Copying a production database into a test environment is the most common way to get realistic data, but it also moves personal information —names, emails, national IDs— into environments with weaker controls, which puts GDPR compliance at risk. Data masking prevents that by replacing sensitive values with fictitious but believable ones.
This is a practical guide: how to mask data in SQL Server using its two methods —Dynamic Data Masking and static masking—, what each one protects, and how to apply them without breaking the relationships between tables.
The two masking methods in SQL Server
SQL Server can mask data in two ways with very different results. Dynamic Data Masking hides data at query time without changing the table. Static masking rewrites the values permanently in a copy. The first limits what is visible in production; the second protects data that leaves it. The right choice becomes clear once you see how each one works.
How to apply Dynamic Data Masking
Available since SQL Server 2016, Dynamic Data Masking (DDM) is defined at the column level and applied automatically whenever the field is queried. You can declare it when creating the table:
sql
CREATE TABLE dbo.Customers (
CustomerID INT IDENTITY PRIMARY KEY,
FullName NVARCHAR(100) MASKED WITH (FUNCTION = 'default()'),
Email NVARCHAR(256) MASKED WITH (FUNCTION = 'email()'),
Phone VARCHAR(20) MASKED WITH (FUNCTION = 'partial(0,"XXX-XXX-",4)'),
Balance MONEY MASKED WITH (FUNCTION = 'random(1, 1000)')
);Or add it to a column that already exists:
sql
ALTER TABLE dbo.Customers
ALTER COLUMN Email ADD MASKED WITH (FUNCTION = 'email()');The four masking functions
Each function covers a data type:
- default() — fully masks the value by type: XXXX for strings, 0 for numerics, a fixed date for date fields.
- email() — exposes the first letter and replaces the rest with a .com suffix (for example, aXXX@XXXX.com).
- partial(prefix, padding, suffix) — exposes the characters you set at the start and end, and masks the middle with the padding text.
- random(start, end) — replaces a numeric value with a random one within the range.
Granting and revoking the UNMASK permission
By default, users see masked values; the UNMASK permission reveals the originals. As of SQL Server 2022 it is granular and can be granted at the database, schema, table, or column level:
sql
-- Allow a role to see only the email without a mask
GRANT UNMASK ON dbo.Customers(Email) TO support_role;
-- Revoke the permission
REVOKE UNMASK ON dbo.Customers(Email) FROM support_role;Checking which columns are masked
The sys.masked_columns catalog view lists active masks and their function, which is useful for auditing the configuration:
sql
SELECT t.name AS table_name, c.name AS column_name, c.masking_function
FROM sys.masked_columns c
JOIN sys.tables t ON c.object_id = t.object_id
WHERE c.is_masked = 1;The security limits of Dynamic Data Masking
Before relying on this feature, note one caveat most guides skip. Microsoft states that Dynamic Data Masking is not meant to be a complete security control: it prevents accidental exposure, not deliberate access, and shouldn't be your only safeguard.
The reason is that the mask only affects presentation; the real value stays stored and the database keeps operating on it. A user with SELECT permission, even without UNMASK, can infer masked data through filters:
sql
-- Balance is masked, but the filter reveals which customers
-- exceed a given threshold
SELECT FullName FROM dbo.Customers WHERE Balance > 50000;By repeating queries with different conditions, the hidden values can be reconstructed. On top of that, any administrative account —or a copy of the database into another environment— reaches the real data with no restriction. DDM works as a complementary layer alongside auditing, encryption, and row-level security, but it does not protect data that leaves production.
How to run static masking for non-production environments
When you need to hand a safe copy to development, QA, analytics, or model training, masking must be static: the values are physically rewritten and the real data disappears. Even if someone accesses the copy with full privileges, there is no personal information to recover. This is what removes those non-production environments from the scope of GDPR. These are the steps.
1. Map the dependencies before transforming
Safe transformation starts by mapping the schema: locating primary and foreign keys so you don't mask one column and orphan the ones that depend on it. SQL Server catalog views expose them:
sql
SELECT
tp.name AS parent_table, cp.name AS parent_column,
tr.name AS child_table, cr.name AS child_column
FROM sys.foreign_keys fk
JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id
JOIN sys.tables tp ON fkc.referenced_object_id = tp.object_id
JOIN sys.columns cp ON fkc.referenced_object_id = cp.object_id AND fkc.referenced_column_id = cp.column_id
JOIN sys.tables tr ON fkc.parent_object_id = tr.object_id
JOIN sys.columns cr ON fkc.parent_object_id = cr.object_id AND fkc.parent_column_id = cr.column_id;2. Transform consistently
Each field is replaced while respecting its format and the system's validations. The key is that a given value always transforms to the same result, so the joins between tables keep working:
sql
-- Deterministic synthetic email derived from the identifier
UPDATE dbo.Customers
SET Email = CONCAT('customer', CustomerID, '@example.com');A national ID is replaced with another value that still passes the system's checksum; a phone number, with a correctly formatted but fictitious one.
3. Preserve referential integrity
Most masking projects fail at preserving referential integrity. If a customer identifier changes in its table but not in orders or invoices, the relationships break and the copy becomes useless for testing. The fix is a deterministic mapping propagated in cascade:
sql
-- Unique mapping table (ROW_NUMBER avoids collisions)
SELECT
CustomerID AS id_original,
ROW_NUMBER() OVER (ORDER BY CustomerID) AS id_new
INTO #CustomerMap
FROM dbo.Customers;
CREATE UNIQUE INDEX IX_CustomerMap ON #CustomerMap (id_original);
BEGIN TRAN;
UPDATE o
SET o.CustomerID = m.id_new
FROM dbo.Orders o
JOIN #CustomerMap m ON o.CustomerID = m.id_original;
UPDATE c
SET c.CustomerID = m.id_new
FROM dbo.Customers c
JOIN #CustomerMap m ON c.CustomerID = m.id_original;
COMMIT;When the identifier is protected by foreign keys, update the child tables first, use ON UPDATE CASCADE, or temporarily disable the constraints during the operation.
4. Automate and keep evidence
A control that depends on manual execution is not reliable. Masking should run inside the provisioning flow (Azure DevOps, GitHub Actions, Jenkins) so every environment is prepared with already-protected data. It should also record what was transformed, with which rule, and on which schema version: that traceability is what GDPR, NIS2, and HIPAA expect during an audit.

