SQL Server

7 min read

How to Mask Data in SQL Server, Step by Step

Mask data in SQL Server step by step: what Dynamic Data Masking does, what it doesn't protect, and how to apply static masking for secure environments.

author-image

Rodrigo de Oliveira

CEO @Gigantics

Copying a production database into a test environment is the most common way to get realistic data, but it also moves personal information —names, emails, national IDs— into environments with weaker controls, which puts GDPR compliance at risk. Data masking prevents that by replacing sensitive values with fictitious but believable ones.



This is a practical guide: how to mask data in SQL Server using its two methods —Dynamic Data Masking and static masking—, what each one protects, and how to apply them without breaking the relationships between tables.




The two masking methods in SQL Server



SQL Server can mask data in two ways with very different results. Dynamic Data Masking hides data at query time without changing the table. Static masking rewrites the values permanently in a copy. The first limits what is visible in production; the second protects data that leaves it. The right choice becomes clear once you see how each one works.




How to apply Dynamic Data Masking



Available since SQL Server 2016, Dynamic Data Masking (DDM) is defined at the column level and applied automatically whenever the field is queried. You can declare it when creating the table:



sql


CREATE TABLE dbo.Customers (
    CustomerID  INT IDENTITY PRIMARY KEY,
    FullName    NVARCHAR(100) MASKED WITH (FUNCTION = 'default()'),
    Email       NVARCHAR(256) MASKED WITH (FUNCTION = 'email()'),
    Phone       VARCHAR(20)   MASKED WITH (FUNCTION = 'partial(0,"XXX-XXX-",4)'),
    Balance     MONEY         MASKED WITH (FUNCTION = 'random(1, 1000)')
);

Or add it to a column that already exists:



sql


ALTER TABLE dbo.Customers
ALTER COLUMN Email ADD MASKED WITH (FUNCTION = 'email()');


The four masking functions



Each function covers a data type:


  • default() — fully masks the value by type: XXXX for strings, 0 for numerics, a fixed date for date fields.

  • email() — exposes the first letter and replaces the rest with a .com suffix (for example, aXXX@XXXX.com).

  • partial(prefix, padding, suffix) — exposes the characters you set at the start and end, and masks the middle with the padding text.

  • random(start, end) — replaces a numeric value with a random one within the range.



Granting and revoking the UNMASK permission



By default, users see masked values; the UNMASK permission reveals the originals. As of SQL Server 2022 it is granular and can be granted at the database, schema, table, or column level:



sql


-- Allow a role to see only the email without a mask
GRANT UNMASK ON dbo.Customers(Email) TO support_role;

-- Revoke the permission
REVOKE UNMASK ON dbo.Customers(Email) FROM support_role;


Checking which columns are masked



The sys.masked_columns catalog view lists active masks and their function, which is useful for auditing the configuration:



sql


SELECT t.name AS table_name, c.name AS column_name, c.masking_function
FROM sys.masked_columns c
JOIN sys.tables t ON c.object_id = t.object_id
WHERE c.is_masked = 1;



The security limits of Dynamic Data Masking



Before relying on this feature, note one caveat most guides skip. Microsoft states that Dynamic Data Masking is not meant to be a complete security control: it prevents accidental exposure, not deliberate access, and shouldn't be your only safeguard.



The reason is that the mask only affects presentation; the real value stays stored and the database keeps operating on it. A user with SELECT permission, even without UNMASK, can infer masked data through filters:



sql


-- Balance is masked, but the filter reveals which customers
-- exceed a given threshold
SELECT FullName FROM dbo.Customers WHERE Balance > 50000;


By repeating queries with different conditions, the hidden values can be reconstructed. On top of that, any administrative account —or a copy of the database into another environment— reaches the real data with no restriction. DDM works as a complementary layer alongside auditing, encryption, and row-level security, but it does not protect data that leaves production.




How to run static masking for non-production environments



When you need to hand a safe copy to development, QA, analytics, or model training, masking must be static: the values are physically rewritten and the real data disappears. Even if someone accesses the copy with full privileges, there is no personal information to recover. This is what removes those non-production environments from the scope of GDPR. These are the steps.



1. Map the dependencies before transforming



Safe transformation starts by mapping the schema: locating primary and foreign keys so you don't mask one column and orphan the ones that depend on it. SQL Server catalog views expose them:



sql


SELECT
    tp.name AS parent_table,  cp.name AS parent_column,
    tr.name AS child_table,   cr.name AS child_column
FROM sys.foreign_keys fk
JOIN sys.foreign_key_columns fkc ON fk.object_id = fkc.constraint_object_id
JOIN sys.tables  tp ON fkc.referenced_object_id = tp.object_id
JOIN sys.columns cp ON fkc.referenced_object_id = cp.object_id AND fkc.referenced_column_id = cp.column_id
JOIN sys.tables  tr ON fkc.parent_object_id     = tr.object_id
JOIN sys.columns cr ON fkc.parent_object_id     = cr.object_id AND fkc.parent_column_id     = cr.column_id;


2. Transform consistently



Each field is replaced while respecting its format and the system's validations. The key is that a given value always transforms to the same result, so the joins between tables keep working:


sql


-- Deterministic synthetic email derived from the identifier
UPDATE dbo.Customers
SET Email = CONCAT('customer', CustomerID, '@example.com');

A national ID is replaced with another value that still passes the system's checksum; a phone number, with a correctly formatted but fictitious one.



3. Preserve referential integrity



Most masking projects fail at preserving referential integrity. If a customer identifier changes in its table but not in orders or invoices, the relationships break and the copy becomes useless for testing. The fix is a deterministic mapping propagated in cascade:



sql


-- Unique mapping table (ROW_NUMBER avoids collisions)
SELECT
    CustomerID AS id_original,
    ROW_NUMBER() OVER (ORDER BY CustomerID) AS id_new
INTO #CustomerMap
FROM dbo.Customers;

CREATE UNIQUE INDEX IX_CustomerMap ON #CustomerMap (id_original);

BEGIN TRAN;

UPDATE o
SET o.CustomerID = m.id_new
FROM dbo.Orders o
JOIN #CustomerMap m ON o.CustomerID = m.id_original;

UPDATE c
SET c.CustomerID = m.id_new
FROM dbo.Customers c
JOIN #CustomerMap m ON c.CustomerID = m.id_original;

COMMIT;


When the identifier is protected by foreign keys, update the child tables first, use ON UPDATE CASCADE, or temporarily disable the constraints during the operation.



4. Automate and keep evidence



A control that depends on manual execution is not reliable. Masking should run inside the provisioning flow (Azure DevOps, GitHub Actions, Jenkins) so every environment is prepared with already-protected data. It should also record what was transformed, with which rule, and on which schema version: that traceability is what GDPR, NIS2, and HIPAA expect during an audit.


Criterion Dynamic Data Masking Static masking
What it does Hides the value in the query Rewrites the value in the copy
Real value on disk Remains Disappears
Resists inference and admin access No Yes
Referential integrity Not handled Preserved with mapping or dictionaries
Recommended use Limiting the view in production Data for development, QA, analytics, AI
Removes data from GDPR scope No Yes, if irreversible



Common mistakes when masking in SQL Server



The most frequent one is trusting Dynamic Data Masking to protect copies outside production: the real value travels with every replica and can be inferred through queries.



The second is transforming related columns independently and breaking referential integrity, which produces false negatives in tests and pipeline failures unrelated to the code.



The third is relying on manual scripts without first classifying personal data (PII) or versioning the rules. These approaches scale poorly, produce different results across runs, and leave sensitive columns untouched.




How Gigantics automates it



Gigantics is a local-first platform that applies static masking on SQL Server without the data leaving your infrastructure. AI-driven discovery scans the schema, tags columns containing PII, and scores their risk. Anonymization rules are defined once and reused across environments, and dictionaries ensure a given value transforms the same way in every table, which solves referential integrity without manual mapping scripts. Every operation is recorded in audit reports ready for compliance.



Development, QA, and analytics teams work with realistic data, while the organization stops exposing personal information outside production.


Turn SQL Server masking into an automated process.

Close the gap between discovery and protection. Gigantics detects personal data at the schema level and applies static masking with referential integrity before it reaches non-production environments. Provision safe, realistic data to any environment.

Get your Technical Demo


Frequently asked questions



Does SQL Server have native data masking?



Yes. Since SQL Server 2016 it includes Dynamic Data Masking, which hides values in query results based on the user's permissions. It does not change the stored content.



Is Dynamic Data Masking secure?



It is a display control, not a complete security measure. Microsoft warns that it does not protect against deliberate attempts to infer the data: a user with query permission can deduce masked values through filters. It should be combined with auditing, encryption, and row-level security.



How do you grant permission to see unmasked data in SQL Server?



With GRANT UNMASK. As of SQL Server 2022 it can be granted granularly at the database, schema, table, or column level, and revoked with REVOKE UNMASK.



How do you mask data in SQL Server without breaking referential integrity?



With a deterministic mapping between the original value and the new one, propagated in cascade to every table with foreign keys. Specialized platforms solve this with consistency dictionaries instead of manual scripts.



Does data masking help with GDPR compliance?



Yes. It reduces the exposure of personal data outside production and, when irreversible, removes it from the scope of the regulation. With the right traceability, it provides the evidence GDPR, NIS2, and HIPAA require.