Taxonomy Types (Policy and Usage Labels)
Policies generally converge on these four usage labels that are assigned to data and directly correlate with the levels above:
- Restricted: Maximum sensitivity data that, if disclosed, would violate regulations or contracts with severe consequences (typically mapped to High Level).
- Confidential: Operationally critical information shared only with internal, need-to-know groups. (mapped to Medium or High Level).
- Internal Use Only: Non-public information intended to be shared among authorized employees. Damage from a leak would be moderate but requires basic safeguards. (mapped to Medium Level).
- Public: Information intended for broad disclosure, carrying no risk if exposed. (mapped to Low Level).
Data Classification Policy: Scope, Criteria, and Controls per Level
A usable data classification policy is concise and explicit. It must define scope (systems, domains, and all environments), the levels/types used with clear criteria and examples, and the controls required per level (access, encryption, masking/tokenization, retention, logging). Specify ownership and RACI, exception/waiver rules, and the audit model (what is recorded and how it is verified).
Data Classification Process
The effective process is continuous, systematic, and requires automation in three core stages:
1. Discovery & Labeling
Run automated discovery to detect sensitive data patterns (PII/PHI/PCI) and tag assets. Apply rules to map these detections to High/Medium/Low labels; data stewards adjudicate edge cases; and propagate the labels through data lineage to maintain consistency across all data copies.
2. Policy Enforcement
Use the assigned labels to automatically drive and enforce controls. This includes implementing Role-Based Access Control (RBAC), defining the necessity of data protection controls (like encryption or tokenization), and ensuring data is handled according to its sensitivity level across all environments.
3. Compliance and Audit Readiness
Record all label changes, policy evaluations, and control executions as tamper-evident logs. This documentation serves as the official audit evidence for every data asset, streamlining compliance checks and enabling a model of continuous, demonstrable protection.
- High-quality detectors (PII/PHI/PCI) with custom patterns/dictionaries.
- Deterministic masking/tokenization that preserves referential integrity across joins.
- Rule engine: detections → High/Medium/Low → control actions.
- APIs & CI/CD gates for automated checks before promotion.
- Lineage & propagation across pipelines and sinks.
- Audit-ready reports per release/domain/regulation.
- Role-based access and clean segregation by environment.
How Gigantics Operationalizes Data Classification Across All Environments
Gigantics turns classification labels into executable, verifiable controls:
- Discovery & labeling: detect PII/PHI/PCI patterns and apply consistent labels (e.g., Public, Internal, Confidential, Restricted) to columns and datasets.
- Label-driven protection: enforce deterministic masking/tokenization (with referential integrity preserved) for labeled data in lower environments; keep encryption/monitoring aligned to policy.
- CI/CD checks: use API-based data gates to verify that assets are classified and required protections ran before promotion.
- Evidence: generate audit artifacts that link classification results to the protections applied for each run or release.
Result: classification that is enforced the same way in every environment, with clear proof of what was protected and when.