Taxonomy Types (Policy and Usage Labels)
Policies generally converge on these four usage labels that are assigned to data and directly correlate with the levels above:
- Restricted: Maximum sensitivity data that, if disclosed, would violate regulations or contracts with severe consequences (typically mapped to High Level).
- Confidential: Operationally critical information shared only with internal, need-to-know groups. (mapped to Medium or High Level).
- Internal Use Only: Non-public information intended to be shared among authorized employees. Damage from a leak would be moderate but requires basic safeguards. (mapped to Medium Level).
- Public: Information intended for broad disclosure, carrying no risk if exposed. (mapped to Low Level).
Data Classification Policy: Scope, Criteria, and Controls per Level
A usable data classification policy is concise and explicit. It should define scope (systems, domains, and all environments—including ephemeral), the levels/types used (High/Medium/Low and/or Public, Internal, Confidential, Restricted) with clear criteria and examples, and the controls required per level (access, encryption, masking/tokenization, sharing, retention, logging). Specify ownership and RACI, exception/waiver rules (approver, duration, compensating controls), and the audit model—what is recorded, where, and how it is verified. Operationalize the policy with environment-aware enforcement (e.g., masking in lower environments), CI/CD policy checks, and per-release evidence mapped to GDPR/HIPAA/NIS2 obligations. Keep procedures (detector dictionaries, runbooks) in living docs.
Data Classification Process
Discovery & labeling
Run automated discovery to detect PII/PHI/PCI and tag assets; apply rules to map detections to High/Medium/Low labels; stewards adjudicate edge cases; propagate labels through lineage.
Enforcement
Use labels to drive RBAC/ABAC, apply deterministic masking/tokenization in lower environments for High/Medium as policy requires, encrypt everywhere, and redact/tokenize in exports, BI views, and APIs.
Evidence
Record label changes, policy evaluations, and control executions as tamper-evident logs attached to each release—shrinking audit timelines and enabling continuous compliance.
- High-quality detectors (PII/PHI/PCI) with custom patterns/dictionaries.
- Deterministic masking/tokenization that preserves referential integrity across joins.
- Rule engine: detections → High/Medium/Low → control actions.
- APIs & CI/CD gates for automated checks before promotion.
- Lineage & propagation across pipelines and sinks.
- Audit-ready reports per release/domain/regulation.
- Role-based access and clean segregation by environment.
How Gigantics Operationalizes Data Classification Across All Environments
Gigantics makes data classification actionable across production, staging, development, analytics, and backups by turning labels into concrete, repeatable actions—without adding process overhead.
- Discovery & labeling: detect PII/PHI/PCI patterns and apply consistent labels (e.g., Public, Internal, Confidential, Restricted) to columns and datasets.
- Label-driven protection: enforce deterministic masking/tokenization (with referential integrity preserved) for labeled data in lower environments; keep encryption/monitoring aligned to policy.
- CI/CD checks: use API-based data gates to verify that assets are classified and required protections ran before promotion.
- Evidence: generate audit artifacts that link classification results to the protections applied for each run or release.
Result: classification that is enforced the same way in every environment, with clear proof of what was protected and when.