Data Classification
Data classification involves defining and categorizing information based on its type, sensitivity, and value to the organization. This process allows for more effective management, protection, and utilization of data. By identifying data as confidential, sensitive, internal, or public, organizations can implement appropriate security controls, access restrictions, and handling procedures to safeguard confidentiality, integrity, and availability.
Furthermore, classification enhances operational efficiency by making it easier for authorized users to locate and access information while ensuring compliance with regulatory requirements and internal policies. Organizations typically create their own classification models and categories to align with their business objectives, regulatory obligations, and risk tolerance. This enables them to prioritize resources and protect their most critical or valuable information.
Content-Based Classification
This approach examines the actual content of files (payload) to determine sensitivity, often identifying patterns such as credit card numbers or PII (personally identifiable information).
- Techniques: Uses automated scanning, pattern matching, algorithms, or machine learning to scan text.
- Pros: Highly accurate because it analyzes data content directly.
- Cons: Can be resource-intensive.
Context-Based Classification
This approach analyzes the surrounding circumstances (the metadata) rather than the data itself to infer sensitivity.
- Techniques: Evaluates application (e.g., Salesforce, Jira), location (e.g., specific file paths), creator, or time of creation.
- Pros: Fast and efficient, often used in DLP (Data Loss Prevention) tools.
- Cons: May miss sensitive data if the context is deceptive.
User-Based Classification
This approach relies on human judgment, where creators or users manually select a classification label for a file at creation or modification.
- Techniques: Manual tagging prompts that ask the user to classify the data (e.g., Public, Confidential).
- Pros: Highly accurate for understanding business value, as the creator knows the data’s true purpose.
- Cons: Subjective, inconsistent, and prone to user error or negligence
Military Classification Scheme
- Top Secret
- Data requires the highest degree of protection, and disclosure of it would cause exceptionally grave damage to national security
- Policy for conducting intelligence
- Secret
- Disclosure of it would cause serious damage to national security
- Indications of weakness
- Confidential
- Disclosure of it would cause damage to national security
- Intelligence reports
- Sensitive
- Data is not classified, and disclosure of it would cause limited damage to national security
- For Official Use Only (FOUO)
- Limited Official Use (LOU)
- Official Use Only (OUO)
- Unclassified
- Data is not classified and non-sensitive
Commercial Classification Scheme
- Restricted
- High sensitive data and access is restricted to specific individuals or authorized third parties (disclosure to it would lead to permanent damage)
- Examples:
- SSN
- Credit cards
- Criminal Record
- Medical info
- Biometric data
- Confidential
- Sensitive data that is team-wide and disclosure to it would harm the origination operation
- Examples:
- Vendor contracts
- Employees salaries
- Names, addresses, and dates
- Sensitive
- Non-Sensitive data that is origination-wide and cannot be disclosed to anyone
- Examples:
- Internal policies
- Internal user guides
- Ogrinzaitonl charts
- Project documents
- Public
- Information that can be disclosed to anyone
- Examples:
- Public API documents
- Job titles and names
- Open API Data