The core tenet of cloud data lake security is that data must be protected both at rest and in transit. This should be part of a data lake security strategy that takes both current and new security threats into account. Any serious protection strategy needs to consider data storage, access controls, external interfaces, and physical infrastructure.
Data Lake Encryption
End-to-end encryption needs to be a default setting for cloud data lake security, with security processes such as customer-managed keys. This type of constant security is not inherent in many data lakes, reflected by many highly publicized data breaches.
Encryption Keys
To fully protect data, the encryptions keys themselves also need protection. Better quality data lakes utilize AES 256-bit encryption with a hierarchical key model rooted in a dedicated hardware security module. Data encryption and key management need to be transparent without impacting performance.
Security Updates and Logging
Security updates should be automatically applied across all relevant components of the data lake as soon as as they are made available. Cloud vendors need to perform penetration testing on a regular basis to test for security weaknesses.
Access Control
For authentication, make sure your connections to the cloud provider leverage standard security technologies such as:
TransportLayer Security (TLS) 1.2 and IP whitelisting
Support for the SAML 2.0 standard for password and user role security
Multifactor identification (MFA) to prevent the use of stolen credentials.
Compliance and Attestations
Industry-standard attestation reports can verify that cloud vendors use appropriate security controls. This could include anything from FeRAMP CERTIFICATION TO HIPAA/Health Information Trust Alliance (HITRUST).
Data Isolation
Isolation from other data lakes is a good tactic for data lakes deployed in a multi-tenant cloud environment. If deemed necessary, make sure your cloud vendor offers this service.
Snowflake and Modern Data Lake Security
Snowflake leverages the most sophisticated cloud security technologies available. Security was built into Snowflake’s platform architecture from the very beginning. Snowflake's security features are core to the platform, so users can focus analyzing data and not worry about protecting it.
The platform includes a multitude of features, including dynamic data masking and end-to-end encryption for data in transit and at rest.
Snowflake’s government deployments have achieved Federal Risk & Authorization Management Program (FedRAMP) Authorization to Operate (ATO) at the Moderate level. In addition, Snowflake has achieved SOC 2 Type 2, PCI DSS compliance, and support for HIPAA compliance.