With total consolidated assets of $35 billion USD at the end of 2020, this commercial banking organization feeds large volumes of data in compressed parquet format via a third-party tool into its AWS S3 buckets.
The bank needed the capability to identify sensitive information in the specific parquet file formats, remediate any false-positives, and mask actual sensitive data in a way that preserved the format of the original data. Also required was a fully automated flow for the workflow that could be applied to any new data feeds coming in from the AWS S3 bucket.
The bank deployed PK Protect in their AWS environments with a custom policy set up to define the required sensitive data. PKWARE leveraged EMR Cluster for running MR jobs so that discovery could be performed on large scales, and bulk remediation was provided via an automated utility. The PKWARE team also provided S3 Orchestrator, which gave the capability to segregate the scanning and masking task based on task size for any given S3 bucket folder.
After standing up PK Protect, the commercial banking organization had access to an automated script to read new feeds from DynamoDB Tables and perform automated masking as required. PK Protect also generated reports after every run, comprised of complete information regarding the number of files submitted for masking and the number of files successfully masked. PK Protect enabled the consumer banking organization to scan and mask sensitive data in their AWS environments in a timely manner.