filmov
tv
PII Detection at Scale on the Lakehouse
Показать описание
SEEK is Australia’s largest online employment marketplace and a market leader spanning ten countries across Asia Pacific and Latin America. SEEK provides employment opportunities for roughly 16 million monthly active users and process 25 million candidate applications to listings. Processing millions of resumes involves handling and managing highly sensitive candidate information, usually inputted in a highly unstructured format. With recent high-profile data leaks in Australia, personally identifiable information (PII) protection has become a major focus area for large digital organizations.
The first step is detection, and SEEK has developed a custom framework built using HuggingFace transformers fine-tuned with nuances around employment. For example, “Software Engineer at Databricks” is not PII, but “CEO at Databricks” is PII. After identifying and anonymizing PII in stream and batch data, SEEK uses Unity Catalog’s data lineage to track PII through their reporting, ETL, and other downstream ML use-cases and govern access control achieving an organization-wide data management capability driven by deep learning and enforcement using Databricks.
Talk by: Ajmal Aziz and Rachael Straiton
The first step is detection, and SEEK has developed a custom framework built using HuggingFace transformers fine-tuned with nuances around employment. For example, “Software Engineer at Databricks” is not PII, but “CEO at Databricks” is PII. After identifying and anonymizing PII in stream and batch data, SEEK uses Unity Catalog’s data lineage to track PII through their reporting, ETL, and other downstream ML use-cases and govern access control achieving an organization-wide data management capability driven by deep learning and enforcement using Databricks.
Talk by: Ajmal Aziz and Rachael Straiton
PII Detection at Scale on the Lakehouse
Presidio - Automated identification and anonymization of PII data at scale
Personally Identifiable Information (PII) - Cybersecurity Awareness Training
LLMs: Data Privacy and Protection, PII Anonymisation
Detect Personally Identifiable Information (PII)
Automating Sensitive Data (PII/PHI) Detection
Dr. Data Science - PII, Data Privacy, and Anonymisation | TIBCO
PII Detection in Emails through QLoRA Fine-tuned LLMs: BERT and GPT3.5 | Chinmay Prakash
Overview of PII Tools UI
PII Scanning
Use Multicloud Automation Rules to Detect PII / SPI Data
How to detect and Mask PII data in Apache Hudi Data Lake | Hands on Lab
AI for Personally Identifiable Information (PII)
Discover PII in Databricks - Deliver data security and privacy
Leveraging Apache Spark and Delta Lake for Efficient Data Encryption at Scale
Detecting and De-Identifying PII Using AWS Comprehend
Presidio: Protect Private Data and Anonymise PII
How to Remove of Redact Personally Identifiable Information (PII) from a CSV
De-identification and re-identification of PII in Big Data Pipeline
Do not share PII data with language models
PII And Data Quality
Detecting and De-Identifying PII Using AWS Comprehend
Cybersecurity Expert Demonstrates How Hackers Easily Gain Access To Sensitive Information
The most unexpected answer to a counting puzzle
Комментарии