Session
Data anonymization using Cognitive Search and custom Deep Learning models
Society creates over 2.5 quintillion bytes of new data every day. This has brought new opportunities to create a positive impact on the world around us, especially in AI since it benefits from using large amounts of data. Using a combination of AI Knowledge Mining techniques, we are now able to provide solutions for any sector where we can deeply understand their data, explore it and uncover insights.
However, these benefits do not come without risk. Governments, corporations, and research institutes continue to gather a massive amount of data, that contains personal information. This information in the wrong hands could harm individuals. Thus, great efforts are being made to remove personally identifiable information from the data. Even in some cases, new legal requirements to anonymize data have emerged. Unfortunately, many attempts to anonymize data are vulnerable to reidentification tactics, especially when multiple data sources contain overlapping information.
This raises the following question: How can we provide a Knowledge Mining solution without compromising the data privacy problem? In our session, we will address this question.
We will present an end to end solution for a Multilanguage Cognitive Search using an Azure Search pipeline with NLP Deep Learning models. We will use a BERT Transformer model as a custom skill to integrate data anonymization seamlessly in the pipeline and thus overcome the challenge previously mentioned
Rodrigo Cabello
Principal AI Research Engineer at Plain Concepts and Microsoft MVP in Artificial Intelligence
Granada, Spain
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top