Designing effective data security schema

In the realm of cybersecurity, security stack tools generate a vast amount of data logs, each with its own data definitions and field names. Though analyzing individual datasets is relatively
straightforward, correlating data across multiple logs poses a significant challenge. This is primarily due to the heterogeneity of field names and data formats across different logs. To clearly explain the scale of datasets and the complexity with fields, enterprise organizations collect logs from different sources such as network devices, servers, applications, and user activity. These logs can be massive in size, reaching petabytes or even exabytes, and often contain a wide variety of fields with varying definitions and data types.

To address this issue, field normalization has emerged as a crucial step in enabling effective correlation.
One of the important steps for identifying misconfigurations, attacks paths, and vulnerabilities across the systems is to develop a standardized security data schema. In this presentation, we will discuss the importance of field normalization and demonstrate a tactical method to address the data schema problem effectively. Field normalization is an industry standardized method to homogenize field name, definition, and data type of a particular field across different data sets. While investigating security incidents, it’s important to
analyze data from multiple logs to quickly identify attack root cause, in such cases fields normalized across data logs is very critical. Field normalization helps facilitate better data integration and improved correlation, productivity, data error reduction, and portability. Our team has developed an extensible modular framework for defining variables within the data schema. In this approach, we thoroughly researched several security data sources, meticulously identifying key sections of security challenges that enterprise faces. From Cloud Misconfigurations to Container Security issues, Network attack patterns, and
Identity misuse etc. Once we pinpointed the key security areas, we designed base classes (a logical construct) tailored to these specific needs. These base classes serve as the building blocks, each evolving to encompass multiple objects (or variables) (or column names). The flexibility of this approach allows
seamless transformation of logs, adapting to the different data lakes and SIEM solutions.

Attendees will be able to understand the complexity of dealing with security logs and gain an effective
solution for defining data schema to identify complex attack patterns in quick time.

Sai kiran Uppu

Cloud Security Researcher

San Jose, California, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Designing effective data security schema

Sai kiran Uppu

Links

Actions