In today's environment, proactive cybersecurity is crucial to any public sector agency. For many organizations, log data that security professionals need for effective threat monitoring and incident response is not readily accessible in one place, or it lives in siloed departments. In some instances, the data may also be stored only for short-term operational purposes. This severely limits the ability to effectively manage security, and underscores the need for effective log retention as well as secure access to critical cyber information.
Federal mandates are requiring agencies to retain information systems logs over a multi-year period to support the detection, investigation, and remediation of cyber incidents. This creates multiple challenges for agencies to navigate. First, storing massive volumes can be costly, particularly if done in relatively high-cost on-premises or proprietary storage. Furthermore, transferring large volumes of data to a single monolithic repository to provide centralized access can also be expensive and result in data duplication across multiple environments. In short, the memorandum significantly increases data management and cybersecurity demands on federal organizations.
Deloitte's Cyber Data Optimization solution looks to address these challenges by employing a hub-and-spoke model on the Databricks Data Intelligence Platform. A central analytics "Lakehouse Hub" coordinates with enterprise clouds and source systems, the "Nodes", to establish a centralized analytics layer for log data. Data is retained in low-cost cloud storage at the nodes and accessible by centralized queries from the hub, avoiding transfer of raw data across cloud boundaries. This multi-node, federated model allows data to be securely shared from individual nodes to the central hub, enabling comprehensive log access to address potential cyber threats more efficiently. This approach allows organizations to navigate the changing cyber landscape more effectively while avoiding costly data storage and egress.
Federal compliance requires that organizations not only collect an extensive list of system logs for an extended retention period, but also ensure comprehensive data visibility in order to support cybersecurity operations. The scale of log data volumes can make it technically and financially unsupportable for many organizations within their current toolbox.
Deloitte’s Cyber Data Optimization solution addresses these cost and scale challenges by leveraging low-cost cloud storage, reducing the need for expensive data indexing in proprietary systems. This is particularly impactful for high-volume telemetry data that is growing to petabyte scale.
The federated model provides centralized access and visibility to remote data distributed across the organization. Security operations center (SOC) analysts then have the opportunity to compile, search and perform advanced analytics on log data, enabling rapid response to cyber investigations that require significant historical data.
The hub-and-spoke architecture manages large volume log data across multi-cloud environments by eliminating data duplication and reducing data egress transfer. The framework is a federation of Databricks workspaces that take advantage of a distributed medallion data pattern, incrementally increasing data quality at each node as data flows from raw to consumption-ready. Nodes are deployed at or near source systems as much as possible. Raw log data is ingested at the node, processed, and made available to be queried by the central hub. This eliminates costly data egress across clouds and regions by keeping the source log data at a single node. Only curated responses to federated queries by the hub are transferred from node to hub.
Ensuring the right users have the right access to log data is vital. By leveraging the Databricks governance framework, the hub defines and enforces access control rules that associate role-based user pools with collections of log datasets. In cases where more granular access management is needed, dynamic view functions can be constructed for row/column-level permissions or data masking.
The Cyber Lakehouse integrates with common systems familiar to the organization’s workforce, augmenting the existing toolset while maintaining continuity and accelerating adoption. This eliminates the need for additional training while leveraging the benefits of the Databricks Data Intelligence Platform. With the Cyber Data Optimization solution, several use cases have been exercised such as:
The Cyber Data Optimization Solutions pairs the deep industry experience of Deloitte with the Databricks Data Intelligence Platform. With Brickbuilder Solutions, you are guaranteed to get:
Deloitte will be at the Databricks Government Forum on December 11. Come meet the team in person and see the Cyber Data Optimization solution in action by registering here.
