CUSTOMER
STORY

Building a first-class data management environment to increase efficiency

NOL UNIVERSE achieved self-service data ops with Databricks in two months

66%

Reduction in time to complete batch aggregation

77%

Decrease in batch computing costs

27%

Increase in data availability time

Product descriptions:

Databricks SQL Delta Sharing Unity Catalog

NOL UNIVERSE, a comprehensive platform encompassing domestic and international travel, leisure and culture, provides customers with a variety of pleasures in everyday life. With the triple brands of NOL, NOL Interpark Tours and NOL Tickets, the travel company offers a wide range of services and experiences — from flights, accommodations and transportation to performances, tours and exhibitions. NOL Tickets maintains the dominant market share as the largest performance booking platform in Korea, while NOL, which started as Yanolja, continues to be the leading domestic accommodation booking platform. NOL UNIVERSE has an overwhelming business competitiveness due to their diverse service categories and advanced technology, which makes efficient data management across multiple brands essential. However, the existing data system was not able to provide a suitable operating platform for the growing amount of data due to the decentralization of multiple solutions. Therefore, NOL UNIVERSE decided to adopt the Databricks Platform for an integrated solution. Databricks has maximized data management efficiency and increased business agility by empowering employees across the company to use data autonomously.

Siloed legacy data systems were inefficient for integrated data management

NOL UNIVERSE’s mission is “letting everyone play at ease.” Due to the nature of the travel and leisure industry, it’s critical to be able to react quickly to real-world situations that change in real time and provide accurate guidance to relevant customers. Efficient data management is a core competency for NOL UNIVERSE, especially since the process of analyzing the impact of various events on the business and formulating strategies accordingly is based on data. Unfortunately, various systems were dispersed in the existing data environment and had great limitations in scalability and manageability.

The existing system relied on multiple databases, including MySQL and SQL Server, along with Hive and Spark distributed processing engines based on Amazon EMR to collect, cleanse and analyze data. A combination of Apache Airflow and Apache NiFi was used to schedule workflows and manage pipelines, and the processed data was stored on Amazon S3. For data analysis and visualization, the company used several BI tools, including Redash and Tableau.

As the service operation periods got longer, the amount of accumulated data — along with the batch processing time — increased significantly. In addition, the separation of the data processing engine and the scheduling system made it inefficient for operation, as the structure caused delays in both tracing the cause and responding to batch failures.

Different cloud accounts and permissions schemes for each service also added to the difficulty of integrated management. As partitions within Amazon S3 grew, so did the burden of managing metastores. As NOL UNIVERSE’s data engineers reflected, “The dispersed tools and the speed and reliability issues of data retrieval caused great frustration for data users.”

Additionally, as organizations merged and expanded their services, NOL UNIVERSE’s data engineers wanted to be able to analyze data generated by different systems within a single account scheme and platform. With the data team fully responsible for all data products, the reliance business users had on them during peak workload became a bottleneck. They sought a self-service batch pipeline structure that enabled each domain to take ownership of its data operations.

To address the limitations of this legacy environment, the team considered scalability, integrated management, user autonomy and operational and cost efficiency. They ultimately chose the most suitable solution: the Databricks Data Intelligence Platform.

Encouraging autonomous data utilization with the Databricks Platform

NOL UNIVERSE utilized Databricks to create an integrated data environment. The existing data lakehouse was completely migrated by converting approximately 2,500 existing Hive-based queries to the Databricks environment. They also migrated their Apache Airflow and Apache NiFi-based data pipelines. The company was able to consolidate their disparate BI solutions into one platform to create a consistent analytics environment. As a result, integrated management of the whole process, from data processing to analytics to workflow operations, was feasible on a single platform.

Not only was the vast amount of data managed centrally and consistently, but also each domain (e.g., flights, hotels, tickets, T&A, packages, etc.) was given a self-service analytics environment to utilize the data autonomously, with full governance. Data users can now independently collect, analyze and serve data to build dashboards and make informed business decisions without relying on the data team. Recently, the customer service team leveraged the machine learning capabilities of Databricks to automate the categorization system within the CS consultation history, which greatly increased their efficiency.

After the organization merged, they were able to integrate data from each domain in just one day by leveraging Delta Sharing. Setting the computing resources allocated by each policy, such as Dashboards, queries and Notebooks, ensured smooth, conflict-free access to data.

With the migration to the Databricks Platform, optimizing query performance based on the high-performance functions of Databricks has dramatically reduced batch times, leading to a significant increase in overall data productivity. An integrated security and governance framework was also built. Integration with the internal SSO system unified user authentication, and Unity Catalog established a consistent permission management and metadata management scheme for a more efficient operation of data governance.

NOL UNIVERSE also leveraged the function for automatically expanding and reducing clusters as workloads change, which maximized resource utilization efficiency. Databricks Units (DBUs)–based cost visibility has enabled the team to track data costs at both the user and organization levels, significantly improving cost efficiency.

By building a flexible infrastructure architecture that reflects security requirements and internal operational policies, technical and nontechnical users alike can work in an environment that allows scalable resources within a tight governance framework. As NOL UNIVERSE’s data engineers explained, “It struck a balance between security and scalability, while leveraging the expertise of our data teams and cloud teams to create a foundation for building the infrastructure quickly and reliably.”

Transitioning to a new unified data experience in just two months

NOL UNIVERSE successfully built their new integrated data environment in just two months. In August 2024, the team ran a PoC (proof of concept) with the entire operations table, which was followed by an intensive transition and validation process from September through November. As a result, an integrated data operations environment was introduced with stability within a short timeframe.

One key strategy that enabled this rapid transition was working with the internal cloud team to automate resource deployment. This allowed NOL UNIVERSE to quickly build an operational environment while laying the groundwork to flexibly respond to structural changes, such as mergers and reorganizations. Furthermore, data consistency was ensured by templating data validation tasks and standardizing criteria among workers.

The adoption of Databricks has also brought positive and measurable data processing results. Specifically, based on a day’s worth of EMR computing resource usage, the batch processing time was reduced by approximately 84%, while the cost was reduced by approximately 48%. In the case of the customer service team’s categorization system automation, a task that used to take over a day can now be completed in less than two hours. “We’ve successfully implemented an advanced integrated data environment while minimizing technical complexity and operational risk in a short period. The base has been built for further enhancing data-driven decision-making and the efficiency of general service operations,” the data engineers said.

Looking ahead, NOL UNIVERSE plans to leverage various functions of Databricks to continuously modernize their data environment and achieve cost efficiency. By collecting GA4 data in real time through the Lakeflow Connector, they aim to deliver data to users faster. Converting the existing tables to Delta tables is expected to save over an average of 20% in operational time and cost. The introduction of a self-service pipeline structure will realize an independent data operation system for domains, further enhancing data productivity and flexibility through the implementation of a Data Mesh architecture.

NOL UNIVERSE is looking forward to continuing collaboration with Databricks to further advance their data environment and drive creative initiatives.

Building a first-class data management environment to increase efficiency

Siloed legacy data systems were inefficient for integrated data management

Encouraging autonomous data utilization with the Databricks Platform

Transitioning to a new unified data experience in just two months

Ready to get started?