Skip to main content

Modern technology for a modern workforce


Faster delivery of new applications to market


“With Databricks Data Intelligence Platform, we can harness billions of real-time labor data points to build upon our vision to build a more efficient global labor market.”

— Mohan Reddy, Co-Founder & Chief Technology Officer, SkyHive

The evolution of the workplace has introduced new levels of complexity in aligning the needs of employers with the right talent. SkyHive helps job seekers and organizations overcome this challenge by providing market intelligence for normalizing global job listings across industries. Its traditional Hadoop infrastructure, however, struggled with the tremendous amounts of data being processed and analyzed — creating scalability and speed issues, limiting efficacy, and impacting the end-user experience. Choosing to move beyond the data warehouse, SkyHive selected Databricks Data Intelligence Platform over a cloud data warehouse because it provides multicloud support and the ability to unify all its data for analytics and AI, from a single platform. With the lakehouse as the infrastructure behind its data intelligence, SkyHive can now democratize labor opportunities, helping organizations build and scale a capable workforce more efficiently.

Lack of common labor market and skill sets brings complications

The global labor market is more accessible than ever, especially with the rapid increase in remote work opportunities due to the global pandemic. Organizations have a large pool from which to find talent across the globe, but job roles and skill sets are defined differently in varied geographies. There is a myriad of semantics, syntax and job titles in a variety of languages. As a result, there isn’t a common digital work profile to map candidates and skill sets to organizational needs. And while workers have the freedom to work for organizations globally, they may need to refine their skills to match an employer’s requirements and enhance their careers. SkyHive provides labor data and insights to enterprises, workers and communities for better decision-making. Through AI-generated recommendations, it helps job seekers find new job opportunities or training for a career shift. It also helps organizations better plan their talent architecture with data insights to improve hiring and talent retention.

“There’s currently a gap in having a common language of skills that translates around the world,” said Mohan Reddy, Co-Founder & Chief Technology Officer at SkyHive. “However, achieving this level of consistency is not a trivial task with billions of jobs across millions of companies all vying for talent. We help organizations plan for the kind of talent they seek and determine what skills and amount of labor they require to drive the business forward.”

SkyHive dealt with a wide variety of data (e.g., job boards from web pages, images from classified ads, and news clippings) across various formats and languages. This diversity made the data challenging to translate into meaningful insights for both job seekers and employers. For example, an accountant in the U.S. might be called actuarial accounts in India or an accounts payable clerk in Nigeria. These different names must be reconciled and parallelized in real time to make them actionable. But as an ever-expanding labor market caused data volumes to rise, SkyHive quickly outgrew its Hadoop environment because accessibility to the data was compromised due to data silos, scalability was limited, and the supporting infrastructure was too expensive to operate. To facilitate the on-demand access of labor data at scale, the company needed a data architecture that would be easy to manage and scale, provide a common view into the myriad data sources, and support analytics and machine learning to power its platform.

Normalizing data and analytics in real time to drive better workforce decisions

Looking to modernize its underlying technology stack, SkyHive considered a cloud data warehouse and Databricks. After careful evaluation, it selected Databricks Data Intelligence Platform because it met all of its technology needs. “What set Databricks apart from the cloud data warehouse we evaluated in depth was its multicloud support, the ability to create a unified layer for our data via the lakehouse architecture, and the flexibility to support both analytics and AI workloads — from a single platform,” said Reddy.

Databricks helps SkyHive to analyze hundreds of millions of different documents and labor data sources dynamically and in real time. The entire workflow, from data access to delivery of insights, is now significantly faster, allowing the data team to do previously impossible things. Looking into the capabilities of the platform, they were able to remove data engineering complexity with Delta Lake and Delta Live Tables, making it easy to build fast and reliable ETL pipelines that deliver high-quality data to the end consumers. And with Delta Sharing, they can provide analysts and business teams with secure, instant access to real-time data. With data at their fingertips, they can then generate reports via Tableau and Power BI to better understand customers and overall supply and demand for various types of skills and talent. With data processed and prepared for action, SkyHive’s data science team uses MLflow to orchestrate hundreds of machine learning models designed to serve workforce recommendations.

“Through the lakehouse architecture, we can now bring all the data together representing any facet of a labor market into a powerful knowledge graph with multiple dimensions such as jobs, skills, geographies, compensation and more,” explained Reddy. “With the world’s labor data at our disposal, we can provide our customers access to a capable and future-proof workforce that closes their skills gap with speed and efficiency.”

Lakehouse architecture delivers speed and scale for better workforce planning

For SkyHive, one of the biggest benefits of the Databricks Data Intelligence Platform is the significant increase in scalability and how quickly meaningful insights are served to its customers. Its entire data set is over 1 petabyte, with a volume of 150 trillion transactions per day. Before Databricks, the company would only be able to process and analyze 1/100th of its data set, and it would take 22 hours at minimum to process, which was extremely expensive. Leveraging the lakehouse architecture, SkyHive is now able to access all its data and query it in less than 20 minutes. “With Databricks, the gains in operational efficiency have been tremendous, and as a result, we have been able to offer a multitude of new products and applications to the delight of our customers,” said Reddy. “This pace of innovation wasn’t possible before Databricks.”

With the lakehouse as the core technology supporting its analytics platform, SkyHive has improved its time-to-market of new labor insights applications by 13x — streamlining developing efforts and shortening delivery times from 6-8 months to only 2 and a half weeks.

Databricks has laid the foundation for SkyHive to realize its vision to re-architect the labor market. “We want to change the world by providing technology that helps move the global labor market forward and upward,” said Reddy. “With Databricks, we have positioned ourselves as the No. 1 labor market insights solution in the world, offering what no one else can.”