Unifying data and ML drives innovation and efficiency
With Databricks as their unified platform for data and ML, the data team at Coins.ph can now optimize daily DevOps tasks while shifting their focus to generate value for the business, in key areas such as fraud detection and anti-money laundering via ad hoc analytics.
Using Delta Lake to ingest large volumes of data in real time, data engineers at Coins.ph are able to bring greater reliability to the data lakes and get up-to-date insights to develop more robust and scalable data pipelines. This accelerates the development of data modeling prototypes by their data scientists. The platform provides automated cluster management, allowing the data team to effectively create data clusters specifically tailored for troubleshooting and round-the-clock monitoring activities without taxing data engineering resources.
Databricks’ interactive and easily maintained notebooks allow various data teams — data engineers, data scientists or business analysts — to leverage the platform and collaborate for data preparation, simple analytics and prototyping new models. This is expedited with the pre-created, self-help features within the notebooks.
“Collaboration with cross-functional teams is facilitated with the sharing of high-quality reports through integration with third-party business insight tools such as Metabase,” explained Ustimov. “It further simplified technically-advanced Databricks capabilities to enable less technically-skilled business users to easily extract actionable insights from the data, such as enabling the finance team to access consistent financial data for accurate reporting.”
MLflow simplifies and streamlines the ML lifecycle, allowing data teams to easily track ML experiments and quickly develop new prototypes to address fraud detection. In addition, the platform enables teams to run experiment jobs with standardized analytics. By doing so, the team is able to roll out new features or rule sets for anti-money-laundering compliance.
Another benefit of implementing the Databricks solution is having experts readily available to address the issues of the data team.
“We asked for recommendations on the best way to architect a streaming ETL pipeline in order to provide optimal real-time insights in the most cost-efficient manner. We were pleased to receive a helpful response within hours,“ said Ustimov.