Sandhya Raghavan

Senior Data Engineer , Virgin Hyperloop One

Sandhya Raghavan is a Senior Data Engineer at Virgin Hyperloop One, where she helps building the data analytics platform for the organization. She has 13 years of experience working with leading organizations to build scalable data architectures, integrating relational and big data technologies. She also has experience implementing large-scale, distributed machine learning algorithms. Sandhya holds a bachelor’s degree in Computer Science from Anna University, India. When Sandhya is not building data pipelines, you can see her travel the world with her family or pedaling a bike.

Past sessions

Virgin Hyperloop One is the leader in realizing a Hyperloop mass transportation system (VHOMTS), which will bring the cities and people closer together than ever before while reducing pollution, emission of greenhouse gases, transit times, etc. To build a safe and user friendly Hyperloop, we need to answer key technical and business questions, including: - 'What is the safe maximum speed the hyperloop can go?' - 'How many pods (the vehicles that carry people) do we need to fulfill a given demand?' These questions need to be accurately answered to convince regulators, operators, and governments so that we may realize our ambitious goals within years instead of decades. To provide answers to those questions we've built a large-scale and configurable simulation framework, that takes a diverse set of configurations like route information, demand and population information, pod performance parameters. How do we reduce time-to-insight so we can iterate on Hyperloop models faster? We have developed a generic execution and analytics framework around our core system simulation to achieve key objectives of scale, concurrency and speed. In this presentation, we will discuss the design of this framework, challenges encountered, and how these challenges were addressed.

We will showcase the following points in detail:

  1. Utilizing the power of cloud to execute multiple simulations in parallel and at-scale.
  2. Data Pipelines for:
    • Gathering demand and socio-economic data
    • Training and comparing demand prediction models (ARIMA, LSTM, XGBoost) with Keras & MLflow.
    • Analyzing massive simulation output data with Spark and Koalas.
  3. Managing and executing pipelines, including data provenance and element traceability with NiFi and Postgres DB.
  4. How we compare the reports from large batch simulations using MLflow.
  5. A video of our simulation and test result comparisons, including the impact of different demand prediction models for prospective Hyperloop.