Zheng Gu is a Software Engineer with American Express. He’s passionate about building scalable system with Spark and end-to-end pipelines to schedule and deploy jobs which can visualize the whole process of Spark application deployment.
May 27, 2021 04:25 PM PT
In financial world, petabytes of transactional data need to be stored, processed, distributed across global customers and partners in a secured, compliant and accurate way with high availability, resiliency and observability. In American Express, we need to generate hundreds of different kinds of reports and distribute to thousands of partners in different schedules based on billions of daily transactions. Our next generation reporting framework is a highly configurable enterprise framework that caters to different reporting needs with zero development. This reusable framework entails dynamic scheduling of partner-specific reports, transforming, aggregating and filtering the data into different dataframes using inbuilt as well as user-defined spark functions leveraging spark's in memory and parallel processing capabilities. This also encompasses applying business rules and converting it into different formats by embedding template engines like FreeMarker and Mustache into the framework.