ESG, followed by AMA

The future of finance goes hand in hand with social responsibility, environmental stewardship and corporate ethics. In order to stay competitive, businesses are increasingly disclosing more information about their environmental, social and governance (ESG) performance.

In this free demo, we’ll demonstrate ways to use machine learning to extract the key ESG initiatives as communicated in yearly PDF reports and compare these with the actual media coverage from news analytics data. Afterwards, FinServe Technical Director Antoine Amend will be available to answer questions about this solution or any other financial services analytics use case questions you may have.

Speaker: Antoine Amend and Itai Weiss


– Hi everyone, thank you very much for your time today. My name is Antoine Amend, and I’m the technical director for financial services at Databricks. Joining from a financial services background, having bridged the gap between business and technology. And today, I want to show you the solution accelerator program that we built at Databricks to show you the technical capability of the platform within the use case you may be familiar with, and to give our clients and our customers a headstart on those use cases. In that context, environmental, social and governance, is a top priority for all financial services customers or retail customers, or any large organization may have an ESG strategy, and how to migrate from a marketing concept of ESG to a data driven and actionable insights. Through the series of notebooks, I want to quickly show you how your data practitioners within your organization will be able to leverage AI advanced analytics to extract key information from corporate, social, and governance, CSR reports, using advanced analytics and how to correlate those ESG initiatives that were disclosed with ESG sentiment as the way your brand, your organization, or your suppliers may be perceived through the news analytics. And bridging the gap between what was disclosed versus what was perceived to bring a true data driven ESG strategy. In this first example then, I want to start using information coming from my financial services customers. So different clients using all those PDF documents that I could find in the way. A document itself is usually released on a quarterly or yearly basis, and contains about hundred pages long document that contains a few pages around hard metrics and a lot, a lot of different texts. Where we want to effectively start extracting these text content in an actionable way, using advanced analytics. Scraping this content, being able to extract each and every single sentence and using machine learning to really programmatically learn those themes, those topics. So naturally in the context of ESG, you will find some themes that when machine learn around supporting community, valuing employee, reducing carbon emission, investing in a more sustainable finance, helping employees and risk management that is intertwined with ESG. That will help us as well, starting to categorize each of those key statements in a PDF documents, drastically summarizing documents, it’s called ESG initiatives. As you can see, without getting into the nitty-gritty of the math behind. We can summarize the document that contains only 20% of a document, maybe actionable, maybe specific enough towards those nine themes that we’ve been able to machine learn. And how you can then start comparing your different suppliers, your different investments, your different competitors against those nine key metrics such as valuing employees, sustainable finance, code of conduct. How much more this company is valuing employee compared to others, to bring that holistic view of your different CSR reports and help you to drastically and programmatically even summarize a complex 170 pages long document into nine key initiatives. Those were the key initiatives from JP Morgan in this specific example. But you don’t necessarily need to be a Python expert or a Spark expert to start interacting with your model. With all models stored therefore on MLflow, you can start creating these simple analytics using a simple right click here. I create a comb plugin that will programmatically extract the key ESG initiative for our specific PDF documents. Scraping the content and scurrying that specific CSR against my 50 financial service institution I’ve trained the models of. This example will show me that Barclays will be in the top 40% for everything related to supporting community. This is a simple example, how to use that model programmatically or through the use of SQL points. The certain aspect is consolidating and correlating that with news analytics data. Now that we understand what was disclosed, let’s look at what was perceived. We use news analytics data to scrape the content for the last 18 months of history. And as you can see, the data is massive, and alternative data is publicly available, but noisy and messy, but available every 15 minutes. If we can drive some insights out of that 15 minutes time window, then we create a real-time view of an ESG rating, no longer waiting for a yearly disclosure, but can operate on a real-time 15 minutes window. So we’ll be extracting the environmental, social and governance related themes that we can extract together with the sentiment analysis to give this data driven view, to give that sentiment analysis for each and every single business mentioned in the news, not necessarily related to the few handful companies we have a CSR report for, but really to each and every single business, medium or small companies, financial services or retail customers, to build that data driven view and to capture the influence any business may have to your ESG rating. If you are directly or indirectly related to a badly rated ESG company, the framework of being data driven will quantify, will capture, will propagate and will show you that positive or negative influence to your bonds, to your suppliers, to your investments. How do you act then on this ESG rating is really endless opportunities. So you want to look at this from a supply chain perspective. You want to look at these from a market risk perspective. In this example, I create a simple synthetic portfolio made of 40 or 15 equities. I simply download the historical data of their stock returns and I overlay with my data-driven ESG score. Interestingly, the best and the worst score in my framework correspond to the best and the worst platforms in my portfolio. I apply this with more data to understand more than just two data points. And yes, ESG is directly correlated with market volatility. Badly ESG related, having my example, a risk that is two times higher, leading, paving the way towards a more agile view of risk management, such as, what is your risk exposure to the E, to the S, to the G? What action should you take to reduce that risk? How do you effectively operate in a more operational resiliency by applying ESG framework, this data driven view of trust your suppliers, of trust your investments, and how do you package all that information in a way that can be consumed by your lines of business through the use of dashboards, through the use of SQL to make that information actionable? What is the key as your strategy? Where’s the positive, the negative impact? How much of your score was reduced or improved based on the data driven best use of disclosure? And what action, what article, what event may have affected your brand, or your competitors, or your supply chain in a positive or negative way? If you have any question about migrating from a marketing concept of CSR to a data driven ESG and acting upon those ESG rating to a data driven way, me and my colleague, Giunta will be more than happy to sit down with you and your practitioner to enable that transformation, knowing that all those notebooks are publicly available and can be used today on a Databricks . Thank you very much.

Watch more Data + AI sessions here
Try Databricks for free
« back
Antoine Amend
About Antoine Amend


Antoine Amend is a data practitioner passionate about distributed computing and advanced analytics. Graduated with a master degree in computational astrophysics, author of “Mastering Spark for data science”, Antoine has been pushing both engineering and science disciplines side by side to extract commercial value from large datasets. More recently, Antoine served as director of data science at Barclays UK, leading their AI practice and driving Barclays through their data and analytics transformation. With his expertise in enterprise architecture and commercial experience delivering data science to production in a highly regulated environment, Antoine joined Databricks as the technical director for financial services, helping our customers redefine the future of Banking.

About Itai Weiss


Information Solution Architect with over 20 years of experience. Extensive background in Data Management, Big Data, Information Systems, Data Governance as well as process and project management. Implementation of numerous solutions across a host of different architectures including IBM, Oracle, open source and datawarehouse appliances. Experience in database design, DBA, data integration, Security, Big Data, Business Analytics and advanced analytics. Implementation of open source software encompassing Hadoop (and peripheral components), Spark, R, Python, RDBMS and NoSQL technologies. Breadth of industry experience to each engagement with specific background in government, power, financial, manufacturing, technology, healthcare and insurance. Long track record of success and Delivery within time and budget. Managed up to 12 team members in various positions. Agnostic perspective to each assignment, providing the best overall solution to the challenge at hand.