Skip to main content

Jefferies modernizes equity research at scale with Databricks and agentic analytics

How Jefferies gives hundreds of analysts natural-language access to multi-source research, powered by AI/BI Genie.

Databricks and Jefferies logos

Published: March 2, 2026

Customers10 min read

Summary

  • Jefferies uses Databricks AI/BI Genie and agentic analytics to help 250+ analysts answer open-ended research questions across multiple data sources.
  • Complex research that once took days or weeks is now completed in minutes, combining self-service speed with embedded domain expertise.
  • Built directly on Databricks and governed by Unity Catalog, the solution scales globally without duplicating data, tools, or access controls.

Equity research is a game of breadth and conviction.

At Jefferies, the global equity research organization covers roughly 3,500 companies across sectors and geographies, with analysts based in the US, EMEA, and APAC. That scale is a competitive advantage, but it also creates a familiar challenge for any research organization working with an expanding universe of third-party and internal datasets.

“Our analysts have to synthesize signals across an enormous universe of companies, industries, and data sources,” said Ethan Geismar, Head of Data & AI, Equity Research at Jefferies. “Our goal is to help them turn that complexity into differentiated and actionable investment advice for our clients.”

The questions analysts ask are rarely narrow or prescriptive. They are open-ended, domain-specific, and framed in the language of markets and fundamentals, not in terms of which dataset to query or which table to join. For example, analysts ask questions like “What’s the demand and outlook for fast-casual restaurants?” or “How are foot traffic and app downloads trending across the brands I cover?” 

In a field where investment decisions hinge on confidence, a single signal is rarely enough. Analysts need corroboration across multiple independent sources to build conviction.

Over the past several years, Jefferies’ equity research engineering team has partnered closely with Databricks to ingest, clean, and standardize dozens of structured datasets — many of which originated as alternative data but now span financial, market, and macroeconomic indicators. As generative AI capabilities matured, the team set out to answer a new question:

How could Jefferies give analysts a faster, easier way to explore this data—one that preserved governance, plugged directly into existing data infrastructure, and translated natural-language questions into defensible, multi-source analysis?

To solve this, Jefferies built Jefferies Data Intelligence (JDI) — a conversational analytics experience powered by Databricks AI/BI Genie, allowing analysts to ask open-ended research questions directly against governed, multi-source datasets.

The Limits of Traditional Self-service and White-glove Support

Historically, Jefferies has supported new and ad hoc analyst requests in two primary ways.

First, through traditional self-service data browsing tools that gave analysts direct access to datasets but required them to understand the underlying data landscape and tooling to extract meaningful insights.

Second, through a white-glove internal service model, the research engineering team translated analyst questions into data pulls and delivered synthesized results.

“Even after we’ve cleaned and mapped data, there’s still friction: someone has to translate the fundamental questions analysts ask into the right datasets and the right views,” Geismar explained. “Analysts don’t frame questions in terms of tables and joins, they ask questions about fundamentals, macroeconomics, industry trends, comparative positioning, catalysts, risks, etcetera.”

While powerful, this approach introduced a different constraint: team bandwidth.

“We work in monthly sprints, and the wiggle room for last-minute requests is limited,” Geismar said. “Even when something wasn’t technically hard to tackle, it used to take days or weeks in some situations before we were able to get to it, just due to capacity constraints.”

More involved questions, especially those requiring triangulation across multiple datasets, could take hours or days of focused effort once prioritized.

Complex research questions were often the most challenging. An analyst asking about consumer demand trends in fast-casual restaurants might need foot traffic data, mobile app engagement metrics, survey-based purchase intent, and macroeconomic context — each requiring separate data pulls, joins, and analyses.

Both models worked, but both imposed friction. What Jefferies needed was a way to combine the independence of self-service with the embedded expertise of the research engineering team without creating new bottlenecks.

A Research Agent That Meets Analysts Where They Are

To operationalize this at scale, Jefferies built an internal equity research assistant with a custom analyst-facing interface, powered by AI/BI Genie as the orchestration and reasoning engine sitting on top of the firm’s structured data lake.

The resulting experience allows analysts to ask the same questions they would pose to a domain expert and receive responses grounded in multiple relevant datasets. Importantly, the system understands the language analysts already use to frame their research.

For example, when an analyst asks about fast-casual restaurants, AI/BI Genie interprets that sector shorthand using domain-specific semantic mappings and curated business context, maps it to the appropriate coverage universe, and retrieves relevant data, without requiring the analyst to specify brands, tables, or joins.

Those same coverage mappings, aligned with how analysts naturally segment their sectors and with industry taxonomies, enable aggregate views such as total restaurant visitation across constituent brands. Because this logic is built directly into Genie, analysts can interrogate their coverage using familiar language and groupings.

From there, analysts can naturally iterate, requesting brand-level breakouts ("break this out by individual brands"), parent-company aggregations, or additional context, prompting deeper analysis without having to pre-specify those dimensions.

Where Open-ended Questions Reveal Hidden Insights

When analysts engage with open-ended prompts, the system helps identify which signals may be most relevant to the question at hand, often uncovering insights and datasets analysts may not have previously considered.

A simple query like “Show me visitation to fast-casual restaurants” retrieves associated foot traffic data and presents trend analysis.

But broader prompts such as “Show me demand and outlook for fast-casual restaurants” expand the scope of analysis by collating foot traffic, mobile app usage, survey-based purchase intent, macroeconomic indicators, and other signals.

Figure: Jefferies Data Intelligence answering a public-data question using governed datasets and live FRED/BLS APIs
Jefferies Data Intelligence answering a multi-source research question with natural-language synthesis and generated visualizations.

“It gives analysts seamless access to our data without needing technical knowledge or support,” Geismar said. “But the more powerful value is that it exposes them to data they didn’t know existed or wouldn’t have thought to use for the question they’re asking.”

This multi-source response surfaces analytical angles that analysts may not have explicitly requested, enabling corroboration across independent sources.

That corroboration, Geismar says, is the core value proposition. “The power is bringing together multiple independent datasets to corroborate a thesis,” he added. “There's no redundancy — it's increasing conviction. That's the name of the game.”

Conversely, when outputs contradict assumptions, they prompt new lines of research and help refine investment theses.

How it Works: An Agentic Workflow Built on Databricks

The analyst experience feels conversational, but the infrastructure behind it is sophisticated. Under the hood, the application is powered by a LangGraph-based multi-agent architecture, operationalized through Databricks Model Serving.

When an analyst submits a question, the system follows a structured workflow:

  1. Tool validation ensures the required data services and APIs are available, checking both internal Databricks resources and third-party services such as Federal Reserve Economic Data (FRED), the Bureau of Labor Statistics (BLS), and others.
  2. A planning agent decomposes the question into a set of research tasks, creating a plan for what needs to be researched and how to use available tools to answer comprehensively.
  3. Execution agents retrieve data in parallel where possible, pulling from governed datasets through Genie and hitting third-party APIs as needed. These agents follow the research plan's ordering, executing sequentially when dependencies exist and in parallel when they can.
  4. A synthesis agent assembles the results into a coherent response, often including charts and analysis that combine findings from multiple sources.

Critically, the system can retrieve and corroborate signals across multiple datasets in response to a single question, enabling cross-dataset corroboration rather than relying on one table or a single joined view. This architecture allows analysts to iterate with natural follow-ups, such as ticker- or brand-level breakouts, to validate signals and drill into specifics.

Within this workflow, Genie plays a key role by enabling natural-language questions over curated, governed business data, while Databricks Model Serving provides the deployment and serving layer for the JDI application.

The system is model-agnostic and leverages a range of foundation models for reasoning-intensive tasks such as planning and synthesis, while maintaining the flexibility to incorporate lighter-weight or task-specific models for simpler steps (such as tool validation) as the architecture evolves.

For the team building JDI, this architecture signals a broader shift in how equity research will be conducted.

“Building out Jefferies Data Intelligence with Databricks has really given us a glimpse into what the future of research will look like,” explained Dylan Andrews, a Senior Associate Data Scientist on the Equity Research team. “Knowing the syntax of how to interact with data will matter less and less, and more focus will be placed on verifying or disproving hypotheses grounded in a mosaic of data across domains within minutes.”

Governance by Default with Unity Catalog

One of the most critical requirements for Jefferies was ensuring that governance was not an afterthought.

Because datasets are registered and accessed through Databricks Unity Catalog, access controls are enforced automatically based on user identity. Genie respects the same table-level and row- or column-level permissions already defined in Unity Catalog, eliminating the need to build and maintain custom authorization logic for the AI experience.

This enabled confident extension of powerful analytical capabilities to non-technical users without compromising data security or compliance. As the system scales to include more sensitive datasets and broader user access across global regions, these built-in governance controls ensure that the right people see the right data automatically.

Webinar

Databricks 101: A Practical Primer

Built on Existing Data Infrastructure

The equity research agent was not developed as a standalone AI prototype. It was designed to sit directly on top of the data foundation Jefferies had already built on Databricks over seven years of partnership.

Today, the system draws from multiple sources in a hybrid architecture that combines governed Databricks datasets with runtime API calls:

Genie Spaces (curated datasets):

  • Fundamental data: Company-reported financial and operating metrics released during quarterly earnings cycles, including company-specific KPIs
  • Alternative datasets: Web traffic, foot traffic, social media engagement, and more, pre-joined and ready for cross-analysis

Runtime API connections:

  • Macroeconomic Data: Indicators from FRED and the BLS.
  • Other Third-Party Services and APIs: Additional third party data sources that are not pre-staged and better ingested through API or MCP at run-time.

The agent seamlessly joins data from API calls with governed datasets retrieved through Genie, providing comprehensive answers that span both real-time external data and carefully curated internal sources.

Because the assistant leverages the same ingestion pipelines, orchestration jobs, and governance model already in place, Jefferies was able to layer agentic capabilities on top of its existing infrastructure rather than introducing a parallel system. The orchestrated jobs running on Databricks — handling ingestion, cleaning, and standardization through existing Databricks pipelines — continue to serve as the foundation, now accessible via natural language.

Tamar Kellner, Senior Associate Data Scientist on the Equity Research team, emphasized how Databricks’ native capabilities accelerated development:

“Databricks Genie and Model Serving handled data access, deployment, and governance out of the box, allowing our team to focus on JDI's core differentiators: agentic system design, analyst-first workflows, and rapid cross-dataset signal corroboration.”

Building Trust Through Transparency

Adoption required more than just speed. Analysts needed to trust the outputs, especially in a workflow with no human intermediary.

One of the critical challenges the team solved was: How do we get non-technical users comfortable with and confident in AI-generated outputs? Unlike tools built on unstructured data, the team couldn't simply link back to source documents and highlight where information was pulled. Nor could they expect analysts to validate SQL queries to verify correctness.

The solution was building auditability directly into every response. Every answer JDI returns includes an expandable dropdown, showing a chain-of-thought view that walks through how the system translated the analyst's prompt into data extraction calls. This transparency helps non-technical users understand and audit the reasoning process, building confidence in the outputs without requiring them to inspect SQL or source tables directly.

This explainability, combined with the system's ability to surface multiple corroborating datasets, gives analysts the evidence they need to build conviction in their investment recommendations.

Early Impact and What Comes Next

The assistant is currently rolled out to more than 250 users in the US, with plans to expand to EMEA and APAC, bringing total access to roughly 550 analysts globally.

Although the tool has been live for only a few weeks, adoption has been broad. Hundreds of questions have already been answered, generating thousands of insights and charts.

Work that previously took days or weeks due to bandwidth constraints or complexity is now delivered in minutes.

For users like Kaumil Gajrawala, Managing Director of Consumer Research at Jefferies, that acceleration is already changing how research gets done.

“JDI has massively accelerated our workflow,” Gajrawala said. “We’re doing more, faster. We’ve only scratched the surface, we’re evolving from getting our work done faster to discovering what we can now do that wasn’t possible before.”

The current system draws from roughly 10–12 core data sources, several of which contain multiple datasets, with a clear path to expanding to 30–40+ sources over time.

“We’re getting started with the most common sources, but we have a runway of two to three times as many,” Geismar said. “The vision is that this becomes the single access point for our department’s structured data, and a daily tool for most analysts.”

As the platform expands, Jefferies remains focused on maintaining performance, usability, and interpretability while increasing the breadth of accessible research data.

A New Access Point for Equity Research

By building on Databricks’ data engineering, governance, and AI capabilities, Jefferies is evolving how analysts interact with structured data, combining the autonomy of self-service with the embedded expertise of the research engineering team.

The result is not just faster answers, but a system that helps analysts develop stronger, more defensible investment theses, grounded in corroborated evidence and delivered at the speed research demands.

Never miss a Databricks post

Subscribe to our blog and get the latest posts delivered to your inbox

What's next?

Scale Faster with Data + AI: Insights from the Databricks Unicorns Index

Technology

December 9, 2024/6 min read

Scale Faster with Data + AI: Insights from the Databricks Unicorns Index

Innovators Unveiled: Announcing the Databricks Generative AI Startup Challenge Winners!

News

December 11, 2024/4 min read

Innovators Unveiled: Announcing the Databricks Generative AI Startup Challenge Winners!