Equity research is a game of breadth and conviction.
At Jefferies, the global equity research organization covers roughly 3,500 companies across sectors and geographies, with analysts based in the US, EMEA, and APAC. That scale is a competitive advantage, but it also creates a familiar challenge for any research organization working with an expanding universe of third-party and internal datasets.
“Our analysts have to synthesize signals across an enormous universe of companies, industries, and data sources,” said Ethan Geismar, Head of Data & AI, Equity Research at Jefferies. “Our goal is to help them turn that complexity into differentiated and actionable investment advice for our clients.”
The questions analysts ask are rarely narrow or prescriptive. They are open-ended, domain-specific, and framed in the language of markets and fundamentals, not in terms of which dataset to query or which table to join. For example, analysts ask questions like “What’s the demand and outlook for fast-casual restaurants?” or “How are foot traffic and app downloads trending across the brands I cover?”
In a field where investment decisions hinge on confidence, a single signal is rarely enough. Analysts need corroboration across multiple independent sources to build conviction.
Over the past several years, Jefferies’ equity research engineering team has partnered closely with Databricks to ingest, clean, and standardize dozens of structured datasets — many of which originated as alternative data but now span financial, market, and macroeconomic indicators. As generative AI capabilities matured, the team set out to answer a new question:
How could Jefferies give analysts a faster, easier way to explore this data—one that preserved governance, plugged directly into existing data infrastructure, and translated natural-language questions into defensible, multi-source analysis?
To solve this, Jefferies built Jefferies Data Intelligence (JDI) — a conversational analytics experience powered by Databricks AI/BI Genie, allowing analysts to ask open-ended research questions directly against governed, multi-source datasets.
Historically, Jefferies has supported new and ad hoc analyst requests in two primary ways.
First, through traditional self-service data browsing tools that gave analysts direct access to datasets but required them to understand the underlying data landscape and tooling to extract meaningful insights.
Second, through a white-glove internal service model, the research engineering team translated analyst questions into data pulls and delivered synthesized results.
“Even after we’ve cleaned and mapped data, there’s still friction: someone has to translate the fundamental questions analysts ask into the right datasets and the right views,” Geismar explained. “Analysts don’t frame questions in terms of tables and joins, they ask questions about fundamentals, macroeconomics, industry trends, comparative positioning, catalysts, risks, etcetera.”
While powerful, this approach introduced a different constraint: team bandwidth.
“We work in monthly sprints, and the wiggle room for last-minute requests is limited,” Geismar said. “Even when something wasn’t technically hard to tackle, it used to take days or weeks in some situations before we were able to get to it, just due to capacity constraints.”
More involved questions, especially those requiring triangulation across multiple datasets, could take hours or days of focused effort once prioritized.
Complex research questions were often the most challenging. An analyst asking about consumer demand trends in fast-casual restaurants might need foot traffic data, mobile app engagement metrics, survey-based purchase intent, and macroeconomic context — each requiring separate data pulls, joins, and analyses.
Both models worked, but both imposed friction. What Jefferies needed was a way to combine the independence of self-service with the embedded expertise of the research engineering team without creating new bottlenecks.
To operationalize this at scale, Jefferies built an internal equity research assistant with a custom analyst-facing interface, powered by AI/BI Genie as the orchestration and reasoning engine sitting on top of the firm’s structured data lake.
The resulting experience allows analysts to ask the same questions they would pose to a domain expert and receive responses grounded in multiple relevant datasets. Importantly, the system understands the language analysts already use to frame their research.
For example, when an analyst asks about fast-casual restaurants, AI/BI Genie interprets that sector shorthand using domain-specific semantic mappings and curated business context, maps it to the appropriate coverage universe, and retrieves relevant data, without requiring the analyst to specify brands, tables, or joins.
Those same coverage mappings, aligned with how analysts naturally segment their sectors and with industry taxonomies, enable aggregate views such as total restaurant visitation across constituent brands. Because this logic is built directly into Genie, analysts can interrogate their coverage using familiar language and groupings.
From there, analysts can naturally iterate, requesting brand-level breakouts ("break this out by individual brands"), parent-company aggregations, or additional context, prompting deeper analysis without having to pre-specify those dimensions.
When analysts engage with open-ended prompts, the system helps identify which signals may be most relevant to the question at hand, often uncovering insights and datasets analysts may not have previously considered.
A simple query like “Show me visitation to fast-casual restaurants” retrieves associated foot traffic data and presents trend analysis.
But broader prompts such as “Show me demand and outlook for fast-casual restaurants” expand the scope of analysis by collating foot traffic, mobile app usage, survey-based purchase intent, macroeconomic indicators, and other signals.
“It gives analysts seamless access to our data without needing technical knowledge or support,” Geismar said. “But the more powerful value is that it exposes them to data they didn’t know existed or wouldn’t have thought to use for the question they’re asking.”
This multi-source response surfaces analytical angles that analysts may not have explicitly requested, enabling corroboration across independent sources.
That corroboration, Geismar says, is the core value proposition. “The power is bringing together multiple independent datasets to corroborate a thesis,” he added. “There's no redundancy — it's increasing conviction. That's the name of the game.”
Conversely, when outputs contradict assumptions, they prompt new lines of research and help refine investment theses.
The analyst experience feels conversational, but the infrastructure behind it is sophisticated. Under the hood, the application is powered by a LangGraph-based multi-agent architecture, operationalized through Databricks Model Serving.
When an analyst submits a question, the system follows a structured workflow:
Critically, the system can retrieve and corroborate signals across multiple datasets in response to a single question, enabling cross-dataset corroboration rather than relying on one table or a single joined view. This architecture allows analysts to iterate with natural follow-ups, such as ticker- or brand-level breakouts, to validate signals and drill into specifics.
Within this workflow, Genie plays a key role by enabling natural-language questions over curated, governed business data, while Databricks Model Serving provides the deployment and serving layer for the JDI application.
The system is model-agnostic and leverages a range of foundation models for reasoning-intensive tasks such as planning and synthesis, while maintaining the flexibility to incorporate lighter-weight or task-specific models for simpler steps (such as tool validation) as the architecture evolves.
For the team building JDI, this architecture signals a broader shift in how equity research will be conducted.
“Building out Jefferies Data Intelligence with Databricks has really given us a glimpse into what the future of research will look like,” explained Dylan Andrews, a Senior Associate Data Scientist on the Equity Research team. “Knowing the syntax of how to interact with data will matter less and less, and more focus will be placed on verifying or disproving hypotheses grounded in a mosaic of data across domains within minutes.”
One of the most critical requirements for Jefferies was ensuring that governance was not an afterthought.
Because datasets are registered and accessed through Databricks Unity Catalog, access controls are enforced automatically based on user identity. Genie respects the same table-level and row- or column-level permissions already defined in Unity Catalog, eliminating the need to build and maintain custom authorization logic for the AI experience.
This enabled confident extension of powerful analytical capabilities to non-technical users without compromising data security or compliance. As the system scales to include more sensitive datasets and broader user access across global regions, these built-in governance controls ensure that the right people see the right data automatically.
The equity research agent was not developed as a standalone AI prototype. It was designed to sit directly on top of the data foundation Jefferies had already built on Databricks over seven years of partnership.
Today, the system draws from multiple sources in a hybrid architecture that combines governed Databricks datasets with runtime API calls:
Genie Spaces (curated datasets):
Runtime API connections:
The agent seamlessly joins data from API calls with governed datasets retrieved through Genie, providing comprehensive answers that span both real-time external data and carefully curated internal sources.
Because the assistant leverages the same ingestion pipelines, orchestration jobs, and governance model already in place, Jefferies was able to layer agentic capabilities on top of its existing infrastructure rather than introducing a parallel system. The orchestrated jobs running on Databricks — handling ingestion, cleaning, and standardization through existing Databricks pipelines — continue to serve as the foundation, now accessible via natural language.
Tamar Kellner, Senior Associate Data Scientist on the Equity Research team, emphasized how Databricks’ native capabilities accelerated development:
“Databricks Genie and Model Serving handled data access, deployment, and governance out of the box, allowing our team to focus on JDI's core differentiators: agentic system design, analyst-first workflows, and rapid cross-dataset signal corroboration.”
Adoption required more than just speed. Analysts needed to trust the outputs, especially in a workflow with no human intermediary.
One of the critical challenges the team solved was: How do we get non-technical users comfortable with and confident in AI-generated outputs? Unlike tools built on unstructured data, the team couldn't simply link back to source documents and highlight where information was pulled. Nor could they expect analysts to validate SQL queries to verify correctness.
The solution was building auditability directly into every response. Every answer JDI returns includes an expandable dropdown, showing a chain-of-thought view that walks through how the system translated the analyst's prompt into data extraction calls. This transparency helps non-technical users understand and audit the reasoning process, building confidence in the outputs without requiring them to inspect SQL or source tables directly.
This explainability, combined with the system's ability to surface multiple corroborating datasets, gives analysts the evidence they need to build conviction in their investment recommendations.
The assistant is currently rolled out to more than 250 users in the US, with plans to expand to EMEA and APAC, bringing total access to roughly 550 analysts globally.
Although the tool has been live for only a few weeks, adoption has been broad. Hundreds of questions have already been answered, generating thousands of insights and charts.
Work that previously took days or weeks due to bandwidth constraints or complexity is now delivered in minutes.
For users like Kaumil Gajrawala, Managing Director of Consumer Research at Jefferies, that acceleration is already changing how research gets done.
“JDI has massively accelerated our workflow,” Gajrawala said. “We’re doing more, faster. We’ve only scratched the surface, we’re evolving from getting our work done faster to discovering what we can now do that wasn’t possible before.”
The current system draws from roughly 10–12 core data sources, several of which contain multiple datasets, with a clear path to expanding to 30–40+ sources over time.
“We’re getting started with the most common sources, but we have a runway of two to three times as many,” Geismar said. “The vision is that this becomes the single access point for our department’s structured data, and a daily tool for most analysts.”
As the platform expands, Jefferies remains focused on maintaining performance, usability, and interpretability while increasing the breadth of accessible research data.
By building on Databricks’ data engineering, governance, and AI capabilities, Jefferies is evolving how analysts interact with structured data, combining the autonomy of self-service with the embedded expertise of the research engineering team.
The result is not just faster answers, but a system that helps analysts develop stronger, more defensible investment theses, grounded in corroborated evidence and delivered at the speed research demands.
Technology
December 9, 2024/6 min read
News
December 11, 2024/4 min read


