Aug 07 2025
Artificial Intelligence

How AI Mixture of Experts Works for Financial Services

MoE divides the computational work performed by artificial intelligence systems into subnetworks for faster, better results.

Financial leaders need the power of artificial intelligence to elevate their competitive efforts. But full large language models (LLMs) can be too slow and imprecise for the needs of this high-stakes sector. An AI architecture known as mixture of experts offers a way forward. MoE models “represent a fundamental shift in how AI systems are built and scaled,” according to industry advocacy group Renewable AI.

Click the banner below to learn how AI can empower people in the workplace.

 

What Is Mixture of Experts (MoE) in AI?

Mixture of experts “is an architectural pattern for neural networks that splits the computation of a layer or operation into multiple ‘expert’ subnetworks,” says Kevin Levitt, global business development lead for financial services at NVIDIA. “These subnetworks each independently perform their own computation.”

With specific experts responding to particular tasks, MoE narrows the focus of an AI operation.

“We can think about it as bringing the right specialist to the job, instead of having the entire model having to weigh in every time,” says Shanker Ramamurthy, global managing partner for banking and financial markets at IBM.

DISCOVER: How to avoid inaccuracy and bias in LLMs.

How Does MoE Work?

Fundamentally, MoE works by targeting queries to a specialized subset within a larger LLM. “Rather than activating the full model for every task, MoE selectively routes each input to certain parts of the model that are most relevant,” Ramamurthy says.

This can happen in one of two ways. “MoE architectures can be either dense, meaning that every expert is used in the case of every input, or sparse, meaning that a subset of experts is used for every input,” Levitt says.

“Sparse MoE are generally what provide performance benefits,” he says. For example, the MoE model known as Mixtral 8x7B only activates 2 out of 8 possible experts. “This means a given forward pass only uses 12 billion of the 46 billion parameters — one-quarter the compute cost.”

Shanker Ramamurthy
We can think about it as bringing the right specialist to the job, instead of having the entire model having to weigh in every time.”

Shanker Ramamurthy Global Managing Partner for Banking and Financial Markets, IBM

What Are the Benefits of MoE for Financial Services?

From a technical perspective, MoEs allow financial services companies to maximize the impact of their existing IT infrastructure. “You can train four times the amount of data in the same compute time, resulting in a better model given the fixed compute budget,” Levitt says.

There are other benefits as well, including:

  • Greater scalability: “If you’re a typical financial institution, you can have hundreds or thousands of people hitting these models,” Ramamurthy says. MoE’s ability to run multiple queries simultaneously makes it inherently more scalable, as different experts can tackle diverse queries at once without bogging down the entire LLM.
  • More-targeted insights: MoE can help financial services professionals to zero in on specific interest areas. “It can do things like fraud detection with one aspect of the model, while another aspect can handle customer engagement and personalized interactions,” Ramamurthy says. That leads to more narrowly targeted outputs.
  • Faster results: With the ability to parse smaller subsets of the LLM, the MoE drives quicker insights. “With the same amount of compute, you can do lots more now — and efficiency turns into speed at the end of the day,” Ramamurthy says.

RELATED: Four ways to integrate LLMs in finance.

What Are the Key MoE Use Cases in Finance?

Several key uses show how the MoE can support the financial services sector:

  • High-frequency trading: In HFT, “you’re looking for extremely fast response to take advantage of incredibly fleeting opportunities,” Ramamurthy says. That means parsing vast streams of market data with ultralow latency. Rather than using the full LLM (too slow) or a small language model (which can miss subtle market signals) “you can take an MoE approach to get that extreme responsiveness.”
  • Portfolio Risk Management: Risk profiles vary across asset classes, geographies and macroeconomic factors. MoE accounts for all of that. For any given risk, “targeted experts only light up the aspects of the LLM for that particular type of risk,” Ramamurthy says. By helping quickly sort through these variations, MoE improves predictions while optimizing infrastructure load.
  • Fraud Detection and Anti-Money Laundering: MoE can reduce false positives across millions of transactions, even as patterns of financial crime vary by transaction type, geography and client behavior. In fraud detection, “you need extraordinarily fast response, and you’ve got incredibly high volume,” Ramamurthy says. Tuned for specific risks, such as money laundering or customer identity, the MoE “will respond to that type of problem really quickly and efficiently.”
  • Natural Language Processing for Market Sentiment: MoE helps teams interpret financial news, earnings calls and analyst reports at scale, with experts specialized in legal language, financial terms or sentiment analysis. Ramamurthy says it’s analogous to the human brain, “which is organized into domains that handle specific things relating to voice or language or human sight.” In the same way, MoE knows just what to look for in interpreting sentiment, thus increasing the accuracy and speed of NLP pipelines.

FIND OUT: Three use cases for multi-modal large language models.

How Can Businesses Implement MoE?

On a technical level, recourses such as NVIDIA GPU Cloud host the Mixtral model family, a Sparse Mixture of Experts (SMoE) LLM. This offers “a great starting point for businesses to evaluate if their use cases benefit especially from this architecture,” Levitt says.

In terms of more general business practice, financial services firms looking to tap the power of MoE should “first step back and look,” Ramamurthy says. “What is the problem that I’m trying to solve? What’s the economic benefit I’m looking for?”

Then, organize your data accordingly. “In a typical financial institution, it’s about getting the right data, getting it organized the right way and ensuring you are pointing the MoE at the right problem,” he says.

Click the banner below to subscribe to our newsletter for the latest financial services IT insights.

Ilmar Idiyatullin/Getty Images
Close

See How Your Peers Are Leveling Up Their IT

Sign up for our financial services newsletter and get the latest insights and expert tips.