Software

Federated Machine Learning Gives Enterprises a Competitive AI Advantage

By bringing the training of ML models to users, organizations can advance their AI ambitions while maintaining data security.

Phil Goldstein

Twitter

Phil Goldstein is a former web editor of the CDW family of tech magazines and a veteran technology journalist.

Historically, machine learning models have been trained by consolidating data from multiple sources into a centralized cloud server or data center and then training the model based on the combined data.

This approach can streamline ML model training but “may also create significant privacy risks and potential vulnerabilities if the central data repository is compromised,” as a Google blog post notes. A range of organizations, especially in highly regulated industries such as healthcare and finance, are turning to a solution that has existed for many years but is growing in prominence: federated machine learning.

Federated machine learning is emerging as a critical architecture for enterprise AI because it allows organizations to train models collaboratively without moving or exposing raw data. Instead of centralizing sensitive data sets, the model is sent to where the data resides — across devices, departments or partner organizations — and each participant trains it locally. Crucially, only model updates are shared and aggregated, preserving privacy while still improving the overall system.

Click the banner below to learn how businesses are unlocking artificial intelligence’s potential.

What Is Federated Machine Learning?

Federated learning is a decentralized yet collaborative approach to ML model training, according to Kathy Lange, research director for IDC’s AI, Data and Automation Software practice.

“It enables multiple parties to jointly train a model without exchanging or exposing potentially sensitive data,” she says. “It is maturing as a critical architecture for enterprise AI, especially where privacy, compliance and cross-organization collaboration are paramount.”

Across a range of industries — finance, healthcare life sciences and manufacturing — and use cases such as disease research or fraud detection, centralized ML training may not be the most efficient way for organizations to train models.

That’s because, Lange notes, “no single organization may have enough data to build robust, generalizable AI models.”

However, through federated learning, by pooling data across institutions, “organizations can overcome sample size limitations, capture greater diversity and improve the accuracy and reliability of their insights” while still preserving privacy, she says.

How Federated Learning Differs From Traditional Centralized AI Training

With traditional centralized AI, all the data is combined into a single data set for training the AI model. In contrast, with federated learning, the data never leaves its original location.

“Only model updates are transmitted to other participants; typically, model parameter updates or gradient changes, not the data itself,” Lange says. “Often, you hear the phrase, ‘The model goes to the data,’ instead of the data going to where the model is being created. Each participant trains the model locally on its own data.”

Privacy-Preserving AI: Why Regulated Industries Are Adopting Federated Learning First

Regulated industries like healthcare and finance have been early adopters of federated ML precisely because of their need for privacy and compliance.

Organizations in these industries can’t simply share sensitive data due to regulations like the Health Insurance Portability and Accountability Act (HIPAA) and General Data Protection Regulation (GDPR).

“Federated learning gives these industries a way to learn from multiple organizations’ data while still keeping tight controls and limiting exposure,” Lange notes. “It’s the best of both worlds: better model accuracy and governance.”

No single organization may have enough data to build robust, generalizable AI models.”

Kathy Lange Research Director for AI, Data and Automation Software, IDC

Real-World Use Cases: Healthcare, Finance, IoT and Edge AI

There is a bevy of use cases organizations in a range of industries can use federated ML for, Lange notes, including:

Healthcare: Hospitals collaboratively train models for cancer diagnosis, brain tumor segmentation and COVID-19 detection without sharing patient records, she says. For example, U.S. medical centers — including collaborators from Case Western Reserve University; Georgetown University; Mayo Clinic; the University of California, San Diego; the University of Florida; and Vanderbilt University — are using NVIDIA-powered federated learning for tumor segmentation, according to an NVIDIA blog post.
Finance: Banks, including Airstar Bank and livi bank in Hong Kong, use federated learning for fraud detection, anti-money laundering and credit risk prediction across institutions without revealing customer lists, Lange says.
Internet of Things and Edge AI: Federated learning powers mobile keyboard prediction such as Google Gboard, Apple QuickType and autonomous vehicles and surveillance in smart city applications by learning from on-device data, according to Lange.
Other Sectors: Automotive (autonomous driving), manufacturing and telecom sectors are also adopting federated learning for collaborative AI without centralizing sensitive data, Lange says.

Key Infrastructure Requirements for Federated Learning Deployments

To run federated learning, organizations need a central system to coordinate the process (including model distribution, scheduling and update aggregation), local infrastructure with sufficient computing power to train models and secure ways for sending updates between participants, according to Lange.

Google notes in its blog that organizations must bring the model to client devices to perform the local model training, and these can range from mobile phones to IoT devices and entire institutions, such as hospitals.

The central server or aggregator “acts as the orchestrator of the federated learning process,” Google notes. “It initializes and distributes the global model, collects model updates from clients, aggregates these updates to refine the global model, and then redistributes the updated model. It doesn’t directly access the clients’ raw data.”

IT leaders also need to use a defined communication protocol to determine how “clients and the server exchange information, primarily the model parameters and updates. Efficient and secure communication protocols are crucial, especially given the potential for a massive number of clients and varying network conditions.”

Finally, a model aggregation algorithm is how the central server combines the model updates received from the clients. “Algorithms like federated averaging are commonly used to average the weights or gradients, creating a single, improved global model,” Google notes.

Organizations also need data and model governance, says Lange.

LEARN: How to secure agentic AI at enterprise scale.

Building a Cross-Enterprise AI Ecosystem: The Key Takeaway

Ultimately, federated learning is about cross-enterprise collaboration for building AI models without sharing raw data, Lange says.

“In regulated, multienterprise environments, it can unlock better models without forcing organizations to give up control over their most sensitive data,” she adds. “For success, it’s critical that the participants establish clear agreements on data ownership, contributions and responsibilities.”

NicoElNino/Getty Images

Newsletter

Sign up today to receive our newsletter in your inbox

BizTech Magazine

Federated Machine Learning Gives Enterprises a Competitive AI Advantage

What Is Federated Machine Learning?

How Federated Learning Differs From Traditional Centralized AI Training

Privacy-Preserving AI: Why Regulated Industries Are Adopting Federated Learning First

Real-World Use Cases: Healthcare, Finance, IoT and Edge AI

Key Infrastructure Requirements for Federated Learning Deployments

Building a Cross-Enterprise AI Ecosystem: The Key Takeaway

What Is Federated Machine Learning?

How Federated Learning Differs From Traditional Centralized AI Training

Privacy-Preserving AI: Why Regulated Industries Are Adopting Federated Learning First

Real-World Use Cases: Healthcare, Finance, IoT and Edge AI

Key Infrastructure Requirements for Federated Learning Deployments

Building a Cross-Enterprise AI Ecosystem: The Key Takeaway

More On

Related Articles

New Research from CDW on Workplace Friction