Serverless GPUs Deliver Value at Lower Cost
Serverless technology, often seen as the ultimate incarnation of cloud computing, allows developers to create and run applications in the cloud without provisioning or managing servers or back-end infrastructure.
With serverless GPUs, companies can get the benefits of GPUs while also optimizing for costs and incorporating the scalability of cloud infrastructure to spin capacity up or down as demand requires, Greden says. That is ideal for AI applications that require massive amounts of computing power but aren’t necessarily being run constantly.
“What we now have is a case where the aperture is opening for the market that will want to make use of GPUs, and not just those doing heavy graphical type of computing,” she says.
LEARN MORE: How can you use serverless computing to build and modernize applications for scale?
Get Agility Plus Capacity With Serverless GPUs
Serverless GPUs essentially operate as a PaaS or even a Function as a Service, allowing organizations to access serverless computing capacity for their applications while avoiding provisioning infrastructure, says Brijesh Kumar, a senior research analyst within IDC’s cloud application deployment platforms research practice.
They are ideal for when organizations cannot always predict the traffic load they will have for their cloud computing capacity, he says. The technology can allow them to spin up GPU capacity when requests come in and demand is high, then scale down to zero when requests stop.
The technology also supports multitenancy or multi-instance capabilities, allowing cloud providers to partition serverless GPUs to support multiple workload requests from different users or sources, Kumar notes. Serverless GPUs also reduce costs by removing the need to manage the necessary infrastructure, Kumar says.
WATCH: Discover how DevOps can add speed and efficiency to your process.
However, there are some potential drawbacks to serverless GPUs. Cost can become a constraint, since running a serverless GPU for an extended period of time will rack up charges with an organization’s cloud provider. An unexpected spike in requests can also raise costs. And, Greden says, organizations risk being locked in to using a particular cloud provider for serverless GPU capabilities.
Even so, serverless GPUs can quickly deliver answers to users of generative AI applications because of the speed with which they can perform the required computations. And with market demand for GPU chipsets still extremely high, using serverless GPUs could be a critical backstop while silicon providers rush to produce as many GPUs as they can, Greden says.