10x price performance improvements on AI inference with Salad’s distributed cloud platform

« Back to Case Studies

Salad is a global cloud platform that harnesses latent compute resources from idle, high-end consumer hardware to power and distribute computing applications more affordably than traditional data centers. Salad and Numenta partnered to increase price performance on AI inference.

CHALLENGE

Scaling up deep learning models without breaking the bank

Deploying deep learning systems today can be costly and complex. Models are becoming larger, as we see with large language models (LLMs) moving from millions to billions of parameters. Additionally, the reliance on highly available processing resources leads many people to deploy their networks to the public cloud, which leads to restrictive technical requirements, expensive model development resources, and increasing cloud spend. New methods are needed to optimize and scale these models on specialized hardware.

SOLUTION

Deploying Numenta’s AI Inference Server on Salad Container Engine

Using hardware-aware optimizations and neuroscience-based acceleration techniques, we created an optimized BERT-Base model and deployed it on the Salad Container Engine (SCE), a fully managed orchestration platform built to facilitate container deployments on Salad’s distributed cloud. To assess the price-performance benefits of Numenta technology on SCE, we benchmarked our optimized BERT-Base model against a standard BERT-Base on four different Amazon Web Services (AWS) configurations and SCE.

“It’s incredibly validating for our team to see Salad Container Engine outperform conventional data center services at such an affordable rate,”

Bob Miles

Salad CEO

RESULTS

10x more inferences per dollar

Our optimized BERT-Base model delivered more inference throughput than a standard BERT-Base model on each AWS instance. When deployed on Salad’s infrastructure, we achieved a 10x price performance improvement over a standard BERT-base model running on AWS.

BENEFITS

Cost savings and performance speed-ups that enable AI deployment at scale

The combination of Numenta + SCE allows users interested in deploying deep learning models to benefit from the best of both worlds: performance improvements from Numenta’s optimized models and more affordable on-demand cloud service provider pricing from Salad.

Do more with your existing budget by running 10x more inferences per dollar
Run existing workloads at 10% of your current cost
Enable new users to run deep learning models who couldn’t previously afford to run them

ADDITIONAL RESOURCES

Press Release: Salad and Numenta Achieve 10X Price Performance Improvements on AI Inference
Full Case Study on Salad.com: Optimizing AI Inference Costs: Numenta on Salad’s Cloud Infrastructure

Interested in working with us?

Related Case Studies

Developing AI-powered games on existing CPU infrastructures without breaking the bank

AI is opening a new frontier for gaming, enabling more immersive and interactive experiences than ever before. NuPIC enables game studios and developers to leverage these AI technologies on existing CPU infrastructure as they embark on building new AI-powered games.

20x inference acceleration for long sequence length tasks on Intel Xeon Max Series CPUs

Numenta technologies running on the Intel 4th Gen Xeon Max Series CPU enables unparalleled performance speedups for longer sequence length tasks.

Numenta + Intel achieve 123x inference performance improvement for BERT Transformers

Numenta technologies combined with the new Advanced Matrix Extensions (Intel AMX) in the 4th Gen Intel Xeon Scalable processors yield breakthrough results.

10x price performance improvements on AI inference with Salad’s distributed cloud platform

CHALLENGE

SOLUTION

RESULTS

BENEFITS

ADDITIONAL RESOURCES

Interested in working with us?

Related Case Studies

Developing AI-powered games on existing CPU infrastructures without breaking the bank

20x inference acceleration for long sequence length tasks on Intel Xeon Max Series CPUs

Numenta + Intel achieve 123x inference performance improvement for BERT Transformers

Stay in the loop.

AI Platform

Thousand Brains Project

Company

Resources

Contact Us