
Numenta + Intel achieve 123x inference performance improvement for BERT Transformers
Numenta technologies combined with the new Advanced Matrix Extensions (Intel AMX) in the 4th Gen Intel Xeon Scalable processors yield breakthrough results.
Achieve dramatic performance improvements in your deep learning networks
Numenta is currently in beta with its products. For this limited closed Beta Program, we are working with a handful of customers to create highly performant, cost-effective deep learning networks for Natural Language Processing, Generative AI, and Computer Vision.
Read the FAQ below.
ultra-low latency on Bert-Base
throughput inference speedup
higher throughput on Bert-Large
cost reduction on inference
energy savings
The beta is open to companies of all sizes and locations, and our customers typically fall into three categories:
Whether you’ve never run deep learning solutions in production or are looking to accelerate multiple Transformer models, whichever path you choose, we can help you achieve significant performance improvements.
For more details, read our blog.
We focus on deploying and accelerating deep learning networks on inference, and have applied our solutions across different use cases in natural language processing and computer vision. You’ll get the most out of our beta program when:
Learn more about our solutions here.
Underlying our AI products and solutions are two decades of deep neuroscience research and breakthrough advances in cortical theory. Our research has uncovered a number of core principles of the neocortex that are not reflected in today’s machine learning systems, allowing us to define new architectures, data structures and algorithms that deliver significant benefits in today’s deep learning networks and unlock a new wave of intelligent neuroscience-based computing.
Learn more about technology behind our AI products here.
Send us a message here.
Numenta technologies combined with the new Advanced Matrix Extensions (Intel AMX) in the 4th Gen Intel Xeon Scalable processors yield breakthrough results.
Numenta Transformer models significantly accelerate CPU inference while maintaining competitive accuracy.
With unique acceleration techniques built on neuroscience insights, our AI platform delivers high throughput at target low latencies for inference on CPUs.