HTM cortical learning algorithms

This page describes an overview of the HTM cortical learning algorithms. For a detailed description of the algorithms, see our paper on HTM cortical learning algorithms available in the papers section of the web site.

HTM networks are modeled on the neocortex, the seat of human intelligence. They capture the essence of how humans learn, recognize patterns, and make predictions. At the heart of every HTM network is a set of learning algorithms which model the organization and behavior of a layer of neurons in the neocortex. In the same way that humans learn from their environment, the HTM cortical learning algorithms perform the difficult task of discovering the temporal structure in large and complex data streams. By observing how data changes over time an HTM network learns what patterns are significant and causally related. After learning, an HTM network can recognize novel patterns and make predictions.

Starting in 2005 Numenta experimented with several approaches to time-based learning. Some of these experimental algorithms were released in early versions of the NuPIC development environment. In November 2009 we discovered a new learning method based on a model of a layer of cells in the neocortex. The new method is far superior to the previous methods, so we switched all our efforts to developing and testing the new method. For a brief time we used the term "FDR" to refer to the new algorithms, but going forward we will refer to them as the "HTM cortical learning algorithms". We have posted documentation about these new algorithms and plan to include them in a software release. The older learning algorithms are currently still available for experimentation but we expect all customers will use the new algorithms once they are released.

The HTM cortical learning algorithms and the corresponding biological theory perform the following functions.

  1. Convert input patterns into sparse distributed representations.
    The brain represents patterns through the activation of sets of cells, in a way that is described mathematically as a "Sparse Distributed Representation." Sparse distributed representations have many desirable qualities, including robustness to noise, high capacity, and the ability to simultaneously encode multiple meanings. The HTM cortical learning algorithms take advantage of these properties.
  2. Learn common transitions between sparse distributed representations.
    The neocortex learns by observing streams of temporal data, i.e. "movies" as opposed to "snapshots". When exposed to streams of sensory data the cortical learning algorithms remember transitions between patterns in the input stream. The transitions that occur again and again are reinforced; the transitions that do not occur again are forgotten. In the neocortex, this memory of transitions corresponds to the lateral connections between cells in a layer of a region.
  3. Predict likely future events.
    As described in On Intelligence, prediction is a key feature of human intelligence, and as we observe our environment we continuously predict what will happen next. When exposed to a sensory input, the HTM cortical learning algorithms use the previously learned transitions to make a prediction of likely future inputs. The prediction can be massively parallel or highly specific based on the learned transitions.
  4. Send predictions to the next level in the hierarchy.
    The neocortex is organized in a hierarchy of levels, where information (e.g. signals from the retina) comes into the lowest level, propagates to a higher level, etc. The HTM cortical learning algorithms operate at each level in the hierarchy. At each level the predicted patterns are combined, and this union of predictions becomes the output of a level in the HTM. The next level takes this input and turns it back into a sparse distributed representation. Forming a union of predictions is equivalent to a many-to-one mapping, and it leads to increased stability as you ascend the hierarchy. Both of these properties are required in hierarchical models.

The HTM cortical learning algorithms model the behavior of a layer of cells in the neocortex, but they also exhibit a number of mathematical properties that are recognized as being important for machine learning.

  • High capacity
    Sparse distributed representations comprised of just a few thousand bits can represent a very large number of distinct entities.
  • Robustness to noise
    Sparse distributed representations and the HTM cortical learning algorithms are highly resistant to noise and occlusions. Performance degrades slowly with noise.
  • On-line learning
    The HTM cortical learning algorithms can learn on-line, meaning they can learn while doing inference. Brains can learn all the time. On-line learning is important for applications where the statistics can change over time.
  • Variable order sequence memory and prediction
    "Variable order" means that sequences can be of varying lengths. Sometimes you need to go back a long way in time to make a prediction and sometimes you only need to go back a tiny bit in time. The HTM cortical learning algorithms automatically learn the variable order statistics in the data and will adapt if those statistics change. The cortical learning algorithms achieve variable order memory by modeling the columnar nature of cells in a layer of the neocortex. Cells in a column have similar feed forward properties but vary in their response in the context of different sequences.
  • Sub-sampling
    An important property of sparse distributed representations is that knowing only a few active bits of a representation is almost as good as knowing all of them. Nowhere in the HTM cortical learning algorithms do we store copies of entire patterns. Learning is based on small subsamples of patterns that, among other things, enable new means of generalization. These sub-samples of patterns correspond to the sets of synapses that form within an integrative region of a dendrite on a neuron.

For additional information, see the Papers and videos section of our web site.