![]() ![]() ![]()
|
2 Developing a Simple HTM
This chapter explores the sequence of tasks involved in developing an HTM application.
Topics
Development Overview
The process of creating an HTM application differs significantly from a traditional software engineering process. HTM Networks learn by building a statistical model based on a sequence of input data. The quality of the final network depends on many factors, such as the network configuration (does the hierarchy have two levels or three?), node parameters (is
maxDistance0 or 0.2?), the training data (are there sufficient sequences?), and so on.Because of this, the HTM development process is iterative at a very high level. You don't just write the program, find problems, rewrite and rerun. Instead, each task in the process might influence other tasks earlier or later in the process. While you're designing the HTM Network, you might need to refine the problem definition. While you're testing and analyzing the results, you might find that the data representation is less than optimal.
Pay particular attention to the early stages of the process. Unless your data are represented in a way that's easy to understand for the HTM (and often for humans as well), the results will most likely not be satisfactory.
Figure 2 Numenta Development Process
Development Tasks
This section briefly discusses each task in the development process. The Advanced NuPIC Programming document explores each task in more detail.
Defining the Problem
A well-defined problem is essential for HTM Network development.
Before you start writing your code, you must determine the spatial and temporal correlations in your problem, and you must consider how these correlations are structured hierarchically.
To be successful, your HTM Network must build a statistical model of the underlying causes in your problem domain.
Representing Data
You must represent the data so they match the sensor node, which receives input for the HTM Network. You need to prepare multiple datasets: training data, category data (if available), and testing data.
Here are a few points to consider:
- Unless you're using a custom sensor, make sure the data match the
VectorFileSensor.If your data source does not produce vectors as output, manipulating the data so that they fit the available sensors is usually the best solution. You can also create a custom sensor using the node plug-in API.
- For the training dataset, include both an input file and a category file.
For some problems, no category information might be available. In that case, the system groups data using the spatial and temporal information it can abstract from the data.
- For the testing dataset, make sure the data are new to the system in interesting ways. If you've trained a vision system using photos of objects, you could include in the testing dataset photos of the same type of object but a different specific object (for example a teapot that's shaped slightly differently from the teapot you used for training). You might also present pictures of the objects from different angles.
If source data are incomplete, or if you want to start developing your HTM Network prototype with a small, easily controlled set of data, data generation makes sense. You might use only generated data, as in the Bitworm example, or modify existing data in different ways for a more complete training set.
Designing and Configuring the HTM Network
Designing, configuring, and running the HTM Network are iterative steps.
As part of analysis and testing, experiment with a different number of levels, different number of nodes per level, and different node parameter settings.
The following questions might help you to make decisions when designing your HTM application.
- How much data are appropriate? Too much data generally doesn't hurt, although it might slow down processing. Too little data might not be enough for learning. Experimentation is usually necessary but it's better to err on the side of having too much data.
- What are the best values for node parameters? See Affecting Learning Node Behavior With Node Parameters on page 42 in Advanced NuPIC Programming for an introduction. For a more detailed discussion, see the white paper Zeta1 Algorithms Reference, available on the Numenta website.
- What is the best way to arrange nodes (how many levels, how linked, etc.)? Network geometry affects the number of groups that are generated.
In general, there are no easy answers to these questions. In many cases, you need to experiment with the available options. For example, you might run your HTM first with two levels of learning nodes, then with three levels and compare the results. The Numenta website and the forums have some discussion on those topics.
Running the Network to Perform Learning and Inference
After the HTM Network file has been saved, you can train the network. Before you start your training runs, asking the following questions might be useful:
- What is the best sequence of training data to quickly train the HTM?
- Can I reuse some trained nodes? For instance, the Pictures example trains one node, then make copies of that node for all other nodes at that level. See Training Your HTM Network: The Pictures Example.
- Can I speed up learning by supplying additional category information at certain levels?
The following materials discuss training:
- Inside a Learning Node: How Learning and Inference Happen on page 38 in Advanced NuPIC Programming gives an introduction to the process.
- The white paper The HTM Learning Algorithms offers an in-depth discussion of HTMs in general.
- The white paper Zeta 1 Algorithms Reference discusses the
Zeta1Nodelearning algorithm including all parameters you can change.Debugging, Testing, and Analyzing Your HTM System
When you first attempt to run your HTM Network, you might require some troubleshooting if things don't work at all. Once the HTM Network runs satisfactorily on the training data set, you might wish to improve accuracy, speed, or both.
It is often useful to examine intermediate results or to experiment with different parameter settings to understand how your HTM application processed the input data. Based on that information, you can change how you enter the data, change nodes and their parameters, or modify other characteristics of your HTM Network.
A number of tools and scripts allow you to view results and explore how the HTM Network arrived at the result.
See Debugging Your HTM Application for a list of things you can do if things don't work at all. For more detailed information, see Testing and Improving HTM Network Performance on page 71 in Advanced NuPIC Programming.
Summary and Guide to Documentation
The process from problem definition to HTM deployment is not linear. At each stage, you might need to return to an earlier stage for redefinition or refinement.
The following table summarizes the tasks involved in designing and implementing an HTM Network and lists relevant documentation for each task.
Table 2: Tasks Overview and Documentation Task Documentation (Concepts) Documentation (Task-based)
Numenta website.The white paper Problems that Fit HTMs Numenta website.Look at example code for examples of problem definition. Numenta website. Numenta website.Look at example code for examples of data representation (Pictures example) and data generation (all examples). Numenta website. Python online help. The white paper Zeta 1 Algorithms Reference Python online help.Numenta website. Numenta website (Pictures Case Study on the Community Wiki)
|
Numenta www.Numenta.com |
![]() ![]() ![]()
|