![]() ![]() ![]()
|
5 Debugging Your HTM Application
This chapter gives some introductory information on troubleshooting your HTM application.
Topics
Making Sure The Data Are Valid
When the results don't make sense, check first whether the data you're using are fed in appropriately and picked up by the HTM Network as expected. A number of things can go wrong with data preparation and data submission. Ask yourself these questions:
- Are your sensors picking up the correct training data files? If you're not sure, move or rename the training data files and confirm that an error results.
- Are you using the right number of training files? In some examples with multiple sensors, the system expects a corresponding number of training files. If you're using both a data sensor and a category sensor, you must submit two files.
- Is your category data file in sync with the data file? It's easy to make a mistake, which results in wrong information for each array in the training data.
- Are your data formatted correctly? Different sensors might have different formatting requirements.
Per-Level Training: Troubleshooting Efficiently
If you train your HTM Network completely and then run inference, the results are not always satisfactory. One reason is that if nodes at one level in your hierarchy are not performing well, nodes at a higher levels can't perform well either because their input is difficult to process. Because of the dependency of each node on all lower levels, it makes sense to train and analyze your HTM Network one level at a time while working on performance improvements.
1. Train the bottom level of your HTM Network and stop the training before training higher levels.
2. Examine the bottom-level output. You can either use the Visualizer tool (see Using HTM Network Visualizer) or extract data structures from the nodes directly.
categorySensor = runtimeNet.getElement("CategorySensor") categorySensor = runtimeNet.CategorySensor
- You can also use
Sessionmethods (see Using Session.execute() to Access Nodes or Execute Commands at Runtime in Advanced NuPIC Programming.3. See where potential problems lie, then change one or more parameters of the bottom-level nodes and repeat steps 1 and 2. For example:
4. Repeat this process level by level until you have trained and analyzed your HTM Network completely.
Experimenting With Parameters and Input Data
At times, certain settings - either learning node parameters or application-specific parameter settings - might make it impossible for your HTM Network to complete a run. An example is shown in Running the Example with Noisy Data, where the
maxDistanceparameter becomes too small for a run with noise and an error results. In addition, issues with your input data might make it impossible for the HTM Network to recognize the patterns.Experiment With Key Parameters
Each node performs spatial learning (coincidence detection) and temporal learning (grouping).
- For spatial learning, a key node parameter is
maxDistance. Use largermaxDistancevalues for higher noise, smallermaxDistancevalues if things that don't belong together are lumped in the same group.- For temporal grouping, a key parameter is
topNeighbors.If you want fewer groups, increasetopNeighbors. If you want more groups, decreasetopNeighbors. Note thattopNeighborsis needed only byZeta1Node, not by the newerSpatialPoolerNodeandTemporalPoolerNode.See Affecting Learning Node Behavior With Node Parameters in Advanced NuPIC Programming for a discussion of each parameter.
If your files are set up appropriately, experimenting with key parameters is fairly straightforward. For example, in Bitworm, you can make changes to many of the parameters by editing
RunOnce.py.A second approach is to experiment with key parameters interactively. The Bitworm example includes a script,
ParameterExploration.pythat allows you to do so. Here is the core part of this script:# Now instantiate the RuntimeNetwork net = CreateRuntimeNetwork("trained_bitworm.xml") # Get the level 1 node node = net.Level1 # Run through all reasonable values for the topNeighbors parameter # and see the resulting number of groups. for t in range(1,10): node.setParameter("topNeighbors",t) node.execute("computeGroups") print "With topNeighbors set to",t,"the number of groups = ", \ node.getParameter("groupCount")When you run the script, you will see the network get trained and then see the effect of various
topNeighborsvalues on the group structure of theLevel1node. Output is sent tostdout.For example, you might find that a different
topNeighborsvalue than the default value results in a cleaner group structure. With the default value, one of the groups has only one element, and it's not clear why that bitworm was not assigned to one of the groups. WithtopNeighborsvalue of 4, the Group 6 bitworm is assigned to the appropriate textured bitworms group.
Experiment With Input Data
There are a number of things to look for in your input data:
- Do the input data have temporal sequences? - Temporal sequences are essential because they allow the HTM to learn that large sets of input patterns are related to each other because all patterns are generated by a single high-level cause. For example, a sequence of images of a cat moving, sitting, yawning, etc. is good input because even though the actual pixels are different, the underlying cause (the cat) is the same.
HTM Networks depend on temporal continuity: Based on the frequency of transitions between naturally occurring sequences of patterns, nodes learn invariances and form a statistical model of their domain.
- Are sequences long enough? - Short sequences are jarring. Like a movie with several cuts a second, short sequences result in many random transitions, which become noise in the transition matrix. Nodes can't tell whether a transition is part of a moving object or a move to a next object unless instructed explicitly.
- Consider blank space for differentiating sequences - Insert blank space as an input vector with all zeros. These blanks tell the learning algorithm that the last pattern of sequence N has no temporal association with sequence N+1.
- Do you have enough training data? - Increasing the number of examples might noticeably improve performance. While more data are especially helpful if you wish to improve recognition accuracy, it might also be necessary if you're experimenting with a sample set that's too small.
Consider Preprocessing Data
In some cases, preprocessing is necessary because your data don't match the sensor you're using. You might also consider preprocessing your raw data if you believe that the raw data are difficult for the HTM Network to process. For example, if you're working with an HTM vision system, consider running the images through an edge detector and feeding the edge output instead of the images themselves to the sensor. However, be careful: Whenever you replace the raw data with preprocessor output you choose, you run the risk of discarding useful correlations hidden in the raw data before the HTM Network even sees the data.
Experiment with Network Topology
As a rule, network topology, such as fan-in at each node and the number of levels in the hierarchy should be based on design decisions that stem from the hierarchical spatial and temporal correlations inherent in the problem. However, if you believe after experimentation that a different topology might yield better results, reconsider your design and try a different topology.
Adding levels and changing the number of nodes per level is easy with the helper functions.
Improving Recognition Accuracy
After your HTM Network has successfully completed a training run, you are ready to test your HTM Network's performance. Follow these steps:
- Test the network's recognition accuracy using the training dataset - At a minimum, your HTM Network should be able to categorize the data on which it's been trained with very high accuracy (though not at 100%).
- Test recognition accuracy on an independent dataset - As a second step, test the HTM Network using new data. You could start with data that are relatively easy to categorize, then move on to data that are more difficult to categorize. If you only test on the original training data, you won't know whether the network can generalize to recognize new data, or whether it simply has memorized what it has seen.
When you test with new data, make sure that the first set of new data is drawn from the same statistical model as your training data. When your HTM Network can correctly categorize that set of test data, you can move on to noisy data or to other data that are more difficult to test.
- Try changing other settings to improve performance (see Experiment With Key Parameters).
For each case, change the configuration settings and save a new HTM Network, then train and test the newly-configured network.
Using HTM Network Visualizer
The Numenta Visualizer tool allows you to examine a trained HTM Network and see what the network has learned. Visualizer generates an HTML page for each node in the network. Each page displays the node's groups and coincidences as well as general statistics.
During development, it is usually best to train your network one level at a time, checking each level to make sure the nodes at that level have learned well before moving on. Training an entire network at once and then looking at end-to-end accuracy is not recommended, because it is difficult to determine exactly where a problem originates. If a certain level is not performing well, all higher levels are impacted because the outputs of one node are the inputs to its parent. You can use Visualizer to examine performance at one level before moving on to the next.
To illustrate the drawbacks of training the entire network at once, imagine that one of your bottom-level nodes is grouping all the patterns it sees into a single group. In this case, the nodes at the next level receive the exact same input no matter what data you feed to the hierarchy. The overall performance of your HTM Network is poor. It therefore makes sense to proceed as follows:
1. Run your training script, but stop after the Level 1 nodes have been trained.
2. Use Visualizer to examine the Level 1 nodes. If the nodes have poor coincidences and groups, change node parameters, and then train and look at the results again.
3. When you're satisfied with the bottom-level nodes, move on to successively higher levels of the hierarchy, making changes one level at a time.
Invoking Visualizer
Visualizer is installed as part of NuPIC and is located in the
nupic.analysispackage.To invoke Visualizer
1. Start the Python interpreter.
2. Import the Visualizer module.
3. Run the Visualizer on your trained network
where
<your_network.xml>points to your trained HTM Network fileYou can also run Visualizer automatically from another Python script by including the code from steps 2 and 3 above in your
.pyfile. If you include a call to Visualizer at the end, the Visualizer HTML pages are generated automatically after training.When you run
Visualizer.visualizeNetwork(), Visualizer takes the following steps:1. Starts a
RuntimeNetworkwith your trained HTM Network file.2. Extracts data for each node in the HTM network.
3. Creates an HTML page for each node in the network, as well as a main page linking to all nodes.
If you wish to examine only a particular node, pass a node name or regular expression to
Visualizer.visualizeNetwork(). For example,visualizeNetwork("level1.*")prints all nodes that begin withlevel1.Visualizer deletes its temporary files when it is done.
Visualizer Output
The Visualizer output is organized as follows:
- Visualizer places the output HTML pages into a folder named after your network. For example, if you call Visualizer with a network file named
mynetwork.xml, Visualizer creates a folder namedmynetwork.- Visualizer generates a main page called
index.htmlwithin the main folder.Open this file to see an overview of the network with links to the pages for each of the nodes in the network.- Each node has its own subfolder within the top-level folder. The HTML page for a given node is stored as
index.htmlin that node's subfolder.Supported Types of Data
Once your data is fed from the sensor into the bottom-level nodes of your HTM Network, the input to any node is a vector of floating-point numbers. The network does not necessarily know whether these vectors represent pixels of an image, letters of text, or something else entirely.
Based on the type of sensor in your network, Visualizer can determine how to turn each of the node's coincidences into its original form. At higher levels, the coincidences do not correspond to raw input, so Visualizer does not show them.
Interpreting Visualizer Output
Visualizer creates a folder named after your network, located in the same directory. Open the
index.htmlpage within it to see an overview page for the network. This overview page lists the total number of nodes in the network - including sensors, effectors, and other nodes that are not visualized - as well as the number of nodes visualized. For each visualized node, the main page contains a link to the node's page and a count of the number of coincidences and groups in that node. Here's a main page from the Waves example:Figure 14 Visualizer Main Page (Waves Example)
Each node page displays the following information (See Figure 15):
- Node name
- Links to neighboring nodes
- Link back to the main page
- Name of the node class (such as
SpatialPoolerNode)- Number of coincidences in the node (for nodes that do spatial pooling)
- Number of groups in the node (for nodes that do temporal pooling)
- Node parameters such as
maxDistance- A histogram of group sizes, where the size of a group is the number of coincidences it contains (for nodes that do temporal pooling)
- Stability ratings, computed for each coincidence and group, and for the node as a whole. Stability is related to the percentage of transitions to one coincidence that came from another member of the same group. A high stability value indicates that most transitions occur within the group. A low stability value indicates that many transitions occur between groups, suggesting the groups are not well formed (a node with random groups would have very low stability numbers). If you see low stability numbers, change the parameters of the temporal pooler, or examine the way you're feeding data into the network.
- For the bottom nodes, the page also displays the node's groups, and the coincidences within them
Here's a node page from the Waves example:
Figure 15 Visualizer Node Page (Waves Example)
Using Visualizer to examine the nodes of your HTM Network allows you to see how well your network is performing. If you spot problems with nodes at a particular level, you can adjust the node's parameters to fix them. If you spot serious problems in the level 1 nodes - for example, the groups seem nonsensical - you might have to reexamine how you feed your data into the network.
Plotting and GUI Packages Bundled with NuPIC
In addition to Numenta Visualizer, a number of plotting and GUI packages are bundled with NuPIC:
|
Numenta www.Numenta.com |
![]() ![]() ![]()
|