Libann

Libann: The Networks


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4. The Networks

This chapter describes the different types of Neural Network supported by Libann. Each network type has its own C++ class, and it declared in its own header file. In general, the processes you will follow are:

  1. Instantiate the network.
  2. Train the network.
  3. Present a feature vector to the network for recall.

4.1 Persisting the Network  Saving to a file
4.2 Kohonen Networks  Unsupervised Learning
4.3 Multi-Layer Perceptron Networks  Supervised Learning
4.4 Hopfield Networks  Content Addressable Memory
4.5 The Boltzmann Machine  Generalised Hopfield Networks


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.1 Persisting the Network

The power of neural networks lies in their fast recall times. Once a network has been trained, it can be used for recall many times. You will probably want to save a network to non-volatile storage to be read back at a later date. To do this, all the network classes have a save method to store the network to a std::ostream, and a constructor which reads from a std::istream to re-create that network.

For example, assume that you have instantiated a Multi-Layer Perceptron Network called network, and you have already trained it. The following code fragment will save it to a file called `myNetwork':

 
std::ofstream ofs("myNetwork");

network.save(ofs);

ofs.close();

Another program can then create an identical network with the following code:

 
std::ifstream ifs("myNetwork");

ann::Mlp network(ifs);

ifs.close();
Currently, the file formats produced are not portable between architectures with different endianess.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.2 Kohonen Networks

A Kohonen network is useful when you want to classify samples into groups, but you don't know how many groups there are, or exactly what the variation is. The Kohonen network is an example of unsupervised learning.

First, you need the following preprocessor directives in your code:

 
#include <ann/ann.h>
#include <ann/kohonen.h>

You will also need to include a Standard Template Library header:

 
#include <set>

Now you're ready to actually create the network. The constructor for the Kohonen class takes two integer parameters. The first is the number of input units, the second is the square root of the number of output units.

 
// Create a Kohonen Network with 100 inputs and 25 outputs
ann::Kohonen net(100,5);

This implies that the number of output nodes in a Kohonen network is always square. Non square configurations are not (yet) supported by Libann.

After the network has been created, it needs to be trained. First, however you need to gather all the training samples into a std::set, and then pass it to the train method. The following code fragment assumes that you have already defined a class called FeatureVec which is derived from ann::ExtInput, and that you have instantiated N samples of this this class, ft1, ft2, ... ftN.

 
std::set<FeatureVec> trainingData;

trainingData.insert(ft1);
trainingData.insert(ft2);
.
.
.
trainingData.insert(ftN);

net.train(trainingData);

There are other optional parameters to the train method, by which you can control the training process. Please see the header file `ann/kohonen.h' for details.

Having trained your network, you'll probably want to recall data from it. You do this with the recall method:

 
net.recall(ft1);

Here, we used a sample which was one of the training data, but it need not have been. The return value from the recall method is of the type ann::vector which is derived from std::vector<float>. It has an operator<< defined which you can use to stream it to cout.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.3 Multi-Layer Perceptron Networks

4.3.1 Creating the Network  Instantiating the network
4.3.2 FeatureMap  How to arrange your training data.

The Multi-Layer Perceptron is an example of a Neural Network which uses supervised learning. That means it is useful when you have a reasonable number of samples whose class is already known, and you wish to classify some unknown samples to match the known samples.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.3.1 Creating the Network

You will need to include the following headers:

 
#include <ann/ann.h>
#include <ann/fm.h>
#include <ann/mlp.h>
#include <map>

The first line is what you need for any program using Libann. The third line gives you the declarations for the Multi-Layer Perceptron (Mlp). The forth line is a header from the C++ Standard Template Library. It contains a declaration for an associative array which we need later. Also, you should not forget to catch all exceptions as described in 2.3 Exceptions.

Before you create your network, it's best to have the training samples already prepared. As the network uses supervised learning, you must know the class of each sample. Class names have the type std::string. You can use any name for your classes, so long as they are unique. A quick look through the header file `ann/fm.h' shows that it contains a declaration for a class ann::FeatureMap which is inherited from std::map<std::string,std::set<ann::ExtInput> >. In fact it is nothing more than one of those, plus a couple of convenience methods. An object of this class will hold your samples. There's more about the FeatureMap below.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.3.2 FeatureMap

Let us assume that you are writing an optical character recognition program and have already defined your input class (See section 3. Creating a Feature Vector.) and it is called Glyph. Each Glyph represents some feature vector from a printed character on a page. So now you should copy each Glyph into a ann::FeatureMap:

 
ann::FeatureMap fm;

while(/* there are more glyphs */) {
 Glyph glyph(/* arguments to the Glyph constructor*/);
 std::string className=/* the name of the class to which this Glyph
                          belongs */; 
 fm.addFeature(className,glyph);
}
Every feature that you add must have the same size. If you try to add one with a different size, then you'll get an exception thrown at you.

Now you're in a position to create the network. The ann::Mlp constructor takes two mandatory arguments; the size of the input layer and the size of the output layer. An optional third argument allows you to specify the size of the hidden layer.

 
ann::Mlp network(fm.featureSize(),4);

Notice two things about the above code fragment:

  1. We have used the method ann::FeatureMap::featureSize() to find out the size of the features in the feature map. We should know that anyway since we put them there in the first place, but this method makes it easy for us.

  2. We've asked for an output size of 4. This implies that there are no more than 2^4 = 16 classes in the feature map. If there are any more then you'll get problems later. We could have calculated this size instead of asking for a literal `4', but this would involve the use of logarithms, so you might want to think twice before doing that.

The network it now ready for training. To do this, use the ann::Mlp::train method and pass the feature map:

 
network.train(fm);

There are a multitude of optional arguments to this method. Refer to the `ann/mlp.h' header file to find out what they are. If training takes an unreasonably long time, then you may have to tweak some of them. Alternatively, you might need to rethink the training data you're passing trying to train with. Increasing the size of the network's hidden layer could also help.

Once trained, you'll probably want to use your network to recall samples of unknown class. To do this, simply use the ann::Mlp::recall method, passing in the unknown sample as its argument:

 
Glyph unknown(/* some new glyph */);
network.recall(unknown);
The return value from the recall method is the class-name of the sample passed in. That is to say, it'll be one of those that you entered into the feature map.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.4 Hopfield Networks

The Hopfield network can be used as a content addressable memory. That is, it can learn a set of patterns, and recall any one of these patterns given an incomplete or approximate copy of that pattern.

First, you need the following preprocessor directives in your code:

 
#include <ann/ann.h>
#include <ann/hopfield.h>

As mentioned, the Hopfield network learns a set of patterns. Therefore you will also need to include a Standard Template Library header for sets:

 
#include <set>

This allows you to create a set of ann::ExtInputs (or objects of a derived class) and to put them into a std::set, thus:

 
  std::set<MyPattern> patternSet;
  for (int i = 0 ; i < 10 ; ++i) {
     // Create a  pattern using some pre-defined constructor
     MyPattern p(i);

     // and insert it into the set
     patternSet.insert(p);
  }

Once you have your set of patterns, you can create the Hopfield network using its contructor, and passing in the set. Then you can use the recall method, passing in a pattern which approximates one of the patterns passed to the constructor; the Hopfield network should identify the closest pattern and return that value.

 
   // Instantiate a Hopfield network
   ann::Hopfield h(patternSet);

   // recall a (hitherto unknown) pattern
   MyPattern approximation;

   std::cout << approximation << "is an approximation to " ; 
   std::cout << h.recall(approximation) << std::endl;
  


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.5 The Boltzmann Machine

The Boltzmann machine class can be used to classify binary data. It is a fully meshed network, like the Hopfield network. However it has a concept of temperature, which overcomes local minima.

 
#include <ann/fm.h>
#include <ann/boltzmann.h>

You must prepare your data to be classified, and add it to a ann::FeatureMap See section 4.3.2 FeatureMap. Having done this, you create the Boltzmann machine by invoking the constructor.

 
  ann::FeatureMap fm;

  // Insert features into feature map
  .
  .
  // End of feature map creation

  ann::Boltzmann b(fm,hiddenUnits, temperature , coolingRate);

As you can see, the constructor needs some additional parameters:

  • hiddenUnits is the number of `hidden' units in the network. As a rule, the greater the number of hidden units, the more powerful the network, but the longer it will take to work.
  • temperature is a floating point value, which is the initial temperature of the network. A large value might give better results, but will give slower performance.
  • coolingRate is the amount by which the temperature is reduced in each iteration of the network. It must be a floating point value between 0 and 1. Values closer to 1.0 will be better at overcoming local minima but will be slower.

Once you have created the network, use the recall method to look up values.

 
  ann::ExtInput feature = getFeature();

  std::cout << "Feature is of class " << b.recall(feature) << std::endl;

The value returned by the recall method is the class name which was given in the FeatureMap. If the network cannot determine the class, then it will return an empty string.


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by John Darrington on May, 15 2003 using texi2html

[Home] [Frequently Asked Questions] [News] [Development] [Links] [Using and Installing] [Download] [Licence]