Don Hush has analyzed a very simple multi-layer perceptron (MLP) to quantify its capacity and performance in an article titled: “Classification with neural networks: a performance analysis“, IEEE International Conference on Systems Engineering, pp 277 – 280, Fairborn, OH , USA, 24 Aug 1989 – 26 Aug 1989. Some conclusions he draws are: networks with one hidden layer perform better than those with two hidden layers; the number of nodes in the hidden layer must be no smaller than d+1 and optimally about 3d where d is the dimension of the data pattern; finally, for best performance the number of training samples should be approximately 60d(d+1).
Don Hush and Bill Horne documented ” Progress in supervised neural networks” in IEEE Signal Processing Magazine,Vol.10, Issue 1, pp 8 – 39, Jan 1993. This review article describes MLP neural net processing, and more crucially, MLP training algorithms. Back in the early nineties, I used this review to specify processing for an MLP that fused the results from multiple independent classifiers. I observed two inescapable performance features.
If the training data contained near identical inputs for two apriori distinct classes then the MLP could not reliably distinguish between the classes (self-evident but with serious consequences). The other was that MLP fusion performance was dominated by the best classifier. In fact, the MLP fusion performance was alway less than that of the dominant input classifier. I concluded that either one had to use classifiers of comparable capability or it paid to reject the fusion process.
I found that the same MLP training software could be used to train a time delay neural network (TDNN) with little modification. I trained the seven node (one hidden layer) TDNN to “match filter” a discrete representation of the chaotic “Logistic map“.