DataLab is a compact statistics package aimed at exploratory data analysis. Please visit the DataLab Web site for more information....



RBF Network - Growing Network

Command: Math -> Neural Netzwork -> RBF Network -> Growing Network

The command Growing Network provides a method of feature selection which has been called 'growing neural network' in some previous work [Lohninger 93d]: the neural network grows in its input layer during the course of feature selection. The method starts with a neural network architecture having only one input neuron. Then each single variable is selected one after another and the network is trained and evaluated using the selected variable. The one variable which leads to the best results is stored and attached to the first neuron of the input layer after the processing of all features has been completed. After this, the network grows in its input layer by one neuron and the selection process is repeated the same way as described above. Thus the best features of the previous runs are combined with a new feature which gives the largest increase of the goodness of fit. In order to prohibit the multiple selection of a feature, those features which have already been selected are omitted for the rest of the selection process.

In addition, the user may select the first variable interactively, thus skipping the automatic selection of the first feature. This procedure can be useful in situations where a single features out-performs all others in achieving an estimate of the training data, but fails when combined with any other variables.

After choosing the command Growing Network the user has to specify the target variable. All other variables are used as potential input variables. During the course of the build-up of the input layer the currently selected input variables are displayed in inverted video in the survey window. This gives a good overview of the variables selected so far. The best combination of variables is displayed in a window at the right side of the screen. This list displays the number of the variable, the standard deviation of the residuals and the goodness of fit so far achieved.

Hint: The optimum numbers of variables is given by the problem under examination. Most often the results of the training runs (goodness of fit) increase up to a maximum point and then start to decrease again. The combination of variables around this point will be a good guess for the best combination. However, only cross validation will reveal the best combination.


Last Update: 2012-Jul-25