DataLab is a compact statistics package aiming at exploratory data analysis. Please visit the DataLab Web site for more information....


Data Imputation

Command: Tools -> Data Imputation...

DataLab provides a few simple algorithms to fill up missing values. The missing data can be imputed either column-wise or row-wise by the following methods:

  • Mean Fill: the empty cells of each valiable are replaced by the mean of the particular column or row.
  • Propagate Down/Right: for each gap the empty cells are filled by the value at the top or at the left of the gap
  • Propagate Up/Left: for each gap the empty cells are filled by the value at the bottom or at the right of the gap.
  • Linear Interpolation: the empty cells of each column/row are replaced by linear interpolation between the first and the last known value of each gap.

In addition, normally distributed random numbers can be added to the imputed values by the ticking the "Add Gaussian Noise" box off. The level of the noise can be set by the "Amplitude" control, which specifies the noise level as a multiple of the standard deviation of the known values of a particular column/row. If you want to undo the last imputation, you may click the "Oops" button    which resets the empty states of the cells to the previous state. The "reset" button can be used to set all imputed values of the entire matrix to the "empty" state.

The imputed data are indicated in the numeric data editor by a colored background: