DataLab is a compact statistics package aiming at exploratory data analysis. Please visit the DataLab Web site for more information....


Class TDataTable


The class TDataTable implements the data structure used by DataLab. It combines a rectangular numeric matrix with additional information, such as row and column names, cell states, or attributes.
The data table consists of numeric cells organized as a rectangular array. It is assumed that the measurement data is organized such that each row of the data matrix represents a particular measured object, while the measured variables are stored columnwise. This arrangement of the data is important when dealing with categorical data (nominal or ordinal). TDataTable supports only the columnwise assignment of categorical variables. By default, all variables are assumed to contain interval level data (which is the most common case). Other levels of measurement can be set up by setting the array properties MScaleType and NominalID.

The data of the table can be accessed by using the array property Elem. Please note that the data is always stored in floating point format, values of categorical variables are represented by the (rounded) ordinal value (see Nominal and Ordinal Data for more).

The size of the matrix is controlled by the properties NrOfRows and NrOfColumns. Changing its size can also be accomplished by using the method Resize. Please note that any change in matrix size triggers the OnResize event. Row and column attributes: each row or column has a numeric attribute between 0 and 255. These attributes may be used to assign different classes to some of the rows (array property RowAttrib), or columns (array property ColAttrib). The row and column names may have a maximum length of MaxNameDTWidth characters and can be accessed by the array properties RowName and ColName.

In order to cope with different cell states each cell of the data matrix may assume a combination of up to 8 independent states (8 bits per cell). These cell states can be used to indicate certain aspects of the data (e.g. for marking part of the data). The cell states can be manipulated by the array property CellState. The methods IfColHasCellState and IfRowHasCellState allow to test the cell states of entire columns or rows. There are several statistical methods which evaluate the cell states and use only those cells which are neither marked as csNAN, nor as csUndefined: CountNumCells, MeanVarOfNumCells, MinMaxOfNumCells, PercentileOfNumCells, QuartilesOfNumCells and SumOfNumCells.

Finally, TDataTable also stores a comment of arbitrary length which is accessible via the property Comment.

The data of the data table may be imported and exported from and to a simply formatted ASCII file by using ImportASCFile and ExportAsASCFile. Note that there is also the general procedure ReadHeaderOfASC which is not a method of TDataTable. This procedure allows to read the header information of ASC formatted files without the need to create an instance of TDataTable. Another method of importing data into the data table is either to copy data from an external TDataTable instance by using various copy operations.

Sample
program:
The following short program shows how to use a data table. It creates a new table, fills the cells by normally distributed data and calculates the actual means and variances afterwards.

program DataTable;

const
  NCOLS = 4;
  NROWS = 200;

var
  dt             : TDataTable;
  i              : integer;         // general purpose integer variables
  mean, vari     : double;

begin
dt := TDataTable.Create(nil);
dt.Resize (NCOLS, NROWS);
dt.FillRandomNormal (0,0,0,0,2.0,1.0);
for i:=1 to NCOLS do
  begin
  dt.MeanVarOfNumCells (i,0,i,0,mean,vari);
  cout ('col '+IntToStr(i)+': ',mean);
  end;
dt.Free;
end.

Hint: Please note that there is a predeclared instance of the class TDataTable, called DStore, which is globally available throughout a script and which allows you to access the currently loaded dataset.

Properties

Methods