DataLab is a compact statistics package aiming at exploratory data analysis. Please visit the DataLab Web site for more information....


DStore

Declaration: DStore: TDataTable;
The global variable DStore provides access to the data matrix and its related auxiliary data structures. DStore is an instance of the class TDataTable. The following table provides an overview on the properties of DStore providing access to the DataLab data structure:

Data Property
size of data table NrOfColumns, NrOfRows
row names RowName
row attributes RowAttrib
data cells Elem, ElemNominal
cell states CellState
variable types MScaleType
column names ColName
column attributes ColAttrib
comment Comment
In addition, there are lots of functions and procedures which let you manipulate the data table:

Filling the table cells: Clear, Fill, FillRandomNormal
Handling nominal variables: AddNominalID, ClearNominalIDs, CountNominalIDs, OrdinalOfNominalID
Copying data: see table of copy operations
Counting: CountEmptyCells, CountImputedCells, CountNumCells, CountValidCells
Import/Export: ExportAsASCFile, ImportASCFile
Handling cell states IfColHasCellState, IfRowHasCellState, InvertCellStates
Managing the size: Resize, InsertColumn, InsertRow, RemoveColumn, RemoveRow
Basic calculations: MeanVarOfMarkedCells, MeanVarOfNumCells, MinMaxOfMarkedCells, MinMaxOfNumCells, MinMaxOfValidCells, PercentileOfMarkedCells, PercentileOfNumCells, QuartilesOfMarkedCells, QuartilesOfNumCells, SumOfMarkedCells, SumOfNumCells, CheckDichotomousColumn
Handling attributes (class numbers): SetColAttributes, SetRowAttributes

Hint: Please note that accessing the data by means of loops is generally much slower than by using appropriate functions of the TDataTable class (due to the interpreted p-code of the scripts). As an example, it is much faster to copy the entire data matrix using the function CopyDataTo2DArray than to copy the data elementwise. The following program shows this effect. The first part copies the entire matrix by using the function CopyDataTo2DArray, the second part does the same by utilizing two nested for loops. The speed advange of the fast version is approx. a factor of 500 when dealing with a matrix of a million cells.

program TestCopySpeed;

var
  MyData1 : TDouble2DArray;
  MyData2 : TDouble2DArray;
  i,j     : integer;
  t1      : double;

begin
Resize2DArray (MyData1, DStore.NrOfColumns, DStore.NrOfRows);
t1 := CurrentTime;
DStore.CopyDataTo2DArray (MyData1,0,0,0,0,0,0);  // fast version
t1 := CurrentTime-t1;
cout ('T1 [ms]: ', t1*3600*24*1000);

Resize2DArray (MyData2, DStore.NrOfColumns, DStore.NrOfRows);
t1 := CurrentTime;
for i:=1 to DStore.NrOfColumns do  // slow version
  for j:=1 to DStore.NrOfRows do
    MyData2[i-1][j-1] := DStore.Elem[i,j];
t1 := CurrentTime-t1;
cout ('T2 [ms]: ', t1*3600*24*1000);
end.