DataLab is a compact statistics package aiming at exploratory data analysis. Please visit the DataLab Web site for more information....


Python - Data and Parameters

When you run a Python script from DataLab (either via the Python script editor or by calling the DLabPascal command ExecutePyScript) the data matrix and several other parameters are made available via the Python module "datalab". Please note that the module "datalab" requires that the NumPy library is available. If NumPy is not installed, DataLab returns an error message and prevents you from running a Python script. The following figure gives a quick overview of the relationship between data structures used by the DLabPascal and the Python engine.

In particular the following Python parameters are defined in the module datalab (see DStore for details on the DLabPascal definition):

Parameter Type Explanation
colname list Contains the names of the columns of the data matrix (valid indices: 0...nrofcolumns-1).
comment string Contains the comment on the dataset as HTML code. Please note that the HTML code passsed to and from the variable is the code within the <body>...</body> statement. No metatags, style sheets or Javascript is allowed or supported.
datafname string The full path of the currently loaded dataset. If the data has not yet been saved datafname returns "noname.idt".
dstore ndarray Contains the numeric values of the data matrix (valid indices: [0...nrofrows-1,0...nrofcolumns-1]
matx, maty ndarray Two user-defined arrays for feeding data to the Python scripts. By default, these two arrays are undefined, unless they are explicitely passed in the DLabPascal function ExecutePyScript.
nrofcolumns integer The number of columns of the data matrix. Please note that changing nrofcolumns does not affect the size of the data matrix (nrofcolumns can be seen as a read-only variable).
nrofrows integer The number of rows of the data matrix. Please note that changing nrofrows does not affect the size of the data matrix (nrofrows can be seen as a read-only variable).
params ndarray An array of parameters which is passed to Python. The parameters are floating point numbers, even if a particular parameter is an integer. In order to pass non-numeric parameters you have to encode them as a floating point numbers (for example, a boolean parameter can be encoded as 0.0 and 1.0 for TRUE and FALSE, respectively).
resmat ndarray This variable is used for communicating data between DLabPascal and Python scripts. The two-dimenisonal array resmat is prepared by the calling process and is passed to the Python script as a container which should be filled with the results of the Python script. By default the variable resmat is configured as a 1xN array, with N being the number of rows of the main data matrix (i.e. DStore.NrOfRows). If order to pass a differently configured resmat to a particular Python script you have to use the DLabPascal function ExecutePyScript.
rowattrib list Contains the row attributes (= class numbers) of the matrix (valid indices: 0...nrofrows-1). Please note that the row attributes are unsigned bytes. Assigning values outside the valid range of 0..255 will result in truncated values.
rowname list Contains the names of the rows of the data matrix (valid indices: 0...nrofrows-1).
scratchdir string The path to the scratch directory of DataLab. This directory is writeable, but will be cleared when DataLab closes.
workdir string The path to the working directory of DataLab. This directory is writeable, all files are store permanently.

Hint: DataLab uses a special Python variable called "_dlab_buffer" for the interprocess communication. Thus you should avoid to use a variable with this name in your Python scripts, otherwise this may lead to unpredictable results.