|DataLab is a compact statistics package aiming at exploratory data analysis. Please visit the DataLab Web site for more information....|
|Home Features of DataLab Loading and Storing Data Loading Data Importing Complex Text Data|
|See also: Importing Data from Excel, Importing CSV Files, Mathematical Expressions in Import Scripts, Parsing Date and Time, Import Script Example, Importing Simple Text
Importing Complex Text Data
When acquiring data there are many situations where a measuring device delivers the measured data as a text file using a proprietary format. In order to be able to import such text data DataLab provides a simple script language which can be used to analyse the text lines and extract the required data from it.
The general approach of the text file analysis is straightforward: the data file is read line by line, and each line is interpreted by the script as soon as the trigger point has been found. The trigger point is a line in the data file which contains a particular substring (the trigger string is defined in the field "Trigger Line"). The matching of the trigger point is not case-sensitive.
In order to increase flexibility the user may specify a number of lines which are to be skipped at the beginning of the file before the search for the trigger point is started (field "No. of lines to skip before trigger"). Further, the user may also specify a number of lines which are skipped after the trigger point (field "No. of lines to skip after trigger").
The actual extraction is controlled by the script which is loaded and edited in the field "Data Extraction Script" (see below for the syntax of the script). Each global variable (= column of the data matrix) can be accessed in the script by an identifier which consists of "C" and the corresponding column number.
The extracted variables are named by their default headings (C1, C2, ....). In order to change the variable names the field "List of Variables" has to be filled in, specifying the desired variable names.
Tab characters: Tab characters are often used in text files to separate different entities. In order to include tabs in the parsing script, they have to be represented by a special character combination: '\t' (without quotes) or '»'. Tabs in the data file display are always indicated by the character '»'.
Debugging a script: Finding and correcting mistakes in a script may be quite time-consuming. A simple and efficient way to simplify bug fixing is to generate a debug report by ticking off the checkbox "Debug Script". If this checkbox is activated, DataLab generates a report of all actions performed during the application of the script. The debug report is displayed in an extr tab at the right side of the window.
Available Script CommandsIn general, each command has the same structure: the actual command is followed by the required parameters enclosed in parentheses and is finished by a semicolon or the end of the line. Commands must not be nested (i.e. a command cannot be called from within another command). If a command specifies a variable, the variable is created automatically without explicit declaration.
Commands starting with a hash character (#) are treated as comments. A comment either ends with the next semicolon or at the end of a line.
Variable identifiers: The script may use any number of user-defined variables which are automatically created when needed. A variable identifier may be any string starting with a character and containing only characters, digits and the underscore character; it must not use reserved words as defined in mathematical expressions (e.g. function names).