DataLab is a compact statistics package aiming at exploratory data analysis. Please visit the DataLab Web site for more information....


Simple Regression

Command: Math -> Simple Regression...

The command Math/Simple Regression... allows to calculate a regression between a single descriptor variable and a single response variable. Following is the list of regression functions(1) which can be parametrized:
straight line y = kx + d parabolic curve y = a + bx + cx2
reciprocal curve y = 1/(a + bx) hyperbolic curve y = a + b/x
logarithmic curve y = a + bln(x) exponential curve y = a ebx
polynomial curve y = a0 + Saixi     centered polynomial y = a0 + Sai(x-x0)i
Hoerl function y = k0k1xxk2 normal distribution
natural spline y = Sk(x) = ak,0 + ak,1(x-xk) +
ak,2(x-xk)2 + ak,3(x-xk)3
for x [xk, xk+1] and k=0,1,2...n-1
smoothed spline y = Sk(x) = ak,0 + ak,1(x-xk) +
ak,2(x-xk)2 + ak,3(x-xk)3
for x [xk, xk+1] and k=0,1,2...n-1
smoothing factor γ = 0.0...1.0

Hint: Using data which span a small range at a large offset may lead to numeric instabilities and round-off errors when calculating polynomial fits of higher order (this is not a deficiency of DataLab but a general problem of limited numeric accuracy in digital computers). A possible remedy to this is to use centered polynomials with the x0 value set to the center of the x-values (DataLab does this automatically when selecting the "Shifted Polyn." function).

After clicking the command, the user has first to select the input (descriptor) variable and then the response variable. Thereafter a window is displayed which allows to select among the above-mentioned curve types and to calculate the corresponding parameters:

The regression results are displayed on five pages:

Regression On this page the plot of the data and the regression curve are shown. Data can be marked by drawing a rectangular window around the data to be marked.
Residuals The residuals plotted against the descriptor are shown on this page. Again, data can be marked by drawing a rectangular window around the data to be marked.
Distribution of Residuals This page shows the histogram of the residuals which the ideal normal distribution plotted on top of the histogram.
Details The most important details on the regression function are shown here: the parameters of the curve, the Durbin-Watson test for serial correlation among the residuals, the Lilliefors test on normality of the residuals, and the ANOVA.
Calculate On this page the user may calculate values of the dependent variable for particular values of the independent variable. The confidence interval is both calculated for the means and for individual values (at a level of confidence specified by the user).

There are several shortcut buttons left to the chart area which support the following actions:

switching the axes The currently selected variable can be switched by clicking the arrow buttons. The top buttons affect the dependent variable, the bottom buttons the independent variable.
selecting new variables Another variable may be selected by using the eye dropper button. Clicking this button enables you to pick a particular variable by means of the variable selection dialog.
setup the plot symbols Setup plot symbols and colors.
zoom full Zoom the window such that all data points are visible.
mouse action Pan: the visible part of the chart can be shifted by pressing and holding down the left mouse button. The chart follows the mouse movement which results in panning the data.
Window: any rectangular region of the chart can magnified to fit the full area of the chart by simply pressing and holding down the left mouse button at one corner of the region. Moving the mouse now shows a rubber band rectangle. The area of the rectangle is blown up to the full chart area when the left mouse button is released.
Drag: the scale of the chart can be increased or reduced by pressing and holding the left mouse button. The scaling factors of the x and y axes are changed in proportion to the movement of the mouse (up and right = magnify, left and down = reduce).
marking data  
 
Mark Data: Data may be marked by activating this option. You can toggle between four different modes of marking by clicking this button when it is already selected or by selecting an option from the context menu of the button. A detailed description can be found in the section "marking the data"
removing marks Marked data can be unmarked by clicking this button.
copying residuals Clicking this button allows to copy the residuals of the regression into the matrix clipboard.
saving the regression equation The regression equation can be saved as a DLabPascal script file.



(1) Some of these functions are not available in the evaluation copy.