Non-linear RegressionΒΆ

G7 offers two algorithms for non-linear regression. The nl command uses the simplex method (sketched below and not to be confused with the simplex method of linear programming) and the Powell method. The form for the two is exactly the same except that the first is invoked by the nl command, while the second is invoked by the nlp command. We illustrate the with nl here.

nl <y> = <non-linear function involving n parameters, a0, a1, ...an-1>
n <starting values of the parameters> <initial variations>

Example:

nl y = @exp(a0*@log(a1 + a2*x))
3
1.5  25.0  2.00
0.1   1.0  0.05

Since the non-linear function must be evaluated repeatedly, anything the user can do to speed the evaluation is helpful. For example, put all of the variables involved into the workspace; make the workspace no larger than necessary; take any functions of variables, such as logs or squares, which do not change through the calculations and put the results in the workspace and use them instead of, say, taking the log over and over.

In the simplex algorithm, ‘S’, the sum of squared errors, initially is calculated at the initial value of each parameter. Then, one-by-one, the parameters are changed by adding the initial variations and ‘S’ is recalculated at each point, thus yielding values at ‘n+1’ points (a simplex). By the algorithm sketched below, points in the simplex are replaced by better points or the simplex is shrunk towards its best point. The process continues until no point differs from the best point by more than one-tenth of the initial variation in any parameter. Each time a replacement is done, the simplex and the operation that yield it is reported on screen. Like all non-linear algorithms, this one is not guaranteed to work on all problems. It has, however, certain advantages. It easily is understood, no derivatives are required, the programming is easy, the process never forgets the best point it has found so far, and the process either converges or goes on improving forever. The display shows the regression coefficients, their standard errors, t-values, and variance-covariance matrix. The Powell method also requires no derivatives and has the property of “quadratic convergence.” That is, applied to a quadratic form, it will find the minimum in a finite number of steps.

Following the nl command, there can be f commands, r commands, and con commands before the non-linear equation itself. These may contain the parameters a1, a2, etc.; they each should be terminated by ‘;’. The non-linear search then includes the execution of these lines. E.g.:

nl f x1 = @cum(s,v,a0);
y = a1 + a2*x1

If the name of the left side variable is “zero”, no squaring of the difference between the left and right side will be done. Instead, the squaring must be done by the @sq() function. This feature allows soft constraints to be built into the objective function. For example,

nl zero = @sq(y - ( a0 + a1*x1 + a2*x2)) + 100*@sq(@pos(-a2))

will “softly” require a2 to be positive in the otherwise linear regression of y on x1 and x2.

If the name of the left-side variable is “last”, no summing will be performed. Instead, the value of the function on the right at the “last” period will be minimized. This “last” period is the one specified by the second date on the preceding limits command. In order to combine summing of some components of the function on the right with the use of the “last” option, the @sum() function is used. It places the sum of the elements from the first date to the second date in the second date position of the output series, which is otherwise zero. This device is useful in some maximum liklihood calculations.

Both the simplex method and Powell’s method are described in the book Numerical Recipes in C, by Wm. H. Press, et al., Cambridge University Press. The Simplex Method is sketched below.

Sketch of Simplex Method of Non-linear Regression

The method begins with a point in the parameter space of, say, ‘n’ dimensions and step sizes for each parameter. Initially, new points are found by taking one step in each direction. Actually, each step is made both forwards and backwards, and the better of the two positions is chosen. This initial operation gives ‘n+1’ points, the initial point and n others. These ‘n+1’ points in ‘n’ dimensions are known as a simplex. The algorithm then goes as follows.

Reflect the old worst point, W, through mid-point, M, of the other points to R (reflected).
(R is on the straight line from W to M, the same distance as W from M, but on the other side.)

if R is better than old best, B {
   expand to E by taking another step of length MR in the same direction.
   if E is better than R,
      replace W by E in the simplex.
   else
      replace W by R. (reflected)
   }
else if R is better than W {
   replace W by R in the simplex.  (reflected)
   }
else{
   contract W half way to mid-point of other points, to C. (contracted)
   if C is better than W,
      replace W by C.
   else
      Shrink all points except B half way towards B. (shrunk)
   }

Previous topic

Common-Coefficient, Panel-Data, or Pooled Regressions

Next topic

2SLS and 3SLS

This Page