Model II regression should be used when the two variables in the regression equation are random and subject to error, i.e. not controlled by the researcher. Model I regression using ordinary least squares underestimates the slope of the linear relationship between the variables when they both contain error. According to Sokal and Rohlf (1995), the subject of Model II regression is one on which research and controversy are continuing and definitive recommendations are difficult to make.
MAREGRESS is a Model II procedure. A bivariate normal distribution can be represented by means of concentric ellipses [mean(Y), mean(X)]. From the analytical geometry an ellipse can be described by two principal axes (major and minor axis). Both axes are at right angles each other. The major axis is the the longest possible axis of the ellipse. To estimate parameters we need sample means, standard deviations and the covariance. Here, the main task is to find the slope and the equation of the major axis of the sample.
The equation of the major axis is defined as:
Y = mean(Y) + b1*(X - mean(X)) = mean(Y) - b1*mean(X) + b1*X = b0 + b1*X
As we can see. The equation involves the means of the two variables and
the slope of the major axis b1. The slope is calculated as,
b1 = SYX/[lambda1 - var(Y)]
where SYX is the covariance(Y,X), var(Y) is the variance of Y, and lambda1 the variability (variance) along the major axis [first eigenvalue, latent root or characteristic root of the variance-covariance matrix of Y and X].
Pearson (1901) coined this term. Here we use the procedure given by Sokal and Rohlf (1995 [Box 15.5]).
The fundamentals are from a multivariate statistics, relating with the principal component analysis (PCA).
This procedure must satisfy:
-- a bivariate normal distribution
-- both variables must be in the same physical units or dimensionless
-- error variances of variables are assumed approximately equal (if there is no information on the ratio of the error variances and no reason to believe that it would differ from 1, results must be taken with caution)
-- can be used with dimesionally heterogeneous variables when the purpose analysis is to compare the slopes of the relationships between the two variables measured under different conditions
[B,BINTR] = MAREGRESS(Y,X,ALPHA) returns the vector B of major axis regression coefficients in the linear Model II and a matrix BINT of the given confidence intervals for B.
MAREGRESS treats NaNs in Y or X as missing values, and removes them.
Syntax: function [b,bintr] = maregress(y,x,alpha)
y - dependent variable (must be the first entree)
x - independent variable (must be the second entree)
alpha - significance value
- major axis regression parameters (intercept;slope)
- parameter confidence intervals [lower upper (intercept);lower upper (slope)]