Linear regression is a beautiful algorithm that models relationship between two or more variables. The algorithm estimates linear weights ⍵0, ⍵1 … ⍵n given data (y, X) and assumed model:
y = ⍵0 + ⍵1x1 + … + ⍵nxn (1)
The classical approach for calculating ⍵ is minimizing the residual sum of squares:
RSS(⍵) = ∑i=1..N (yi - f(xi))2 = (y − X⍵)T(y − X⍵) (2)
After differentiating the RSS we find it extrema:
⍵* = (XTX)−1XTy (3)
The formula is really great, however there are couple of problems with it from the computational point of view:
- Matrix inverse is quite expensive procedure
- We have to store all values of x and y
- To update the model with new sample we need to iterate over all dataset
Instead of using the classical formula for ⍵ (3) the online-linear-regression is based on the recursive least squares algorithm. It stores only one n×n matrix P and two n-dimensional vectors K and ⍵, where n - number of weights.
Ki+1 = Pixi+1(1 + xTi+1Pixi+1)−1
⍵i+1 = ⍵i + Ki+1(yi+1 - xTi+1⍵)
Pi+1 = Pi − Ki+1xTi+1Pi
npm install online-linear-regression
const LinReg = require('online-linear-regression')
To use in browsers, bundle with browserify
I've tried to make the function interface as simple as possible. However because of inner values and on-line nature it's still needed to initialize the function before usage:
const LinReg = require('online-linear-regression')
const linreg = LinReg()
then just call linreg
with 2 arguments (x, y) when training the model or one argument (x) when predicting ŷ.
// Example of polynomial regression
for (let i = 0; i < 5; i += Math.random() * 0.5) {
let noise = Math.random() * 10 - 5
let y = 3 * i ** 2 - 2 * i ** 2 + 9 * i + 10 + noise
linreg([i, i ** 2, i ** 3], y)
}
console.log(linreg([2, 4, 8])) // 43.22
The problem of finding solutions for a system of independent equations when the number of equations is greater than the number of variables is known as Least Squares Estimation. In our case we have data (X, y) with N
pairs of input/ouput values. Each input is a vector xi of length n-1
, output — scalar value of yi. We try to find weight vector ⍵ of length n
with coefficients ⍵0,⍵1…⍵n.
If N > n (i.e. data length is bigger that number of input variables) it's impossible to find one unique solution for ⍵. So we should use the LSE method.
y0 = ⍵0 + ⍵1x01 + … + ⍵nx0n
y1 = ⍵0 + ⍵1x11 + … + ⍵nx1n
…
yN = ⍵0 + ⍵1xN1 + … + ⍵nxNn
or in vector form:
y = X⍵
we are trying to find such ⍵ that minimizes the error function (3). To do that let's find the function's derivative:
RSS(⍵) = (y − X⍵)T(y − X⍵) = (yT − ⍵TXT)(y − X⍵) = yTy - ⍵TXTy - yTX⍵ + ⍵TXTX⍵
Understanding that ⍵TXTy is scalar and its transpose (⍵TXTy)T = yTX⍵ doesn't change its value:
RSS(⍵) = yTy - 2⍵TXTy + ⍵TXTX⍵
To differentiate the above formula, we use such math statements:
∂/∂⍵ (⍵TXTy) = XTy
∂/∂⍵ (⍵TXTX⍵) = 2XTX⍵
Now differentiate RSS(⍵) and find a closed-form formula for ⍵ when ∂RSS(⍵)/∂⍵ equals to 0 (function extrema)
∂/∂⍵ RSS(⍵) = -2XTy + 2XTX⍵ = 0
⍵ = (XTX)−1XTy
We are interested in a recursive algorithm for ⍵ such that when new data arrives it'd be possible to update weights without iterating over the dataset. Let's start from the formula (3) for i+1
case:
⍵i+1 = (XTi+1Xi+1)−1XTi+1yi+1 = Pi+1XTi+1yi+1 (4)
Where:
Pi+1 = (XTi+1Xi+1)−1 = (XTiXi + xi+1xTi+1)−1 = (Pi + xi+1xTi+1)−1 (5)
Using Woodbury matrix identity we get rid of matrix inversion in (5):
Pi+1 = Pi - αPixi+1xTi+1Pi
α is a scalar value equal to
α = (1 + xTi+1Pixi+1)-1
Now we can re-write the equation (4) in a recursive way:
⍵i+1 = Pi+1XTi+1yi+1 = Pi+1(XTiyi + xi+1yi+1) = (Pi - αPixi+1xTi+1Pi)(XTiyi + xi+1yi+1)
…
⍵i+1 = ⍵i + ki+1(yi+1 - xTi+1⍵i) (6)
where
ki+1 = Pi+1xi+1 = αPixi+1 = (1 + xTi+1Pixi+1)-1Pixi+1 (7)
The last thing we need is an updated formula of Pi+1 based on knowledge of k:
Pi+1 = Pi - αPixi+1xTi+1Pi = Pi - ki+1xTi+1Pi (8)
Now we have all needed pieces to calculate weights ⍵ iteratively. For each new sample:
- Calculate vector ki+1 using previous values of the matrix Pi and new input values x0,x1…xn
- Calculate vector ⍵i+1
- Update n×n matrix Pi+1