Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v1.1] Add support for errors-in-variables #19

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 7 additions & 3 deletions package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "ml-levenberg-marquardt",
"version": "1.0.3",
"version": "1.1.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's customary to separate release bump commits from other work (some package maintainers get really peeved about this). @targos Are there strong opinions about this for this project?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we use a tool that bumps the version number automatically (based on the commit messages) when we publish the release

"description": "Curve fitting method in javascript",
"main": "./lib/index.js",
"module": "./src/index.js",
Expand All @@ -12,7 +12,7 @@
"scripts": {
"eslint": "eslint src",
"eslint-fix": "npm run eslint -- --fix",
"prepare": "rollup -c",
"preparea": "rollup -c",
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this intended? I indeed have no idea about what it should do, but it seems a lot like a typo to me 😀

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, it was a temporary rename because npm install was failing 🙃

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So yes, typo. You can push up a fix.

"test": "npm run test-only && npm run eslint",
"test-coverage": "npm run test-only -- --coverage",
"test-only": "jest",
Expand Down Expand Up @@ -49,6 +49,10 @@
"rollup": "^0.66.6"
},
"dependencies": {
"ml-matrix": "^5.2.0"
"ml-matrix": "^5.2.0",
"norm-dist": "^2.0.1"
},
"jest": {
"testURL": "http://localhost"
}
}
10 changes: 5 additions & 5 deletions src/__tests__/curve.js
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,8 @@ test('bennet5 problem', () => {
damping: 0.00001,
maxIterations: 1000,
errorTolerance: 1e-3,
maxBound: [11, 11, 11],
minBound: [1, 1, 1],
maxValues: [11, 11, 11],
minValues: [1, 1, 1],
initialValues: [3.5, 3.8, 4]
};

Expand All @@ -62,7 +62,7 @@ test('fourParamEq', () => {
expect(levenbergMarquardt(data, fourParamEq, options)).toBeDeepCloseTo(
{
iterations: 200,
parameterError: 374.6448,
residuals: 16398.0009,
parameterValues: [-16.7697, 43.4549, 1018.8938, -4.3514]
},
3
Expand All @@ -78,8 +78,8 @@ test('error is NaN', () => {

expect(levenbergMarquardt(data, fourParamEq, options)).toBeDeepCloseTo(
{
iterations: 0,
parameterError: NaN,
iterations: 10,
residuals: NaN,
parameterValues: [-64.298, 117.4022, -47.0851, -0.06148]
},
3
Expand Down
33 changes: 29 additions & 4 deletions src/__tests__/errorCalculation.js
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
import errorCalculation from '../errorCalculation';
import * as errCalc from '../errorCalculation';

function sinFunction([a, b]) {
return (t) => a * Math.sin(b * t);
}

describe('errorCalculation test', () => {
describe('sum of residuals', () => {
it('Simple case', () => {
Copy link
Contributor

@jacobq jacobq Oct 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we simplify this test by making the sampleFunction always evaluate to a constant value (perhaps 0)? I don't see why it should matter whether we use sin or any other particular function since sumOfResiduals should be purely a function of the difference between the function's output and the corresponding value in the set of data points. e.g. something like this:

const sampleFunction = ([k]) => (x => k); // resulting function returns constant
const n = 3;
const xs = new Array(n).fill(0).map((zero, i) => i); // [0, 1, ..., n-1] but doesn't matter because f(x; [k]) -> k
const data = {
  x: xs,
  y: xs.map(sampleFunction([0])) // all zeros
};

expect(errCalc.sumOfResiduals(data, [0], sampleFunction)).toBeCloseTo(0, 1);
expect(errCalc.sumOfResiduals(data, [1], sampleFunction)).toBeCloseTo(n, 1);
expect(errCalc.sumOfResiduals(data, [2], sampleFunction)).toBeCloseTo(2*n, 1);

expect(errCalc.sumOfSquaredResiduals(data, [0], sampleFunction)).toBeCloseTo(0, 1);
expect(errCalc.sumOfSquaredResiduals(data, [1], sampleFunction)).toBeCloseTo(n, 1);
expect(errCalc.sumOfSquaredResiduals(data, [2], sampleFunction)).toBeCloseTo(Math.pow(2*n, 2), 1);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a simpler function could be used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize now that all your change really does here is update the name. The simplification of the test function is outside the scope of this PR. I've attempted to address this in #26, so after that merges this PR can be rebased to resolve this.

const len = 20;
let data = {
Expand All @@ -17,7 +17,32 @@ describe('errorCalculation test', () => {
data.y[i] = sampleFunction(i);
}

expect(errorCalculation(data, [2, 2], sinFunction)).toBeCloseTo(0, 3);
expect(errorCalculation(data, [4, 4], sinFunction)).toBeCloseTo(48.7, 1);
expect(errCalc.sumOfResiduals(data, [2, 2], sinFunction)).toBeCloseTo(0, 5);
expect(errCalc.sumOfResiduals(data, [4, 4], sinFunction)).toBeCloseTo(48.7, 1);

expect(errCalc.sumOfSquaredResiduals(data, [2, 2], sinFunction)).toBeCloseTo(0, 5);
expect(errCalc.sumOfSquaredResiduals(data, [4, 4], sinFunction)).toBeCloseTo(165.6, 1);
});
});

describe('error propagation estimator', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m93a Could you please help me understand these test values? When I look at them it is not at all obvious to me what the correct values should be. (Partly because I am not sure how errorPropagation works other than that it seems to take pseudo-random steps around a point and estimate the slope there)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The errorPropagation method tries to approximate the propagation of uncertainty through the function, given a variable with known error. Since we know the mean, as well as the standard deviation of the variable, we can assume it's normally distributed and choose equiprobable intervals where the variable's measured value might actually be.

This means that if we slice the interval of all variable's possible values to eg. 10 equiprobable intervals, there's the same probability that the measured value will be in the first interval, as there is probability that it will be in the second, third, etc. interval. For practical purposes, I don't really count with all of the possible values, I only use values between x̅ − 2.25σ and x̅ + 2.25σ. (There's 97.6 % probability that the measured value will be between those values). This process is not pseudo-random, but completely deterministic – the intervals are computed using the CDF of normal distribution.

Then, the functional values on the boundaries of the intervals are computed. This way the function can be approximated as a linear function on each interval. When a variable “passes through” a linear function, its standard error gets “stretched” (ie. multiplied) by the slope of the function. If a variable has the same probability of passing through multiple linear functions, its standard error (on average) gets stretched by the average of the slopes of those linear functions.

Of course the error distribution of the functional value most probably won't be a Gauss curve, even if the input variable was normally distributed in the first place. But as our fitting algorithm can't deal with those either (ie. it's not robust), it's not a big deal. We can simply pretend that all of the resulting distributions are close enough to the normal distribution (which is mostly true as long as the errors are small).

(An uncanny fact as a bonus: if the error is large enough and the function is very unsymmetrical, the mean of the functional value won't be equal to the functional value of the mean. For now let's just pretend those cases never occur.)

Now about the test values. You're right that the values are not exactly obvious. The results should match the standard deviation of normally distributed data that passed through the function. One could use Monte Carlo to get the right value.

// Pseudo-code to validate the results of errorPropagation
const mean = 42;
const sigma = 0.1;
let data = Array(BILLIONS_AND_BILLIONS)
data = data.map(() => normal.random(mean, sigma));
data = data.map( x => fn(x) );
const propagatedSigma = stdev(data);

expect(errCalc.errorPropagation(fn, mean, sigma))
  .toBeCloseTo(propagatedSigma, DEPENDS_ON_NUMBER_OF_INTERVALS);

Wow, this was really time-consuming, I'm taking a break for today. I've got other things to do 😉

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explanation! I plan to read through it a few times and attempt to make the test cases more expressive once I understand it.

it('Simple case', () => {
const q = Math.PI / 2;

errCalc.initializeErrorPropagation(10);
expect(errCalc.errorPropagation(sinFunction([4, 4]), 0, 0.001)).toBeCloseTo(0.016, 5);
expect(errCalc.errorPropagation(sinFunction([1, 1]), q, 0.010)).toBeCloseTo(0.000, 5);

errCalc.initializeErrorPropagation(50);
expect(errCalc.errorPropagation(sinFunction([4, 4]), 0, 0.25)).toBeCloseTo(2.568135, 1);

errCalc.initializeErrorPropagation(100);
expect(errCalc.errorPropagation(sinFunction([4, 4]), 0, 0.25)).toBeCloseTo(2.568135, 2);

errCalc.initializeErrorPropagation(1000);
expect(errCalc.errorPropagation(sinFunction([4, 4]), 0, 0.25)).toBeCloseTo(2.568135, 4);

errCalc.initializeErrorPropagation(10000);
expect(errCalc.errorPropagation(sinFunction([4, 4]), 0, 0.25)).toBeCloseTo(2.568135, 6);
});
});
55 changes: 47 additions & 8 deletions src/__tests__/test.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,13 @@ describe('levenberg-marquardt test', () => {
initialValues: [3, 3]
};

const { parameterValues, parameterError } = levenbergMarquardt(
const { parameterValues, residuals } = levenbergMarquardt(
data,
sinFunction,
options
);
expect(parameterValues).toBeDeepCloseTo([2, 2], 3);
expect(parameterError).toBeCloseTo(0, 2);
expect(residuals).toBeCloseTo(0, 2);
});

it('Exceptions', () => {
Expand All @@ -36,20 +36,20 @@ describe('levenberg-marquardt test', () => {
initialValues: [3, 3]
};

expect(() => levenbergMarquardt()).toThrow(
expect(() => levenbergMarquardt({}, sinFunction, { damping: -1 })).toThrow(
'The damping option must be a positive number'
);
expect(() => levenbergMarquardt([1, 2], sinFunction, options)).toThrow(
'The data parameter must have x and y elements'
'The data object must have x and y elements'
);
expect(() =>
levenbergMarquardt({ x: 1, y: 2 }, sinFunction, options)
).toThrow(
'The data parameter elements must be an array with more than 2 points'
'The data must have more than 2 points'
);
expect(() =>
levenbergMarquardt({ x: [1, 2], y: [1, 2, 3] }, sinFunction, options)
).toThrow('The data parameter elements must have the same size');
).toThrow('The data object must have equal number of x and y coordinates');
expect(() =>
levenbergMarquardt({ x: [1, 2], y: [1, 2] }, sumOfLorentzians, {
damping: 0.1,
Expand All @@ -75,13 +75,13 @@ describe('levenberg-marquardt test', () => {
maxIterations: 200
};

let { parameterValues, parameterError } = levenbergMarquardt(
let { parameterValues, residuals } = levenbergMarquardt(
data,
sigmoidFunction,
options
);
expect(parameterValues).toBeDeepCloseTo([2, 2, 2], 1);
expect(parameterError).toBeCloseTo(0, 1);
expect(residuals).toBeCloseTo(0, 1);
});

it('Sum of lorentzians example', () => {
Expand Down Expand Up @@ -109,6 +109,45 @@ describe('levenberg-marquardt test', () => {
});
});

it('Legacy', () => {
const len = 20;
let data = {
x: new Array(len),
y: new Array(len)
};
let sampleFunction = sinFunction([2, 2]);
for (let i = 0; i < len; i++) {
data.x[i] = i;
data.y[i] = sampleFunction(i);
}
const options = {
damping: 0.1,
initialValues: [3, 3]
};
const options1 = {
...options,
residualEpsilon: 1
};
const options2 = {
...options,
errorTolerance: 1
};

const result1 = levenbergMarquardt(
data,
sinFunction,
options1
);
const result2 = levenbergMarquardt(
data,
sinFunction,
options2
);

expect(result1.parameterValues).toBeDeepCloseTo(result2.parameterValues, 0);
expect(result1.residuals).toBe(result1.parameterError);
});

function sinFunction([a, b]) {
return (t) => a * Math.sin(b * t);
}
Expand Down
145 changes: 137 additions & 8 deletions src/errorCalculation.js
Original file line number Diff line number Diff line change
@@ -1,22 +1,151 @@
import nd from 'norm-dist';

/**
* @private
* @param {number} x
* @return {number}
*/
const sq = (x) => x * x;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, this isn't really any faster than Math.pow(x, 2), which has been part of the ES standard since the early days. (You can also use x ** 2 if you don't care about IE.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt that the code in the perf actually does anything. Since you're effectively throwing away the result right after the computation, the interpreter probably optimizes that part out and just iterates i (probably by more than 1).

You're probably right about this function not adding very much to performance. If you don't like it, just remove it, but it seems quite readable to me.

Copy link
Contributor

@jacobq jacobq Oct 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess what I'm trying to say is that if you're doing it for performance reasons then it's probably better to remove it. If you think it is more readable I can see why that might be a good reason. However, I think it's hard to argue that a custom function that is identical to a built-in one is easier to read (because one has to look it up to see what it does rather than just recognize it), though in this case it matters little either way since the function is so simple.

You are probably right about jsPerf...sometimes it's too easy to fool oneself. Here's an updated example adapted to reduce the likelihood of this kind of mistake. It still shows less than 1% difference between the two methods of interest.
image

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, the performance differences are negligible. I would choose x ** 2 then, as it seems to have the best readability. And when we move to TypeScript, it will be compiled to Math.pow so we won't lose compatibility with old browsers.


/**
* @private
* @param {...number} n
* @return {boolean}
*/
const isNaN = (...n) => n.some((x) => Number.isNaN(x));

/**
* Compute equiprobable ranges in normal distribution
* @private
* @param {number} iterations - How many evaluations (per point per step) of the fitted function to use to approximate the error propagation through it
*/
export function initializeErrorPropagation(iterations) {
if (equiprobableStops.length === iterations) return;

const min = nd.cdf(-2.25);
const max = 1 - min;
const step = (max - min) / (iterations - 1);

for (var i = 0, x = min; i < iterations; i++, x += step) {
equiprobableStops[i] = nd.icdf(x);
}
}

/** @type {number[]} */
const equiprobableStops = [];

/**
* Estimate error propagation through a function
* @private
* @param {(x: number) => number} fn - The function to approximate, outside of the function domain return `NaN`
* @param {number} x - The error propagation will be approximate in the neighbourhood of this point
* @param {number} xSigma - The standard deviation of `x`
* @return {number} The estimated standard deviation of `fn(x)`
*/
export function errorPropagation(fn, x, xSigma) {
const stopCount = equiprobableStops.length;

var slope = 0;
var N = 0;

var xLast = x + xSigma * equiprobableStops[0];
var yLast = fn(xLast);

for (var stop = 1; stop < stopCount; stop++) {
var xNew = x + xSigma * equiprobableStops[stop];
var yNew = fn(xNew);

if (!isNaN(xNew, yNew, xLast, yLast)) {
slope += (yNew - yLast) / (xNew - xLast);
N++;
}

xLast = xNew;
yLast = yNew;
}

const avgSlope = slope / N;

return Math.abs(avgSlope * xSigma);
}

/**
* Calculate current error
* Approximate errors in point location and calculate point weights
* @ignore
* @param {{x:Array<number>, y:Array<number>, xError:Array<number>|void, yError:Array<number>|void}} data - Array of points to fit in the format [x1, x2, ... ], [y1, y2, ... ]
* @param {Array<number>} params - Array of parameter values
* @param {(n: number[]) => (x: number) => number} paramFunction - Takes the parameters and returns a function with the independent variable as a parameter
* @return {Array<number>} Array of point weights
*/
export function pointWeights(data, params, paramFunction) {
const m = data.x.length;

/** @type {Array<number>} */
const errs = new Array(m);

if (!data.xError) {
if (!data.yError) {
return errs.fill(1);
} else {
errs.splice(0, m, ...data.yError);
}
} else {
const fn = paramFunction(params);
var point;

for (point = 0; point < m; point++) {
errs[point] = errorPropagation(fn, data.x[point], data.xError[point]);
}

if (data.yError) {
for (point = 0; point < m; point++) {
errs[point] = Math.sqrt(sq(errs[point]) + sq(data.yError[point]));
}
}
}

// Point weight is the reciprocal of its error
for (point = 0; point < m; point++) {
errs[point] = 1 / errs[point];
}

return errs;
}

/**
* Calculate the current sum of residuals
* @ignore
* @param {{x:Array<number>, y:Array<number>}} data - Array of points to fit in the format [x1, x2, ... ], [y1, y2, ... ]
* @param {Array<number>} parameters - Array of current parameter values
* @param {function} parameterizedFunction - The parameters and returns a function with the independent variable as a parameter
* @param {(...n: number[]) => (x: number) => number} paramFunction - The parameters and returns a function with the independent variable as a parameter
* @return {number}
*/
export default function errorCalculation(
data,
parameters,
parameterizedFunction
) {
export function sumOfResiduals(data, parameters, paramFunction) {
var error = 0;
const func = parameterizedFunction(parameters);
const func = paramFunction(parameters);

for (var i = 0; i < data.x.length; i++) {
error += Math.abs(data.y[i] - func(data.x[i]));
}

return error;
}

/**
* Calculate the current sum of squares of residuals
* @ignore
* @param {{x:Array<number>, y:Array<number>}} data - Array of points to fit in the format [x1, x2, ... ], [y1, y2, ... ]
* @param {Array<number>} parameters - Array of current parameter values
* @param {(...n: number[]) => (x: number) => number} paramFunction - The parameters and returns a function with the independent variable as a parameter
* @return {number}
*/
export function sumOfSquaredResiduals(data, parameters, paramFunction) {
var error = 0;
const func = paramFunction(parameters);

for (var i = 0; i < data.x.length; i++) {
error += sq(data.y[i] - func(data.x[i]));
}

return error;
}
Loading