Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors in variables? #17

Open
m93a opened this issue Sep 4, 2018 · 6 comments · May be fixed by #19
Open

Errors in variables? #17

m93a opened this issue Sep 4, 2018 · 6 comments · May be fixed by #19

Comments

@m93a
Copy link

m93a commented Sep 4, 2018

I need to fit data points with errors in both variables. Does a "weighted" version of this algorithm exist? Would it be viable to implement it? I've spent hours searching for an implementation of something like that (in any understandable language) but had little success.

@maasencioh
Copy link
Member

Hello @m93a what do you mean with weighted?

@m93a
Copy link
Author

m93a commented Sep 4, 2018

I mean weighted as in "Weighted Total Least Squares", which apart from values for x and y needs their standard deviations. When computing the regression, it gives more weight to data points with lower error (standard deviation) and vice versa. With TLS, the modification to the weighted form is as simple as multiplying the residuals by the reciprocal of error. But I don't understand the technical aspects of the Levenberg-Marquardt algorithm so I don't know whether something like this is also possible.

@m93a
Copy link
Author

m93a commented Sep 6, 2018

Okay, maybe it's not called Weighted Total Least Squares, my bad... 😀
I'm a layman in statistics – otherwise I'd have implemented an error-in-variables version of LM myself – but as far as my knowledge goes, your algorithm tries to minimize the sum of squares of non-weighted residuals:

      χ² = Σ ( yᵢ − f(xᵢ, β) )²

where xᵢ and yᵢ are the data points, f is the fitted function and β is the vector of parameters. This is fine as far as you have very precise measurements of xᵢ and yᵢ or, at least, if their error stays constant regardless of i. The more general approach, however, is to minimize the the sum of squares of weighted residuals:

               ( yᵢ − f(xᵢ, β) )²
      χ² = Σ ————————————————————————
              σ(yᵢ)² + σ( f(xᵢ, β) )²

where σ(yᵢ) is the error in yᵢ, and σ( f(xᵢ, β) ) is the error of the fitted function due to error in xᵢ, which can be approximated as:

                        ∂f
      σ( f(xᵢ, β) ) = ( —— )(xᵢ, β) * σ(xᵢ)
                        ∂x

.

However modifying the algorithm to account for errors isn't as simple as changing the merit function, is it? (Merit function is the part which calculates the sum of residuals.)

Could you please point me to the place, where the merit function is defined in your code? And would you mind sharing some of your insight into the problem? You probably understand fitting algorighms way better than me. Thanks!

@jobo322
Copy link
Member

jobo322 commented Sep 6, 2018

Hello @m93a i'm interested in your approach, I am trying to understand your explanation, when you say merit function, this is inbebed in the step function. There is a function called gradient function, this creates a gradient matrix used to compute the new candidates, the matrixFunction return the difference matrix between toFit data ('real' data) and the Func(xi, B) result where B is the vector of current parameters. between both function could you find the place to implement your idea.

@andcastillo
Copy link

andcastillo commented Sep 6, 2018

Hello there,

Indeed, the LM algorithm should minimize sum of weighted squared residuals. I remember in the original implementation there were a parameter called weight = weighting vector for least squares fit ( weight >= 0 ) ... (https://github.com/mljs/curve-fitting)[https://github.com/mljs/curve-fitting]
This does not actually solve all your problem, because, the LM algorithm consider the independent variable free of error, i.e. σ( f(xᵢ, β) ) = 0. Then, as far as I understand, you can deal with the error σ(yᵢ) by means of the weight vector. But somehow also with the error in xᵢ becuse yᵢ = g(xi): it is an implicit function of xᵢ.
Now, this library lacks the implementation of the weights, but this is almost trivial having the old code in the curve fitting project, so perhaps you could do it.
As I said, it could be a good approximation to the solution. If not, you will need to change LM to work with a function h(i) = <y(i), x(i)>, which is not supported by any implementation that I currently know.

@m93a
Copy link
Author

m93a commented Sep 7, 2018

Hey again,
Thank you for your comments and tips!

I've looked at the old code of ml-curve-fitting, as @andcastillo suggested, and found the merit function very quickly. However when I tried to understand the context, I found the code quite hard to follow, as it lacked types, segmentation to sub-functions and descriptive names.

Code in this repo, on the other hand, is rather delightful and bears little resemblance to ml-curve-fitting (probably for the better), however I failed to find a part that would resemble the merit function, so I took @jobo322's advice and looked into the gradientFunction.

If I understand it correctly, the gradient function returns the gradient of residuals:

                  ⎛ ∂/∂β₁ ⎞   ⎛ y₁ − f(x₁, β) ⎞
                  ⎜ ∂/∂β₂ ⎟   ⎜ y₂ − f(x₂, β) ⎟
      G = ∇ ⊗ R = ⎜  ...  ⎟ ⊗ ⎜      ...      ⎟
                  ⎝ ∂/∂βₘ ⎠   ⎝ yₙ − f(xₙ, β) ⎠

where R is the vector of residuals, ∂/∂βᵢ is the partial derivative with respect to parameter βᵢ and a ⊗ b is the outer product of two vectors. If the gradient G minimizes the norm of R, simply weighting every entry of R by the reciprocal of the error should produce the correct result, right? In code that would mean adding something like ans[param][point] /= dataErrors[point] to here.

Now there's the problem with determining the error… In the book Numerical Recipes in FORTRAN 77 (page 660 in book, page 690 in pdf) they say that the statistically correct way to fit a straight line to data with errors in variables is to minimize this value:

              ( yᵢ − a − bxᵢ )²
      χ² = Σ ———————————————————
              σ(yᵢ)² + b²σ(xᵢ)²

This exactly corresponds to the equation I've given in my previous comment if you substitute f(x) for a + bx. I don't have the knowledge to verify that the book is actually correct, but I strongly believe that it is. This would mean that my equation for the weights is presumably correct and the only thing left is to determine the value of σ( f(xᵢ, β) ) from xᵢ and σ(xᵢ)

Since the equation σ(f(xᵢ)) ≈ (∂f/∂x)(xᵢ) * σ(xᵢ) applies only for small σ(xᵢ) and I don't want to risk hitting a point where f(xᵢ) is locally very flat but gets very steep as you move to f(xᵢ + σ), I'll try to solve it in a different way:

We assume that the actual value (let's call it ξᵢ) which was measured as xᵢ lays somewhere around it with the probability given by the normal distribution with mean xᵢ and standard deviation σᵢ. This means that ξᵢ lays in the range between xᵢ − 3σᵢ and xᵢ + 3σᵢ with the probability of 99.7 %. I propose that we divide this interval to N regions, so that each region has the same probability of containing ξᵢ. Now, if we evaluate f at the edge of each region, we should be able to use the basic formula for standard deviation to approximate it in the neighbourhood of xᵢ (and hopefully ξᵢ).

(Edit: In the end used the averaged the "average slopes" of the equiprobable intervals and used first equation (in a loose sense), because the formula for standard deviation converged too slowly.)

Wow, that was a lot of maths… If you either don't understand something, think I got something wrong or want to approve my conclusions, feel encouraged to comment!

@m93a m93a linked a pull request Sep 8, 2018 that will close this issue
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants