Linear Least Squares to Solve Nonlinear Problems

Показать описание

Ever wondered how Excel comes up with those neat trendlines? Here's the the theory so you can model your data however you like! #SoME1

Рекомендации по теме

Комментарии

"When I see a variable in an exponent, I try to use logarithm as a ladder so that I can bring them off their shelf" is a poetically quote-worthy sentence.

oceannuclear

Mimizing the sum of squares is not equivalent to minimizing the sum of absolute deviations. This is easiest to see if you try to just fit a single constant c to the data, i.e. minimize sum(|x-c|) vs sum((x-c)^2). In the former case, you get the median, whereas in the latter case, you get the mean. Generalized to curve-fitting, minimizing the sum of absolute deviations is called "least absolute deviation" fitting, which is different from "least squares". (Statistically, "least absolute deviation" can be interpreted as assuming that the errors are Laplace-distributed, while "least squares" can be interpreted as assuming that the errors are normally distributed.)

tailcalled

The method of
raw data -> manipulation to make linear -> least squares fit -> post-analysis to recover actual fit parameters

is something that I've used several times, and it's a life saver every time. However, it's important to note that you're no longer just minimizing the squared deviation, and the amount that the error matters can become unequally weighted.

As an example, an exponential function can be manipulated to be linear through a logarithm, such that y=Ae^Bx becomes ln(y) = ln(A) + Bx. Fitting to this line to the data using least squares will minimize the squared deviation between ln(y) and ln(data). The result is that larger data points are relatively less important to the fit than smaller data points. Say you have the points (1, 2) and (10, 200) in your data set and your least squares fit gives you the points (1, 1) and (10, 100) on the best fit line. The x=1 point has a real squared deviation of 1, and the x=10 point has a squared deviation of 1, 000. However, the deviation used in the manipulated least squares fit is on ln(y), which gives the x=1 point a squared deviation of ln(2)^2=0.48, and the x=10 point a squared deviation of ln(200/100)^2=0.48... The same weighting. In this case, the error is weighted using fractional error, and since these are both 2x the fit line, they have the same error as far as the least squares fitting is concerned.

This weighted error fitting can be desirable or not, depending on your use case. Just something I've noticed through use, and thought it might me useful to someone else. :)

Simonsays

Aside from the flaw others already have pointed out it’s a really well made video and broadened my horizon for the Least Squares method which so far I had only applied to lines 👍🏼

timdernedde

The correct motivation for least squares is the Gaussian error model. The probability of error e goes like exp(-C e^2), and so the total probability density for all the errors is the product of these exponentials, or exp( - weighted sum of squares ). Minimizing the square deviation is the same as maximizing the probability of getting the data given your model. This is the Bayesian rule for finding the most likely possibility for the parameter values.

annaclarafenyo

Great video. Essentially just expanding early statistics formulas and changing meaning of operation to fit new context. Example Error is effectively variance. There is always beauty in expanding the usability of the tools we already have. There is a great spirograph video someone came out with recently that absolutely blew my mind.

chuckhammond

Very nice video. I`m personally teaching least square regression to my colleagues with a similar approach. I agree with the comments, but don`t take it personally. There is too much stuff on this subject for a 12 min video, so I understand the need for some simplifications. That will give you a reason to do a part 2, where you`ll be able to refine and go further. One thing I would like you to consider is to warn your audience about the danger of using X transpose X form of the normal equation. With a lot of data points, numerical instabilities can occur. One thing also that can interest your audience is to show them some examples with a specific tool like They have specialized functions for least square regression which are not as well known as you might think. Another topic that you can add to your list is the uncertainty estimation of the estimated parameters. This is often neglected and it is very important to have an idea how well you can know the parameters. I`m encouraging you to continue. You have a gold mine in your hand. Good luck!

stephanel

whatever you did here, its beautiful. very good explanation, and I hope to see more !

alihouadef

I really enjoyed this video, and it inspired me to write a little python program to implement it. Thank you for sharing.

TheLuke

Just to echo the comment already made, minimizing squared deviations is NOT THE SAME as minimizing absolute deviations. Minimizing squared deviations provides an estimator for the mean of y given x. Minimizing absolute deviations provides as estimator for the median of y given x. Other properties also differ across approaches, for example mean squared error is much more sensitive to outliers.

matthewb

Thank you, this video helped me a lot. Also, really nice editing. Hoping to see more videos!

unraton

thank you so much for explaining this. Cheers!

Numerically_Stable

It is a very useful video, thank you!

norbertbarna

At 3:47, the equations after setting derivatives to 0 being linear in coefficients in m and b are a direct result of the expected function (here f(x) = mx + b) being linear in m and b. If the expected function was non linear in m or b, those equations would have been non linear too such as for f(x) = m^2 *X + mx +b or f(x) = e^(mx) + b

MayurGarg

Where was your video 3months ago because i was needed it soooo much then
I was working on fitting covid data with SEIR model
I hope you do more videos on that topic

Xphy

Everyone getting nitpicky about "least absolute deviation" versus "least squares deviation" are missing the point, I think. Sure, he might have said that there's no difference, but the conceptual (and important) part as it relates to this video is that minimizing either one will minimize the error in some sense. For the general audience this video is intended for, this is plenty of confirmation to accept that using least squares deviation is valid.

Simonsays

Did the video do an abrupt cut at ~ 10:32?

ivolol

So, what are you doing with the varactors?

fletcherreder

I KNEW I HAD HEARD THIS VOICE ALREADY!! Are you quantum boy? :))

ohanabergerdesouza

so it's possible to use this in parameter estimation of consants in ode models? but instead numerical integration is used?instead of a functional relation

jamespeter

Linear Least Squares to Solve Nonlinear Problems

Least squares approximation | Linear Algebra | Khan Academy

Linear Regression Using Least Squares Method - Line of Best Fit Equation

Linear Algebra 6.5.1 Least Squares Problems

Linear Least Squares to Solve Nonlinear Problems

Least squares using matrices | Lecture 26 | Matrix Algebra for Engineers

Determine a Least Squares Solutions to Ax=b

Least Squares Approximation

9. Four Ways to Solve Least Squares Problems

Linear Systems of Equations, Least Squares Regression, Pseudoinverse

The Least Squares Formula: A Derivation

Linear Least Squares

Least squares examples | Alternate coordinate systems (bases) | Linear Algebra | Khan Academy

Least Squares Approximation

Deriving the least squares estimators of the slope and intercept (simple linear regression)

Linear Regression Least Squares Method

Linear Least Squares Estimation

Calculating the Least Squares Regression Line by Hand

How to calculate linear regression using least square method

Harvard AM205 video 1.6 - Linear least squares

Least Squares Approximations

11.3.6 Solving the Linear Least-squares Problem Via QR Factorization

Least Square Solution of a Given System of Linear equations

Example 1: Finding a least squares solution of an inconsistent system

Nonlinear Least Squares