Wikipedia:Reference desk/Archives/Mathematics/2020 November 17

Mathematics desk
< November 16 << Oct | November | Dec >> Current desk >
Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


November 17

edit

generalized curve fitting

edit

If I have a bunch of points xi,yi, with the xi known exactly but the yi subject to some error, and if I want to fit a straight line through the points, minimizing the square of each y error (either because I like the way Δy2 takes the absolute value for me, or because I want outliers to have more weight), I can of course use the classic linear regression technique, which will actually spit out coefficients for me immediately, in closed form. Similarly, if I want something other than a straight-line fit, I can do things like taking log(yi) before fitting, or do a polynomial regression.

But what if I want something much more general? What if the y(x) function I'm trying to fit is arbitrary, perhaps with arbitrarily-many coefficients? What if I want to define my own error function, perhaps taking Δx into account as well? Finally, what if I don't insist on closed-form output, but am willing to search, to iterate?

I'm sure there are well-studied ways of doing this, I just don't know what they're called. I could try to write my own program to do the searching, but there are probably extant ones out there that already work well.

I know about linear programming but I'm looking for something more general than that, too, because I'm not interested in limiting myself to linear functions. —Steve Summit (talk) 03:08, 17 November 2020 (UTC)[reply]

A frequently occurring case of model fitting where the commonly taught methods do not work is that where the model to be fitted to is a curve defined by the sum of a number of exponential functions; even the sum of two exponentials is difficult if their growth rates are not far apart. Similarly, fitting to a sum of sinusoids is hard. For a whole class of model-fitting problems where different methods are needed, see Energy minimization. In model fitting you have a data set of pairs   with   for some (further irrelevant) set   and   The model is defined by a parameter space   typically (a subset of)   where   is the number of parameters, and a function   We want to find a parametric vector   such that, ideally, all values   are close to   The goodness of fit can be given by a weighted sum of squares
 
where the (non-negative) weights   can be determined by any approach we fancy, one common method being to take   to be the reciprocal of the expected variance in a measurement taken at   After all this preliminary work, the problem has been reduced to an abstract optimization problem:
Minimize the value of   for  
This may seem too abstract, but unless we know something more specific about how the value of   depends on   – as we do in linear regression problems, but which requires a very restricted class of models – there is not much better we can do. Now the tool chest of mathematical optimization can be applied; instead of aiming for the summit in the landscape, such as by hill climbing, we want to descend to the bottom of the deepest pit, but the mathematics is the same. Again, unless you can exploit some specific property of your class of models, in general you will have to use a computational iterative technique. In most cases we can analytically find the derivative of function   and use the conjugate gradient method.  --Lambiam 09:45, 17 November 2020 (UTC)[reply]
  • Every curve-fitting is made (implicitly or explicitly) of two parts:
  1. Define a cost function that says how "good" a fit function is compared to the data
  2. Minimise that cost function
The first part is totally up to you, although of course there are standard ways to do it, and that choice will determine what results you get at the second step. Saying the second part has been well-studied is a major understatement; there are tons of algorithms that perform more or less well depending on certain general properties of the cost function (whether you know its jacobian, whether it has a single minimum, whether it is convex, etc.).
If you want to learn more about the topic, the keyword would be "optimization problem" but that article is rather short and the associated article mathematical optimization is IMO not very well-written. Gradient descent is a subset of the topic but the article is well-written and has lots of interesting refs.
If you are just looking at solving a given problem, you should just read the manual of whatever optimization library comes with your programming language of choice, learn barely enough about the subject to understand what the manual says, and pick whatever it tells you to pick. That is the math equivalent of taking whatever pills your doctor recommends rather than trying to form your own opinion about pharmacology: not intellectually rewarding, but incomparably quicker and safer. TigraanClick here to contact me 10:40, 17 November 2020 (UTC)[reply]
@Lambiam and Tigraan: Thanks for these tips. I would never have realized that "optimization" and "mathematical optimization" were the top-level terms to search on, but now I know. And I'm glad to find the pages on hill climbing and Tabu search, which make sense and are the kind of approaches I'd been thinking about. —Steve Summit (talk) 04:46, 18 November 2020 (UTC)[reply]