In statistics and image processing, to smooth a data set is to create an approximating function that attempts to capture important patterns in the data, while leaving out noise or other fine-scale structures/rapid phenomena. In smoothing, the data points of a signal are modified so individual points (presumably because of noise) are reduced, and points that are lower than the adjacent points are increased leading to a smoother signal. Smoothing may be used in two important ways that can aid in data analysis (1) by being able to extract more information from the data as long as the assumption of smoothing is reasonable and (2) by being able to provide analyses that are both flexible and robust. Many different algorithms are used in smoothing.
Smoothing may be distinguished from the related and partially overlapping concept of curve fitting in the following ways:
- curve fitting often involves the use of an explicit function form for the result, whereas the immediate results from smoothing are the "smoothed" values with no later use made of a functional form if there is one;
- the aim of smoothing is to give a general idea of relatively slow changes of value with little attention paid to the close matching of data values, while curve fitting concentrates on achieving as close a match as possible.
- smoothing methods often have an associated tuning parameter which is used to control the extent of smoothing. Curve fitting will adjust any number of parameters of the function to obtain the 'best' fit.
However, the terminology used across applications is mixed. For example, use of an interpolating spline fits a smooth curve exactly through the given data points and is sometimes called "smoothing".
In the case that the smoothed values can be written as a linear transformation of the observed values, the smoothing operation is known as a linear smoother; the matrix representing the transformation is known as a smoother matrix or hat matrix.
The operation of applying such a matrix transformation is called convolution. Thus the matrix is also called convolution matrix or a convolution kernel. In the case of simple series of data points (rather than a multi-dimensional image), the convolution kernel is a one-dimensional vector.
One of the most common algorithms is the "moving average", often used to try to capture important trends in repeated statistical surveys. In image processing and computer vision, smoothing ideas are used in scale space representations. The simplest smoothing algorithm is the "rectangular" or "unweighted sliding-average smooth". This method replaces each point in the signal with the average of "m" adjacent points, where "m" is a positive integer called the "smooth width". Usually m is an odd number. The triangular smooth is like the rectangular smooth except that it implements a weighted smoothing function.
Some specific smoothing and filter types are:
- Additive smoothing
- Butterworth filter
- Digital filter
- Exponential smoothing used to reduce irregularities (random fluctuations) in time series data, thus providing a clearer view of the true underlying behaviour of the series. It also provides an effective means of predicting future values of the time series (forecasting).
- Kalman filter
- Kernel smoother
- Kolmogorov–Zurbenko filter
- Laplacian smoothing
- Local regression also known as "loess" or "lowess"
- Low-pass filter
- Moving average a form of average which has been adjusted to allow for seasonal or cyclical components of a time series. Moving average smoothing is a smoothing technique used to make the long term trends of a time series clearer.
- Ramer–Douglas–Peucker algorithm
- Savitzky–Golay smoothing filter based on the least-squares fitting of polynomials to segments of the data
- Smoothing spline
- Stretched grid method
- Hastie, T.J. and Tibshirani, R.J. (1990), Generalized Additive Models, New York: Chapman and Hall.
- Einicke, G.A. (2012). Smoothing, Filtering and Prediction: Estimating the Past, Present and Future. Intech. ISBN 978-953-307-752-9.