In mathematics, the Euclidean distance or Euclidean metric is the "ordinary" straight-line distance between two points in Euclidean space. With this distance, Euclidean space becomes a metric space. The associated norm is called the Euclidean norm. Older literature refers to the metric as the Pythagorean metric. A generalized term for the Euclidean norm is the L2 norm or L2 distance.
In Cartesian coordinates, if p = (p1, p2,..., pn) and q = (q1, q2,..., qn) are two points in Euclidean n-space, then the Euclidean distance (d) from p to q, or from q to p, is given by the Pythagorean formula:
The position of a point in a Euclidean n-space is a Euclidean vector. Hence p and q may be represented as Euclidean vectors, starting from the origin of the space (initial point) with their tips (terminal points) ending at the two points. The Euclidean norm (a.k.a, Euclidean length), or the magnitude of a vector, measures the length of the vector:
where the last expression involves the dot product.
When the vector described as a directed line segment from the origin of the Euclidean space (vector tail) to a point in that space (vector tip), its length is actually the distance from its tail to its tip. The Euclidean norm of a vector is seen to be just the Euclidean distance between its tail and its tip.
The relationship between points p and q may involve a direction (for example, from p to q), and when it does, this relationship can itself be represented by a vector, given by
In a two- or three-dimensional space (n = 2, 3), this can be visually represented as an arrow from p to q. In any space, it can be regarded as the position of q relative to p. It may also be called a displacement vector, if p and q represent two positions of some moving point.
The Euclidean distance between p and q is just the Euclidean length of this displacement vector:
which is equivalent to equation 1, and also to:
In the context of Euclidean geometry, a metric is established in one dimension by fixing two points on a line, and choosing one to be the origin. The length of the line segment between these points defines the unit of distance, and the direction from the origin to the second point is defined as the positive direction. This line segment may be translated along the line to build longer segments, whose lengths correspond to multiples of the unit distance. In this manner, real numbers can be associated to points on the line (as the distance from the origin to the point), and these are the Cartesian coordinates of the points on what may now be called the real line. An alternate way to establish the metric, instead of choosing two points on the line, is to choose one point to be the origin, a unit of length, and a direction along the line to call positive. The second point is then uniquely determined, as the point on the line that is at a distance of one positive unit from the origin.
The distance between any two points on the real line is the absolute value of the numerical difference of their coordinates. It is common to identify the name of a point with its Cartesian coordinate. Thus if p and q are two points on the real line, then the distance between them is given by:
In one dimension, there is a single homogeneous, translation-invariant metric (i.e., a distance that is induced by a norm), up to a scale factor of length, which is the Euclidean distance, induced by the absolute-value norm (which is the unique norm in one dimension, up to scaling). In higher dimensions, there are other possible norms, such as the norms (which are all equal in one dimension), and in one dimension there are other metrics (scaling Euclidean distance by any monotonic function), but they are not induced by norms (since as metrics, they are not homogeneous and translation-invariant).
This is equivalent to the Pythagorean theorem.
Alternatively, it follows from (2) that if the polar coordinates of the point p are (r1, θ1) and those of q are (r2, θ2), then the distance between the points is
In three-dimensional Euclidean space, the distance is
In general, for an n-dimensional space, the distance is
Squared Euclidean distanceEdit
The square of the standard Euclidean distance, which is known as the squared Euclidean distance (SED), is also of interest; as an equation:
Squared Euclidean distance is of central importance in estimating parameters of statistical models, where it is used in the method of least squares, a standard approach to regression analysis. The corresponding loss function is the squared error loss (SEL), and places progressively greater weight on larger errors. The corresponding risk function (expected loss) is mean squared error (MSE).
Squared Euclidean distance is not a metric, as it does not satisfy the triangle inequality, nor does removing the parameter and its associated terms render the SED function into a norm or seminorm for the same reason. However, it is a more general notion of distance, namely a divergence (specifically a Bregman divergence), and can be used as a statistical distance. The Pythagorean theorem is simpler in terms of squared distance (since there is no square root); if , then:
In information geometry, the Pythagorean identity can be generalized from SED to other Bregman divergences, including relative entropy (Kullback–Leibler divergence), allowing generalized forms of least squares to be used to solve non-linear problems.
The SED is a smooth, strictly convex function of the two points, unlike the distance, which is not smooth when two points are equal and is not strictly convex (because it is linear). The SED is thus preferred in optimization theory, since it allows convex analysis to be used. Since squaring is a monotonic function of non-negative values, minimizing the SED is equivalent to minimizing the Euclidean distance, so the optimization problem is equivalent in terms of either, but easier to solve using the SED.
If one of the points is fixed, the SED can be interpreted as a potential function, in which case a normalization factor of one half is used, and the sign may be switched, depending on convention. In detail, given two points , the vector points from to and has magnitude proportional to their Euclidean distance. If one fixes , one can thus define a smooth vector field "pointing at " by This is the gradient of the scalar-valued function "half SED from ", where the half cancels the two in the power rule. Writing half the squared distance from as , one has Alternatively, one can consider the vector field pointing from , and omit the minus sign.
In information geometry, the notion of a vector field of "pointing from one point to another" can be generalized to statistical manifolds – one can use an affine connection to connect tangent vectors at different points and the exponential map to flow from one point to another, and on a statistical manifold this is invertible, defining a unique "difference vector" from any given point to another. In this context, the SED (whose gradient generates the standard difference vector) generalized to a divergence that generates the information geometry of the manifold; a uniform construction of such a divergence (given the geometric structure) is called a canonical divergence.
In computer science (and especially computer graphics, simulations, and video game development), the usage of traditional Euclidean distance and norm functions (in certain contexts) can be sometimes considered nonoptimal due to its dependence on the square root operation, which in many cases can be prohibitively slow. Algorithms based on comparisons between multiple distances or magnitudes can forgo the Euclidean metric and can instead utilize SED as an optimization, as relations between arbitrary non-negative values (in our case, distances) in a tuple should remain preserved after all values become squared (the SED values), e.g. if , then it follows that .
- Chebyshev distance measures distance assuming only the most significant dimension is relevant.
- Euclidean distance matrix
- Haversine distance giving great-circle distances between two points on a sphere from their longitudes and latitudes.
- Mahalanobis distance measures distance between a point and a distribution.
- Manhattan distance measures distance following only axis-aligned directions.
- Minkowski distance is a generalization that unifies Euclidean distance, Manhattan distance, and Chebyshev distance.
- Pythagorean addition
- Sequential euclidean distance transforms
- Vincenty's formulae well known as "Vincent distance"
- "Compendium of Mathematical Symbols". Math Vault. March 1, 2020. Retrieved September 1, 2020.
- Anton, Howard (1994), Elementary Linear Algebra (7th ed.), John Wiley & Sons, pp. 170–171, ISBN 978-0-471-58742-2
- Weisstein, Eric W. "Distance". mathworld.wolfram.com. Retrieved September 1, 2020.
- "Distance Between 2 Points". www.mathsisfun.com. Retrieved September 1, 2020.
- Ay & Amari 2015, 2. A New Approach to the General Inverse Problem.
- "mathematics - Are there any disadvantages of using Distance Squared checks rather than Distance?". Game Development Stack Exchange. Retrieved August 18, 2020.
- "Fast Square Root For Distance Calculations?". GameDev.net. Retrieved August 18, 2020.
- Deza, Elena; Deza, Michel Marie (2009). Encyclopedia of Distances. Springer. p. 94.
- "Cluster analysis". March 2, 2011.