In mathematics, a covering number is the number of balls of a given size needed to completely cover a given space, with possible overlaps between the balls. The covering number quantifies the size of a set and can be applied to general metric spaces. Two related concepts are the packing number, the number of disjoint balls that fit in a space, and the metric entropy, the number of points that fit in a space when constrained to lie at some fixed minimum distance apart.

Definition edit

Let (M, d) be a metric space, let K be a subset of M, and let r be a positive real number. Let Br(x) denote the ball of radius r centered at x. A subset C of M is an r-external covering of K if:

 .

In other words, for every   there exists   such that  .

If furthermore C is a subset of K, then it is an r-internal covering.

The external covering number of K, denoted  , is the minimum cardinality of any external covering of K. The internal covering number, denoted  , is the minimum cardinality of any internal covering.

A subset P of K is a packing if   and the set   is pairwise disjoint. The packing number of K, denoted  , is the maximum cardinality of any packing of K.

A subset S of K is r-separated if each pair of points x and y in S satisfies d(x, y) ≥ r. The metric entropy of K, denoted  , is the maximum cardinality of any r-separated subset of K.

Examples edit

  1. The metric space is the real line  .   is a set of real numbers whose absolute value is at most  . Then, there is an external covering of   intervals of length  , covering the interval  . Hence:
     
  2. The metric space is the Euclidean space   with the Euclidean metric.   is a set of vectors whose length (norm) is at most  . If   lies in a d-dimensional subspace of  , then:[1]: 337 
     .
  3. The metric space is the space of real-valued functions, with the l-infinity metric. The covering number   is the smallest number   such that, there exist   such that, for all   there exists   such that the supremum distance between   and   is at most  . The above bound is not relevant since the space is  -dimensional. However, when   is a compact set, every covering of it has a finite sub-covering, so   is finite.[2]: 61 

Properties edit

  1. The internal and external covering numbers, the packing number, and the metric entropy are all closely related. The following chain of inequalities holds for any subset K of a metric space and any positive real number r.[3]
     
  2. Each function except the internal covering number is non-increasing in r and non-decreasing in K. The internal covering number is monotone in r but not necessarily in K.

The following properties relate to covering numbers in the standard Euclidean space,  :[1]: 338 

  1. If all vectors in   are translated by a constant vector  , then the covering number does not change.
  2. If all vectors in   are multiplied by a scalar  , then:
    for all  :  
  3. If all vectors in   are operated by a Lipschitz function   with Lipschitz constant  , then:
    for all  :  

Application to machine learning edit

Let   be a space of real-valued functions, with the l-infinity metric (see example 3 above). Suppose all functions in   are bounded by a real constant  . Then, the covering number can be used to bound the generalization error of learning functions from  , relative to the squared loss:[2]: 61 

 

where   and   is the number of samples.

See also edit

References edit

  1. ^ a b Shalev-Shwartz, Shai; Ben-David, Shai (2014). Understanding Machine Learning – from Theory to Algorithms. Cambridge University Press. ISBN 9781107057135.
  2. ^ a b Mohri, Mehryar; Rostamizadeh, Afshin; Talwalkar, Ameet (2012). Foundations of Machine Learning. US, Massachusetts: MIT Press. ISBN 9780262018258.
  3. ^ Tao, Terence. "Metric entropy analogues of sum set theory". Retrieved 2 June 2014.