Binary GCD algorithm

The binary GCD algorithm, also known as Stein's algorithm or the binary Euclidean algorithm,[1][2] is an algorithm that computes the greatest common divisor of two nonnegative integers. Stein's algorithm uses simpler arithmetic operations than the conventional Euclidean algorithm; it replaces division with arithmetic shifts, comparisons, and subtraction.

Visualisation of using the binary GCD algorithm to find the greatest common divisor (GCD) of 36 and 24. Thus, the GCD is 22 × 3 = 12.

Although the algorithm in its contemporary form was first published by the Israeli physicist and programmer Josef Stein in 1967,[3] it may have been known by the 2nd century BCE, in ancient China.[4][5]


The algorithm reduces the problem of finding the GCD of two nonnegative numbers v and u by repeatedly applying these identities:

  • gcd(0, v) = v, because everything divides zero, and v is the largest number that divides v. Similarly, gcd(u, 0) = u.
  • gcd(2u2v) = 2·gcd(u, v)
  • gcd(2uv) = gcd(uv), if v is odd (2 is not a common divisor). Similarly, gcd(u2v) = gcd(uv) if u is odd.
  • gcd(uv) = gcd(|u − v|, min(u, v)), if u and v are both odd.


Recursive version in CEdit

Following is a recursive implementation of the algorithm in C. The implementation is similar to the description of the algorithm given above, and optimised for readability rather than speed, though all but one of the recursive calls are tail recursive.

unsigned int gcd(unsigned int u, unsigned int v)
    // Base cases
    // gcd(n, n) = n
    if (u == v)
        return u;
    //  Identity 1: gcd(0, n) = gcd(n, 0) = n
    if (u == 0)
        return v;
    if (v == 0)
        return u;

    if (u & 1) { // u is odd
        if (v % 2 == 0) // v is even
            return gcd(u, v / 2); // Identity 3
        // Identities 4 and 3 (u and v are odd, so u-v and v-u are known to be even)
        if (u > v)
            return gcd((u - v) / 2, v);
            return gcd((v - u) / 2, u);
    } else { // u is even
        if (v & 1) // v is odd
            return gcd(u / 2, v); // Identity 3
        else // both u and v are even
            return 2 * gcd(u / 2, v / 2); // Identity 2

Iterative version in RustEdit

Following is an implementation of the algorithm in Rust, adapted from uutils. It removes all common factors of 2, using identity 2, then computes the GCD of the remaining numbers using identities 3 and 4, combining these to form the final answer.

pub fn gcd(mut u: u64, mut v: u64) -> u64 {
    use std::cmp::min;
    use std::mem::swap;

    // Base cases: gcd(n, 0) = gcd(0, n) = n
    if u == 0 {
        return v;
    } else if v == 0 {
        return u;

    // Using identities 2 and 3:
    // gcd(2ⁱ u, 2ʲ v) = 2ᵏ gcd(u, v) with u, v odd and k = min(i, j)
    // 2ᵏ is the greatest power of two that divides both u and v
    let i = u.trailing_zeros();  u >>= i;
    let j = v.trailing_zeros();  v >>= j;
    let k = min(i, j);

    loop {
        // u and v are odd at the start of the loop
        debug_assert!(u % 2 == 1, "u = {} is even", u);
        debug_assert!(v % 2 == 1, "v = {} is even", v);

        // Swap if necessary so u <= v
        if u > v {
            swap(&mut u, &mut v);

        // Using identity 4 (gcd(u, v) = gcd(|v-u|, min(u, v))
        v -= u;

        // Identity 1: gcd(u, 0) = u
        // The shift by k is necessary to add back the 2ᵏ factor that was removed before the loop
        if v == 0 {
            return u << k;

        // Identity 3: gcd(u, 2ʲ v) = gcd(u, v) (u is known to be odd)
        v >>= v.trailing_zeros();

This implementation showcases several performance optimisations:

  • Trial division by 2 is eschewed in favour of a single bitshift and the count trailing zeros primitive u64::trailing_zeros. This performs the equivalent of applying repeatedly identity 3, in a much smaller amount of time.
  • The loop is laid out so as to avoid repeated work; for instance, eliminating 2 as a factor of v was moved to the back of the loop, and after the exit condition, as v is known to be odd upon entering the loop.
  • The body of the loop is branch-free except for its exit condition (v == 0), as the exchange of u and v (ensuring u ≤ v) compiles down to conditional moves.[6] Hard-to-predict branches can have a large, negative impact on performance.[7][8]


The algorithm requires O(n) steps, where n is the number of bits in the larger of the two numbers, as every 2 steps reduce at least one of the operands by at least a factor of 2. Each step involves only a few arithmetic operations (O(1) with a small constant); when working with word-sized numbers, each arithmetic operation translates to a single machine operation, so the number of machine operations is on the order of log(max(u, v)) .

However, the asymptotic complexity of this algorithm is O(n2),[9] as those arithmetic operations (subtract and shift) each take linear time for arbitrarily-sized numbers (one machine operation per word of the representation). This is the same as for the Euclidean algorithm, though a more precise analysis by Akhavi and Vallée proved that binary GCD uses about 60% fewer bit operations.[10]


The binary GCD algorithm can be extended in several ways, either to output additional information, deal with arbitrarily-large integers more efficiently, or to compute GCDs in domains other than the integers.

The extended binary GCD algorithm, analogous to the extended Euclidean algorithm, fits in the first kind of extension, as it provides the Bézout coefficients in addition to the GCD, i.e. integers a and b such that a·u + b·v = gcd(u, v).[11][12][13]

In the case of large integers, the best asymptotic complexity is O(log n M(n)), with M(n) the cost of n-bit multiplication; this is near-linear, and vastly smaller than the O(n2) of the binary GCD algorithm, though concrete implementations only outperform older algorithms for numbers larger than about 64 kilobits (i.e. greater than 8×1019265). This is achieved by extending the binary GCD algorithm using ideas from the Schönhage–Strassen algorithm for fast integer multiplication.[14]

The binary GCD algorithm has also been extended to domains other than natural numbers, such as Gaussian integers,[15] Eisenstein integers,[16] quadratic rings,[17][18] and arbitrary unique factorisation domains.[19]

Historical descriptionEdit

An algorithm for computing the GCD of two numbers was known in ancient China, under the Han dynasty, as a method to reduce fractions:

If possible halve it; otherwise, take the denominator and the numerator, subtract the lesser from the greater, and do that alternately to make them the same. Reduce by the same number.

— Fangtian - Land surveying, The Nine Chapters on the Mathematical Art

The phrase "if possible halve it" is ambiguous,[4][5]

  • if this applies when either of the numbers become even, the algorithm is the binary GCD algorithm;
  • if this only applies when both numbers are even, the algorithm is similar to the Euclidean algorithm.

See alsoEdit


  1. ^ Brent, Richard P. (13–15 September 1999). Twenty years' analysis of the Binary Euclidean Algorithm. 1999 Oxford-Microsoft Symposium in honour of Professor Sir Antony Hoare. Oxford.
  2. ^ Brent, Richard P. (November 1999). Further analysis of the Binary Euclidean algorithm (Technical report). Oxford University Computing Laboratory. arXiv:1303.2772. PRG TR-7-99.
  3. ^ Stein, J. (February 1967), "Computational problems associated with Racah algebra", Journal of Computational Physics, 1 (3): 397–405, Bibcode:1967JCoPh...1..397S, doi:10.1016/0021-9991(67)90047-2, ISSN 0021-9991
  4. ^ a b Knuth, Donald (1998), Seminumerical Algorithms, The Art of Computer Programming, 2 (3rd ed.), Addison-Wesley, ISBN 978-0-201-89684-8
  5. ^ a b Zhang, Shaohua (2009-10-05). "The concept of primes and the algorithm for counting the greatest common divisor in Ancient China". arXiv:0910.0095 [math.HO].
  6. ^ Godbolt, Matt. "Compiler Explorer". Retrieved 4 November 2020.
  7. ^ Kapoor, Rajiv (2009-02-21). "Avoiding the Cost of Branch Misprediction". Intel Developer Zone.
  8. ^ Lemire, Daniel (2019-10-15). "Mispredicted branches can multiply your running times".
  9. ^ "GNU MP 6.1.2: Binary GCD".
  10. ^ Akhavi, Ali; Vallée, Brigitte (2000), "Average Bit-Complexity of Euclidean Algorithms", Proceedings ICALP'00, Lecture Notes Computer Science 1853: 373–387, CiteSeerX
  11. ^ Knuth 1998, p. 646, answer to exercise 39 of section 4.5.2
  12. ^ Menezes, Alfred J.; van Oorschot, Paul C.; Vanstone, Scott A. (October 1996). "§14.4 Greatest Common Divisor Algorithms" (PDF). Handbook of Applied Cryptography. CRC Press. pp. 606–610. ISBN 0-8493-8523-7. Retrieved 2017-09-09.
  13. ^ Cohen, Henri (1993). "Chapter 1 : Fundamental Number-Theoretic Algorithms". A Course In Computational Algebraic Number Theory. Graduate Texts in Mathematics. 138. Springer-Verlag. pp. 17–18. ISBN 0-387-55640-0.
  14. ^ Stehlé, Damien; Zimmermann, Paul (2004), "A binary recursive gcd algorithm" (PDF), Algorithmic number theory, Lecture Notes in Comput. Sci., 3076, Springer, Berlin, pp. 411–425, CiteSeerX, doi:10.1007/978-3-540-24847-7_31, ISBN 978-3-540-22156-2, MR 2138011, INRIA Research Report RR-5050.
  15. ^ Weilert, André (July 2000). "(1+i)-ary GCD Computation in Z[i] as an Analogue to the Binary GCD Algorithm". Journal of Symbolic Computation. 30 (5): 605–617. doi:10.1006/jsco.2000.0422.
  16. ^ Damgård, Ivan Bjerre; Frandsen, Gudmund Skovbjerg (August 12–15, 2003). Efficient Algorithms for GCD and Cubic Residuosity in the Ring of Eisenstein Integers. 14th International Symposium on the Fundamentals of Computation Theory. Malmö, Sweden. pp. 109–117. doi:10.1007/978-3-540-45077-1_11.CS1 maint: date format (link)
  17. ^ Agarwal, Saurabh; Frandsen, Gudmund Skovbjerg (June 13–18, 2004). Binary GCD Like Algorithms for Some Complex Quadratic Rings. Algorithmic Number Theory Symposium. Burlington, VT, USA. pp. 57–71. doi:10.1007/978-3-540-24847-7_4.CS1 maint: date format (link)
  18. ^ Agarwal, Saurabh; Frandsen, Gudmund Skovbjerg (March 20–24, 2006). A New GCD Algorithm for Quadratic Number Rings with Unique Factorization. 7th Latin American Symposium on Theoretical Informatics. Valdivia, Chile. pp. 30–42. doi:10.1007/11682462_8.CS1 maint: date format (link)
  19. ^ Wikström, Douglas (July 11–15, 2005). On the l-Ary GCD-Algorithm in Rings of Integers. Automata, Languages and Programming, 32nd International Colloquium. Lisbon, Portugal. pp. 1189–1201. doi:10.1007/11523468_96.CS1 maint: date format (link)

Further readingEdit

Covers the extended binary GCD, and a probabilistic analysis of the algorithm.

Covers a variety of topic, including the extended binary GCD algorithm which outputs Bézout coefficients, efficient handling of multi-precision integers using a variant of Lehmer's GCD algorithm, and the relationship between GCD and continued fraction expansions of real numbers.

An analysis of the algorithm in the average case, through the lens of functional analysis: the algorithms' main parameters are cast as a dynamical system, and their average value is related to the invariant measure of the system's transfer operator.

External linksEdit