Wednesday, December 25, 2024

Fermat's Christmas theorem

1. On 25 Dec. 1640, P. de Fermat (1607-1665) wrote to M. Mersenne, OM, about the two squares theorem for prime numbers, called Fermat's Christmas theorem.

Only some numbers can be written as a sum of two squares. P. de Fermat began a line of enquiry, founded on / premised upon the prime numbers, into which ones can be so written, on 25.12.1640; this can be considered the beginning of the modern higher arithmetic.

He stated in tbe letter that among (odd) primes, those (and, obviously, as shown below (art. 2, proposition III), only those) of the form p = 1 (mod 4) can be so written (i. e., those primes which yield 1 as remainder/residue on dividing by 4).

e. g.: 

    5 = 1 (mod 4).

    5 =  1^2 + 2^2.
 

Definition: (Odd) primes of the form p = 1 (mod 4) are called Pythagorean primes, (since the sum of two squares is reminiscent of the Pythagoras theorem); in contradistinction, (odd) primes of the form p = 3 (mod 4) = -1 (mod 4) are celled Gaussian primes (in the integers Z) (the reason for the name is explained in art. 6 below).

Proposition I (P. de Fermat): The Pythagorean primes can be written as a sum of two squares. 

Unlike the Pythagorean prime 5; the Gaussian primes 3, 7, or 11, cannot be so written; the next Pythagorean prime 13 =  2^2 + 3^2.

Note bene: The even prime 2 = 1^2 + 1^2.

Proposition II: If a prime is written as a sum of two numbers, they, the addends, are relatively prime (provided that the addends are proper, i. e., less than the sum).

 



2. Demonstration: For (odd) primes that can be so written, it is clear that they must be written as the sum of an odd square and an even square.

Odd squares are of the form x = 1^2, 3^2/(-1)^2 = 1 (mod 4); Even squares, of the form y = 0^2, 2^2 = 0 (mod 4). Thus their sum is of the form z = 1 (mod 4). Q. E. D.  Therefore:

Proposition III: If a prime can be written as a sum of two squares, then it is either even or a Pythagorean prime. The Gaussian primes cannot be so written.

N. B.: This is the converse of the Christmas theorem (proposition I).

What is not obvious is that all the Pythagorean primes can be so written. This is what P. de Fermat discovered and stated on 25.12.1640.



3. P. de Fermat did not give a demonstration of the propositon he had enunciated.

A complete demonstration of his Christmas theorem was given by L. Euler (1707-1783) only more than a century later, in two letters to C. Goldbach on 6.5.1747 and 12.4.1749, later published in two articles `De numeris qui sunt aggregata duorum quadratorum' [E228] and `Demonstratio theorematis Fermatiani omnem numerum primum formae 4n+1 esse summam duorum quadratorum' [E241].

J. L. Lagrange {G. L. La Grangia} (1736-1813) gave another demonstration in `Recherches d'arithmetique', Berlin, 1773, from his study of binary quadratic forms; C. F. Gauss (1777-1855) gave a succinct one in art. 182 of his `Disquisitiones arithmeticae', 1798, 1801.



4. A. Girard (1595-1632) had stated, in 1625, with no demonstrations, that: 

Proposition IV (A. Girard): The numbers that can be written as a sum of two squares are:

    i.   The even prime.
    ii.  Squares.  (N. B.: 0^2 is also permissible in the sum.)
    iii. The Pythagorean primes.
    iv.  Products of such numbers.

(in his commentary and annotated edition of the Arithmetic of Simon Stevinus).




5. Diophantus of Alexandria (fl. IIIrd century) showed (in his `Arithmetica') that: 

Proposition V (Diophantus of Alexandria): If two numbers can be written as sums of two squares, then so can be their product (and, in two different ways).

Demonstration:

(a^2 + b^2) (c^2 + d^2) = |a + bi|^2 |c + di|^2
                                       = (a+bi) (a-bi) (c+di) (c-di)
                                       = (a+bi) (c+di) (a-bi) (c-di)
                                       = [(ac-bd) + (ad+bc)i] [(ac-bd) - (ad+bc)i]
                                       = |(ac-bd) + (ad+bc)i|^2
                                       = (ac-bd)^2 + (ad+bc)^2

(a^2 + b^2) (c^2 + d^2) = (a^2 + b^2) (d^2 + c^2)
                                       = |a + bi|^2 |d + ci|^2
                                       = (a+bi) (a-bi) (d+ci) (d-ci)
                                       = (a+bi) (d+ci) (a-bi) (d-ci)
                                       = [(ad-bc) + (ac+bd)i] [(ad-bc) - (ac+bd)i]
                                       = |(ad-bc) + (ac+bd)i|^2
                                       = (ad-bc)^2 + (ac+bd)^2
                                       = (ac+bd)^2 + (ad-bc)^2

Thus we obtain that:

    (a^2 + b^2) (c^2 + d^2) = e^2 + f^2

where:

    either: e = ac-bd; f = ad+bc

    or:  e = ac+bd; f = ad-bc.

Q. E. D.  

Nota bene: Diophantus merely stated his identity, and gave no demonstration. I have written an anachronistic demonstration for his identity in the ring of Gaussian integers Z[i] (cf. the following art. 6 and, further, art. 10, Proposition IX).

 

 

6. C. F. Gauss in his second memoir on biquadratic {quartic} reciprocity, `Theoria residuorum biquadraticorum', 1831, discovered the ring Z[i] of Gaussian integers of the form a+bi (where a and b are integers, and the lateral/imaginary unit i = \sqrt{-1} is adjoined to the integers Z to obtain Z[i]; cf. art. 11 on the lateral unit), and showed that unique factorisation into primes (the fundamental theorem of arithmetic) holds for the Gaussian integers. 

The writing of a number as a sum of two squares can be reïnterpreted as splitting it into conjugate Gaussian integers (cf. art. 9 for the meaning of conjugate).

Some primes in Z split/decompose/factorise into conjugate primes in the ring Z[i], reflecting their being written as a sum of two squares (cf. art. 12 for a demonstration that these factors are Gaussian primes in Z[i], Proposition XI). 

The other primes in Z remain inert, prime in the ring Z[i] as well, reflecting the impossibility of their being written as a sum of two squares. 

Definition: Call the primes (in Z) that split into conjugate primes in the Gaussian integers, the splitting primes; and the primes (in Z) that remain inert in the Gaussian integers, the inert primes. In art. 2, it was shown (proposition III, reënunciated here):

Proposition VI: A splitting prime is either even or a Pythagorean prime. The Gaussian primes are inert.

Fermat and Girard enunciated that:

Proposition VII (P. de Fermat): The Pythagorean primes split. The inert primes are Gaussian.

 

7. C. F. Gauss in his memoir of 1831, and also C. Wessel in 1799 and J.-R. Argand in 1806, 1814, introduced the geometrical interpretation of the complex numbers

    a+bi = r e^{i \theta}

in the complex plane; where: 

    a+bi

is the Cartesian form of the complex number with abscissa a and ordinate b; and:

    r e^{i \theta}

is the polar form of the complex number with magnitude (radius, modulus, absolute value. norm) r and argument (polar angle, phase, azimuth) \theta, and: 

    e^{i \theta} = (\cos \theta + i \sin \theta)

is the direction factor/direction coëfficient/reduced form/polar factor/polar coëfficient/angular factor/angular coëfficient/phase factor/phase coëfficient/azimuthal factor/azimuthal coëfficient/polar arc

The complex numbers were discovered by Girolamo Fazio Cardano (1501-1576) in his `Ars magna', 1545. They were called complex numbers by C. F. Gauss in 1831.

The term lateral unit is due to C. F. Gauss, 1831; and the term imaginary unit, due to Renatus Cartesius, 1637. 

The notation i for the lateral unit is due to L. Euler in his lecture, `De formulis differentialibus angularibus maxime irrationalibus quas tamen per logarithmos et arcus circulares integrare licet' [E671], Academiae imperialis scientiarum, Petropolis {St. Petersburg}, Russia, May 1777 (5 days after the birth of C. F. Gauss); the next use of the notation is by C. F. Gauss in his `Disquisitiones arithmeticae', 1798, 1801.

 

1777

30 Apr. -> Birth of C. F. Gauss, Braunschweig, Braunschweig-Wolfenbüttel, Germania.

5 May -> L. Euler introduces the notation i = \sqrt{-1}. 

 

The formula for the polar coëfficient/reduced form:

    e^{i \theta} = (\cos \theta + i \sin \theta)

was obtained by L. Euler in his `Introductio in anaysin infinitorum', 1745, 1748; for \theta := \pi it yields:

    e^{i \pi} + 1 = 0.

 

 

8. The even prime, 2, a prime in Z, splits in Z[i], as 2 = (1+i) (1-i) = (-1+i) (-1-i), and is thus composite in Z[i]; 1+i, 1-i, -1+i, -i-i are the 4 Gaussian prime factors of 2 in Z[i]; these factorisations correspond to the sum of squares form 2 = 1^2 + 1^2.

The first Pythagorean prime, 5, a prime in Z, splits in Z[i], as 5 = (1+2i) (1-2i) = (2+i) (2-i), and is thus composite in Z[i]; 1+2i, 1−2i, −1+2i, −1−2i, 2+i, 2−i, −2+i, −2−i are the 8 Gaussian prime factors of 5 in Z[i]; these factorisations correspond to the sum of squares form 5 = 1^2 + 2^2. cf. the following art. 8 for the eightfold associates of 1+2i, 1-2i.

The first few (<= 1729) Pythagorean primes, i. e., of the form p = 1 (mod 4) are: 

5, 13, 17, 29, 37, 41, 53, 61, 73, 89, 97, 101, 109, 113, 137, 149, 157, 173, 181, 193, 197, 229, 233, 241, 257, 269, 277, 281, 293, 313, 317, 337, 349, 353, 373, 389, 397, 401, 409, 421, 433, 449, 457, 461, 509, 521, 541, 557, 569, 577, 593, 601, 613, 617, 641, 653, 661, 673, 677, 701, 709, 733, 757, 761, 769, 773, 797, 809, 821, 829, 853, 857, 877, 881, 929, 937, 941, 953, 977, 997, 1009, 1013, 1021, 1033, 1049, 1061, 1069, 1093, 1097, 1109, 1117, 1129, 1153, 1181, 1193, 1201, 1213, 1217, 1229, 1237, 1249, 1277, 1289, 1297, 1301, 1321, 1361, 1373, 1381, 1409, 1429, 1433, 1453, 1481, 1489, 1493, 1549, 1553, 1597, 1601, 1609, 1613, 1621, 1637, 1657, 1669, 1693, 1697, 1709, 1721, ...

For the splitting primes, the sum of squares representation is unique; and they have 8 Gaussian prime factors in Z[i], all corresponding to the sum of squares representation.

The first few (<= 1729) Gaussian primes in Z, i. e., of the form p = 3 (mod 4) = -1 (mod 4), are: 

3, 7, 11, 19, 23, 31, 43, 47, 59, 67, 71, 79, 83, 103, 107, 127, 131, 139, 151, 163, 167, 179, 191, 199, 211, 223, 227, 239, 251, 263, 271, 283, 307, 311, 331, 347, 359, 367, 379, 383, 419, 431, 439, 443, 463, 467, 479, 487, 491, 499, 503, 523, 547, 563, 571, 587, 599, 607, 619, 631, 643, 647, 659, 683, 691, 719, 727, 739, 743, 751, 787, 811, 823, 827, 839, 859, 863, 883, 887, 907, 911, 919, 947, 967, 971, 983, 991, 1019, 1031, 1039, 1051, 1063, 1087, 1091, 1103, 1123, 1151, 1163, 1171, 1187, 1223, 1231, 1259, 1279, 1283, 1291, 1303, 1307, 1319, 1327, 1367, 1399, 1423, 1427, 1439, 1447, 1451, 1459, 1471, 1483, 1487, 1499, 1511, 1523, 1531, 1543, 1559, 1567, 1571, 1579, 1583, 1607, 1619, 1627, 1663, 1667, 1699, 1723, ...

For the inert primes, there are no sum of squares representations; and they have 4 Gaussian prime factors in Z[i], to wit, p, pi, -p, -pi (the associates of p, cf. art. 11).

Proposition VIII: The threefold kinds of Gaussian primes in Z[i], which are factors of the three kinds of primes in Z; to wit: even (also splitting), splitting (and odd), and inert; are all of the Gaussian primes (cf. art. 12 for a demonstration, of Proposition XII).

 

 

9. Definition: Following C. F. Gauss, call:

    i. ||a + bi|| := a^2 + b^2 = r^2, the norm-squared;

    ii. |a + bi|  := \sqrt{||a+bi||} = \sqrt{a^2 + b^2} = \sqrt{r}, the norm;

of the complex number a + bi = r e^{i \theta}.

 

Nota bene: If \conj (a+bi) := a-bi = r e^{-i \theta}, called the conjugate of the complex number a+bi, then:

a. The conjugate of a product is the product of the conjugates; i. e.: 

        \conj{z_1 z_2} = \conj{z_1} \ conj{z_2} (Multiplicativity)

b. ||z|| = z \conj{z}.

c. \conj{z} is the reflection of z about the real axis/x- axis, negating the ordinate (and the polar angle). i \conj{z} is the reflection of z about the diagonal line having equation y = x of slope \pi/4.

d. For a complex number z = a + bi = r e^{i \theta}, consider the eightfold associates of z and \conj{z}:

    z, iz, -z, -iz, \conj{z}, i\conj{z}, -\conj{z}, -i\conj{z}.

which are, respectively, =

    a + bi, -b + ai, -a - bi, b - ai, a - bi, b + ai, -a + bi, -b - ai, =

    r e^{i \theta}, r e^{i (\theta + \pi/2)}, r e^{i (\theta + \pi)}, r e^{i (\theta + 3\pi/2)},
    r e^{-i \theta}, r e^{i (-\theta  + \pi/2)}, r e^{i (-\theta  + \pi)}, r e^{i (-\theta  + 3\pi/2)}.

These are the possible complex numbers that can be obtained from z by negating either the abscissa or ordinate or by transposing/exchanging them; they are the associates of either z or \conj{z} (cf. art. 11).

e. A complex number z is, together with \conj{z}, a zero of the polynomial:

    (x - z) (x - \conj{z}) = x^2 - 2\Re(z) + ||z||

 

It can be further deduced, for the Gaussian integers and their norms-squared:

i. The norm-squared of a Gaussian integer is a whole number.

ii. 

    a. The norm-squared of a unit (cf. art. 11) is 1; i. e.:

            ||1|| = ||i|| = ||-1|| = ||-i|| = 1

    b. The norm-squared of a product is the product of the norms-squared;

        i. e., for Gaussian integers z_1, z_2:

        ||z_1 z_2|| = (z_1 z_2) \conj {z_1 z_2}

                      = z_1 z_2 \conj{z_1} \conj{z_2}

                      = z_1 \conj{z_1} z_2 \conj{z_2}

                      = ||z_1|| ||z_2||    (Multiplicativity). 

    c. The norm-squared of a Gaussian integer is a sum of two squares.



10. The writing of a number as a sum of two squares can be reïnterpreted as the writing it as the norm-squared of a Gaussian integer.

The Diophantus identity (art. 5, proposition V) can be reënunciated as:

Proposition IX (Diophantus of Alexandria): If two numbers X = ||a+bi||, Y = ||c+di||, can be written as norms-squared of Gaussian integers, then so can be their product Z = ||e+fi|| (and, in two different ways).

Demonstration

XY = (a^2 + b^2) (c^2 + d^2) 
       = ||a + bi|| ||c + di||
       = ||(ac-bd) + (ad+bc)i|| (Multiplicativity)
       = (ac-bd)^2 + (ad+bc)^2

XY = (a^2 + b^2) (c^2 + d^2)
       = (a^2 + b^2) (d^2 + c^2)
       = ||a + bi|| ||d + ci||
       = ||(ad-bc) + (ac+bd)i|| (Multiplicativity)
       = (ad-bc)^2 + (ac+bd)^2
       = (ac+bd)^2 + (ad-bc)^2 

yielding:

    XY = ||a + bi|| ||c + di|| = ||e + fi|| = Z

where:

    either: e = ac-bd; f = ad+bc

    or:  e = ac+bd; f = ad-bc

and Z = ||e+fi||.

Q. E. D.   


11. Call i. -1. -i. 1, the units of the ring of Gaussian integers Z[i]; and for a Gaussian integer z; z, iz, -z, -iz, its associates in Z[i], obtained by multiplying by the units, rotating by right angles through the complex plane.

Likewise, call 1, -1, the units of the ring of integers Z; and for an integer m; m, -m, its associates in Z, obtained by multiplying by the units, rotating by straight angles through the complex plane.

Nota bene: The units are the only elements of the ring with norm 1, i. e., lying on the unit circle; and with their reciprocals also in the ring.

Nota bene: The units of Z[i] are the biquadratic {quartic} roots of unity, i. e.:

    \radical{1}{4} = 1^{1/4} = i. -1. -i. 1 = e^{i \pi/2}, e^{i 2\pi/2}, e^{i 3\pi/2}, e^{i 4\pi/2}

Likewise, the units of Z are the square roots of unity, i. e.: 

    \sqrt{1} = \radical {1}{2} = 1^{1/2} = -1. 1 = e^{i \pi}, e^{i 2\pi}

The nth roots of unity, i. e., \radical{1}{n} = 1^{1/n}, are the n zeros of the polynomial:

    x^n - 1 = (x-1) (x^{n-1} + x^{n-2} + ... + x + 1)

These are formed by rotating \omega := e^{i 2\pi/n} through the complex plane, i. e.: 

    \omega_k := \omega^k = e^{i 2\pi/n k} for k in [0..n)

i. e.: 

    \omega = \omega_1 = e^{i 2\pi/n}

generates the nth roots of unity.

A primitive nth root of unity is any such root that generates the n roots; these are:

    \omega_k := \omega^k = e^{i 2\pi/n k} for k in [0..n) with k relatively prime to n.

The nth roots of unity are said to be conjugates with respect to the polynomial x^n-1; the conjugates of 1, to wit, \omega_k for k in [1..n), are the zeros of the polynomial:

    x^{n-1} + x^{n-2} + ... + x + 1 = \prod_{k in [1..n)} (x - \omega_k).

For n = 2, \omega_1 = -1 is the primitive square root of unity; and for n = 4, \omega_1 = i and \omega_3 = -i are the primitive biquadratic {quartic} roots of unity.

For n=2, \omega = \omega_1 = e^{i \pi} = -1 is a zero of x+1, yielding:

    e^{i \pi} + 1 = 0  (as obtained in art. 6 too).     



12.  An inert prime is a Gaussian prime in Z[i]; the other Gaussian primes in Z[i] are as follows.

Demonstration: For Gaussian primes z = r+si in Z[i], either z is real (i. e., \Im(z)=s=0, and z is an inert prime); or z is non-real (i. e., has non-zero ordinate/lateral part/imaginary part), and n = ||z|| = r^2+s^2, a natural number that can be written as a sum of two squares; if n = ||z|| is not prime, its proper non-unit factors, say n_1, n_2, with n = n_1 n_2 would yield n_1 n_2 = (r + si) (r - si), and, due to unique factorisation in the Gaussian integers, proper non-unit factors for z in the Gaussian integers, an absurdity. Therefore, n = ||z|| = r^2+s^2 is a splitting prime. Q. E. D. Hence: 

Proposition X: A Gaussian prime in Z[i] is either an inert prime or its norm-squared is a splitting prime.

A splitting prime p is the norm-squared of a Gaussian integer z in Z[i], i. e., p = ||z||; the question arises whether these factors are Gaussian primes or composites in Z[i].

Demonstration: For splitting primes p, let p = r^2 + s^2 = ||r + si|| = ||z||, where z=r+si is a Gaussian integer; if z is not a Gaussian prime in Z[i], then, its proper non-unit factors, say z_1, z_2, with z = z_1 z_2 would yield p = ||z_1|| ||z_2||, proper non-unit factors for p, an absurdity. Therefore, z=r+si is a Gaussian prime (in Z[i]). Q. E. D. Hence:

Proposition XI: The factors of a splitting prime are conjugate Gaussian primes in Z[i]. 

 

Therefore (reënunciating proposition VIII in art. 8):

Proposition XII: The primes in the Gaussian integers Z[i] are: 

    i. 1+i (the proper non-unit factor of 2); and its associates in Z[i].

    ii. Proper non-unit factors of the splitting primes; and their associates in Z[i].

    iii. Inert primes; and its associates in Z[i].

Nota bene: 2 is a perfect square up to units in Z[i], since its proper factor 1+i has its conjugate 1-i being simultaneously its associate.

Nota bene: Purely lateral/imaginary perfect squares are of the form 2a^2 i = (a + ai)^2; the smallest such is 2i = (1+i)^2. 

These primes can be found on concentric circles emanating from the origin in the complex plane, having radii/norms: 

    i. \sqrt{2}.

    ii. \sqrt{p} where p is an odd splitting prime.

    iii. p, an inert prime.

 and norms-squared:

    i. 2.

    ii. p where p is an odd splitting prime.

    iii. p^2, where p is an inert prime.

 

13.  Reïnterpreting the writing of a number as a sum of two squares as the writing it as the norm-squared of a Gaussian integer:

Demonstration: Consider a natural number n that can be written as the norm-squared of a Gaussian integer z, i. e., n = ||z||, the prime factorisation of n, and the corresponding prime factorisation of z (obtained by taking one of the conjugate Gaussian prime factors in the case of the splitting prime factors of n). From the multiplicativity of norms-squared, it follows that n is the product of the three types of norms-squared enumerated in the preceding art. 12, wherein all the Gaussian primes in Z[i] contribute for each instance a prime of single multiplicity to the product n, except that the inert primes contribute primes of double multiplicity. Q. E. D. Therefore:

Proposition XIII: If a number can be written as the norm-squared of a Gaussian integer, then all the inert primes in its prime factorisation will be of even multiplicity.

and an analogue of Girard's theorem (art. 4, proposition IV) follows at once:

Proposition XIV (A. Girard): If a number can be written as the norm-squared of a Gaussian integer, then it is one of:

    i.   The even prime.
    ii.  Square.  (N. B.: 0^2 is also permissible in the sum.)
    iii. A splitting prime.
    iv.  Product of such numbers.