Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Search for study opportunitiesNEW

Connect with the world's best universities and choose your course of study

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

The Gaussian Integers - Lecture Notes | MATH 3240, Study notes of Number Theory

University of Connecticut (UConn) - Avery Point Number Theory

Prof. Keith Conrad

Material Type: Notes; Professor: Conrad; Class: Introduction to Number Theory; Subject: Mathematics; University: University of Connecticut; Term: Fall 2009;

Typology: Study notes

2009/2010

Uploaded on 03/28/2010

koofers-user-9lx 🇺🇸

10 documents

1 / 33

Partial preview of the text

Download The Gaussian Integers - Lecture Notes | MATH 3240 and more Study notes Number Theory in PDF only on Docsity! THE GAUSSIAN INTEGERS KEITH CONRAD Since the work of Gauss, number theorists have been interested in analogues of Z where concepts from arithmetic can also be developed. The example we will look at in this handout is the Gaussian integers: Z[i] = {a+ bi : a, b ∈ Z}. Excluding the last two sections of the handout, the topics we will study are extensions of common properties of the integers. Here is what we will cover in each section: (1) the norm on Z[i] (2) divisibility in Z[i] (3) the division theorem in Z[i] (4) the Euclidean algorithm Z[i] (5) Bezout’s theorem in Z[i] (6) unique factorization in Z[i] (7) modular arithmetic in Z[i] (8) applications of Z[i] to the arithmetic of Z (9) primes in Z[i] 1. The Norm In Z, size is measured by the absolute value. In Z[i], we use the norm. Definition 1.1. For α = a+ bi ∈ Z[i], its norm is the product N(α) = αα = (a+ bi)(a− bi) = a2 + b2. For example, N(2 + 7i) = 22 + 72 = 53. For m ∈ Z, N(m) = m2. In particular, N(1) = 1. Thinking about a + bi as a complex number, its norm is the square of its usual absolute value: |a+ bi| = √ a2 + b2, N(a+ bi) = a2 + b2 = |a+ bi|2. The reason we prefer to deal with norms on Z[i] instead of absolute values on Z[i] is that norms are integers (rather than square roots), and the divisibility properties of norms in Z will provide important information about divisibility properties in Z[i]. This is based on the following algebraic property of the norm. Theorem 1.2. The norm is multiplicative: for α and β in Z[i], N(αβ) = N(α) N(β). Proof. Write α = a+ bi and β = c+di. Then αβ = (ac− bd) + (ad+ bc)i. We now compute N(α) N(β) and N(αβ): N(α) N(β) = (a2 + b2)(c2 + d2) = (ac)2 + (ad)2 + (bc)2 + (bd)2 1 2 KEITH CONRAD and N(αβ) = (ac− bd)2 + (ad+ bc)2 = (ac)2 − 2abcd+ (bd)2 + (ad)2 + 2abcd+ (bc)2 = (ac)2 + (bd)2 + (ad)2 + (bc)2. The two results agree, so N(αβ) = N(α) N(β). As a first application of Theorem 1.2, we determine the Gaussian integers which have mul- tiplicative inverses in Z[i]. The idea is to apply norms to reduce the question to invertibility in Z. Corollary 1.3. The only Gaussian integers which are invertible in Z[i] are ±1 and ±i. Proof. It is easy to see ±1 and ±i have inverses in Z[i]: 1 and −1 are their own inverse and i and −i are inverses of each other. For the converse direction, suppose α ∈ Z[i] is invertible, say αβ = 1 for some β ∈ Z[i]. We want to show α ∈ {±1,±i}. Taking the norm of both sides of the equation αβ = 1, we find N(α) N(β) = 1. This is an equation in Z, so we know N(α) = ±1. Since the norm doesn’t take negative values, N(α) = 1. Writing α = a + bi, we have a2 + b2 = 1, and the integral solutions to this give us the four values α = ±1,±i. Invertible elements are called units. The units of Z are ±1. The units of Z[i] are ±1 and ±i. Knowing a Gaussian integer up to multiplication by a unit is analogous to knowing an integer up to its sign. While there is no such thing as inequalities on Gaussian integers, we can talk about inequalities on their norms. In particular, induction on the norm (not on the Gaussian integer itself) is a technique to bear in mind if you want to prove something by induction in Z[i]. We will use induction on the norm to prove unique factorization (Theorems 6.4 and 6.6). The norm of every Gaussian integer is a non-negative integer, but it is not true that every non-negative integer is a norm. Indeed, the norms are the integers of the form a2 + b2, and not every positive integer is a sum of two squares. Examples include 3, 7, 11, 15, 19, and 21. No Gaussian integer has norm equal to these values. 2. Divisibility Divisibility in Z[i] is defined in the natural way: we say β divides α (and write β|α) if α = βγ for some γ ∈ Z[i]. In this case, we call β a divisor or a factor of α. Example 2.1. Since 14− 3i = (4 + 5i)(1− 2i), 4 + 5i divides 14− 3i. Example 2.2. Does (4 + 5i)|(14 + 3i)? We can do the division by taking a ratio and rationalizing the denominator: 14 + 3i 4 + 5i = (14 + 3i)(4− 5i) (4 + 5i)(4− 5i) = 71− 58i 41 = 71 41 − 58 41 i. This is not in Z[i]: the real and imaginary parts are 71/41 and −58/41, which are not integers. Therefore 4 + 5i does not divide 14 + 3i in Z[i]. Theorem 2.3. A Gaussian integer α = a + bi is divisible by an ordinary integer c if and only if c|a and c|b in Z. THE GAUSSIAN INTEGERS 5 |a− bq| ≤ (1/2)|b|. Write r = a− bq, so a = bq+ r with |r| ≤ (1/2)|b|. In the usual division theorem, the remainder is nonnegative and bounded above by |b|. We have shrunken the upper bound at the cost of possibly making the remainder negative. Sometimes a might land right in the middle between two multiples of b, in which case the quotient and remainder are not unique, e.g., if a = 27 and b = 6 then a is right in the middle between 4b and 5b: 27 = 6 · 4 + 3, 27 = 6 · 5− 3. Thus we get two choices of r, either 3 or −3. The usual division theorem in Z has a unique quotient and remainder, but the modified version gives up on uniqueness. This might seem like a calamity, but it’s exactly what we need to prove the division theorem in Z[i] (Theorem 3.1), which is what we turn to next. The proof is mostly a translation of the correct part of Example 3.2 into general algebraic terms. After the proof we will give further examples. Proof. We have α, β ∈ Z[i] with β 6= 0 and we want to construct γ, ρ ∈ Z[i] such that α = βγ + ρ where N(ρ) ≤ (1/2) N(β). Write α β = αβ ββ = αβ N(β) = m+ ni N(β) , where we set αβ = m + ni. Divide m and n by N(β) using the modified division theorem in Z: m = N(β)q1 + r1, n = N(β)q2 + r2, where q1 and q2 are in Z and 0 ≤ |r1|, |r2| ≤ (1/2) N(β). Then α β = N(β)q1 + r1 + (N(β)q2 + r2)i N(β) = q1 + q2i+ r1 + r2i N(β) . Set γ = q1 + q2i (this will be our desired quotient), so after a little algebra the above equation becomes (3.2) α− βγ = r1 + r2i β . We will show N(α−βγ) ≤ (1/2) N(β), so using ρ = α−βγ will settle the division theorem. Take norms of both sides of (3.2) and use N(β) = N(β) to get N(α− βγ) = r 2 1 + r 2 2 N(β) . Feeding the estimates 0 ≤ |r1|, |r2| ≤ (1/2) N(β) into the right side, N(α− βγ) ≤ (1/4) N(β) 2 + (1/4) N(β)2 N(β) = 1 2 N(β). Example 3.3. Let α = 11 + 10i and β = 4 + i. Then N(β) = 17. We compute α β = αβ N(β) = 54 + 29i 17 . 6 KEITH CONRAD Since 54/17 = 3.17 . . . and 29/17 = 1.70 . . . , we use γ = 3+2i (why?). Then α−βγ = 1−i, so we set ρ = 1− i. Note N(ρ) = 2 ≤ (1/2) N(β). Example 3.4. Let α = 41 + 24i and β = 11− 2i. Then N(β) = 125 and α β = αβ 125 = 403 + 346i 125 Since 403/125 = 3.224 . . . and 346/125 = 2.768 . . . , we use γ = 3 + 3i (why?) and find α− βγ = 2− 3i. Set ρ = 2− 3i and compare N(ρ) with N(β). There is one interesting difference between the division theorem in Z[i] and the (usual) division theorem in Z: the quotient and remainder are not unique in Z[i]. Example 3.5. Let α = 37 + 2i and β = 11 + 2i, so N(β) = 125. If you carry out the algorithm for division in Z[i] as it was developed above, you will be led to α = β · 3 + (4− 4i). However, it is also true that α = β(3− i) + (2 + 7i). The remainder in both cases has norm less than 125 (in fact, less than 125/2). Example 3.6. The reader may not be impressed by the previous example, since only the first outcome would actually come out of our division algorithm in Z[i]. We now give an example where the division algorithm itself allows for two different outcomes. Let α = 1+8i and β = 2− 4i. Then α β = αβ N(β) = −30 + 20i 20 = −3 2 + i. Since −3/2 lies right in the middle between −2 and −1, we can use γ = −1+i or γ = −2+i. Using the first choice, we obtain α = β(−1 + i)− 1 + 2i. Using the second choice, α = β(−2 + i) + 1− 2i. That division in Z[i] lacks uniqueness in the quotient and remainder does not seriously limit the usefulness of division. Indeed, back in Z, the uniqueness of the quotient and remainder for the usual division theorem is irrelevant for many important applications (such as Euclid’s algorithm). All those applications will carry over to Z[i], with essentially the same proofs. 4. The Euclidean Algorithm We begin by defining greatest common divisors in Z[i]. Definition 4.1. For non-zero α and β in Z[i], a greatest common divisor of α and β is a common divisor with maximal norm. This is analogous to the usual definition of greatest common divisor in Z, except the concept is not pinned down as a specific number. If δ is a greatest common divisor of α and β, so are (at least) its unit multiples −δ, iδ, and −iδ. Perhaps there are other greatest common divisors; we just don’t know yet. (We will find out in Corollary 4.7.) We can speak about a greatest common divisor, but not the greatest common divisor. A similar THE GAUSSIAN INTEGERS 7 technicality would occur in Z if we defined the greatest common divisor as a common divisor with largest absolute value, rather than the largest positive common divisor. There is no analogue of positivity in Z[i] (at least not in this course), so we are stuck with the concept of greatest common divisor always ambiguous at least by a unit multiple. Definition 4.2. When α and β only have unit factors in common, we call them relatively prime. Theorem 4.3 (Euclid’s algorithm). Let α, β ∈ Z[i] be non-zero. Recursively apply the division theorem, starting with this pair, and make the divisor and remainder in one equation the new dividend and divisor in the next, provided the remainder is not zero: α = βγ1 + ρ1, N(ρ1) < N(β) β = ρ1γ2 + ρ2, N(ρ2) < N(ρ1) ρ1 = ρ2γ3 + ρ3, N(ρ3) < N(ρ2) ... The last non-zero remainder is divisible by all common divisors of α and β, and is itself a common divisor, so it is a greatest common divisor of α and β. Proof. The proof is identical to the usual proof that Euclid’s algorithm works in Z. We briefly summarize the argument. Reasoning from the first equation down shows every common divisor of α and β divides the last non-zero remainder. Conversely, reasoning from the final equation up shows the last non-zero remainder (which is in the second-to- last equation) is a common divisor of α and β. Therefore this last non-zero remainder is a common divisor which is divisible by all the others. Thus it must have maximal norm among the common divisors, so it is a greatest common divisor. Example 4.4. We compute a greatest common divisor of α = 32 + 9i and β = 4 + 11i. Details involved in carrying out the division theorem in each step of Euclid’s algorithm are omitted. The reader could work them out as more practice with the division theorem. We find 32 + 9i = (4 + 11i)(2− 2i) + 2− 5i, 4 + 11i = (2− 5i)(−2 + i) + 3− i, 2− 5i = (3− i)(1− i)− i, 3− i = (−i)(1 + 3i) + 0. The last non-zero remainder is −i, so α and β only have unit factors in common. They are relatively prime. Notice that, unlike in Z+, when two Gaussian integers are relatively prime we do not necessarily obtain 1 as the last non-zero remainder. Rather, we just obtain some unit as the last non-zero remainder. Example 4.5. We show 4 + 5i and 4 − 5i, which are conjugates, are relatively prime in Z[i]: 4 + 5i = (4− 5i)i− (1− i) 4− 5i = −(1− i)(−4)− i −(1− i) = −i(1 + i) + 0. The last non-zero remainder is a unit, so we are done. 10 KEITH CONRAD Example 5.5. In Example 4.6, we saw −1 + 2i is a greatest common divisor of α = 11 + 3i and β = 1 + 8i. Reversing the steps of Euclid’s algorithm, −1 + 2i = 1 + 8i− (2− 4i)(−1 + i) = 1 + 8i− (11 + 3i− (1 + 8i)(1− i))(−1 + i) = (11 + 3i)(1− i) + (1 + 8i)(1 + (1− i)(−1 + i)) = (11 + 3i)(1− i) + (1 + 8i)(1 + 2i) = α(1− i) + β(1 + 2i). Example 5.6. Let α = 10 + 91i and β = 7 + 3i. By Euclid’s algorithm, α = β(6 + 11i) + 1− 4i, β = (1− 4i)(2i) +−1 + i, 1− 4i = (−1 + i)(−3 + i)− 1, −1 + i = −1(1− i) + 0, so the last non-zero remainder is −1. That tells us α and β are relatively prime. Using back-substitution, −1 = 1− 4i− (−1 + i)(−3 + i) = 1− 4i− (β − (1− 4i)(2i))(−3 + i) = (1− 4i)(1 + (2i)(−3 + i))− β(−3 + i) = (1− 4i)(−1− 6i) + β(3− i) = (α− β(6 + 11i))(−1− 6i) + β(3− i) = α(−1− 6i) + β(−(6 + 11i)(−1− 6i) + 3− i) = α(−1− 6i) + β(−57 + 46i). We can negate to write 1 as a Z[i]-combination of α and β: (5.2) 1 = α(1 + 6i) + β(57− 46i). While the previous example shows 10 + 91i and 7 + 3i do not have a common factor in Z[i], notice that their norms are N(10 + 91i) = 8381 = 172 · 29, N(7 + 3i) = 58 = 2 · 29, so the norms of 10+91i and 7+3i have a common factor in Z. We can understand how such phenomena (relatively prime Gaussian integers have non-relatively prime norms) happen by exhibiting the “prime factorizations” of 10+91i and 7+3i (without explaining how they are found, however): (5.3) 10 + 91i = (1− 4i)(4 + i)(5 + 2i), 7 + 3i = (1 + i)(5− 2i). Now we see why such examples are possible: the factors 5 + 2i and 5 − 2i have the same norm (namely 29) but they are relatively prime to each other. All the usual consequences of Bezout’s theorem over Z have analogues over Z[i]. Here are some of them. Corollary 5.7. Let α|βγ in Z[i] with α and β relatively prime. Then α|γ. THE GAUSSIAN INTEGERS 11 Proof. It’s just like the integer proof, but we write up the details anyway. Set βγ = ακ for some κ in Z[i]. Since α and β are relatively prime, we can solve the equation 1 = αx+ βy for some x, y ∈ Z[i]. Multiply both sides of the equation by γ: γ = γαx+ γβy = αγx+ ακy = α(γx+ κy). Thus α|γ. Corollary 5.8. If α|γ and β|γ in Z[i], with α and β relatively prime, then αβ|γ. Proof. Left to the reader. It’s just like the integer case. Corollary 5.9. For non-zero α, β, γ in Z[i], α and β are each relatively prime to γ if and only if αβ is relatively prime to γ. Proof. Left to the reader. It’s just like the integer case. We close out this section with an extension to Z[i] of several different characterizations of the greatest common divisor in Z. The greatest common divisor of non-zero integers a and b can be described in several ways: • the largest common divisor of a and b (definition) • the positive common divisor which all other common divisors divide • the smallest positive value of ax+ by (x, y ∈ Z) • the positive value of ax + by (x, y ∈ Z) which divides all other values of ax + by (x, y ∈ Z) The corresponding characterizations of greatest common divisors of two non-zero Gauss- ian integers α and β are these: • a common divisor of α and β with maximal norm (definition) • a common divisor which all other common divisors divide • a non-zero value of αx+ βy (x, y ∈ Z[i]) with smallest norm • a non-zero value of αx + βy (x, y ∈ Z[i]) which divides all other values of αx + βy (x, y ∈ Z[i]) Verifying the equivalence of all four conditions is left to the interested reader. It is completely analogous to the arguments used in the integer case. Notice the switch from “the” to “a” when we pass from Z to Z[i]: there are always four greatest common divisors, ambiguous up to multiplication by any of the four units. 6. Unique Factorization We will define composite and prime Gaussian integers, and then prove unique factoriza- tion. By Theorem 2.4, if β|α, then N(β)|N(α), so 1 ≤ N(β) ≤ N(α) when α 6= 0. Which divisors of α have norm 1 or N(α)? Lemma 6.1. For α 6= 0, any divisor of α whose norm is 1 or N(α) is a unit or is a unit multiple of α. 12 KEITH CONRAD Proof. If β|α and N(β) = 1, then β is ±1 or ±i. If β|α and N(β) = N(α), consider the complementary divisor γ, where α = βγ. Taking norms of both sides and cancelling the common value N(α), we see N(γ) = 1, so γ is ±1 or ±i. Therefore β has to be ±α or ±iα. Lemma 6.1 is not saying the only Gaussian integers whose norm is N(α) are ±α and ±iα. For instance, 1 + 8i and 4 + 7i both have norm 65 and neither is a unit multiple of the other. What Lemma 6.1 is saying is that the only Gaussian integers which divide α and have norm equal to N(α) are ±α and ±iα. When N(α) > 1, there are always eight obvious factors of α: ±1, ±i, ±α, and ±iα. We call these the trivial factors of α. They are analogous to the four trivial factors ±1 and ±n of any integer n with |n| > 1. Any other factor of α is called non-trivial. By Lemma 6.1, the non-trivial factors of α are the factors with norm strictly between 1 and N(α). Definition 6.2. Let α be a Gaussian integer with N(α) > 1. We call α composite if it has a non-trivial factor. If α only has trivial factors, we call α prime. Writing α = βγ, the condition 1 < N(β) < N(α) is equivalent to: N(β) > 1 and N(γ) > 1. We refer to any such factorization of α, into a product of Gaussian integers with norm greater than 1, as a non-trivial factorization. Thus, a composite Gaussian integer is one which admits a non-trivial factorization. For instance, a trivial factorization of 7+ i is i(1−7i). A non-trivial factorization of 7+ i is (1 − 2i)(1 + 3i). A non-trivial factorization of 5 is (1 + 2i)(1 − 2i). How interesting: 5 is prime in Z but it is composite in Z[i]. Even 2 is composite in Z[i]: 2 = (1 + i)(1 − i). However, 3 is prime in Z[i], so some primes in Z stay prime in Z[i] while others do not. To show 3 is prime in Z[i], we argue by contradiction. Assume it is composite and let a non-trivial factorization be 3 = αβ. Taking the norm of both sides, 9 = N(α) N(β). Since the factorization is non-trivial, N(α) > 1 and N(β) > 1. Therefore N(α) = 3. Writing α = a + bi, we get a2 + b2 = 3. There are no integers a and b satisfying that equation, so we have a contradiction. Thus, 3 has only trivial factorizations in Z[i], so 3 is prime in Z[i]. (In Corollary 9.4, we will see any prime p in Z+ satisfying p ≡ 3 mod 4 is prime in Z[i].) While we don’t really need to construct primes explicitly in Z[i] in order to prove unique factorization in Z[i], it is good to have some method of generating Gaussian primes, if only to get a feel for what they look like by comparison with prime numbers. The following test for primality in Z[i], using the norm, provides a way to generate many Gaussian primes if we can recognize primes in Z. Theorem 6.3. If the norm of a Gaussian integer is prime in Z, then the Gaussian integer is prime in Z[i]. For example, since N(4 + 5i) = 41, 4 + 5i is prime in Z[i]. Similarly, 4 − 5i is prime, as are 1± i, 1± 2i, 1± 3i, 1± 4i, 2± 3i, and 15± 22i. Compute each of their norms and check the result is a prime number. Proof. Let α ∈ Z[i] have prime norm, say p = N(α). We will show α only has trivial factors (that is, its factors have norm 1 or N(α) only), so α is prime in Z[i]. Consider any factorization of α in Z[i], say α = βγ. Taking norms, p = N(β) N(γ). This is an equation in positive integers, and p is prime in Z+, so either N(β) or N(γ) is 1. Therefore β or γ is a unit, so α does not admit nontrivial factors. Thus α is prime. THE GAUSSIAN INTEGERS 15 We will try to use the conclusion to tell us something about the hypothesis: use integer factorizations of the norm to suggest possible factors of the Gaussian integer. For instance, take α = 3 + 4i. Its norm is 25 = 5 · 5. If 3 + 4i factors, a non-trivial factor has to have norm 5. We know the Gaussian integers with norm 5: up to unit multiple they are 1 + 2i and 1− 2i. So we try the various possibilities: (1 + 2i)(1 + 2i) = −3 + 4i, (1 + 2i)(1− 2i) = 5, (1− 2i)(1− 2i) = −3− 4i. We recognize the last product as −α, so 3 + 4i = −(1− 2i)(1− 2i) = −(1− 2i)2. This is a prime factorization of 3 + 4i. What about 2319 + 1694i? Its norm is 8247397, whose prime factorization in Z is 8247397 = 17 · 29 · 16729. Let’s look for the Gaussian integers with norm 17, 29, and 16729. and then try multiplying them together to get 2319 + 1694i. Gaussian factors of 17, 29, and 16729 come from representations of each number as a sum of two squares: 17 = 12 + 42, 29 = 22 + 52, 16729 = 402 + 1232. (Admittedly, that last equation was not found by hand.) These give us prime factorizations in Z[i]: 17 = (1 + 4i)(1− 4i), 29 = (2 + 5i)(2− 5i), 16729 = (40 + 123i)(40− 123i) (The Gaussian integers here are prime since their norms are prime in Z.) Let’s pick one factor from each product and multiply them together. Happily, the first choice gives us what we want: (1 + 4i)(2 + 5i)(40 + 123i) = −2319− 1694i. Therefore the prime factorization of 2319 + 1694i is 2319 + 1694i = −(1 + 4i)(2 + 5i)(40 + 123i). Except for the overall sign, each factor on the right is prime in Z[i] since its norm is prime in Z. As an application of these ideas, try to discover the prime factorizations in (5.3) on your own. 7. Modular arithmetic in Z[i] As in the integers, congruences are defined using divisibility. Definition 7.1. For Gaussian integers α, β, and γ, we write α ≡ β mod γ when γ|(α− β). Example 7.2. To check 1 + 12i ≡ 2− i mod 3 + i, we subtract and divide: (1 + 12i)− (2− i) 3 + i = −1 + 13i 3 + i = 1 + 4i. The ratio is in Z[i], so the congruence holds. 16 KEITH CONRAD Congruences in Z[i] behave well under both addition and multiplication: α ≡ α′ mod γ, β ≡ β′ mod γ =⇒ α+ β ≡ α′ + β′ mod γ, αβ ≡ α′β′ mod γ. The details behind this are just like in Z and are left to the reader to check. Since congruence modulo 0 means equality, we usually assume the modulus is non-zero. A Gaussian integer can be reduced modulo α, if α 6= 0, to get a congruent Gaussian integer with small norm by dividing by α and using the remainder. Example 7.3. Let’s compute (3 + 2i)2 mod 4 + i. Since (3 + 2i)2 = 5 + 12i and 5 + 12i = (4 + i)(2 + 3i)− 2i, we have (3 + 2i)2 ≡ −2i mod 4 + i. Example 7.4. To reduce 1 + 8i mod 2− 4i, we divide. This was already done in Example 3.6, where we found more than one possibility: 1 + 8i = (2− 4i)(−1 + i)− 1 + 2i, 1 + 8i = (2− 4i)(−1 + i) + 1− 2i. Therefore 1 + 8i ≡ −1 + 2i mod 2− 4i and 1 + 8i ≡ 1− 2i mod 2− 4i. There is no reason to think −1 + 2i or 1− 2i is the more correct reduction. Both work. There is a way to picture what modular arithmetic in Z[i] means, by plotting the mul- tiples of a Gaussian integer in Z[i]. For example, let’s look at the Z[i]-multiples of 1 + 2i. Algebraically, a general Z[i]-multiple of 1 + 2i is (1 + 2i)(m+ ni) = (1 + 2i)m+ (1 + 2i)ni = m(1 + 2i) + n(−2 + i), where m and n are in Z. This is an integral combination of 1 + 2i and −2 + i = (1 + 2i)i. In Figure 1 we plot 1 + 2i and −2 + i as the vectors (1, 2) and (−2, 1) in R2. Figure 1. 1 + 2i and −2 + i The Z[i]-multiples of 1 + 2i are the integral combinations of the two vectors in Figure 1. Forming all these combinations produces the picture in Figure 2, where the plane is tiled by squares having the Gaussian multiples of 1 + 2i as the vertices. The significance of Figure 2 for modular arithmetic is that Gaussian integers are congruent modulo 1 + 2i precisely when they are located in the same relative positions within different squares of Figure 2. For example, 2 + 3i and 4 − 3i are in the same relative position within their squares, and their difference is a Gaussian multiple of 1 + 2i: (2 + 3i)− (4− 3i) 1 + 2i = −2 + 6i 1 + 2i = (−2 + 6i)(1− 2i) (1 + 2i)(1− 2i) = 10 + 10i 5 = 2 + 2i ∈ Z[i]. THE GAUSSIAN INTEGERS 17 Why are congruent Gaussian integers mod 1+2i in the same position within their respective squares? Because each square shares its sides with four other squares, and moving to these squares corresponds to adding 1 + 2i, −(1 + 2i), −2 + i, or −(−2 + i). Moving from a position in any square to the same relative position in any other square is translation by a Gaussian multiple of 1 + 2i. Figure 2. Z[i]-multiples of 1 + 2i We can use Figure 2 to make a list of representatives for Z[i]/(1 + 2i): use the Gaussian integers inside a square and one of its vertices. (All the vertices are Z[i]-multiples of 1 + 2i, so we should use only one of them.) Choosing the square with edges 1 + 2i and −2 + i, we get a list of 5 Gaussian integers: 0, i, 2i, −1 + i, −1 + 2i. Every Gaussian integer is congruent modulo 1 + 2i to exactly one of these. For instance, 2 + 3i ≡ −1 + 2i mod 1 + 2i since 2 + 3i and −1 + 2i are in the same relative position in their respective squares. Using instead the square with edges 1 + 2i and 2 − i, we get the list 0, 1, 2, 1 + i, 2 + i, and with this list we have 2 + 3i ≡ 1 + i mod 1 + 2i. There is nothing special about using the vertex 0 in our lists: we could use any of the other vertices of the square in place of 0 for our list of representatives modulo 1 + 2i. In fact, there is nothing special about using points inside or on a single square. We just need to use a set of points which fills out each relative position within all these squares. For instance, the numbers 0, 1, 2, 3, 4 could be used, and with this list we have 2 + 3i ≡ 3 mod 1 + 2i. Let’s look at the picture for modulus 2 + 2i. In Figure 3 we plot the Z[i]-multiples of 2 + 2i as vertices of squares. Since (2 + 2i)(m+ ni) = (2 + 2i)m+ (2 + 2i)ni = m(2 + 2i) + n(−2 + 2i), 20 KEITH CONRAD Theorem 7.6. For α and β in Z[i] with β 6= 0, αx ≡ 1 mod β is solvable if and only if α and β are relatively prime in Z[i]. If α and β are relatively prime then any linear congruence αx ≡ γ mod β has a unique solution. Proof. To solve αx ≡ 1 mod β with x ∈ Z[i] amounts to solving αx+ βy = 1 with x and y in Z[i], which is equivalent to relative primality of α and β by Corollary 5.2. Once we can invert α mod β, we can solve αx ≡ γ mod β by multiplying both sides by the inverse of α mod β. If there is going to be a solution this must be it, and it does work. Example 7.7. Can we solve (1 + 8i)x ≡ 1 mod 11 + 3i? No, since 1 + 8i and 11 + 3i have a common factor of −1 + 2i by Example 4.6. Example 7.8. Can we solve (7 + 3i)x ≡ 1 mod 10 + 91i? According to Example 5.6, 7 + 3i and 10 + 91i are relatively prime (although their norms are not), so there is a solution. Moreover, by using Euclid’s algorithm and back-substitution we found in (5.2) that (7 + 3i)(57− 46i) + (10 + 91i)(1 + 6i) = 1, so a solution is x = 57− 46i. (The norm of 57− 46i is less than the norm of the modulus 10 + 91i, so there is no great incentive to reduce our solution further mod10 + 91i.) Corollary 7.9. Let π be a Gaussian prime. Every α 6≡ 0 mod π has a multiplicative inverse modulo π and any polynomial congruence cnx n + cn−1xn−1 + · · ·+ c1x+ c0 ≡ 0 mod π, where ci ∈ Z[i] and cn 6≡ 0 mod π, has at most n solutions modulo π. Proof. Since π is prime, any α 6≡ 0 mod π is relatively prime to π and therefore α mod π has a multiplicative inverse by Theorem 7.6. Thus Z[i]/(π) is a field, so this corollary is a special case of the fact that polynomials have no more roots in a field than their degree. When we allow Gaussian integers into our congruences, does it change the meaning of congruences among ordinary integers? That is, if a, b, and c are in Z, does the meaning of a ≡ b mod c change when we think in Z[i]? That is, could integers which are incongruent modulo c in Z become congruent modulo c in Z[i]? No. Theorem 7.10. For a, b, and c in Z, a ≡ b mod c in Z if and only if a ≡ b mod c in Z[i]. Proof. In terms of divisibility, this is saying c|(a− b) in Z⇐⇒ c|(a− b) in Z[i], which is something we already checked in the paragraph after the proof of Theorem 2.3: divisibility between ordinary integers holds in Z if and only if it holds in Z[i]. So far modular arithmetic in Z[i] behaves just like in Z. But things now will get tricky, so pay attention! One of the useful properties of modular arithmetic in Z is Fermat’s little theorem. For a prime p in Z+, if a 6≡ 0 mod p then ap−1 ≡ 1 mod p. Naively translating this result into the Gaussian integers, using a Gaussian prime π, we get something like this: if α 6≡ 0 mod π then απ−1 ≡ 1 mod π. ???? If π is not a positive integer, then raising to the power π − 1 doesn’t mean anything in a congruence. (Well, if you have had complex analysis you may know a way to do this, but then you would also know the result is almost certainly not going to be in Z[i], so it’s the wrong idea for us.) Moreover, even when π is a positive integer that is prime in Z[i] the congruence απ−1 ≡ 1 mod π is usually wrong. THE GAUSSIAN INTEGERS 21 Example 7.11. Let π = 3, which is prime in Z[i]. Take α = i. Then απ−1 = i2 = −1, but −1 6≡ 1 mod 3, so απ−1 6≡ 1 mod π. Despite this setback, there is a good Gaussian integer version of Fermat’s little theorem. The way to find it is to go back to the proof of Fermat’s little theorem and remind ourselves how ap−1 actually showed up in the proof. It came from comparing two different sets of representatives for the non-zero integers modulo p: 1, 2, . . . , p − 1 and a, 2a, . . . , (p − 1)a. The two products of all the numbers in both cases have to be congruent modulo p, and cancelling common terms on both sides of the congruence (essentially a factor of (p − 1)!) leaves behind 1 ≡ ap−1 mod p. So the source of ap−1 comes from the fact that there are p−1 non-zero numbers modulo p. It is the size of the set of non-zero numbers modulo p which gave us the exponent in Fermat’s little theorem. There are p numbers in total modulo p, and we take away 1 because we don’t count 0. With this insight, we get almost for free a Z[i]-analogue. Theorem 7.12. Let π be a Gaussian prime and denote the number of Gaussian integers modulo π by n(π). If α 6≡ 0 mod π, then αn(π)−1 ≡ 1 mod π. Proof. There is no natural complete set of representatives for Z[i]/(π), but we can use any complete set of representatives at all. Denote it β1, β2, . . . , βn(π), where we take βn(π) = 0. Since α is invertible modulo π, another complete set of representatives for Z[i]/(π) is αβ1, αβ2, . . . , αβn(π). The last term here is 0. Multiplying congruent numbers retains the congruence, so let’s multiply each set of non-zero representatives together and compare: β1β2 · · ·βn(π)−1 ≡ (αβ1)(αβ2) · · · (αβn(π)−1) mod π ≡ αn(π)−1β1β2 · · ·βn(π)−1 mod π. Since the βi’s here are non-zero modulo π (why?), we can cancel them on both sides and we are left with 1 ≡ αn(π)−1 mod π. As soon as we try to test this result in an example, we run into a problem. We defined n(π) to be the size of Z[i]/(π) but we never gave a working formula for this size. For instance, what is n(3)? Or, to jazz things up, n(3 + 4i)? Example 7.13. Let’s show there are 9 elements in Z[i]/3, so n(3) = 9. A Gaussian integer is divisible by 3 exactly when its real and imaginary parts are divisible by 3 (Theorem 2.3). Therefore a+ bi ≡ c+ di mod 3⇐⇒ a ≡ b mod 3 and c ≡ d mod 3. The real and imaginary parts have 3 possibilities modulo 3, so there is a total of 3 ·3 = 9 in- congruent Gaussian integers modulo 3. We can even write down a nice set of representatives: a+ bi where 0 ≤ a, b ≤ 2. Since n(3) = 9, Theorem 7.12 says that if α 6≡ 0 mod 3 then α8 ≡ 1 mod 3. This works at α = i (unlike what we saw in Example 7.11). Using α = 1 + i shows the exponent 8 is optimal: (1 + i)k 6≡ 1 mod 3 for 1 ≤ k < 8. To make Theorem 7.12 really meaningful, we want a formula for n(π) in general. In fact, there is a nice formula for n(α) = #(Z[i]/(α)) even when α is not prime. Theorem 7.14. If α 6= 0 in Z[i], then n(α) = N(α). That is, the size of Z[i]/(α) is N(α). 22 KEITH CONRAD There is an analogy with the absolute value on Z: #(Z/m) = |m| when m 6= 0 and now #(Z[i]/(α)) = N(α) when α 6= 0. Our earlier lists of representatives for Z[i]/(1 + 2i), Z[i]/(2 + 2i), Z[i]/(3), and Z[i]/(3 + i) are all consistent with this norm formula. Perhaps we should point out why n(α) is finite (when α 6= 0) before we prove the formula for it. Using division by α, every Gaussian integer is congruent modulo α to some Gaussian integer with norm less than N(α). There are only finitely many Gaussian integers with norm below a given bound, so n(α) is finite.1 Before we prove Theorem 7.14 we establish a few lemmas about the n-function. Lemma 7.15. If m 6= 0 in Z then n(m) = m2. Proof. The argument is the same as the case m = 3 done in Example 7.13. Lemma 7.16. If α 6= 0 in Z[i] then n(α) = n(α). Proof. Congruences modulo α and congruences modulo α can be converted into one another by conjugating all terms: x ≡ y mod α⇐⇒ x ≡ y mod α. Therefore a complete set of representatives modulo α becomes a complete set of represen- tatives modulo α by conjugating the representatives, so n(α) = n(α). The next lemma needs a bit more work. Lemma 7.17. The function n is multiplicative: if α and β are non-zero in Z[i], then n(αβ) = n(α)n(β). Proof. Let a complete set of representatives for Z[i]/(α) be x1, x2, . . . , xr and a complete set of representatives for Z[i]/(β) be y1, y2, . . . , ys. (That is, r = n(α) and s = n(β).) Given any z ∈ Z[i], we have z ≡ xi mod α for some i. Then z−xi = αt for some Gaussian integer t, and t ≡ yj mod β for some j. Writing t = yj + βw, we have z = xi + αyj + αβw ≡ xi + αyj mod αβ. Thus the rs numbers xi + αyj are a set of representatives for Z[i]/(αβ). To show they are complete (that is, no repetitions), suppose (7.1) xi + αyj ≡ xi′ + αyj′ mod αβ. We want to show i = i′ and j = j′. Reducing both sides of (7.1) modulo α, xi ≡ xi′ mod α. Since the x’s are a complete set of representatives modulo α, this congruence must be equality: xi = xi′ (that is, i = i′). Then subtract the common xi on both sides of (7.1) and divide through the congruence (including the modulus!) by α. We are left with yj ≡ yj′ mod β. Since the y’s are a complete set of representatives modulo β, we have j = j′. We are ready to prove Theorem 7.14. All the real work has been put into the lemmas, so the proof now will be a short and slick argument. Proof. By Lemma 7.17, n(αα) = n(α)n(α). By Lemma 7.16, the right side is n(α)2. At the same time, since αα = N(α) is an integer, Lemma 7.15 says n(αα) = N(α)2. Thus N(α)2 = n(α)2. Take positive square roots. 1This shows Gaussian integers with norm less than N(α) fill up all congruence classes modulo α, but there could be different remainders which are congruent, unlike in Z, so n(α) is actually smaller than the number of these remainders. THE GAUSSIAN INTEGERS 25 If u = 1, then c = a and d = b. If u = −1, then c = −a and d = −b. If u = i, then c = b and d = −a. If u = −i, then c = −b and d = a. Thus c and d equal a and b up to order and sign. Theorem 8.1 is not saying that any integer which is a sum of two squares has only one representation in that form. It is only referring to primes which are sums of two squares. Two non-primes which are a sum of two squares in more than one way are 50 = 52 + 52 = 12 + 72 and 65 = 12 + 82 = 42 + 72. (We will find more examples at the end of this section.) Some primes which can be written as sums of two squares (necessarily uniquely) are 2 = 12 + 12, 5 = 12 + 22, 13 = 22 + 32, 17 = 12 + 42, 29 = 22 + 52. Example 8.2. The fifth Fermat number 22 5 + 1 = 4294967297 is easily a sum of two squares: 22 5 + 1 = (216)2 + 12. Euler found it can be written as a sum of two squares in a different way: (216)2 + 12 = 622642 + 204492. This actually has an interesting consequence. Fermat guessed that the fifth Fermat number was a prime, but the fact that it can be written as a sum of two squares in two different ways proves it is not prime without telling us what a nontrivial factor might be! (Euler did find a nontrivial factor, 641: 22 5 + 1 = 641 · 6700417.) Our next application of Z[i] to ordinary arithmetic is the classification of Pythagorean triples, which are integral solutions to the equation a2 + b2 = c2. If any two of a, b, and c have a common prime factor, it is also a factor of the third number (why?), so its square appears in all terms. Conversely, multiplying both sides by a square rescales a, b, and c by the same amount. Therefore, we focus our attention on the Pythagorean triples (a, b, c) where they share no common factor (equivalently, a and b alone share no common factor). Such triples are called primitive. Examples of primitive Pythagorean triples include (3, 4, 5), (5, 12, 13), and (8, 15, 17), but not (6, 8, 10). We will use unique factorization in Z[i] to obtain a formula for the primitive Pythagorean triples. Before we give the formula, let’s make a few observations about primitive triples (a, b, c). Since there is no common factor among the three numbers, at most one of them can be even (why?). Could c be even? If so, then a and b are odd, so a2 ≡ 1 mod 8 and b2 ≡ 1 mod 8. Then c2 = a2 +b2 ≡ 2 mod 8. But no number squares to 2 mod 8. Therefore c is odd. Since a2 + b2 is now known to be odd, a and b do not have the same parity. That is, one of them is odd and the other is even. Relabelling if necessary, we may assume that a is odd and b is even. With these preliminary observations out of the way, we are ready for the main result. Theorem 8.3. Every primitive Pythagorean triple (a, b, c) with a odd has the form a = m2 − n2, b = 2mn, c = m2 + n2, where m > n > 0, (m,n) = 1, and m 6≡ n mod 2. Conversely, for any such choice of m and n, the above formula is a primitive Pythagorean triple. Different choices of m and n give different primitive triples. We get (3, 4, 5) from (m,n) = (2, 1), (5, 12, 13) from (m,n) = (3, 2), and (15, 8, 17) from (m,n) = (4, 1). 26 KEITH CONRAD Proof. We write the equation a2 + b2 = c2 in the form (8.1) (a+ bi)(a− bi) = c · c. Our proof will have three steps: • use the primitivity of the triple to show a+ bi and a− bi are relatively prime in Z[i], • use unique factorization in Z[i] to show a+ bi is a square or i times a square in Z[i], • use the evenness of b to show a + bi is a square in Z[i], and then read off the consequences. First we show a+bi and a−bi are relatively prime. This is going to follow from (a, b) = 1 and c being odd. Let δ be a common divisor of a + bi and a − bi in Z[i]. It divides their sum and their difference: (8.2) δ|2a, δ|2b. (Strictly, δ dividing the difference means δ|2bi, but i is a unit so we can remove it.) Now we show δ is relatively prime to 2 in Z[i]. Since 2 = −i(1 + i)2 and 1 + i is prime, this is equivalent to showing δ is not divisible by 1 + i. By Corollary 2.5, (1 + i)|δ if and only if N(δ) is even. Because δ2|c2, by (8.1), which implies N(δ)2|c4, and c4 is odd, we see N(δ) is odd. That tells us 1 + i does not divide δ. Now that we know δ is relatively prime to 2 in Z[i], (8.2) simplifies to δ|a, δ|b. Because a and b are relatively prime in Z, they are also relatively prime in Z[i] (just solve ax+ by = 1 in Z and then view the equation in Z[i]). Thus, their only common divisors in Z[i] are units, so at last we see δ is a unit. In (8.1), we have a product of relatively prime Gaussian integers on the left and a perfect square on the right. If you think about it, the only way two relatively prime Gaussian integers can multiply to a square is if they are each squares. After all, think about how their prime factors can combine to give a square, given that they are relatively prime and that Z[i] has unique factorization. Thus, from (8.1) we must have a+ bi = (m+ ni)2 for some Gaussian integer m+ ni. Alas, this reasoning is wrong! Two relatively prime Gaussian integers can multiply to a square without either factor being a square. In fact, this possibility already can happen in Z: 36 = (−4)(−9). Neither −4 nor −9 is a square in Z, but their product is and they are relativey prime. Ah, the only sneaky thing here are the units. Remember, unique factorization always has an ambiguity due to units. (We tend to forget this in Z since we focus on factoring positive integers into positive factors, and the only positive unit is 1.) We can’t forget about units! Very well, we keep in mind the units in Z[i] when looking at (8.1). Since the two factors on the left are relatively prime and their product is a square, unique factorization in Z[i] tells us each factor is itself a square up to unit multiple. The units in Z[i] are ±1 and ±i. Since −1 is a perfect square, it can be absorbed into any square factor by writing it as i2. Therefore, we can say a+ bi = (m+ ni)2 or a+ bi = i(m+ ni)2 THE GAUSSIAN INTEGERS 27 for some m + ni ∈ Z[i]. Expanding these out and collecting real and imaginary parts, we have a+ bi = (m2 − n2) + (2mn)i or a+ bi = (−2mn)i+ (m2 − n2)i. Now we appeal to our convention that a is odd (and b is even). The second choice makes a even, so it is not correct. We thus must have (8.3) a+ bi = (m+ ni)2, so a+ bi is a perfect square after all. (The point is that we have now argued this correctly, rather than incorrectly as before.) The derivation of (8.3) from unique factorizzation in Z[i] is really the key step in this proof. The remainder of the proof will be just a matter of careful bookkeeping. Identifying real and imaginary parts in (8.3) gives us a = m2 − n2, b = 2mn. Therefore c2 = a2 + b2 = (m2−n2)2 + 4m2n2 = m4 + 2m2n2 +n4 = (m2 +n2)2. Since c > 0 we see that c = m2 + n2. Since b > 0, the formula for b shows m and n have the same sign: they are both positive or both negative. We can negate them both if necessary to assume m and n are positive without changing the values of a, b, or c. Since a > 0 we have m > n. Because a is odd, m and n have different parities. If m and n have a common factor, then we get a common factor in a, b, and c. Therefore primitivity of the triple (a, b, c) makes m and n relatively prime. Now we show any triple (m2−n2, 2mn,m2 +n2) with m and n positive, relatively prime, of opposite parity, and m > n, is a primitive Pythagorean triple. Easily it is a Pythagorean triple. Suppose it is not primitive. Then some prime p divides each of m2 − n2, 2mn, and m2 + n2. Since the first term is odd, p 6= 2. Then from p|2mn we have either p|m or p|n. If p|m, then the relation m2 ≡ n2 mod p shows n2 ≡ 0 mod p, so p|n. We were supposing (m,n) = 1, so we have a contradiction. That shows the triple is primitive. If instead p|n then we get a contradiction in the same way as before (just interchange the roles of m and n). As for the triple being uniquely determined by m and n, (8.3) tells us that the parameters m and n which describe the triple (a, b, c) are the coordinates of a square root of a+ bi. As there are only two square roots, which just differ by a sign, the uniqueness falls out (since we take m > 0 and n > 0). This proof tells us how to produce Pythagorean triples on demand: take any Gaussian integer α (with non-zero real and imaginary parts) and square it, say α2 = a + bi. Then (|a|, |b|,N(α)) is a Pythagorean triple. For example, (17+12i)2 = 145+408i and 172+122 = 433. Therefore (145, 408, 433) is a Pythagorean triple (check it!). Moreover, since 17 and 12 are relatively prime, this triple is primitive. The next application uses Z[i] to show a perfect square in Z never comes right before a perfect cube, except for the pair 0 and 1. Theorem 8.4. The only x, y ∈ Z satisfying y2 = x3 − 1 is (x, y) = (1, 0). Although the cubes are spread out more thinly than the squares in Z, it is not obvious why they couldn’t come within one of each other many times. 30 KEITH CONRAD As noted already, this lemma tells us the prime factors in Z[i] of the primes in Z+ will give us all Gaussian primes. Here are Gaussian prime factorizations of the first three prime numbers: 2 = (1 + i)(1− i), 3 = 3, 5 = (1 + 2i)(1− 2i). For instance, by unique factorization, any other Gaussian prime factor of 5 is a unit multiple of 1 + 2i or 1− 2i, which gives one of the following numbers: 1 + 2i, −1− 2i, −2 + i, 2− i, 1− 2i, −1 + 2i, −2− i, 2 + i. Up to unit multiple, these eight numbers are really just two numbers: 1 + 2i and 1− 2i. Theorem 9.2. A prime p in Z+ is composite in Z[i] if and only if it is a sum of two squares. Thus, any prime p in Z+ which is not a sum of two squares is not composite in Z[i], so it stays prime in Z[i]. Examples include 3, 7, 11, and 19. Proof. If the prime p in Z+ is composite in Z[i], let a non-trivial factorization be p = αβ. Then, taking norms, p2 = N(α) N(β). Since the factorization of p was nontrivial, and p > 0, we must have N(α) = p. Then, writing α = a+ bi, the norm equation tells us p = a2 + b2. Conversely, suppose a prime p in Z+ is a sum of two squares, say p = a2 + b2. Then in Z[i] we get the non-trivial factorization p = (a+ bi)(a− bi), so p is composite in Z[i]. The first primes in Z+ which are sums of two squares are 2, 5, 13, 17, and 29: 2 = 12 + 12, 5 = 12 + 22, 13 = 22 + 32, 17 = 12 + 42, 29 = 22 + 52. Therefore each of these prime numbers is composite in Z[i], e.g. 29 = (2 + 5i)(2 − 5i). This is a Gaussian prime factorization, since the factors have prime norm (and thus are themselves prime in Z[i]). The factorization of 2 is special, since its prime factors are unit multiples of each other: 1− i = −i(1 + i). In other words, 2 = −i(1 + i)2. Corollary 9.3. If a prime p in Z+ is composite, and p 6= 2, then up to unit multiple p has exactly two Gaussian prime factors, which are conjugate and have norm p. Proof. By Theorem 9.2, when p is composite we have p = a2 + b2 = (a+ bi)(a− bi) for some a, b ∈ Z. Since a+ bi and a− bi have prime norm p, they are prime in Z[i]. Could they be unit multiples? We consider all four ways this could happen and show each one leads to a contradiction. If a+bi = a−bi, then b = 0 and p = a2, which is a contradiction. If a+bi = −(a−bi), then a = 0 and we get a contradiction again. If a+bi = i(a−bi), then b = a and p = a2+a2 = 2a2, but p 6= 2. We have a contradiction. The final case, when a+ bi = −i(a− bi), again implies the contradiction p = 2a2. Corollary 9.4. If a prime p in Z+ satisfies p ≡ 3 mod 4, then it is not a sum of two squares in Z and it stays prime in Z[i]. THE GAUSSIAN INTEGERS 31 Proof. Once we show p is not a sum of two squares in Z, it is prime in Z[i] by Theorem 9.2. We consider the squares modulo 4: the only squares are 0 and 1. Adding them together modulo 4 gives us 0 (= 0 + 0), 1(= 1 + 0 or 0 + 1), and 2(= 1 + 1). We can’t get 3, so any number which is ≡ 3 mod 4 is not a sum of two squares in Z. We now know how 2 factors into Gaussian primes and how any prime p in Z+ with p ≡ 3 mod 4 factors in Z[i] (it doesn’t factor). What about the primes p ≡ 1 mod 4? The first such primes are 5, 13, 17, and 29. These are primes we saw earlier among the sums of two squares, so they are all composite in Z[i] by Theorem 9.2 and they factor into conjugate Gaussian primes by Theorem 9.3. Is every prime p ≡ 1 mod 4 a sum of two squares? Numerical evidence suggests it is true, so we make the Conjecture 9.5. For a prime p in Z+, the following conditions are equivalent: (1) p = 2 or p ≡ 1 mod 4, (2) p = a2 + b2 for some a, b ∈ Z. The easier condition to check in practice is (1). The more interesting condition, at least from the viewpoint of ordinary arithmetic, is (2). It is easy to see that (2) implies (1): if p = a2 + b2 for some a and b, then p mod 4 is a sum of two squares. The squares mod 4 are 0 and 1, so a sum of two squares mod 4 could be 0, 1, or 2. Therefore p ≡ 0, 1, 2 mod 4. The first choice is impossible (since p is prime) and the third only happens for p = 2. (This argument may look familiar. You already met it in the proof of Corollary 9.4.) What about the proof that (1) implies (2) (which is the more interesting direction any- way)? It turns out to be convenient to insert an additional property in between them, involving a polynomial modulo p. Theorem 9.6. Let p be a prime in Z+. The following conditions are equivalent: (1) p = 2 or p ≡ 1 mod 4, (2) the congruence x2 ≡ −1 mod p has a solution. (3) p = a2 + b2 for some a, b ∈ Z. Proof. We have already shown (3) implies (1). To show (1) implies (2), we may take p 6= 2. Consider the polynomial factorization (9.1) T p−1 − 1 = (T (p−1)/2 − 1)(T (p−1)/2 + 1) with mod p coefficients. We are going to count roots of these polynomials modulo p. Recall that a polynomial of degree d has no more than d roots modulo p. By Fermat’s little theorem, the left side of (9.1) has p−1 different roots modulo p, namely the non-zero integers modulo p. The first polynomial on the right side of (9.1) has degree (p − 1)/2, so it has at most (p − 1)/2 roots modulo p. Therefore the second polynomial T (p−1)/2 + 1 must have roots modulo p: some integer c satisfies c(p−1)/2 ≡ −1 mod p. Since p ≡ 1 mod 4, (p − 1)/2 is an even integer: if p = 4k + 1 then (p − 1)/2 = 2k. Therefore (ck)2 ≡ −1 mod p, which proves (2). To show (2) implies (3), we are going to show (2) implies p is composite in Z[i]. Then Theorem 9.2 says p is a sum of two squares. View the congruence in (2) as a divisibility relation in Z. When x2 ≡ −1 mod p for some x ∈ Z, p|(x2 + 1) in Z. Now consider this divisibility in Z[i], where we can factor x2 + 1: (9.2) p|(x+ i)(x− i). 32 KEITH CONRAD To show p is composite in Z[i], we argue by contradiction. If p is a Gaussian prime, then by (9.2) p|(x + i) or p|(x − i) in Z[i]. Therefore some Gaussian integer m + ni satisfies p(m+ni) = x± i, but look at the imaginary part: pn = ±1. This is impossible! We have a contradiction, which proves p is composite in Z[i], so p is a sum of two squares by Theorem 9.2. Be sure you make note of the way we used the condition p ≡ 1 mod 4 in the proof that (1) implies (2). We can now summarize the factorization of primes in Z+ into Gaussian prime factors. Theorem 9.7. Let p be a prime in Z+. The factorization of p in Z[i] is determined by p mod 4 : i) 2 = (1 + i)(1− i) = −i(1 + i)2. ii) If p ≡ 1 mod 4 then p = ππ is a product of two conjugate primes π, π which are not unit multiples. iii) If p ≡ 3 mod 4 then p stays prime in Z[i]. Proof. Part i is a numerical check. Part ii is a consequence of Corollary 9.3 and Theorem 9.6. Part iii is Corollary 9.4. Example 9.8. The prime 61 satisfies 61 ≡ 1 mod 4, so 61 has two conjugate Gaussian prime factors, coming from an expression of 61 as a sum of two squares. Since 61 = 52 + 62, 61 = (5 + 6i)(5− 6i). Combining the factorizations in Theorem 9.7 with Lemma 9.1, we now have a description of all the Gaussian primes in terms of the primes in Z+. Theorem 9.9. Every prime in Z[i] is a unit multiple of the following primes: i) 1 + i ii) π or π, where N(π) = p is a prime in Z+ which is ≡ 1 mod 4. iii) p, where p is a prime in Z+ with p ≡ 3 mod 4. Proof. Lemma 9.1 tells us any Gaussian prime is a factor of a prime in Z+. Theorem 9.7 and unique factorization in Z[i] tell us how the primes in Z+ factor in Z[i] up to unit multiple. The Gaussian primes in parts i and ii of Theorem 9.9 have prime norm in Z, while the primes occurring in part iii have norm p2, where p ≡ 3 mod 4. Moreover, when p ≡ 3 mod 4, its unit multiples in Z[i] are ±p and ±ip, which have real or imaginary part 0. Thus, although the converse to Theorem 6.3 is not strictly true, we see it is true for the “interesting” Gaussian integers, namely the ones with non-zero real and imaginary part: write α = a + bi and suppose a and b are both non-zero in Z. Then α is prime in Z[i] if and only if N(α) is prime in Z! Our classification of Gaussian primes tells us that a Gaussian prime has norm either p or p2, where p is the prime in Z+ which the Gaussian prime divides. In particular, any Gaussian prime other than 1+i (and its unit multiples) has an odd norm. Thus, a Gaussian integer which is not divisible by 1+i must have a norm which is odd, so any Gaussian integer with an even norm must be divisible by 1 + i. This is something we already checked, using simple algebra, back in Corollary 2.5. But now we understand why it is true from a higher point of view, in connection with unique factorization in Z[i]: Corollary 2.5 is true because every Gaussian integer with norm greater than 1 is a product of Gaussian primes and 1 + i is the only Gaussian prime up to unit multiple with even norm.

Documents

questions

The Gaussian Integers - Lecture Notes | MATH 3240, Study notes of Number Theory

Related documents

Partial preview of the text