Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Contents, Summaries of Relativity Theory

1 Physics in Euclidean Space and Flat Spacetime: Geometric Viewpoint. 1. 1.1 [N & R] Overview .

Typology: Summaries

2022/2023

Uploaded on 05/11/2023

shaukat54_pick
shaukat54_pick 🇺🇸

4.2

(21)

5 documents

Partial preview of the text

Download Contents and more Summaries Relativity Theory in PDF only on Docsity! Contents 1 Physics in Euclidean Space and Flat Spacetime: Geometric Viewpoint 1 1.1 [N & R] Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Foundational Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 [N] Newtonian Concepts . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.2 [R] Special Relativistic Concepts: Inertial frames, inertial coordinates, events, vectors, and spacetime diagrams . . . . . . . . . . . . . . . . 5 1.2.3 [R] Special Relativistic Concepts: Principle of Relativity; the Interval and its Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.3 [N & R] Tensor Algebra Without a Coordinate System . . . . . . . . . . . . 14 1.4 Particle Kinetics and Lorentz Force Without a Reference Frame . . . . . . . 16 1.4.1 [N] Newtonian Particle Kinetics . . . . . . . . . . . . . . . . . . . . . 16 1.4.2 [R] Relativistic Particle Kinetics: World Lines, 4-Velocity, 4-Momentum and its Conservation, 4-Force . . . . . . . . . . . . . . . . . . . . . . 17 1.4.3 [R] Geometric Derivation of the Lorentz Force Law . . . . . . . . . . 20 1.5 Component Representation of Tensor Algebra . . . . . . . . . . . . . . . . . 21 1.5.1 [N] Euclidean 3-space . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 1.5.2 [R] Minkowski Spacetime . . . . . . . . . . . . . . . . . . . . . . . . . 23 1.5.3 [N & R] Slot-Naming Index Notation . . . . . . . . . . . . . . . . . . 25 1.6 [R] Particle Kinetics in Index Notation and in a Lorentz Frame . . . . . . . . 27 1.7 Orthogonal and Lorentz Transformations of Bases, and Spacetime Diagrams 32 1.7.1 [N] Euclidean 3-space: Orthogonal Transformations . . . . . . . . . . 32 1.7.2 [R] Minkowski Spacetime: Lorentz Transformations . . . . . . . . . . 34 1.7.3 [R] Spacetime Diagrams for Boosts . . . . . . . . . . . . . . . . . . . 35 1.8 [R] Time Travel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 1.9 [N & R] Directional Derivatives, Gradients, Levi-Civita Tensor, Cross Product and Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 1.10 [R] Nature of Electric and Magnetic Fields; Maxwell’s Equations . . . . . . . 44 1.11 Volumes, Integration, and Integral Conservation Laws . . . . . . . . . . . . . 48 1.11.1 [N] Newtonian Volumes and Integration . . . . . . . . . . . . . . . . 48 1.11.2 [R] Spacetime Volumes and Integration . . . . . . . . . . . . . . . . . 49 1.11.3 [R] Conservation of Charge in Spacetime . . . . . . . . . . . . . . . . 51 1.11.4 [R] Conservation of Particles, Baryons and Rest Mass . . . . . . . . . 53 1.12 The Stress-Energy Tensor and Conservation of 4-Momentum . . . . . . . . . 56 1.12.1 [N] Newtonian Stress Tensor and Momentum Conservation . . . . . . 56 i ii 1.12.2 [R] Relativistic Stress-Energy Tensor . . . . . . . . . . . . . . . . . . 58 1.12.3 [R] 4-Momentum Conservation . . . . . . . . . . . . . . . . . . . . . . 60 1.12.4 [R] Stress-Energy Tensors for Perfect Fluid and Electromagnetic Field 61 3 Special Relativity Classical Physics in the absence of gravity Arena: Flat, Minkowski spacetime vanishing gravity General Relativity The most accurate framework for Classical Physics Arena: Curved spacetime weak gravity small speeds small stresses Newtonian Physics Approximation to relativistic physics Arena: Flat, Euclidean 3-space, plus universal time low speeds small stresses add weak gravity Fig. 1.1: The three frameworks and arenas for the classical laws of physics, and their relationship to each other. in Euclidean space, and Lorentz coordinates in Minkowski spacetime. This chapter is an introduction to all the differential geometric tools that we shall need in these limited arenas. In Parts III, IV, and V, when studying elasticity theory, fluid mechanics, and plasma physics, we will use curvilinear coordinates in nontrivial ways. As a foundation for them, at the beginning of Part III we will extend our flat-space differential geometric tools to curvilinear coordinate systems (e.g. cylindrical and spherical coordinates). Finally, at the beginning of Part VI, we shall extend our geometric tools to the arena of curved spacetime. In this chapter we shall alternate back and forth, one section after another, between the laws of physics and flat-space differential geometry, using each to illustrate and illuminate the other. We begin in Sec. 1.2 by recalling the foundational concepts of Newtonian physics and of special relativity. Then in Sec. 1.3 we develop our first set of differential geometric tools: the tools of coordinate-free tensor algebra. In Sec. 1.4 we illustrate our tensor-algebra tools by using them to describe—without any coordinate system or reference frame whatsoever—the kinematics of point particles that move through the Euclidean space of Newtonian physics and through relativity’s Minkowski spacetime; the particles are allowed to collide with each other and be accelerated by an electromagnetic field. In Sec. 1.5, we extend the tools of tensor algebra to the domain of Cartesian and Lorentz coordinate systems, and then in Sec. 1.6 we use these extended tensorial tools to restudy the motions, collisions, and electromagnetic accelerations of particles. In Sec. 1.7 we discuss rotations in Euclidean space and Lorentz transformations in Minkowski spacetime, and we develop relativistic spacetime diagrams in some depth and use them to study such relativistic phenomena as length contraction, time dilation, and simultaneity breakdown. In Sec. 1.8 we illustrate the tools we have developed by asking whether the laws of relativity permit a highly advanced civilization to build time machines for traveling backward in time as well as forward. In Sec. 1.9 we develop additional differential geometric tools: directional derivatives, gradients, and the Levi-Civita tensor, and in Sec. 1.10 we use these tools to discuss Maxwell’s equations and the geometric nature of electric and magnetic fields. In Sec. 1.11 we develop our final set of geometric tools: volume elements and the integration of tensors over spacetime, and in Sec. 1.12 we use these tools to define the stress tensor of Newtonian physics and relativity’s stress-energy tensor, and to formulate very general versions of the conservation of 4-momentum. 4 1.2 Foundational Concepts 1.2.1 [N] Newtonian Concepts The arena for the Newtonian laws is a spacetime composed of the familiar 3-dimensional Euclidean space of everyday experience (which we shall call 3-space), and a universal time t. We shall denote points (locations) in 3-space by capital script letters such as P and Q. These points and the 3-space in which they live require no coordinates for their definition. A scalar is a single number that we associate with a point, P, in 3-space. We are interested in scalars that represent physical quantities, e.g., temperature T . When a scalar is a function of location P in space, e.g. T (P), we call it a scalar field. A vector in Euclidean 3-space can be thought of as a straight arrow that reaches from one point, P, to another, Q (e.g., the arrow ∆x of Fig. 1.2a). Equivalently, ∆x can be thought of as a direction at P and a number, the vector’s length. Sometimes we shall select one point O in 3-space as an “origin” and identify all other points, say Q and P, by their vectorial separations xQ and xP from that origin. The Euclidean distance ∆σ between two points P and Q in 3-space can be measured with a ruler and so, of course, requires no coordinate system for its definition. (If one does have a Cartesian coordinate system, it can be computed by the Pythagorean formula, a precursor to the “invariant interval” of flat spacetime, Sec. 1.2.3.) This distance ∆σ is also the length |∆x| of the vector ∆x that reaches from P to Q, and the square of that length is denoted |∆x|2 ≡ (∆x)2 ≡ (∆σ)2 . (1.1) Of particular importance is the case when P and Q are neighboring points and ∆x is a differential (infinitesimal) quantity dx. By traveling along a sequence of such dx’s, laying them down tail-at-tip, one after another, we can map out a curve to which these dx’s are tangent (Fig. 1.2b). The curve is P(λ), with λ a parameter along the curve; and the infinitesimal vectors that map it out are dx = (dP/dλ)dλ. The product of a scalar with a vector is still a vector; so if we take the change of location dx of a particular element of a fluid during a (universal) time interval dt, and multiply it by 1/dt, we obtain a new vector, the fluid element’s velocity v = dx/dt, at the fluid element’s location P. Performing this operation at every point P in the fluid defines the velocity field v(P). Similarly, the sum (or difference) of two vectors is also a vector and so taking the P Q P Q x x x ∆ O C (a) (b) Fig. 1.2: (a) A Euclidean 3-space diagram depicting two points P and Q, their vectorial separations ~xP and ~xQ from the (arbitrarily chosen) origin O, and the vector ∆x = xQ −xP connecting them. (b) A curve C generated by laying out a sequence of infinitesimal vectors, tail-to-tip. 5 difference of two velocity measurements at times separated by dt and multiplying by 1/dt generates the acceleration a = dv/dt. Multiplying by the fluid element’s (scalar) mass m gives the force F = ma that produced the acceleration; dividing an electrically produced force by the fluid element’s charge q gives another vector, the electric field E = F/q, and so on. We can define inner products [Eq. (1.9a) below] of pairs of vectors at a point (e.g., force and displacement) to obtain a new scalar (e.g., work), and cross products [Eq. (1.60a)] of vectors to obtain a new vector (e.g., torque). By examining how a differentiable scalar field changes from point to point, we can define its gradient [Eq. (1.54b)]. In this fashion, which should be familiar to the reader and will be elucidated and generalized below, we can construct all of the standard scalars and vectors of Newtonian physics. What is important is that these physical quantities require no coordinate system for their definition. They are geometric (coordinate-independent) objects residing in Euclidean 3-space at a particular time. It is a fundamental (though often ignored) principle of physics that the Newtonian physical laws are all expressible as geometric relationships between these types of geometric objects, and these relationships do not depend upon any coordinate system or orientation of axes, nor on any reference frame (on any purported velocity of the Euclidean space in which the measurements are made).2 We shall return to this principle throughout this book. It is the Newtonian analog of Einstein’s Principle of Relativity (Sec. 1.2.3 below). 1.2.2 [R] Special Relativistic Concepts: Inertial frames, inertial coordinates, events, vectors, and spacetime diagrams Because the nature and geometry of Minkowski spacetime are far less obvious intuitively than those of Euclidean 3-space, we shall need a crutch in our development of the Minkowski foundational concepts. That crutch will be inertial reference frames. We shall use them to develop in turn the following frame-independent Minkowski-spacetime concepts: events, 4- vectors, the principle of relativity, geometrized units, the interval and its invariance, and spacetime diagrams. An inertial reference frame is a (conceptual) three-dimensional latticework of measuring rods and clocks (Fig. 1.3) with the following properties: (i) The latticework moves freely through spacetime (i.e., no forces act on it), and is attached to gyroscopes so it does not rotate with respect to distant, celestial objects. (ii) The measuring rods form an orthogonal lattice and the length intervals marked on them are uniform when compared to, e.g., the wavelength of light emitted by some standard type of atom or molecule; and therefore the rods form an orthonormal, Cartesian coordinate system with the coordinate x measured along one axis, y along another, and z along the third. (iii) The clocks are densely packed throughout the latticework so that, ideally, there is a separate clock at every lattice point. (iv) The clocks tick uniformly when compared, e.g., to the period of the light emitted by some standard type of atom or molecule; i.e., they are ideal clocks . (v) The clocks are synchronized by the Einstein synchronization process: If a pulse of light, emitted by one of the clocks, bounces off a mirror attached to another and then returns, the time of bounce tb 2By changing the velocity of Euclidean space, one adds a constant velocity to all particles, but this leaves the laws, e.g. Newton’s F = ma, unchanged. 8 spacetime from P to Q. Different observers in different inertial frames will attribute different coordinates to each birth and different components to the births’ vectorial separation; but all observers can agree that they are talking about the same events P and Q in spacetime and the same separation vector ∆~x. In this sense, P, Q, and ∆~x are frame-independent, geometric objects (points and arrows) that reside in spacetime. 1.2.3 [R] Special Relativistic Concepts: Principle of Relativity; the Interval and its Invariance The principle of relativity states that Every (special relativistic) law of physics must be ex- pressible as a geometric, frame-independent relationship between geometric, frame-independent objects, i.e. objects such as points in spacetime and vectors and tensors, which represent physical quantities such as events and particle momenta and the electromagnetic field. Since the laws are all geometric (i.e., unrelated to any reference frame or coordinate system), there is no way that they can distinguish one inertial reference frame from any other. This leads to an alternative form of the principle of relativity (one commonly used in elementary textbooks and equivalent to the above): All the (special relativistic) laws of physics are the same in every inertial reference frame, everywhere in spacetime. A more operational version of this principle is the following: Give identical instructions for a specific physics experiment to two different observers in two different inertial reference frames at the same or different locations in Minkowski (i.e., gravity-free) spacetime. The experiment must be self-contained, i.e., it must not involve observations of the external universe’s properties (the “environment”), though it might utilize carefully calibrated tools derived from the external universe. For example, an unacceptable experiment would be a measurement of the anisotropy of the Universe’s cosmic microwave radiation and a computation therefrom of the observer’s velocity relative to the radiation’s mean rest frame; such an experiment studies the Universal environment, not the fundamental laws of physics. An acceptable experiment would be a measurement of the speed of light using the rods and clocks of the observer’s own frame, or a measurement of cross sections for elementary particle reactions using cosmic-ray particles whose incoming energies and compositions are measured as initial conditions for the experiment. The principle of relativity says that in these or any other similarly self-contained experiments, the two observers in their two different inertial frames must obtain identically the same experimental results—to within the accuracy of their experimental techniques. Since the experimental results are governed by the (nongravitational) laws of physics, this is equivalent to the statement that all physical laws are the same in the two inertial frames. Perhaps the most central of special relativistic laws is the one stating that the speed of light c in vacuum is frame-independent, i.e., is a constant, independent of the inertial reference frame in which it is measured. In other words, there is no aether that supports light’s vibrations and in the process influences its speed — a remarkable fact that came as a great experimental surprise to physicists at the end of the nineteenth century. The constancy of the speed of light is built into Maxwell’s equations. In order for the Maxwell equations to be frame independent, the speed of light, which appears in them, must also be frame independent. In this sense, the constancy of the speed of light follows from the Principle of Relativity; it is not an independent postulate. This is illustrated in Box 1.2. 9 Box 1.2 Measuring the Speed of Light Without Light r q,µ Q ae r am q,µ v Q In some inertial reference frame we perform two experiments using two particles, one with a large charge Q; the other, a test particle, with a much smaller charge q and mass µ. In the first experiment we place the two particles at rest, separated by a distance |∆x| ≡ r and measure the electrical repulsive acceleration ae of q (left diagram). In Gaussian cgs units (where the speed of light shows up explicitly instead of via ǫoµo = 1/c2), the acceleration is ae = qQ/r2µ. In the second experiment, we connect Q to ground by a long wire, and we place q at the distance |∆x| = r from the wire and set it moving at speed v parallel to the wire. The charge Q flows down the wire with an e-folding time τ so the current is I = dQ/dτ = (Q/τ)e−t/τ . At early times 0 < t ≪ τ , this current I = Q/τ produces a solenoidal magnetic field at q with field strength B = (2/cr)(Q/τ), and this field exerts a magnetic force on q, giving it an acceleration am = q(v/c)B/µ = 2vqQ/c2τr/µ. The ratio of the electric acceleration in the first experiment to the magnetic acceleration in the second experiment is ae/am = c2τ/2rv. Therefore, we can measure the speed of light c in our chosen inertial frame by performing this pair of experiments, carefully measuring the separation r, speed v, current Q/τ , and accelerations, and then simply computing c = √ (2rv/τ)(ae/am). The principle of relativity insists that the result of this pair of experiments should be independent of the inertial frame in which they are performed. Therefore, the speed of light c which appears in Maxwell’s equations must be frame-independent. In this sense, the constancy of the speed of light follows from the Principle of Relativity as applied to Maxwell’s equations. The constancy of the speed of light was verified with very high precision in an era when the units of length (centimeters) and the units of time (seconds) were defined independently. By 1983, the constancy had become so universally accepted that it was used to redefine the meter (which is hard to measure precisely) in terms of the second (which is much easier to measure with modern technology4): The meter is now related to the second in such a way 4The second is defined as the duration of 9,192,631,770 periods of the radiation produced by a certain hyperfine transition in the ground state of a 133Cs atom that is at rest in empty space. Today (2008) all fundamental physical units except mass units (e.g. the kilogram) are defined similarly in terms of fundamental constants of nature. 10 that the speed of light is precisely c = 299, 792, 458 m s−1 ; i.e., one meter is the distance traveled by light in (1/299, 792, 458) seconds. Because of this constancy of the light speed, it is permissible when studying special relativity to set c to unity. Doing so is equivalent to the relationship c = 2.99792458× 1010cm s−1 = 1 (1.3a) between seconds and centimeters; i.e., equivalent to 1 second = 2.99792458 × 1010 cm . (1.3b) We shall refer to units in which c = 1 as geometrized units, and we shall adopt them throughout this book, when dealing with relativistic physics, since they make equations look much simpler. Occasionally it will be useful to restore the factors of c to an equation, thereby converting it to ordinary (SI or Gaussian-cgs) units. This restoration is achieved easily using dimensional considerations. For example, the equivalence of mass m and energy E is written in geometrized units as E = m. In cgs units E has dimensions ergs = gram cm2 sec−2, while m has dimensions of grams, so to make E = m dimensionally correct we must multiply the right side by a power of c that has dimensions cm2/sec2, i.e. by c2; thereby we obtain E = mc2. We turn, next, to another fundamental concept, the interval (∆s)2 between the two events P and Q whose separation vector is ∆~x. In a specific but arbitrary inertial reference frame and in geometrized units, (∆s)2 is given by (∆s)2 ≡ −(∆t)2 + (∆x)2 + (∆y)2 + (∆z)2 = −(∆t)2 + ∑ i,j δij∆x i∆xj ; (1.4a) cf. Eq. (1.2). Here δij is the Kronecker delta, (unity if i = j; zero otherwise) and the spatial indices i and j are summed over 1, 2, 3. If (∆s)2 > 0, the events P and Q are said to have a spacelike separation; if (∆s)2 = 0, their separation is null or lightlike; and if (∆s)2 < 0, their separation is timelike. For timelike separations, (∆s)2 < 0 implies that ∆s is imaginary; to avoid dealing with imaginary numbers, we describe timelike intervals by (∆τ)2 ≡ −(∆s)2 , (1.4b) whose square root ∆τ is real. The coordinate separation between P and Q depends on one’s reference frame; i.e., if ∆xα′ and ∆xα are the coordinate separations in two different frames, then ∆xα′ 6= ∆xα. Despite this frame dependence, the principle of relativity forces the interval (∆s)2 to be the same in all frames: (∆s)2 = −(∆t)2 + (∆x)2 + (∆y)2 + (∆z)2 = −(∆t′)2 + (∆x′)2 + (∆y′)2 + (∆z′)2 (1.5) We shall sketch a proof for the case of two events P and Q whose separation is timelike: 13 is more fundamental to relativity than light itself! **************************** EXERCISES Exercise 1.1 Practice: [R] Geometrized Units Convert the following equations from the geometrized units in which they are written to cgs/Gaussian units: (a) The “Planck time” tP expressed in terms of Newton’s gravitation constant G and Planck’s constant ~, tP = √ G~. What is the numerical value of tP in seconds? in meters? (b) The Lorentz force law mdv/dt = e(E + v × B). (c) The expression p = ~ωn for the momentum p of a photon in terms of its angular frequency ω and direction n of propagation. How tall are you, in seconds? How old are you, in centimeters? Exercise 1.2 Derivation and Example: [R] Invariance of the Interval Complete the derivation of the invariance of the interval given in the text [Eqs. (1.6)], using the principle of relativity in the form that the laws of physics must be the same in the primed and unprimed frames. Hints, if you need them: (a) Having carried out the construction shown in Fig. 1.5 in the unprimed frame, use the same mirror and photons for the analogous construction in the primed frame. Argue that, independently of the frame in which the mirror is at rest (unprimed or primed), the fact that the reflected photon has (angle of reflection) = (angle of incidence) in the primed frame implies that this is also true for this same photon in the unprimed frame. Thereby conclude that the construction leads to Eq. (1.6b) as well as to (1.6a). (b) Then argue that the perpendicular distance of an event from the common x and x′ axis must be the same in the two reference frames, so h′ = h and ∆y′ = ∆y; whence Eqs. (1.6b) and (1.6a) imply the invariance of the interval. [For a leisurely version of this argument, see Secs. 3.6 and 3.7 of Taylor and Wheeler (1992).] **************************** 14 1.3 [N & R] Tensor Algebra Without a Coordinate Sys- tem We now pause in our development of the geometric view of physical laws, to introduce, in a coordinate-free way, some fundamental concepts of differential geometry: tensors, the inner product, the metric tensor, the tensor product, and contraction of tensors. In this section we shall allow the space in which the concepts live to be either 4-dimensional Minkowski spacetime, or 3-dimensional Euclidean space; we shall denote its dimensionality by N ; and we shall use spacetime’s arrowed notation ~A for vectors even though the space might be Euclidean 3-space. We have already defined a vector ~A as a straight arrow from one point, say P, in our space to another, say Q. Because our space is flat, there is a unique and obvious way to transport such an arrow from one location to another, keeping its length and direction unchanged.5 Accordingly, we shall regard vectors as unchanged by such transport. This enables us to ignore the issue of where in space a vector actually resides; it is completely determined by its direction and its length. 7.95 T Fig. 1.6: A rank-3 tensor T. A rank-n tensor T is, by definition, a real-valued, linear function of n vectors. Pictorially we shall regard T as a box (Fig. 1.6) with n slots in its top, into which are inserted n vectors, and one slot in its end, out of which rolls computer paper with a single real number printed on it: the value that the tensor T has when evaluated as a function of the n inserted vectors. Notationally we shall denote the tensor by a bold-face sans-serif character T T( , , , ︸ ︷︷ ︸ ) . (1.8a) տ n slots in which to put the vectors If T is a rank-3 tensor (has 3 slots) as in Fig. 1.6, then its value on the vectors ~A, ~B, ~C will be denoted T( ~A, ~B, ~C). Linearity of this function can be expressed as T(e ~E + f ~F , ~B, ~C) = eT( ~E, ~B, ~C) + fT(~F , ~B, ~C) , (1.8b) where e and f are real numbers, and similarly for the second and third slots. We have already defined the squared length ( ~A)2 ≡ ~A2 of a vector ~A as the squared distance (in 3-space) or interval (in spacetime) between the points at its tail and its tip. The inner product ~A · ~B of two vectors is defined in terms of the squared length by ~A · ~B ≡ 1 4 [ ( ~A+ ~B)2 − ( ~A− ~B)2 ] . (1.9a) 5This is not so in curved spaces, as we shall see in Sec. 24.7. 15 In Euclidean space this is the standard inner product, familiar from elementary geometry. Because the inner product ~A · ~B is a linear function of each of its vectors, we can regard it as a tensor of rank 2. When so regarded, the inner product is denoted g( , ) and is called the metric tensor. In other words, the metric tensor g is that linear function of two vectors whose value is given by g( ~A, ~B) ≡ ~A · ~B . (1.9b) Notice that, because ~A · ~B = ~B · ~A, the metric tensor is symmetric in its two slots; i.e., one gets the same real number independently of the order in which one inserts the two vectors into the slots: g( ~A, ~B) = g( ~B, ~A) (1.9c) With the aid of the inner product, we can regard any vector ~A as a tensor of rank one: The real number that is produced when an arbitrary vector ~C is inserted into ~A’s slot is ~A( ~C) ≡ ~A · ~C . (1.9d) Second-rank tensors appear frequently in the laws of physics—often in roles where one sticks a single vector into the second slot and leaves the first slot empty thereby producing a single-slotted entity, a vector. A familiar example is a rigid body’s (Newtonian) moment- of-inertia tensor I( , ). Insert the body’s angular velocity vector Ω into the second slot, and you get the body’s angular momentum vector J( ) = I( ,Ω). Other examples are the stress tensor of a solid, a fluid, a plasma or a field (Sec. 1.12 below) and the electromagnetic field tensor (Secs. 1.4.3 and 1.10 below). From three (or any number of) vectors ~A, ~B, ~C we can construct a tensor, their tensor product (also called outer product in contradistinction to the inner product ~A · ~B), defined as follows: ~A⊗ ~B ⊗ ~C( ~E, ~F , ~G) ≡ ~A( ~E) ~B(~F ) ~C( ~G) = ( ~A · ~E)( ~B · ~F )( ~C · ~G) . (1.10a) Here the first expression is the notation for the value of the new tensor, ~A⊗ ~B⊗ ~C evaluated on the three vectors ~E, ~F , ~G; the middle expression is the ordinary product of three real numbers, the value of ~A on ~E, the value of ~B on ~F , and the value of ~C on ~G; and the third expression is that same product with the three numbers rewritten as scalar products. Similar definitions can be given (and should be obvious) for the tensor product of any two or more tensors of any rank; for example, if T has rank 2 and S has rank 3, then T ⊗ S( ~E, ~F , ~G, ~H, ~J) ≡ T( ~E, ~F )S( ~G, ~H, ~J) . (1.10b) One last geometric (i.e. frame-independent) concept we shall need is contraction. We shall illustrate this concept first by a simple example, then give the general definition. From two vectors ~A and ~B we can construct the tensor product ~A⊗ ~B (a second-rank tensor), and we can also construct the scalar product ~A · ~B (a real number, i.e. a scalar, i.e. a rank-0 tensor). The process of contraction is the construction of ~A · ~B from ~A⊗ ~B contraction( ~A⊗ ~B) ≡ ~A · ~B . (1.11a) 18 the world line as a frame-independent, geometric object. Also shown in the figure is the particle’s 4-velocity ~u, which (by analogy with the velocity in 3-space) is the time derivative of its position: ~u ≡ d~x/dτ . (1.15) This derivative is defined by the usual limiting process d~x dτ ≡ lim ∆τ→0 ~x(τ + ∆τ) − ~x(τ) ∆τ . (1.16) The squared length of the particle’s 4-velocity is easily seen to be −1: ~u2 ≡ g(~u, ~u) = d~x dτ · d~x dτ = d~x · d~x (dτ)2 = −1 . (1.17) The last equality follows from the fact that d~x · d~x is the squared length of d~x which equals the invariant interval (∆s)2 along it, and (dτ)2 is minus that invariant interval. The particle’s 4-momentum is the product of its 4-velocity and rest mass ~p ≡ m~u = md~x/dτ ≡ d~x/dζ . (1.18) Here the parameter ζ is a renormalized version of proper time, ζ ≡ τ/m . (1.19) This ζ , and any other renormalized version of proper time with position-independent renor- malization factor, are called affine parameters for the particle’s world line. Expression (1.18), together with the unit length of the 4-velocity ~u2 = −1, implies that the squared length of the 4-momentum is ~p 2 = −m2 . (1.20) In quantum theory a particle is described by a relativistic wave function which, in the geometric optics limit (Chapter 6), has a wave vector ~k that is related to the classical particle’s 4-momentum by ~k = ~p/~ . (1.21) The above formalism is valid only for particles with nonzero rest mass, m 6= 0. The corresponding formalism for a particle with zero rest mass (e.g. a photon or a graviton7) can be obtained from the above by taking the limit as m → 0 and dτ → 0 with the quotient dζ = dτ/m held finite. More specifically, the 4-momentum of a zero-rest-mass particle is well defined (and participates in the conservation law to be discussed below), and it is expressible in terms of the particle’s affine parameter ζ by Eq. (1.18) ~p = d~x dζ . (1.22) 7We do not know for sure that photons and gravitons are massless, but the laws of physics as currently undertood require them to be massless and there are tight experimental limits on their rest masses. 19 x y t p → p → p → p → 1 2 2 1 V Fig. 1.8: Spacetime diagram depicting the law of 4-momentum conservation for a situation where two particles, numbered 1 and 2, enter an interaction region V in spacetime, there interact strongly, and produce two new particles, numbered 1̄ and 2̄. The sum of the final 4-momenta, ~p1̄ + ~p2̄, must be equal to the sum of the initial 4-momenta, ~p1 + ~p2. However, the particle’s 4-velocity ~u = ~p/m is infinite and thus undefined; and proper time τ = mζ ticks vanishingly slowly along its world line and thus is undefined. Because proper time is the square root of the invariant interval along the world line, the interval between two neighboring points on the world line vanishes identically; and correspondingly the world line of a zero-rest-mass particle is null . (By contrast, since dτ 2 > 0 and ds2 < 0 along the world line of a particle with finite rest mass, the world line of a finite-rest-mass particle is timelike.) The 4-momenta of particles are important because of the law of conservation of 4- momentum (which, as we shall see in Sec. 1.6, is equivalent to the conservation laws for energy and ordinary momentum): If a number of “initial” particles, named A = 1, 2, 3, . . . enter a restricted region of spacetime V and there interact strongly to produce a new set of “final” particles, named Ā = 1̄, 2̄, 3̄, . . . (Fig. 1.8), then the total 4-momentum of the final particles must be be the same as the total 4-momentum of the initial ones: ∑ Ā ~pĀ = ∑ A ~pA . (1.23) Note that this law of 4-momentum conservation is expressed in frame-independent, geomet- ric language—in accord with Einstein’s insistence that all the laws of physics should be so expressible. As we shall see in Part VI, momentum conservation is a consequence of the translation symmetry of flat, 4-dimensional spacetime. In general relativity’s curved space- time, where that translation symmetry is lost, we lose momentum conservation except under special circumstances; see Sec. 24.9.4. If a particle moves freely (no external forces and no collisions with other particles), then its 4-momentum ~p will be conserved along its world line, d~p/dζ = 0. Since ~p is tangent to the world line, this means that the direction of the world line never changes; i.e., the free particle moves along a straight line through spacetime. To change the particle’s 4-momentum, one must act on it with a 4-force ~F , d~p/dτ = ~F . (1.24) 20 If the particle is a fundamental one (e.g., photon, electron, proton), then the 4-force must leave its rest mass unchanged, 0 = dm2/dτ = −d~p2/dτ = −2~p · d~p/dτ = −2~p · ~F ; (1.25) i.e., the 4-force must be orthogonal to the 4-momentum. 1.4.3 [R] Geometric Derivation of the Lorentz Force Law As an illustration of these physical concepts and mathematical tools, we shall use them to deduce the relativistic version of the Lorentz force law. From the outset, in accord with the principle of relativity, we insist that the law we seek be expressible in geometric, frame- independent language, i.e. in terms of vectors and tensors. Consider a particle with charge q and rest mass m 6= 0, interacting with an electromag- netic field. It experiences an electromagnetic 4-force whose mathematical form we seek. The Newtonian version of the electromagnetic force F = q(E + v × B) is proportional to q and contains one piece (electric) that is independent of velocity v, and a second piece (magnetic) that is linear in v. It is reasonable to expect that, in order to produce this Newtonian limit, the relativistic 4-force ~F will be proportional to q and will be linear in the 4-velocity ~u. Linearity means there must exist some second-rank tensor F( , ), the “electromagnetic field tensor”, such that d~p/dτ = ~F ( ) = qF( , ~u) . (1.26) Because the 4-force ~F must be orthogonal to the particle’s 4-momentum and thence also to its 4-velocity, ~F · ~u ≡ ~F (~u) = 0, expression (1.26) must vanish when ~u is inserted into its empty slot. In other words, for all timelike unit-length vectors ~u, F(~u, ~u) = 0 . (1.27) It is an instructive exercise (Ex. 1.3) to show that this is possible only if F is antisymmetric, so the electromagnetic 4-force is d~p/dτ = qF( , ~u) , where F( ~A, ~B) = −F( ~B, ~A) for all ~A and ~B . (1.28) This must be the relativistic form of the Lorentz force law. In Sec. 1.10 below, we shall deduce the relationship of the electromagnetic field tensor F to the more familiar electric and magnetic fields, and the relationship of this relativistic Lorentz force to its Newtonian form (1.14). This discussion of particle kinematics and the electromagnetic force is elegant, but per- haps unfamiliar. In Secs. 1.6 and 1.10 we shall see that it is equivalent to the more elementary (but more complex) formalism based on components of vectors. **************************** EXERCISES 23 Note that Rijik is summed on the i index, so it has only two free indices, j and k, and thus is the component of a second rank tensor, as it must be if it is to represent the contraction of a fourth-rank tensor. 1.5.2 [R] Minkowski Spacetime In Minkowski spacetime, associated with any inertial reference frame (Fig. 1.3 and associated discussion in Sec. 1.2.2), there is a Lorentz coordinate system {t, x, y, z} = {x0, x1, x2, x3} generated by the frame’s rods and clocks. And associated with these coordinates is a set of orthonormal basis vectors {~et, ~ex, ~ey, ~ez} = {~e0, ~e1, ~e2, ~e3}; cf. Fig. 1.9. (The reason for putting the indices up on the coordinates but down on the basis vectors will become clear below.) The basis vector ~eα points along the xα coordinate direction, which is orthogonal to all the other coordinate directions, and it has squared length −1 for α = 0 (vector pointing in a timelike direction) and +1 for α = 1, 2, 3 (spacelike): ~eα · ~eβ = ηαβ . (1.30) Here ηαβ , the orthonormality values (a spacetime analog of the Kronecker delta) are defined by η00 ≡ −1 , η11 ≡ η22 ≡ η33 ≡ 1 , ηαβ ≡ 0 if α 6= β . (1.31) The fact that ~eα ·~eβ 6= δαβ prevents many of the Euclidean-space component-manipulation formulas (1.29c)–(1.29h) from holding true in Minkowski spacetime. There are two ap- proaches to recovering these formulas. One approach, used in many old textbooks (includ- ing the first and second editions of Goldstein’s Classical Mechanics and Jackson’s Classical Electrodynamics), is to set x0 = it, where i = √ −1 and correspondingly make the time basis vector be imaginary, so that ~eα · ~eβ = δαβ . When this approach is adopted, the resulting formalism does not care whether indices are placed up or down; one can place them wherever one’s stomach or liver dictate without asking one’s brain. However, this x0 = it approach has severe disadvantages: (i) it hides the true physical geometry of Minkowski spacetime, (ii) it cannot be extended in any reasonable manner to non-orthonormal bases in flat spacetime, and (iii) it cannot be extended in any reasonable manner to the curvilinear coordinates that one must use in general relativity. For these reasons, most modern texts (including the third editions of Goldstein and Jackson) take an alternative approach, one always used in general relativity. This alternative, which we shall adopt, requires introducing two different types of components for vectors, and analogously for tensors: contravariant components denoted by superscripts, e.g. T αβγ , and covariant components denoted by subscripts, e.g. Tαβγ . In Parts I–V of this book we introduce these components only for orthonormal bases; in Part VI we develop a more sophisticated version of them, valid for nonorthonormal bases. A vector or tensor’s contravariant components are defined as its expansion coefficients in the chosen basis [analog of Eq. (1.29d) in Euclidean 3-space]: ~A ≡ Aα~eα , T ≡ T αβγ~eα ⊗ ~eβ ⊗ ~eγ . (1.32a) Here and throughout this book, Greek (spacetime) indices are to be summed whenever they are repeated with one up and the other down. The covariant components are defined as the 24 numbers produced by evaluating the vector or tensor on its basis vectors [analog of Eq. (1.29e) in Euclidean 3-space]: Aα ≡ ~A(~eα) = ~A · ~eα , Tαβγ ≡ T(~eα, ~eβ, ~eγ) . (1.32b) These definitions have a number of important consequences. We shall derive them one after another and then at the end shall summarize them succinctly with equation numbers: (i) The covariant components of the metric tensor are gαβ = g(~eα, ~eβ) = ~eα · ~eβ = ηαβ . Here the first equality is the definition (1.32b) of the covariant components and the second equality is the orthonormality relation (1.30) for the basis vectors. (ii) The covariant components of any tensor can be computed from the contravariant com- ponents by Tλµν = T(~eλ, ~eµ, ~eν) = T αβγ~eα⊗~eβ⊗~eγ(~eλ, ~eµ, ~eν) = T αβγ(~eα ·~eλ)(~eβ ·~eµ)(~eγ · ~eν) = T αβγgαλgβµgγν . The first equality is the definition (1.32b) of the covariant com- ponents, the second is the expansion (1.32a) of T on the chosen basis, the third is the definition (1.10a) of the tensor product, and the fourth is one version of our result (i) for the covariant components of the metric. (iii) This result, Tλµν = T αβγgαλgβµgγν , together with the numerical values (i) of gαβ, implies that when one lowers a spatial index there is no change in the numerical value of a component, and when one lowers a temporal index, the sign changes: Tijk = T ijk, T0jk = −T 0jk, T0j0 = +T 0j0, T000 = −T 000. We shall call this the “sign-flip-if-temporal” rule. As a special case, −1 = g00 = g00, 0 = g0j = −g0j , δjk = gjk = gjk — i.e., the metric’s covariant and contravariant components are numerically identical; they are both equal to the orthonormality values ηαβ . (iv) It is easy to see that this sign-flip-if-temporal rule for lowering indices implies the same sign-flip-if-temporal rule for raising them, which in turn can be written in terms of metric components as T αβγ = Tλµνg λαgµβgνγ. (v) It is convenient to define mixed components of a tensor, components with some indices up and others down, as having numerical values obtained by raising or lowering some but not all of its indices using the metric, e.g. T α µν = T αβγgβµgγν = Tλµνg λα. Numeri- cally, this continues to follow the sign-flip-if-temporal rule: T 0 0k = −T 00k, T 0 jk = T 0jk, and it implies, in particular, that the mixed components of the metric are gα β = δαβ (the Kronecker-delta values; plus one if α = β and zero otherwise). Summarizing these results: The numerical values of the components of the metric in Minkowski spacetime are gαβ = ηαβ , gα β = δαβ , gα β = δαβ , gαβ = ηαβ ; (1.32c) and indices on all vectors and tensors can be raised and lowered using these components of the metric Aα = gαβA β , Aα = gαβAβ , T α µν ≡ gµβgνγT αβγ T αβγ ≡ gβµgγνT α µν , (1.32d) 25 which says numerically that lowering a temporal index changes the component’s sign and lowering a spatial index leaves the component unchanged—and similarly for raising indices; the sign-flip-if-temporal rule. This index notation gives rise to formulas for tensor products, inner products, values of tensors on vectors, and tensor contractions, that are the obvious analogs of those in Euclidean space: [Contravariant components of T( , , ) ⊗ S( , )] = T αβγSδǫ , (1.32e) ~A · ~B = AαBα = AαB α , T(A,B,C) = TαβγA αBβCγ = T αβγAαBβCγ , (1.32f) Covariant components of [1&3contraction of R] = Rµ αµβ , Contravariant components of [1&3contraction of R] = Rµα µ β . (1.32g) Notice the very simple pattern in Eqs. (1.32), which universally permeates the rules of index gymnastics, a pattern that permits one to reconstruct the rules without any memo- rization: Free indices (indices not summed over) must agree in position (up versus down) on the two sides of each equation. In keeping with this pattern, one often regards the two indices in a pair that is summed (one index up and the other down) as “strangling each other” and thereby being destroyed, and one speaks of “lining up the indices” on the two sides of an equation to get them to agree. In Part VI, when we use non-orthonormal basis, all of these index-notation equations (1.32) will remain valid unchanged except for the numerical values (1.32c) of the metric components and the sign-flip-if-temporal rule. 1.5.3 [N & R] Slot-Naming Index Notation [Note: In this and other sections marked “N&R”, the Newtonian reader should mentally lower all indices on tensor components and make them Latin; e.g. should mentally change T αβ α = T αβγgαγ in Eq. (1.33) to Taba = Tabcgac.] We now pause, in our development of the component version of tensor algebra, to intro- duce a very important new viewpoint: Consider the rank-2 tensor F( , ). We can define a new tensor G( , ) to be the same as F, but with the slots interchanged; i.e., for any two vectors ~A and ~B it is true that G( ~A, ~B) = F( ~B, ~A). We need a simple, compact way to indicate that F and G are equal except for an interchange of slots. The best way is to give the slots names, say α and β—i.e., to rewrite F( , ) as F( α, β) or more conveniently as Fαβ; and then to write the relationship between G and F as Gαβ = Fβα. “NO!” some readers might object. This notation is indistinguishable from our notation for components on a particular basis. “GOOD!” a more astute reader will exclaim. The relation Gαβ = Fβα in a particular basis is a true statement if and only if “G = F with slots interchanged” is true, so why not use the same notation to symbolize both? This, in fact, we shall do. We shall ask our readers to look at any “index equation” such as Gαβ = Fβα like they would look at an Escher drawing: momentarily think of it as a relationship between components of tensors in a specific basis; 28aaa aa a a a a a a aa a a a a a a aa a a aa a a a a a a aa a aa a a a a a a a a a a a a aa a a a a aa aa a a a a a a a auv=u/γ u t x y Fig. 1.10: Spacetime diagram in a specific Lorentz frame, showing the frame’s 3-space t = 0 (stippled region), the 4-velocity ~u of a particle as it passes through that 3-space (i.e., at time t = 0); and two 3-dimensional vectors that lie in the 3-space: the spatial part of the particle’s 4-velocity, u, and the particle’s ordinary velocity v. Here vj are the components of the ordinary velocity as measured in the Lorentz frame. This relation, together with the unit norm of ~u, ~u2 = gαβu αuβ = −(u0)2 + δiju iuj = −1, implies that the components of the 4-velocity have the forms familiar from elementary textbooks: u0 = γ , uj = γvj , where γ = 1 (1 − δijvivj) 1 2 . (1.34c) It is useful to think of vj as the components of a 3-dimensional vector v, the ordinary velocity, that lives in the 3-dimensional Euclidean space t = const of the chosen Lorentz frame. As we shall see below, this 3-space is not well defined until a Lorentz frame has been chosen, and correspondingly, v relies for its existence on a specific choice of frame. However, once the frame has been chosen, v can be regarded as a coordinate-independent, basis-independent 3-vector lying in the frame’s 3-space t =const. Similarly, the spatial part of the 4-velocity ~u (the part with components uj in our chosen frame) can be regarded as a 3-vector u lying in the frame’s 3-space; and Eqs. (1.34c) become the component versions of the coordinate-independent, basis-independent 3-space relations u = γv , γ = 1√ 1 − v2 . (1.34d) Figure 1.10 shows stippled the 3-space t = 0 of a specific Lorentz frame, and the 4-velocity ~u and ordinary velocity v of a particle as it passes through that 3-space. The components of the particle’s 4-momentum ~p in our chosen Lorentz frame have special names and special physical significances: The time component of the 4-momentum is the particle’s energy E as measured in that frame E ≡ p0 = mu0 = mγ = m√ 1 − v2 = (the particle’s energy) ≃ m+ 1 2 mv2 for v ≡ |v| ≪ 1 . (1.35a) Note that this energy is the sum of the particle’s rest mass-energy m = mc2 and its kinetic energy mγ − m (which, for low velocities, reduces to the familiar nonrelativistic kinetic 29 energy E = 1 2 mv2). The spatial components of the 4-momentum, when regarded from the viewpoint of 3-dimensional physics, are the same as the components of the momentum, a 3-vector residing in the chosen Lorentz frame’s 3-space: pj = muj = mγvj = mvj √ 1 − v2 = Evj = (j-component of particle’s momentum) ; (1.35b) or, in basis-independent, 3-dimensional vector notation, p = mu = mγv = mv√ 1 − v2 = Ev = (particle’s momentum) . (1.35c) For a zero-rest-mass particle, as for one with finite rest mass, we identify the time com- ponent of the 4-momentum, in a chosen Lorentz frame, as the particle’s energy, and the spatial part as its momentum. Moreover, if—appealing to quantum theory—we regard a zero-rest-mass particle as a quantum associated with a monochromatic wave, then quantum theory tells us that the wave’s angular frequency ω as measured in a chosen Lorentz frame will be related to its energy by E ≡ p0 = ~ω = (particle’s energy) ; (1.36a) and, since the particle has ~p2 = −(p0)2 + p2 = −m2 = 0 (in accord with the lightlike nature of its world line), its momentum as measured in the chosen Lorentz frame will be p = En = ~ωn . (1.36b) Here n is the unit 3-vector that points in the direction of travel of the particle, as measured in the chosen frame; i.e. (since the particle moves at the speed of light v = 1), n is the particle’s ordinary velocity. Eqs. (1.36a) and (1.36b) are the temporal and spatial components of the geometric, frame-independent relation ~p = ~~k [Eq. (1.21), which is valid for zero-rest-mass particles as well as finite-mass ones]. The introduction of a specific Lorentz frame into spacetime can be said to produce a “3+1” split of every 4-vector into a 3-dimensional vector plus a scalar (a real number). The 3+1 split of a particle’s 4-momentum ~p produces its momentum p plus its energy E = p0; and correspondingly, the 3+1 split of the law of 4-momentum conservation (1.23) produces a law of conservation of momentum plus a law of conservation of energy: ∑ Ā pĀ = ∑ A pA , ∑ Ā EĀ = ∑ A EA . (1.37) Here the unbarred quantities are the momenta or energies of the particles entering the interaction region, and the barred quantities are the momenta or energies of those leaving; cf. Fig. 1.8. Because the concept of energy does not even exist until one has chosen a Lorentz frame, and neither does that of momentum, the laws of energy conservation and momentum con- servation separately are frame-dependent laws. In this sense they are far less fundamental than their combination, the frame-independent law of 4-momentum conservation. 30 Box 1.3 [N] Relativistic Particles for Newtonian Readers Readers who are skipping the relativistic parts of this book will need to know two im- portant pieces of relativity: (i) geometrized units, as embodied in Eqs. (1.3), and (ii) the (relativistic) energy and momentum of a moving particle, as described here: A particle with rest mass m, moving with ordinary velocity v = dx/dt and speed v = |v|, has energy E (including its rest-mass), energy E and momentum p given by E = m√ 1 − v2 ≡ m √ 1 − v2/c2 ≡ E +m , p = Ev = mv√ 1 − v2 . (1) [Eqs. (1.35)]. In the low-velocity, Newtonian limit, the energy E with rest mass removed and the momentum p and take their familiar, Newtonian forms: When v ≪ c ≡ 1, E → 1 2 mv2 and p → mv . (2) A particle with zero rest mass (a photon or a graviton7) always moves with the speed of light v = c = 1, and like other particles it has has momentum p = Ev, so the magnitude of its momentum is equal to its energy: |p| = Ev = E . When particles interact (e.g. in chemical reactions, nuclear reactions, and elementary- particle collisons) the sum of the particle energies E is conserved, as is the sum of the particle momenta p: Eq. (1.37). By learning to think about the 3+1 split in a geometric, frame-independent way, one can gain much conceptual and computational power. As a example, consider a particle with 4-momentum ~p, as studied by an observer with 4-velocity ~U . In the observer’s own Lorentz reference frame, her 4-velocity has components U0 = 1 and U j = 0, and therefore, her 4- velocity is ~U = Uα~eα = ~e0, i.e. it is identically equal to the time basis vector of her Lorentz frame. This means that the particle energy that she measures is E = p0 = −p0 = −~p · ~e0 = −~p· ~U . This equation, derived in the observer’s Lorentz frame, is actually a geometric, frame- independent relation: the inner product of two 4-vectors. It says that when an observer with 4-velocity ~U measures the energy of a particle with 4-momentum ~p, the result she gets (the time part of the 3+1 split of ~p as seen by her) is E = −~p · ~U . (1.38) We shall use this equation fairly often in later chapters. In Exs. 1.9 and 1.10, the reader can get experience at deriving and interpreting other frame-independent equations for 3+1 splits. Exercise 1.11 exhibits the power of this geometric way of thinking by using it to derive the Doppler shift of a photon. **************************** EXERCISES 33 the basis vectors of one basis in terms of those of the other. We shall denote the expansion coefficients by the letter R and shall write ei = ep̄Rp̄i , ep̄ = eiRip̄ . (1.43) The quantities Rp̄i and Rip̄ are not the components of a tensor; rather, they are the elements of transformation matrices [Rp̄i] =   R1̄1 R1̄2 R1̄3 R2̄1 R2̄2 R2̄3 R3̄1 R3̄2 R3̄3   , [Rip̄] =   R11̄ R12̄ R13̄ R21̄ R22̄ R23̄ R31̄ R32̄ R33̄   . (1.44a) (Here and throughout this book we use square brackets to denote matrices.) These two matrices must be the inverse of each other, since one takes us from the barred basis to the unbarred, and the other in the reverse direction, from unbarred to barred: Rp̄iRiq̄ = δp̄q̄ , Rip̄Rp̄j = δij . (1.44b) The orthonormality requirement for the two bases implies that δij = ei · ej = (ep̄Rp̄i) · (eq̄Rq̄j) = Rp̄iRq̄j(ep̄ · eq̄) = Rp̄iRq̄jδp̄q̄ = Rp̄iRp̄j. This says that the transpose of [Rp̄i] is its inverse—which we have already denoted by [Rip̄]; Rip̄ = Rp̄i . (1.44c) This property implies that the transformation matrix is orthogonal; i.e., the transformation is a reflection or a rotation [see, e.g., Goldstein (1980)]. Thus (as should be obvious and familiar), the bases associated with any two Euclidean coordinate systems are related by a reflection or rotation. Note: Eq. (1.44c) does not say that [Rip̄] is a symmetric matrix; in fact, it typically is not. Rather, (1.44c) says that [Rip̄] is the transpose of [Rp̄i].] The fact that a vector A is a geometric, basis-independent object implies that A = Aiei = Ai(ep̄Rp̄i) = (Rp̄iAi)ep̄ = Ap̄ep̄; i.e., Ap̄ = Rp̄iAi , and similarly Ai = Rip̄Ap̄ ; (1.45a) and correspondingly for the components of a tensor Tp̄q̄r̄ = Rp̄iRq̄jRr̄kTijk , Tijk = Rip̄Rjq̄Rkr̄Tp̄q̄r̄ . (1.45b) It is instructive to compare the transformation law (1.45a) for the components of a vector with those (1.43) for the bases. To make these laws look natural, we have placed the transformation matrix on the left in the former and on the right in the latter. In Minkowski spacetime, the placement of indices, up or down, will automatically tell us the order. If we choose the origins of our two coordinate systems to coincide, then the vector x reaching from the their common origin to some point P whose coordinates are xj and xp̄ has components equal to those coordinates; and as a result, the coordinates themselves obey the same transformation law as any other vector xp̄ = Rp̄ixi , xi = Rip̄xp̄ ; (1.45c) The product of two rotation matrices, [Rip̄Rp̄s̄] is another rotation matrix [Ris̄], which transforms the Cartesian bases es̄ to ei. Under this product rule, the rotation matrices form a mathematical group: the rotation group, whose “representations” play an important role in quantum theory. 34 1.7.2 [R] Minkowski Spacetime: Lorentz Transformations Consider two different inertial reference frames in Minkowski spacetime. Denote their Lorentz coordinates by {xα} and {xµ̄} and their bases by {eα} and {eµ̄}, and write the transformation from one basis to the other as ~eα = ~eµ̄L µ̄ α , ~eµ̄ = ~eαL α µ̄ . (1.46) As in Euclidean 3-space, Lµ̄ α and Lα µ̄ are elements of two different transformation matrices, and since these matrices operate in opposite directions, they must be the inverse of each other: Lµ̄ αL α ν̄ = δµ̄ ν̄ , Lα µ̄L µ̄ β = δα β . (1.47a) Notice the up/down placement of indices on the elements of the transformation matrices: the first index is always up, and the second is always down. This is just a convenient convention which helps systematize the index shuffling rules in a way that can be easily remembered. Our rules about summing on the same index when up and down, and matching unsummed indices on the two sides of an equation, automatically dictate the matrix to use in each of the transformations (1.46); and similarly for all other equations in this section. In Euclidean 3-space the orthonormality of the two bases dictated that the transforma- tions must be orthogonal, i.e. must be reflections or rotations. In Minkowski spacetime, orthonormality implies gαβ = ~eα · ~eβ = (~eµ̄L µ̄ α) · (~eν̄L ν̄ β) = Lµ̄ αL ν̄ βgµ̄ν̄ ; i.e., gµ̄ν̄L µ̄ αL ν̄ β = gαβ , and similarly gαβL α µ̄L β ν̄ = gµ̄ν̄ . (1.47b) Any matrices whose elements satisfy these equations is a Lorentz transformation. From the fact that vectors and tensors are geometric, frame-independent objects, one can derive the Minkowski-space analogs of the Euclidean transformation laws for components (1.45a), (1.45b): Aµ̄ = Lµ̄ αA α , T µ̄ν̄ρ̄ = Lµ̄ αL ν̄ βL ρ̄ γT αβγ , and similarly in the opposite direction. (1.48a) Notice that here, as elsewhere, these equations can be constructed by lining up indices in accord with our standard rules. If (as is conventional) we choose the spacetime origins of the two Lorentz coordinate systems to coincide, then the vector ~x extending from the origin to some event P, whose coordinates are xα and xᾱ, has components equal to those coordinates. As a result, the transformation law for the coordinates takes the same form as that (1.48a) for components of a vector: xα = Lα µ̄x µ̄ , xµ̄ = Lµ̄ αx α . (1.48b) The product [Lα µ̄L µ̄ ρ̄] of two Lorentz transformation matrices is a Lorentz transformation matrix; and under this product rule, the Lorentz transformations form a mathematical group, the Lorentz group, whose “representations” play an important role in quantum field theory. An important specific example of a Lorentz transformation is the following ||Lα µ̄ || =     γ βγ 0 0 βγ γ 0 0 0 0 1 0 0 0 0 1     , ||Lµ̄ α || =     γ −βγ 0 0 −βγ γ 0 0 0 0 1 0 0 0 0 1     , (1.49a) 35 1 21 1 1 2 2 2 x x t ta n -1 β tan-1β (a) (b) x x t t t P (c) x x t t simultaneous 3-space in F simultaneous 3-space in Fu u Fig. 1.12: Spacetime diagrams illustrating the pure boost (1.49c) from one Lorentz reference frame to another. where β and γ are related by |β| < 1 , γ ≡ (1 − β2)− 1 2 . (1.49b) One can readily verify that these matrices are the inverses of each other and that they satisfy the Lorentz-transformation relation (1.47b). These transformation matrices produce the following change of coordinates [Eq. (1.48b)] t = γ(t̄+ βx̄) , x = γ(x̄+ βt̄) , y = ȳ , z = z̄ , t̄ = γ(t− βx) , x̄ = γ(x− βt) , ȳ = y , z̄ = z . (1.49c) These expressions reveal that any point at rest in the unbarred frame (a point with fixed, time-independent x, y, z) is seen in the barred frame to move along the world line x̄ = const−βt̄, ȳ = const, z̄ = const. In other words, the unbarred frame is seen by observers at rest in the barred frame to move with uniform velocity ~v = −β~ex̄, and correspondingly the barred frame is seen by observers at rest in the unbarred frame to move with the opposite uniform velocity ~v = +β~ex. This special Lorentz transformation is called a pure boost along the x direction. 1.7.3 [R] Spacetime Diagrams for Boosts Figure 1.12 illustrates the pure boost (1.49c). Diagram (a) in that figure is a two-dimensional spacetime diagram, with the y- and z-coordinates suppressed, showing the t̄ and x̄ axes of the boosted Lorentz frame F̄ in the t, x Lorentz coordinate system of the unboosted frame F . That the barred axes make angles tan−1 β with the unbarred axes, as shown, can be inferred from the Lorentz transformation equation (1.49c). Note that invariance of the interval guarantees that the event x̄ = a on the x̄-axis lies at the intersection of that axis with the dashed hyperbola x2 − t2 = a2; and similarly, the event t̄ = a on the t̄-axis lies at the intersection of that axis with the dashed hyperbola t2 − x2 = a2. As is shown in diagram (b) of the figure, the barred coordinates t̄, x̄ of an event P can be inferred by projecting from P onto the t̄- and x̄-axes, with the projection going parallel to the x̄- and t̄- axes respectively. Diagram (c) shows the 4-velocity ~u of an observer at rest in frame F and 38 11 9 10 8 0 1 2 3 4 5 6 7 0 τc= τc=1 2 3 4 5 6 7 Flor en ce M et hu se la h x t 11 9 10 8 0 1 2 3 4 5 6 7 0 τc= τc=1 2 3 4 5 6 7 x t (a) (b) Fig. 1.13: (a) Spacetime diagram depicting the twins paradox. Marked along the two world lines are intervals of proper time as measured by the two twins. (b) Spacetime diagram depicting the motions of the two mouths of a wormhole. Marked along the mouths’ world tubes are intervals of proper time τc as measured by the single clock that sits on the common mouths. 1.8 [R] Time Travel Time dilation is one facet of a more general phenomenon: Time, as measured by ideal clocks, is a “personal thing,” different for different observers who move through spacetime on different world lines. This is well illustrated by the infamous “twins paradox,” in which one twin, Methuselah, remains forever at rest in an inertial frame and the other, Florence, makes a spacecraft journey at high speed and then returns to rest beside Methuselah. The twins’ world lines are depicted in Fig. 1.13(a), a spacetime diagram whose axes are those of Methuselah’s inertial frame. The time measured by an ideal clock that Methuselah carries is the coordinate time t of his inertial frame; and its total time lapse, from Florence’s departure to her return, is treturn − tdeparture ≡ TMethuselah. By contrast, the time measured by an ideal clock that Florence carries is the proper time τ , i.e. the square root of the invariant interval (1.7), along her world line; and thus her total time lapse from departure to return is TFlorence = ∫ dτ = ∫ √ dt2 − δijdxidxj = ∫ TMethuselah 0 √ 1 − v2dt . (1.52) Here (t, xi) are the time and space coordinates of Methuselah’s inertial frame, and v is Florence’s ordinary speed, v = √ δij(dxi/dt)(dxj/dt), relative to Methuselah’s frame. Obvi- ously, Eq. (1.52) predicts that TFlorence is less than TMethuselah. In fact (cf. Exercise 1.16), even if Florence’s acceleration is kept no larger than one Earth gravity throughout her trip, and her trip lasts only TFlorence = (a few tens of years), TMethuselah can be hundreds or thousands or millions or billions of years. Does this mean that Methuselah actually “experiences” a far longer time lapse, and actually ages far more than Florence? Yes. The time experienced by humans and the aging 39 of the human body are governed by chemical processes, which in turn are governed by the natural oscillation rates of molecules, rates that are constant to high accuracy when measured in terms of ideal time (or, equivalently, proper time τ). Therefore, a human’s experiential time and aging time are the same as the human’s proper time—so long as the human is not subjected to such high accelerations as to damage her body. In effect, then, Florence’s spacecraft has functioned as a time machine to carry her far into Methuselah’s future, with only a modest lapse of her own proper time (ideal time; experiential time; aging time). Is it also possible, at least in principle, for Florence to construct a time machine that carries her into Methuselah’s past—and also her own past? At first sight, the answer would seem to be Yes. Figure 1.13(b) shows one possible method, using a wormhole. [Papers on other methods are cited in Thorne (1993) and Friedman and Higuchi (2006).] Wormholes are hypothetical “handles” in the topology of space. A simple model of a wormhole can be obtained by taking a flat 3-dimensional space, removing from it the interiors of two identical spheres, and identifying the spheres’ surfaces so that if one enters the surface of one of the spheres, one immediately finds oneself exiting through the surface of the other. When this is done, there is a bit of strongly localized spatial curvature at the spheres’ common surface, so to analyze such a wormhole properly, one must use general relativity rather than special relativity. In particular, it is the laws of general relativity, combined with the laws of quantum field theory, that tell one how to construct such a wormhole and what kinds of materials (quantum fields) are required to “hold it open” so things can pass through it. Unfortunately, despite considerable effort, theoretical physicists have not yet deduced definitively whether those laws permit such wormholes to exist.9 On the other hand, assuming such wormholes can exist, the following special relativistic analysis shows how one might be used to construct a machine for backward time travel.10 The two identified spherical surfaces are called the wormhole’s mouths. Ask Methuselah to keep one mouth with himself, forever at rest in his inertial frame, and ask Florence to take the other mouth with herself on her high-speed journey. The two mouths’ world tubes (analogs of world lines for a 3-dimensional object) then have the forms shown in Fig. 1.13(b). Suppose that a single ideal clock sits on the wormhole’s identified mouths, so that from the external Universe one sees it both on Methuselah’s wormhole mouth and on Florence’s. As seen on Methuselah’s mouth, the clock measures his proper time, which is equal to the coordinate time t [see tick marks along the left world tube in Fig. 1.13(b)]. As seen on Florence’s mouth, the clock measures her proper time, Eq. (1.52) [see tick marks along the right world tube in Fig. 1.13(b)]. The result should be obvious, if surprising: When Florence returns to rest beside Methuselah, the wormhole has become a time machine. If she travels through the wormhole when the clock reads τc = 7, she goes backward in time as seen in Methuselah’s (or anyone else’s) inertial frame; and then, in fact, traveling along the everywhere timelike, dashed world line, she is able to meet her younger self before she entered the wormhole. This scenario is profoundly disturbing to most physicists because of the dangers of science- 9See, e.g., Morris and Thorne (1987), Thorne (1993), Borde, Ford and Roman (2002), and references therein. 10Morris, Thorne, and Yurtsever (1988). 40 fiction-type paradoxes (e.g., the older Florence might kill her younger self, thereby preventing herself from making the trip through the wormhole and killing herself). Fortunately perhaps, it now seems moderately likely (though not certain) that vacuum fluctuations of quantum fields will destroy the wormhole at the moment when its mouths’ motion first makes backward time travel possible; and it may be that this mechanism will always prevent the construction of backward-travel time machines, no matter what tools one uses for their construction.11 **************************** EXERCISES Exercise 1.16 Example: [R] Twins Paradox (a) The 4-acceleration of a particle or other object is defined by ~a ≡ d~u/dτ , where ~u is its 4-velocity and τ is proper time along its world line. Show that, if an observer carries an accelerometer, the magnitude of the acceleration a measured by the accelerometer will always be equal to the magnitude of the observer’s 4-acceleration, a = |~a| ≡ √ ~a · ~a. (b) In the twins paradox of Fig. 1.13(a), suppose that Florence begins at rest beside Methuselah, then accelerates in Methuselah’s x-direction with an acceleration a equal to one Earth gravity, “1g”, for a time TFlorence/4 as measured by her, then accelerates in the −x-direction at 1g for a time TFlorence/2 thereby reversing her motion, and then accelerates in the +x-direction at 1g for a time TFlorence/4 thereby returning to rest beside Methuselah. (This is the type of motion shown in the figure.) Show that the total time lapse as measured by Methuselah is TMethuselah = 4 g sinh ( gTFlorence 4 ) . (1.53) (b) Show that in the geometrized units used here, Florence’s acceleration (equal to ac- celetion of gravity at the surface of the Earth) is g = 1.033/yr. Plot TMethuselah as a function of TFlorence, and from your plot deduce that, if TFlorence is several tens of years, then TMethuselah can be hundreds or thousands or millions or even billions of years. Exercise 1.17 Challenge: [R] Around the World on TWA In a long-ago era when an airline named Trans World Airlines (TWA) flew around the world, J. C. Hafele and R. E. Keating carried out a real live twins paradox experiment: They synchronized two atomic clocks, and then flew one around the world eastward on TWA, and on a separate trip, around the world westward, while the other clock remained at home at the Naval Research Laboratory near Washington D.C. When the clocks were compared after each trip, they were found to have aged differently. Making reasonable estimates for the airplane routing and speeds, compute the difference in aging, and compare your result 11Kim and Thorne (1991), Hawking (1992), Thorne (1993). But see also contrary indications in more recent research reviewed by Friedman and Higuchi (2006). 43 between ǫ and the metric g: If {~eα} is an orthonormal basis [orthonormality being defined with the aid of the metric, ~eα · ~eβ = g(~eα, ~eβ) = ηαβ in spacetime and = δαβ in Euclidean space], and if this basis is right-handed (a new property, not determined by the metric), then ǫ(~e1, ~e2, . . . , ~eN) = +1 in a space of N dimensions; ǫ(~e0, ~e1, ~e2, ~e3) = +1 in spacetime. (1.59a) The concept of right handedness should be familiar in Euclidean 2-space or 3-space. In spacetime, the basis is right handed if {~e1, ~e2, ~e3} is right handed and ~e0 points to the future. Equation (1.59a) and the antisymmetry of ǫ imply that in an orthonormal, right-handed basis, the only nonzero covariant components of ǫ are ǫ12...N = +1 , ǫαβ...ν = +1 if α, β, . . . , ν is an even permutation of 1, 2, . . . , N = −1 if α, β, . . . , ν is an odd permutation of 1, 2, . . . , N = 0 if α, β, . . . , ν are not all different; (1.59b) (In spacetime the indices run from 0 to 3 rather than 1 to N = 4.) One can show that these components in one right-handed orthonormal frame imply these same components in all other right-handed orthonormal frames by virtue of the fact that the orthogonal (3- space) and Lorentz (spacetime) transformation matrices have unit determinant; and that in a left-handed orthormal frame the signs of these components are reversed. In 3-dimensional Euclidean space, the Levi-Civita tensor is used to define the cross prod- uct and the curl: A × B ≡ ǫ( ,A,B) i.e., in slot-naming index notation, ǫijkAjBk ; (1.60a) ∇ ×A ≡ (the vector field whose slot-naming index form is ǫijkAk;j) . (1.60b) [Equation (1.60b) is an example of an expression that is complicated if written in index-free notation; it says that ∇ × A is the double contraction of the rank-5 tensor ǫ ⊗∇A on its second and fifth slots, and on its third and fourth slots.] Although Eqs. (1.60a) and (1.60b) look like complicated ways to deal with concepts that most readers regard as familiar and elementary, they have great power. The power comes from the following property of the Levi-Civita tensor in Euclidean 3-space [readily derivable from its components (1.59b)]: ǫijmǫklm = δij kl ≡ δi kδ j l − δi lδ j k . (1.61) Here δi k is the Kronecker delta. Examine the 4-index delta function δij kl carefully; it says that either the indices above and below each other must be the same (i = k and j = l) with a + sign, or the diagonally related indices must be the same (i = l and j = k) with a − sign. [We have put the indices ij of δij kl up solely to facilitate remembering this rule. Recall (first paragraph of Sec. 1.5) that in Euclidean space and Cartesian coordinates, it does not matter whether indices are up or down.] With the aid of Eq. (1.61) and the index-notation expressions for the cross product and curl, one can quickly and easily derive a wide variety of useful vector identities; see the very important Exercise 1.18. 44 **************************** EXERCISES Exercise 1.18 **Example and Practice: [N] Vectorial Identities for the Cross Product and Curl Here is an example of how to use index notation to derive a vector identity for the double cross product A×(B×C): In index notation this quantity is ǫijkAj(ǫklmBlCm). By permuting the indices on the second ǫ and then invoking Eq. (1.61), we can write this as ǫijkǫlmkAjBlCm = δlm ij AjBlCm. By then invoking the meaning (1.61) of the 4-index delta function, we bring this into the form AjBiCj−AjBjCi, which is the index-notation form of (A ·C)B−(A ·B)C. Thus, it must be that A× (B ×C) = (A · C)B − (A · B)C. Use similar techniques to evaluate the following quantities: (a) ∇ × (∇ ×A) (b) (A ×B) · (C ×D) (c) (A ×B) × (C ×D) **************************** 1.10 [R] Nature of Electric and Magnetic Fields; Maxwell’s Equations Now that we have introduced the gradient and the Levi-Civita tensor, we are prepared to study the relationship of the relativistic version of electrodynamics to the nonrelativistic (“Newtonian”) version. Consider a particle with charge q, rest mass m and 4-velocity ~u interacting with an electromagnetic field F( , ). In index notation, the electromagnetic 4-force acting on the particle [Eq. (1.28)] is dpα/dτ = qF αβuβ . (1.62) Let us examine this 4-force in some arbitrary inertial reference frame in which particle’s ordinary-velocity components are vj = vj and its 4-velocity components are u0 = γ, uj = γvj [Eqs. (1.34c)]. Anticipating the connection with the nonrelativistic viewpoint, we introduce the following notation for the contravariant components of the antisymmetric electromagnetic field tensor: F 0j = −F j0 = Ej , F ij = ǫijkBk . (1.63) (Recall that spatial indices, being Euclidean, can be placed up or down freely with no change in sign of the indexed quantity.) Inserting these components of F and ~u into Eq. (1.62) and using the relationship dt/dτ = u0 = γ between t and τ derivatives, we obtain for the 45 w w E Bw t x y Fig. 1.14: The electric and magnetic fields measured by an observer with 4-velocity ~w, shown as 4- vectors ~E~w and ~B~w that lie in the observer’s 3-surface of simultaneity (stippled 3-surface orthogonal to ~w). components of the 4-force dpj/dτ = γdpj/dt = qγ(Ej + ǫijkvjBk) and dp0/dτ = γdp0/dt = γEjvj. Dividing by γ, converting into 3-space index notation, and denoting the particle’s energy by E = p0, we bring these into the familiar Lorentz-force form dp/dt = q(E + v × B) , dE/dt = v ·E . (1.64) Evidently E is the electric field and B the magnetic field as measured in our chosen Lorentz frame. This may be familiar from standard electrodynamics textbooks, e.g. Jackson (1999). Not so familiar, but quite important, is the following geometric interpretation of E and B: The electric and magnetic fields E and B are spatial vectors as measured in the chosen inertial frame. We can also regard them as 4-vectors that lie in the 3-surface of simultaneity t = const of the chosen frame, i.e. that are orthogonal to the 4-velocity (denote it ~w) of the frame’s observers (cf. Figs. 1.12 and 1.14). We shall denote this 4-vector version of E and B by ~E~w and ~B~w, where the subscript ~w identifies the 4-velocity of the observers who measure these fields. These fields are depicted in Fig. 1.14. In the rest frame of the observer ~w, the components of ~E~w are E0 ~w = 0, Ej ~w = Ej [the Ej appearing in Eqs. (1.63)], and similarly for ~B~w; and the components of ~w are w0 = 1, wj = 0. Therefore, in this frame Eqs. (1.63) can be rewritten as Eα ~w = F αβwβ , Bβ ~w = 1 2 ǫαβγδFγδwα . (1.65a) (To verify this, insert the above components of F and ~w into these equations and, after some algebra, recover Eqs. (1.63) along with E0 ~w = B0 ~w = 0.) Equations (1.65a) say that in one special reference frame, that of the observer ~w, the components of the 4-vectors on the left and on the right are equal. This implies that in every Lorentz frame the components of these 4-vectors will be equal; i.e., it implies that Eqs. (1.65a) are true when one regards them as geometric, frame-independent equations written in slot-naming index notation. These equations enable one to compute the electric and magnetic fields measured by an observer (viewed as 4-vectors in the observer’s 3-surface of simultaneity) from the observer’s 4-velocity and the electromagnetic field tensor, without the aid of any basis or reference frame. 48 1.11 Volumes, Integration, and Integral Conservation Laws 1.11.1 [N] Newtonian Volumes and Integration The Levi-Civita tensor is the foundation for computing volumes and performing volume integrals in any number of dimensions. In Cartesian coordinates of 2-dimensional Euclidean space, the area (i.e. 2-dimensional volume) of a parallelogram whose sides are A and B is 2-Volume = ǫabAaBb = A1B2 − A2B1 = det [ A1 B1 A2 B2 ] , (1.70a) a relation that should be familiar from elementary geometry. Equally familiar should be the expression for the 3-dimensional volume of a parallelopiped with legs A, B, and C: 3-Volume = ǫijkAiBjCk = A · (B× C) = det   A1 B1 C1 A2 B2 C2 A3 B3 C3   . (1.70b) Recall that this volume has a sign: it is positive if {A, B, C} is a right handed set of vectors and negative if left-handed. Equations (1.70a) and (1.70b) are foundations from which one can derive the usual formulae dA = dx dy and dV = dx dy dz for the area and volume of elementary surface and volume elements with Cartesian side lengths dx, dy and dz. In Euclidean 3-space, we define the vectorial surface area of a 2-dimensional parallelogram with legs A and B to be Σ = A × B = ǫ( ,A,B) . (1.70c) This vectorial surface area has a magnitude equal to the area of the parallelogram and a direction perpendicular to it. Such vectorial surface areas are the foundation for surface integrals in 3-dimensional space, and for the familiar Gauss theorem ∫ V3 (∇ ·A)dV = ∫ ∂V3 A · dΣ (1.71a) (where V3 is a compact 3-dimensional region and ∂V3 is its closed two-dimensional boundary) and Stokes theorem ∫ V2 ∇ ×A · dΣ = ∫ ∂V2 A · dl (1.71b) (where V2 is a compact 2-dimensional region, ∂V2 is the 1-dimensional closed curve that bounds it, and the last integral is a line integral around that curve). Notice that in Euclidean 3-space, the vectorial surface area ǫ( ,A,B) can be thought of as an object that is waiting for us to insert a third leg C so as to compute a volume ǫ(C,A,B)—the volume of the parallelopiped with legs C, A, and B. This mathematics is illustrated by the integral and differential conservation laws for electric charge and for particles: The total charge and the total number of particles inside a three dimensional region of space V3 are ∫ V3 ρedV and ∫ V3 ndV , where ρe is the charge density 49 and n the number density of particles. The rates that charge and particles flow out of V3 are the integrals of the current density j and the particle flux vector S over its boundary ∂V3. Therefore, the laws of charge conservation and particle conservation say d dt ∫ V3 ρedV + ∫ ∂V3 j · dΣ = 0 , d dt ∫ V3 ndV + ∫ ∂V3 S · dΣ = 0 . (1.72) Pull the time derivative inside each volume integral (where it becomes a partial derivative), and apply Gauss’s law to each surface integral; the results are ∫ V3 (∂ρe/∂t + ∇ · j)dV = 0 and similarly for particles. The only way these equations can be true for all choices of V3 is by the integrands vanishing: ∂ρe/∂t+ ∇ · j = 0 , ∂n/∂t + ∇ · S = 0 . (1.73) These are the differential conservation laws for charge and for particles. They have a stan- dard, universal form: the time derivative of the density of a quantity plus the divergence of its flux vanishes. 1.11.2 [R] Spacetime Volumes and Integration The generalization to 4-dimensional spacetime should be obvious: The 4-dimensional par- allelopiped whose legs are the four vectors ~A, ~B, ~C, ~D has a 4-dimensional volume given by the analog of Eqs. (1.70a) and (1.70b): 4-Volume = ǫαβγδA αBβCγDδ = ǫ( ~A, ~B, ~C, ~D) = det     A0 B0 C0 D0 A1 B1 C1 D1 A2 B2 C2 D2 A3 B3 C3 D3     . (1.74) Note that this 4-volume is positive if the set of vectors { ~A, ~B, ~C, ~D} is right-handed and negative if left-handed. Just as Eqs. (1.70a) and (1.70b) give us a way to perform area and volume integrals in 2- and 3-dimensional Euclidean space, so Equation (1.74) provides us a way to perform volume integrals over 4-dimensional Minkowski spacetime: To integrate a tensor field T over some region V of spacetime, we need only divide spacetime up into tiny parallelopipeds, multiply the 4-volume dΣ of each parallelopiped by the value of T at its center, and add. It is not hard to see from Eq. (1.74) that in any right-handed Lorentz coordinate system, the 4-volume of a tiny parallelopiped whose edges are dxα along the four orthogonal coordinate axes is dΣ = dt dx dy dz (the analog of dV = dx dy dz), and correspondingly the integral of T over V can be expressed as ∫ V T αβγdΣ = ∫ V T αβγdt dx dy dz . (1.75) By analogy with the vectorial area (1.70c) of a parallelogram in 3-space, any 3-dimensional parallelopiped in spacetime with legs ~A, ~B, ~C has a vectorial 3-volume ~Σ (not to be confused with the scalar 4-volume Σ) defined by ~Σ( ) = ǫ( , ~A, ~B, ~C) ; Σµ = ǫµαβγA αBβCγ . (1.76) 50 Here we have written the volume vector both in abstract notation and in component notation. This volume vector has one empty slot, ready and waiting for a fourth vector (“leg”) to be inserted, so as to compute the 4-volume Σ of a 4-dimensional parallelopiped. Notice that the volume vector ~Σ is orthogonal to each of its three legs (because of the antisymmetry of ǫ), and thus (unless it is null) it can be written as ~Σ = V ~n where V is the magnitude of the volume and ~n is the unit normal to the three legs. Interchanging any two legs of the parallelopiped reverses the 3-volume’s sign. Conse- quently, the 3-volume is characterized not only by its legs but also by the order of its legs, or equally well, in two other ways: (i) by the direction of the vector ~Σ (reverse the order of the legs, and the direction of ~Σ will reverse); and (ii) by the sense of the 3-volume, defined as follows. Just as a 2-volume (i.e., a segment of a plane) in 3-dimensional space has two sides, so a 3-volume in 4-dimensional spacetime has two sides; cf. Fig. 1.15. Every vector ~D for which ~Σ · ~D > 0 points out of one side of the 3-volume ~Σ. We shall call that side the “positive side” of ~Σ; and we shall call the other side, the one out of which point vectors ~D with ~Σ · ~D < 0, its “negative side”. When something moves through or reaches through or points through the 3-volume from its negative side to its positive side, we say that this thing is moving or reaching or pointing in the “positive sense”; and similarly for “negative sense”. The examples shown in Fig. 1.15 should make this more clear.a a aa a aa a a aa a a a aa∆x ex ∆y ey positive sense Σ t x y ∆y ey positiv e sense Σt x y ∆t e0 (a) (b) Fig. 1.15: Spacetime diagrams depicting 3-volumes in 4-dimensional spacetime, with one spatial dimension (that along the z-direction) suppressed. Figure 1.15(a) shows two of the three legs of the volume vector ~Σ = ǫ( ,∆x~ex,∆y~ey, ∆z~ez), where x, y, z are the spatial coordinates of a specific Lorentz frame. It is easy to show that this vector can also be written as ~Σ = −∆V ~e0, where ∆V is the ordinary volume of the parallelopiped as measured by an observer in the chosen Lorentz frame, ∆V = ∆x∆y∆z. Thus, the direction of the vector ~Σ is toward the past (direction of decreasing Lorentz time t). From this, and the fact that timelike vectors have negative squared length, it is easy to infer that ~Σ · ~D > 0 if and only if the vector ~D points out of the “future” side of the 3-volume (the side of increasing Lorentz time t); therefore, the positive side of ~Σ is the future side. This means that the vector ~Σ points in the negative sense of its own 3-volume. Figure 1.15(b) shows two of the three legs of the volume vector ~Σ = ǫ( ,∆t~et,∆y~ey, ∆z~ez) = −∆t∆A~ex (with ∆A = ∆y∆z). In this case, ~Σ points in its own positive sense. 53 In exercise 1.24, we show when this global law of charge conservation (1.80) is subjected to a 3+1 split of spacetime into space plus time, it becomes the nonrelativistic integral law of charge conservation (1.72). This global conservation law can be converted into a local conservation law with the help of the 4-dimensional Gauss theorem (1.79), ∫ ∂V JαdΣα = ∫ V Jα ;αdΣ . Since the left-hand side vanishes, so must the right-hand side; and in order for this 4-volume integral to vanish for every choice of V, it is necessary that the integrand vanish everywhere in spacetime: Jα ;α = 0 ; i.e. ~∇ · ~J = 0 . (1.81) In a specific but arbitrary Lorentz frame (i.e., in a 3+1 split of spacetime into space plus time), this becomes the standard differential law of charge conservation (1.73). 1.11.4 [R] Conservation of Particles, Baryons and Rest Mass Any conserved scalar quantity obeys conservation laws of the same form as those for electric charge. For example, if the number of particles of some species (e.g. electrons or protons or photons) is conserved, then we can introduce for that species a number-flux 4-vector ~S (analog of charge-current 4-vector ~J): In any Lorentz coordinate system S0 is the number density of particles n and Sj is the particle flux. If ~Σ is a small 3-volume (3-surface) in spacetime, then ~S(~Σ) = SW αΣα is the number of particles that pass through Σ from its negative side to its positive side. The frame-invariant global and local conservation laws for these particles take the same form as those for electric charge: ∫ ∂V SαdΣα = 0, where ∂V is any closed 3-surface in spacetime, (1.82a) Sα ;α = 0 ; i.e. ∇ · ~S = 0 . (1.82b) When fundamental particles (e.g. protons and antiprotons) are created and destroyed by quantum processes, the total baryon number (number of baryons minus number of an- tibaryons) is still conserved—or, at least this is so to the accuracy of all experiments per- formed thus far. We shall assume it so in this book. This law of baryon-number conservation takes the forms (1.82a) and (1.82b), with ~S the number-flux 4-vector for baryons (with an- tibaryons counted negatively). It is useful to reexpress this baryon-number conservation law in Newtonian-like language by introducing a universally agreed upon mean rest mass per baryon m̄B This m̄B is often taken to be 1/56 the mass of an 56Fe (iron-56) atomic nucleus, since 56Fe is the nucleus with the tightest nuclear binding, i.e. the endpoint of thermonuclear evolution in stars. We multiply the baryon number-flux 4-vector ~S by this mean rest mass per baryon to obtain a rest-mass-flux 4-vector ~Srm = m̄B ~S , (1.83) which (since m̄B is, by definition, a constant) satisfies the same conservation laws (1.82a) and (1.82b) as baryon number. 54 For media such as fluids and solids, in which the particles travel only short distances between collisions or strong interactions, it is often useful to resolve the particle number- flux 4-vector and the rest-mass-flux 4-vector into a 4-velocity of the medium ~u (i.e., the 4-velocity of the frame in which there is a vanishing net spatial flux of particles), and the particle number density no or rest mass density ρo as measured in the medium’s rest frame: ~S = no~u , ~Srm = ρo~u . (1.84) See Exercise 1.25. We shall make use of the conservation laws ∇ · ~S = 0 and ∇ · ~Srm = 0 for particles and rest mass later in this book, e.g. when studying relativistic fluids; and we shall find the expressions (1.84) for the number-flux 4-vector and rest-mass-flux 4-vector quite useful. See, e.g., the discussion of relativistic shock waves in Ex. 16.11, and the nonrelativistic limit of a relativistic fluid in Sec. 23.4 . **************************** EXERCISES Exercise 1.23 Practice and Example: [R] Evaluation of 3-Surface Integral in Spacetime In Minkowski spacetime the set of all events separated from the origin by a timelike interval a2 is a 3-surface, the hyperboloid t2 − x2 − y2 − z2 = a2, where {t, x, y, z} are Lorentz coordinates of some inertial reference frame. On this hyperboloid introduce coordinates {χ, θ, φ} such that t = a coshχ , x = a sinhχ sin θ cosφ , y = a sinhχ sin θ sin φ; , z = a sinhχ cos θ . (1.85) Note that χ is a radial coordinate and (θ, φ) are spherical polar coordinates. Denote by V3 the portion of the hyperboloid with χ ≤ b. (a) Verify that for all values of (χ, θ, φ), the points (1.85) do lie on the hyperboloid. (b) On a spacetime diagram, draw a picture of V3, the {χ, θ, φ} coordinates, and the elementary volume element (vector field) d~Σ [Eq. (1.77)]. (c) Set ~A ≡ ~e0 (the temporal basis vector), and express ∫ V3 ~A · d~Σ as an integral over {χ, θ, φ}. Evaluate the integral. (d) Consider a closed 3-surface consisting of the segment V3 of the hyperboloid as its top, the hypercylinder {x2 + y2 + z2 = a2 sinh2 b, 0 < t < a cosh b} as its sides, and the sphere {x2 + y2 + z2 ≤ a2 sinh2 b , t = 0} as its bottom. Draw a picture of this closed 3-surface on a spacetime diagram. Use Gauss’s theorem, applied to this 3-surface, to show that ∫ V3 ~A · d~Σ is equal to the 3-volume of its spherical base. 55 Exercise 1.24 Derivation and Example: [R] Global Law of Charge Conservation in an Inertial Frame Consider the global law of charge conservation ∫ ∂V JαdΣα = 0 for a special choice of the closed 3-surface ∂V: The bottom of ∂V is the ball {t = 0, x2 + y2 + z2 ≤ a2}, where {t, x, y, z} are the Lorentz coordinates of some inertial frame. The sides are the spherical world tube {0 ≤ t ≤ T, x2 + y2 + z2 = a2}. The top is the ball {t = T, x2 + y2 + z2 ≤ a2}. (a) Draw this 3-surface in a spacetime diagram. (b) Show that for this ∂V, ∫ ∂V JαdΣα = 0 is the nonrelativistic integral conservation law (1.72) for charge. Exercise 1.25 Example: [R] Rest-mass-flux 4-vector, Lorentz contraction of rest-mass den- sity, and rest-mass conservation for a fluid Consider a fluid with 4-velocity ~u, and rest-mass density ρo as measured in the fluid’s rest frame. (a) From the physical meanings of ~u, ρo, and the rest-mass-flux 4-vector ~Srm, deduce Eq. (1.84). (b) Examine the components of ~Srm in a reference frame where the fluid moves with ordi- nary velocity v. Show that S0 = ρoγ, S j = ρoγv j , where γ = 1/ √ 1 − v2. Explain the physical interpretation of these formulas in terms of Lorentz contraction. (c) Show that the law of conservation of rest-mass ∇ · ~Srm = 0, takes the form dρo dτ = −ρo∇ · ~u , (1.86) where d/dτ is derivative with respect to proper time moving with the fluid. (d) Consider a small 3-dimensional volume V of the fluid, whose walls move with the fluid (so if the fluid expands, V goes up). Explain why the law of rest-mass conservation must take the form d(ρoV )/dτ = 0. Thereby deduce that ∇ · ~u = (1/V )(dV/dτ) . (1.87) **************************** 58 Electromagnetic field: See Ex. 1.26 below. The stress tensor plays a central role in the Newtonian law of momentum conservation: Recall the physical intepretation of Tjk as the j-component of momentum that crosses a unit area perpendicular to ek per unit time [Eq. (1.89)]. Apply this definition to the little cube in Fig. 1.18. The momentum that flows into the cube in unit time across the front face (at y = 0) is TjyL 2, and across the back face (at y = L) is −TjyL 2; and their sum is −Tjy,yL 3. Adding to this the contributions from the side faces and the top and bottom faces, we find for the rate of change of total momentum inside the cube (−Tjx,x − Tjy,y − Tjz,z)L 3 = −Tjk,kL 3. Since the cube’s volume is L3, this says that ∂(momentum density)/dt+ ∇ · T = 0 . (1.90) This has the standard form for any local conservation law: the time derivative of the density of some quantity (here momentum), plus the divergence of the flux of that quantity (here momentum flux is the stress tensor), is zero. We shall make extensive use of this Newtonian local law of momentum conservation in Part III (elasticity theory), Part IV (fluid mechanics) and Part V (plasma physics). **************************** EXERCISES Exercise 1.26 **Problem: [R] Electromagnetic Stress Tensor An electric field E exerts (in Gaussian cgs units) a pressure E2/8π orthogonal to itself and a tension of this same magnitude along itself. Similarly, a magnetic field B exerts a pressure B2/8π orthogonal to itself and a tension of this same magnitude along itself. Verify that the following stress tensor embodies these stresses: T = 1 8π [ (E2 + B2)g − 2(E ⊗E + B ⊗B) ] . (1.91) **************************** 1.12.2 [R] Relativistic Stress-Energy Tensor We conclude this chapter by formulating the law of 4-momentum conservation in ways anal- ogous to our laws of conservation of charge, particles, baryons and rest mass. This task is not trivial, since 4-momentum is a vector in spacetime, while charge, particle number, baryon number, and rest mass are scalar quantities. Correspondingly, the density-flux of 4-momentum must have one more slot than the density-fluxes of charge, baryon number and rest mass, ~J , ~S and ~Srm; it must be a second-rank tensor. We call it the stress-energy tensor and denote it T( , ) (the same notation as we use for the stress-tensor in Euclidean space). Consider a medium or field flowing through 4-dimensional spacetime. As it crosses a tiny 3-surface ~Σ, it transports a net electric charge ~J(~Σ) from the negative side of ~Σ to 59 the positive side, and net baryon number ~S(~Σ) and net rest mass ~Srm(~Σ); and similarly, it transports a net 4-momentum T( , ~Σ) from the negative side to the positive side: T( , ~Σ) ≡ (total 4-momentum ~P that flows through ~Σ); i.e., T αβΣβ = P α . (1.92) From this definition of the stress-energy tensor we can read off the physical meanings of its components on a specific, but arbitrary, Lorentz-coordinate basis: Making use of method (1.32b) for computing the components of a vector or tensor, we see that in a specific, but arbitrary, Lorentz frame (where ~Σ = −~e0 is a volume vector representing a parallelopiped with unit volume ∆V = 1, at rest in that frame, with its positive sense toward the future): −Tα0 = T(~eα,−~e0) = ~P (~eα) =   α-component of 4-momentum that flows from past to future across a unit volume ∆V = 1 in the 3-space t = const   = (α-component of density of 4-momentum ) . (1.93a) Specializing α to be a time or space component and raising indices, we obtain the specialized versions of (1.93a) T 00 = (energy density as measured in the chosen Lorentz frame), T j0 = (density of j-component of momentum in that frame). (1.93b) Similarly, the αx component of the stress-energy tensor (also called the α1 component since x = x1 and ~ex = ~e1) has the meaning Tα1 ≡ Tαx ≡ T(~eα, ~ex) =     α-component of 4-momentum that crosses a unit area ∆y∆z = 1 lying in a surface of constant x, during unit time ∆t, crossing from the −x side toward the +x side     = ( α component of flux of 4-momentum across a surface lying perpendicular to ~ex ) . (1.93c) The specific forms of this for temporal and spatial α are (after raising indices) T 0x = ( energy flux across a surface perpendidular to ~ex, from the −x side to the +x side ) , (1.93d) T jx = ( flux of j-component of momentum across a surface perpendicular to ~ex, from the −x side to the +x side ) = ( jx component of stress ) . (1.93e) The αy and αz components have the obvious, analogous interpretations. These interpretations, restated much more briefly, are: T 00 = (energy density), T j0 = (momentum density), T 0j = (energy flux), T jk = (stress). (1.93f) 60 Although it might not be obvious at first sight, the 4-dimensional stress-energy tensor is always symmetric: in index notation (where indices can be thought of as representing the names of slots, or equally well components on an arbitrary basis) T αβ = T βα . (1.94) This symmetry can be deduced by physical arguments in a specific, but arbitrary, Lorentz frame: Consider, first, the x0 and 0x components, i.e., the x-components of momentum density and energy flux. A little thought, symbolized by the following heuristic equation, reveals that they must be equal T x0 = ( momentum density ) = (∆E)dx/dt ∆x∆y∆z = ∆E ∆y∆z∆t = ( energy flux ) , (1.95) and similarly for the other space-time and time-space components: T j0 = T 0j . [In Eq. (1.95), in the first expression ∆E is the total energy (or equivalently mass) in the volume ∆x∆y∆z, (∆E)dx/dt is the total momentum, and when divided by the volume we get the momentum density. The third equality is just elementary algebra, and the resulting expression is obvi- ously the energy flux.] The space-space components, being equal to the stress tensor, are also symmetric, T jk = T kj, by the argument embodied in Fig. 1.18 above. Since T 0j = T j0 and T jk = T kj, all components in our chosen Lorentz frame are symmetric, T αβ = T βα. This means that, if we insert arbitrary vectors into the slots of T and evaluate the resulting number in our chosen Lorentz frame, we will find T( ~A, ~B) = T αβAαBβ = T βαAαBβ = T( ~B, ~A) ; (1.96) i.e., T is symmetric under interchange of its slots. Let us return to the physical meanings (1.93) of the components of the stress-energy tensor. With the aid of T’s symmetry, we can restate those meanings in the language of a 3+1 split of spacetime into space plus time: When one chooses a specific reference frame, that choice splits the stress-energy tensor up into three parts. Its time-time part is the energy density T 00, Its time-space part T 0j = T j0 is the energy flux or equivalently the momentum density, and its space-space part T jk is the symmetric stress tensor. 1.12.3 [R] 4-Momentum Conservation Our interpretation of ~J(~Σ) ≡ JαΣα as the net charge that flows through a small 3-surface ~Σ from its negative side to its positive side gave rise to the global conservation law for charge, ∫ ∂V JαdΣα = 0 [Eqs. (1.80) and Fig. 1.16]. Similarly the role of T( , ~Σ) [T αβΣβ in slot naming index notation] as the net 4-momentum that flows through ~Σ from its negative side to positive gives rise to the following equation for conservation of 4-momentum: ∫ ∂V T αβdΣβ = 0 . (1.97) This equation says that all the 4-momentum that flows into the 4-volume V of Fig. 1.16 through its 3-surface ∂V must also leave V through ∂V; it gets counted negatively when it 63 (c) Read the discussion, in Ex. 1.10, of the tensor P = g + ~u ⊗ ~u that projects into the 3-space of the fluid’s rest frame. Explain why PµαT αβ ;β = 0 should represent the law of force balance (momentum conservation) as seen by the fluid. Show that this equation reduces to (ρ+ P )~a = −P · ∇P , (1.102c) where ~a = d~u/dτ is the fluid’s 4-acceleration. This equation is a relativistic version of Newton’s “F = ma”. Explain the physical meanings of the left and right hand sides. Infer that ρ + P must be the fluid’s inertial mass per unit volume. See Ex. 1.29 for further justification of this inference. Exercise 1.29 **Example: [R] Inertial Mass Per Unit Volume Suppose that some medium has a rest frame (unprimed frame) in which its energy flux and momentum density vanish, T 0j = T j0 = 0. Suppose that the medium moves in the x direction with speed very small compared to light, v ≪ 1, as seen in a (primed) laboratory frame, and ignore factors of order v2. The “ratio” of the medium’s momentum density T j′0′ as measured in the laboratory frame to its velocity vi = vδix is called its total inertial mass per unit volume, and is denoted ρinert ji : T j′0′ = ρinert ji vi . (1.103) (a) Show, using a Lorentz transformation from the medium’s (unprimed) rest frame to the (primed) laboratory frame, that ρinert ji = T 00δji + Tji . (1.104) (b) Give a physical explanation of the contribution Tjivi to the momentum density. (c) Show that for a perfect fluid [Eq. (1.100b)] the inertial mass per unit volume is isotropic and has magnitude ρ + P , where ρ is the mass-energy density and P is the pressure measured in the fluid’s rest frame: ρinert ji = (ρ+ P )δji . (1.105) See Ex. 1.28 above for this inertial-mass role of ρ + P in the law of force balance (momentum conservation) for a fluid. Exercise 1.30 **Example: [R] Stress-Energy Tensor, and Energy-Momentum Conserva- tion for the Electromagnetic Field (a) Compute from Eq. (1.101) the components of the electromagnetic stress-energy tensor in an inertial reference frame in Gaussian units. Your answer should be the expressions given in electrodynamic textbooks: T 00 = E2 + B2 8π , T 0jej = T j0ej = E× B 4π , T jk = 1 8π [ (E2 + B2)δjk − 2(EjEk +BjBk) ] . (1.106) See also Ex. 1.26 above for an alternative derivation of the stress tensor Tjk. 64 (b) Show that for the electromagnetic field, T αβ ;β = F αµJµ , (1.107a) where Jµ is the charge-current 4-vector. (c) The matter that carries the electric charge and current can exchange energy and mo- mentum with the electromagnetic field. Explain why Eq. (1.107a) is the rate per unit volume at which that matter feeds 4-momentum into the electromagnetic field, and conversely, −F αµJµ is the rate per unit volume at which the electromagnetic field feeds 4-momentum into the matter. Show, further, that (as viewed in any reference frame) the time and space components of this quantity are dEmatter dtdV = −F 0jJj = E · j , dpmatter dtdV = ρeE + j ×B , (1.107b) where ρe is charge density and j is current density [Eq. (1.67)]. The first of these equations is ohmic heating of the matter by the electric field; the second is the Lorentz force per unit volume on the matter. **************************** Bibliographic Note For an inspiring taste of the history of special relativity, see the original papers by Einstein, Lorentz, and Minkowski, translated into English and archived in Einstein et. al. (1923). Early relativity textbooks [see the bibliography on p. 567 of Jackson (1999)] emphasized the transformation properties of physical quantities, in going from one inertial frame to another, rather than their roles as frame-invariant geometric objects. Minkowski (1908) introduced geometric thinking, but only in recent years — in large measure due to the influence of John Wheeler — has the geometric viewpoint gained ascendancy. It is still not common in texts on Newtonian physics, but it is almost universal in modern relativity texts. In our opinion, the best elementary introduction to special relativity is the first edition of Taylor and Wheeler (1966); the more ponderous second edition (1992) is also good. Both adopt the geometric viewpoint. At an intermediate level, most physics students learn rela- tivity from electrodynamics texts such as Griffiths (1999) and Jackson (1999), or classical mechanics texts such as Goldstein (1980). Avoid the first and second editions of Jackson and of Goldstein, which use imaginary time and obscure the geometry of spacetime! Griffiths and Jackson (like old relativity texts) adopt the “transformation” viewpoint on physical quantities, rather than the geometric viewpoint. Under John Safko’s influence, the third edition of Goldstein [Goldstein, Poole and Safko (2002)] has become strongly geometric. For fully geometric treatments of special relativity, analogous to ours, see not only the third edition of Goldstein, but also the special relativity sections in modern general relativity texts. Some we like at the undergraduate level are Schutz (1985) and especially Hartle (2002); 65 Box 1.4 Important Concepts in Chapter 1 • Foundational Concepts – Frameworks for physical laws (general relativity, special relativity and New- tonian physics) and their relationships to each other, Sec. 1.1. – Inertial reference frame, Sec. 1.2.2. – Invariant interval and how it defines the geometry of spacetime, Sec. 1.2.3. • Principle of Relativity: Laws of physics are frame-independent geometric rela- tions between geometric objects, Sec. 1.2.3. Important examples: – Newton’s second law of motion F = ma, Eq. (1.13). – Lorentz force law in 3-dimensional Newtonian language (1.14), and in 4- dimensional geometric language (1.28) and their connection, Sec. 1.10. – Conservation of 4-momentum in particle interations, Eq. (1.23). – Global and local conservation laws for charge, baryon number, and 4- momentum, Secs. 1.11.3, 1.11.4, 1.12.3. • Differential geometry – Tensor as a linear function of vectors, Sec. 1.3. Examples: Electromagnetic field tensor (1.26), stress tensor (1.88) and stress-energy tensor (1.92). – Slot-naming index notation, Sec. 1.5.3. – Gauss’s theorem in Euclidean space (1.71a), and in spacetime (1.79). – Computations via geometric techniques, without coordinates or Lorentz trans- formations (e.g. derive Lorentz force law Ex. 1.4.3, derive Dopper shift Ex. 1.11, derive vector identities Ex. 1.18). • 3+1 Split of spacetime into space plus time induced by choice of inertial frame, Sec. 1.6, and resulting 3+1 split of physical quantities and laws: – 4-momentum → energy and momentum, Eqs. (1.35), (1.36), (1.38); Ex. 1.9. – Electromagnetic tensor → electric field and magnetic field, Sec. 1.10. – Charge-current 4-vector → charge density and current density, Ex. 1.20. • Spacetime diagrams used to understand Lorentz contraction, time dilation, breakdown of simultaneity (Sec. 1.7.3, Ex. 1.14) and conservation laws (Fig. 1.16). Contents VI GENERAL RELATIVITY 2 23 From Special to General Relativity 1 23.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 23.2 Special Relativity Once Again . . . . . . . . . . . . . . . . . . . . . . . . . . 3 23.2.1 Geometric, Frame-Independent Formulation . . . . . . . . . . . . . . 3 23.2.2 Inertial Frames and Components of Vectors, Tensors and Physical Laws 5 23.2.3 Light Speed, the Interval, and Spacetime Diagrams . . . . . . . . . . 7 23.3 Differential Geometry in General Bases and in Curved Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 23.3.1 Non-Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . . . 9 23.3.2 Vectors as Differential Operators; Tangent Space; Commutators . . . 13 23.3.3 Differentiation of Vectors and Tensors; Connection Coefficients . . . . 15 23.3.4 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 23.4 The Stress-Energy Tensor Revisited . . . . . . . . . . . . . . . . . . . . . . . 25 23.5 The Proper Reference Frame of an Accelerated Observer [MTW pp. 163–176, 327–332] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 1 Part VI GENERAL RELATIVITY 2 Chapter 23 From Special to General Relativity Version 0623.1.K.pdf, 25 April 2007. Please send comments, suggestions, and errata via email to kip@tapir.caltech.edu or on paper to Kip Thorne, 130-33 Caltech, Pasadena CA 91125 Box 23.1 Reader’s Guide • This chapter relies significantly on – The special relativity portions of Chap. 1. – The discussion of connection coefficients in Sec. 10.3. • This chapter is a foundation for the presentation of general relativity theory in Chaps. 24–27. 23.1 Overview We have reached the final Part of this book, in which we present an introduction to the basic concepts of general relativity and its most important applications. This subject, although a little more challenging than the material that we have covered so far, is nowhere near as formidable as its reputation. Indeed, if you have mastered the techniques developed in the first five Parts, the path to the Einstein Field Equations should be short and direct. The General Theory of Relativity is the crowning achievement of classical physics, the last great fundamental theory created prior to the discovery of quantum mechanics; its formulation by Albert Einstein in 1915 marks the culmination of the great intellectual ad- venture undertaken by Newton 250 years earlier. It was created after many wrong turns and with little experimental guidance, almost by pure thought. Unlike the special theory, whose physical foundations and logical consequences were clearly appreciated by physicists 1 4 τ =0 1 2 3 4 5 6 7 x y t u → u → Fig. 23.1: The world line P(τ) of a particle in Minkowski spacetime and the tangent vector ~u = dP/dτ to this world line; ~u is the particle’s 4-velocity. The bending of the world line is produced by some force that acts on the particle, e.g. by the Lorentz force embodied in Eq. (23.3). Also shown is the light cone emitted from the event P(τ = 1). Although the axes of an (arbitrary) inertial reference frame are shown, no reference frame is needed for the definition of the world line or its tangent vector ~u or the light cone, or for the formulation of the Lorentz force law. symmetric, g( ~A, ~B) = g( ~B, ~A) for all ~A and ~B. Since the particle’s 4-velocity ~u is ~u = dP dτ = lim ∆τ→0 P(τ + ∆τ) −P(τ) ∆τ ≡ lim ∆τ→0 ∆~x ∆τ , (23.2) Eq. (23.1) implies that ~u · ~u = g(~u, ~u) = −1. The 4-velocity ~u is an example of a timelike vector; it has a negative inner product with itself (negative “squared length”). This shows up pictorially in the fact that ~u lies inside the light cone (the cone swept out by the trajectories of photons emitted from the tail of ~u; see Fig. 23.1). Vectors ~k on the light cone (the tangents to the world lines of the photons) are null and so have vanishing squared lengths, ~k · ~k = g(~k,~k) = 0; and vectors ~A that lie outside the light cone are spacelike and have positive squared lengths, ~A · ~A > 0. An example of a physical law in 4-dimensional geometric language is the Lorentz force law d~p dτ = qF( , ~u) , (23.3) where q is the particle’s charge and both sides of this equation are vectors, i.e. first-rank tensors, i.e. tensors with just one slot. As we learned in Sec. 1.5, it is convenient to give names to slots. When we do so, we can rewrite the Lorentz force law as dpα dτ = qF αβuβ . (23.4) Here α is the name of the slot of the vector d~p/dτ , α and β are the names of the slots of F, β is the name of the slot of u, and the double use of β with one up and one down on the right side of the equation represents the insertion of ~u into the β slot of F, whereby the two β slots disappear and we wind up with a vector whose slot is named α. As we learned in Sec. 1.5, this slot-naming index notation is isomorphic to the notation for components of vectors, tensors, and physical laws in some reference frame. However, no reference frames are 5 needed or involved when one formulates the laws of physics in geometric, frame-independent language as above. Those readers who do not feel completely comfortable with these concepts, statements and notation should reread the relevant portions of Chap. 1. 23.2.2 Inertial Frames and Components of Vectors, Tensors and Physical Laws In special relativity a key role is played by inertial reference frames. An inertial frame is an (imaginary) latticework of rods and clocks that moves through spacetime freely (inertially, without any force acting on it). The rods are orthogonal to each other and attached to inertial-guidance gyroscopes so they do not rotate. These rods are used to identify the spatial, Cartesian coordinates (x1, x2, x3) = (x, y, z) of an event P [which we also denote by lower case Latin indices xj(P) with j running over 1,2,3]. The latticework’s clocks are ideal and are synchronized with each other via the Einstein light-pulse process (Sec. 1.2). They are used to identify the temporal coordinate x0 = t of an event P; i.e. x0(P) is the time measured by that latticework clock whose world line passes through P, at the moment of passage. The spacetime coordinates of P are denoted by lower case Greek indices xα, with α running over 0,1,2,3. An inertial frame’s spacetime coordinates xα(P) are called Lorentz coordinates or inertial coordinates. In the real universe, spacetime curvature is very small in regions well-removed from concentrations of matter, e.g. in intergalactic space; so special relativity is highly accurate there. In such a region, frames of reference (rod-clock latticeworks) that are non-accelerating and non-rotating with respect to cosmologically distant galaxies (and thence with respect to a local frame in which the cosmic microwave radiation looks isotropic) constitute good approximations to inertial reference frames. Associated with an inertial frame’s Lorentz coordinates are basis vectors ~eα that point along the frame’s coordinate axes (and thus are orthogonal to each other) and have unit length (making them orthonormal). This orthonormality is embodied in the inner products ~eα · ~eβ = ηαβ , (23.5) where by definition η00 = −1 , η11 = η22 = η33 = +1 , ηαβ = 0 if α 6= β . (23.6) Here and throughout Part VI (as in Chap. 1), we set the speed of light to unity [i.e. we use the geometrized units discussed in Eqs. (1.3a) and (1.3b)], so spatial lengths (e.g. along the x axis) and time intervals (e.g. along the t axis) are measured in the same units, seconds or meters with 1 s = 2.99792458× 108 m. In Sec. 1.5 we used the basis vectors of an inertial frame to build a component representa- tion of tensor analysis. The fact that the inner products of timelike vectors with each other are negative, e.g. ~e0 · ~e0 = −1, while those of spacelike vectors are positive, e.g. ~e1 · ~e1 = +1, forced us to introduce two types of components: covariant (indices down) and contravariant (indices up). The covariant components of a tensor were computable by inserting the basis 6 vectors into the tensor’s slots, uα = ~u(~eα) ≡ ~u · ~eα; Fαβ = F(~eα, ~eβ). For example, in our Lorentz basis the covariant components of the metric are gαβ = g(~eα, ~eβ) = ~eα · ~eβ = ηαβ . The contravariant components of a tensor were related to the covariant components via “index lowering” with the aid of the metric, Fαβ = gαµgβνF µν , which simply said that one reverses the sign when lowering a time index and makes no change of sign when lowering a space index. This lowering rule implied that the contravariant components of the metric in a Lorentz basis are the same numerically as the covariant components, gαβ = ηαβ and that they can be used to raise indices (i.e. to perform the trivial sign flip for temporal in- dices) F µν = gµαgνβFαβ. As we saw in Sec. 1.5, tensors can be expressed in terms of their contravariant components as ~p = pα~eα, and F = F αβ~eα ⊗ ~eβ , where ⊗ represents the tensor product [Eq. (1.10a)]. We also learned in Chap. 1 that any frame independent geometric relation between ten- sors can be rewritten as a relation between those tensors’ components in any chosen Lorentz frame. When one does so, the resulting component equation takes precisely the same form as the slot-naming-index-notation version of the geometric relation. For example, the com- ponent version of the Lorentz force law says dpα/dτ = qF αβuβ, which is identical to Eq. (23.4). The only difference is the interpretation of the symbols. In the component equation F αβ are the components of F and the repeated β in F αβuβ is to be summed from 0 to 3. In the geometric relation F αβ means F( , ) with the first slot named α and the second β, and the repeated β in F αβuβ implies the insertion of ~u into the second slot of F to produce a single-slotted tensor, i.e. a vector whose slot is named α. As we saw in Sec. 1.6, a particle’s 4-velocity ~u (defined originally without the aid of any reference frame; Fig. 23.1) has components, in any inertial frame, given by u0 = γ, uj = γvj where vj = dxj/dt is the particle’s ordinary velocity and γ ≡ 1/ √ 1 − δijvivj. Similarly, the particle’s energy E ≡ p0 is mγ and its spatial momentum is pj = mγvj , i.e. in 3-dimensional geometric notation, p = mγv. This is an example of the manner in which a choice of Lorentz frame produces a “3+1” split of the physics: a split of 4-dimensional spacetime into 3-dimensional space (with Cartesian coordinates xj) plus 1-dimensional time t = x0; a split of the particle’s 4-momentum ~p into its 3-dimensional spatial momentum p and its 1-dimensional energy E = p0; and similarly a split of the electromagnetic field tensor F into the 3-dimensional electric field E and 3-dimensional magnetic field B; cf. Secs. 1.6 and 1.10. The principle of relativity (all laws expressible as geometric relations between geometric objects in Minkowski spacetime), when translated into 3+1 language, says that, when the laws of physics are expressed in terms of components in a specific Lorentz frame, the form of those laws must be independent of one’s choice of frame. The components of tensors in one Lorentz frame are related to those in another by a Lorentz transformation (Sec. 1.7), so the principle of relativity can be restated as saying that, when expressed in terms of Lorentz- frame components, the laws of physics must be Lorentz-invariant (unchanged by Lorentz transformations). This is the version of the principle of relativity that one meets in most elementary treatments of special relativity. However, as the above discussion shows, it is a mere shadow of the true principle of relativity—the shadow cast onto Lorentz frames when one performs a 3+1 split. The ultimate, fundamental version of the principle of relativity is the one that needs no frames at all for its expression: All the laws of physics are expressible 9 the notion of a gradient of a vector depends on comparing the vector at two different points and thus depends on the details of transport, we will have to rework the notion of a gradient and the gradient’s connection coefficients; and since, in doing an integral, one must add contributions that live at different points in the manifold, we must also rework the notion of integration. We shall tackle each of these three issues in turn in the following four subsections. 23.3.1 Non-Orthonormal Bases Consider an n-dimensional manifold, e.g. 4-dimensional spacetime or 3-dimensional Eu- clidean space or the 2-dimensional surface of a sphere. At some point P in the manifold, introduce a set of basis vectors {~e1, ~e2, . . . , ~en} and denote them generally as ~eα. We seek to generalize the formalism of Sec. 23.2 in such a way that the index manipulation rules for components of tensors are unchanged. For example, we still want it to be true that covariant components of any tensor are computable by inserting the basis vectors into the tensor’s slots, Fαβ = F(~eα, ~eβ), and that the tensor itself can be reconstructed from its contravariant components as F = F µν~eµ ⊗ ~eν , and that the two sets of components are computable from each other via raising and lowering with the metric components, Fαβ = gαµgβνF µν . The only thing we do not want to preserve is the orthonormal values of the metric components; i.e. we must allow the basis to be nonorthonormal and thus ~eα · ~eβ = gαβ to have arbitrary values (except that the metric should be nondegenerate, so no linear combination of the ~eα’s vanishes, which means that the matrix ||gαβ|| should have nonzero determinant). We can easily achieve our goal by introducing a second set of basis vectors, denoted {~e1, ~e2, . . . , ~en}, which is dual to our first set in the sense that ~eµ · ~eβ ≡ g(~eµ, ~eβ) = δµ β (23.8) where δα β is the Kronecker delta. This duality relation actually constitutes a definition of the eµ once the ~eα have been chosen. To see this, regard ~eµ as a tensor of rank one. This tensor is defined as soon as its value on each and every vector has been determined. Expression (23.8) gives the value ~eµ(~eβ) = ~eµ · ~eβ of ~eµ on each of the four basis vectors ~eβ; and since every other vector can be expanded in terms of the ~eβ’s and ~eµ( ) is a linear function, Eq. (23.8) thereby determines the value of ~eµ on every other vector. The duality relation (23.8) says that ~e1 is always perpendicular to all the ~eα except ~e1; and its scalar product with ~e1 is unity—and similarly for the other basis vectors. This interpretation is illustrated for 3-dimensional Euclidean space in Fig. 23.2. In Minkowski spacetime, if ~eα are an orthonormal Lorentz basis, then duality dictates that ~e0 = −~e0, and ~ej = +~ej . The duality relation (23.8) leads immediately to the same index-manipulation formalism as we have been using, if one defines the contravariant, covariant and mixed components of tensors in the obvious manner F µν = F(~eµ, ~eν) , Fαβ = F(~eα, ~eβ) , F µ β = F(~eµ, ~eβ) ; (23.9) see Ex. 23.4. Among the consequences of this duality are the following: (i) gµβgνβ = δµ ν , (23.10) 10 e e e e e 1 2 3 3 1 Fig. 23.2: Non-orthonormal basis vectors ~ej in Euclidean 3-space and two members ~e 1 and ~e 3 of the dual basis. The vectors ~e1 and ~e2 lie in the horizontal plane, so ~e 3 is orthogonal to that plane, i.e. it points vertically upward, and its inner product with ~e3 is unity. Similarly, the vectors ~e2 and ~e3 span a vertical plane, so ~e 1 is orthogonal to that plane, i.e. it points horizontally, and its inner product with ~e1 is unity. i.e., the matrix of contravariant components of the metric is inverse to that of the covariant components, ||gµν|| = ||gαβ||−1; this relation guarantees that when one raises indices on a tensor Fαβ with gµα and then lowers them back down with gνβ, one recovers one’s original covariant components Fαβ unaltered. (ii) F = F µν~eµ ⊗ ~eν = Fαβ~e α ⊗ ~eβ = F µ β~eµ ⊗ ~eβ , (23.11) i.e., one can reconstruct a tensor from its components by lining up the indices in a manner that accords with the rules of index manipulation. (iii) F(~p, ~q) = F αβpαpβ , (23.12) i.e., the component versions of tensorial equations are identical in mathematical symbology to the slot-naming-index-notation versions. Associated with any coordinate system xα(P) there is a coordinate basis whose basis vectors are defined by ~eα ≡ ∂P ∂xα . (23.13) Since the derivative is taken holding the other coordinates fixed, the basis vector ~eα points along the α coordinate axis (the axis on which xα changes and all the other coordinates are held fixed). In an orthogonal curvilinear coordinate system, e.g. circular polar coordinates (̟, φ) in Euclidean 2-space, this coordinate basis is quite different from the coordinate system’s orthonormal basis. For example, ~eφ = (∂P/∂φ)̟ is a very long vector at large radii and a very short vector at small radii [cf. Fig. 23.3]; the corresponding unit-length vector is ~eφ̂ = (1/̟)~eφ. By contrast, ~e̟ = (∂P/∂̟)φ already has unit length, so the corresponding 11 Box 23.1 Dual Bases in Other Contexts Vector spaces appear in a wide variety of contexts in mathematics and physics, and wherever they appear it can be useful to introduce dual bases. When a vector space does not posses a metric, the basis {~eµ} lives in a different space from {~eα}, and the two spaces are said to be dual to each other. An important example occurs in manifolds that do not have metrics. There the vectors in the space spanned by {~eµ} are often called a one forms and are represented pictorially as families of parallel surfaces; the vectors in the space spanned by {~eα} are called tangent vectors and are represented pictorially as arrows; the one forms are linear functions of tangent vectors, and the result that a one form β̃ gives when a tangent vector ~a, is inserted into its slot, β̃(~a), is the number of surfaces of β̃ pierced by the arrow ~a; see, e.g., MTW. A metric produces a one-to-one mapping between the one forms and the tangent vectors. In this book we regard this mapping as equating each one form to a tangent vector and thereby as making the space of one forms and the space of tangent vectors be identical. This permits us to avoid ever speaking about one forms, except here in this box. Quantum mechanics provides another example of dual spaces. The kets |ψ〉 are the tangent vectors and the bras 〈φ| are the one forms: linear complex valued functions of kets with the value that 〈φ| gives when |ψ〉 is inserted into its slot being the inner product 〈φ|ψ〉. orthonormal basis vector is simply ~e ˆ̟ = ~e̟. The metric components in the coordinate basis are readily seen to be gφφ = ̟2, g̟̟ = 1, g̟φ = gφ̟ = 0 which is in accord with the equation for the squared distance (interval) between adjacent points ds2 = gijdx idxj = d̟2 +̟2dφ2. The metric components in the orthonormal basis, of course, are gîĵ = δij. Henceforth, we shall use hats to identify orthonormal bases; bases whose indices do not have hats will typically (though not always) be coordinate bases. In general, we can construct the basis {~eµ} that is dual to the coordinate basis {~eα} = {∂P/∂xα} by taking the gradients of the coordinates, viewed as scalar fields xα(P): ~eµ = ~∇xµ . (23.14) It is straightforward to verify the duality relation (23.8) for these two bases: ~eµ · ~eα = ~eα · ~∇xµ = ∇~eα xµ = ∇∂P/∂xαxµ = ∂xµ ∂xα = δµ α . (23.15) In any coordinate system, the expansion of the metric in terms of the dual basis, g = gαβ~e α ⊗ ~eβ = gαβ ~∇xα ⊗ ~∇xβ is intimately related to the line element ds2 = gαβdx αdxβ: Consider an infinitesimal vectorial displacement d~x = dxα(∂/∂xα). Insert this displacement into the metric’s two slots, to obtain the interval ds2 along d~x. The result is ds2 = gαβ∇xα⊗ ∇xβ(d~x, d~x) = gαβ(d~x · ∇xα)(d~x · ∇xβ) = gαβdx αdxβ ; i.e. ds2 = gαβdx αdxβ . (23.16) 14 ζ= −0.5 ζ=0 ζ=0.5 A= dP dζ Fig. 23.4: A curve P(ζ) on the surface of a sphere and the curve’s tangent vector ~A = dP/dζ at P(ζ = 0) ≡ Po. The tangent vector lives in the tangent space at Po, i.e. in the flat plane that is tangent to the sphere there as seen in the flat Euclidean 3-space in which the sphere’s surface is embedded. define the tangent vector to the curve P(ζ) to be the derivative d/dζ which differentiates scalar fields along the curve. This derivative operator is very well defined by the rules of ordinary differentiation; if ψ(P) is a scalar field in the manifold, then ψ[P(ζ)] is a function of the real variable ζ , and its derivative (d/dζ)ψ[P(ζ)] evaluated at ζ = 0 is the ordinary derivative of elementary calculus. Since the derivative operator d/dζ differentiates in the manifold along the direction in which the curve is moving, it is often called the directional derivative along P(ζ). Mathematicians notice that all the directional derivatives at a point Po of the manifold form a vector space (they can be multiplied by scalars and added and subtracted to get new vectors), and so they define this vector space to be the tangent space at Po. This mathematical procedure turns out to be isomorphic to the physicists’ more heuris- tic way of thinking about the tangent space. In physicists’ language, if one introduces a coordinate system in a region of the manifold containing Po and constructs the corre- sponding coordinate basis ~eα = ∂P/∂xα, then one can expand any vector in the tangent space as ~A = Aα∂P/∂xα. One can also construct, in physicists’ language, the directional derivative along ~A; it is ∂ ~A ≡ Aα∂/∂xα. Evidently, the components Aα of the physicist’s vector ~A (an arrow) are identical to the coefficients Aα in the coordinate-expansion of the directional derivative ∂ ~A. There therefore is a one-to-one correspondence between the direc- tional derivatives ∂ ~A at Po and the vectors ~A there, and a complete isomorphism between the tangent-space manipulations that a mathematician will perform treating the directional derivatives as vectors, and those that a physicist will perform treating the arrows as vectors. 15 “Why not abandon the fuzzy concept of a vector as an arrow, and redefine the vector ~A to be the same as the directional derivative ∂ ~A?” mathematicians have demanded of physicists. Slowly, over the past century, physicists have come to see the merit in this approach: (i) It does, indeed, make the concept of a vector more rigorous than before. (ii) It simplifies a number of other concepts in mathematical physics, e.g., the commutator of two vector fields; see below. (iii) It facilitates communication with mathematicians. With these motivations in mind, and because one always gains conceptual and computational power by having multiple viewpoints at one’s finger tips (see, e.g., Feynman, 1966), we shall regard vectors henceforth both as arrows living in a tangent space and as directional derivatives. Correspondingly, we shall assert the equalities ∂P ∂xα = ∂ ∂xα , ~A = ∂ ~A , (23.24) and shall often expand vectors in a coordinate basis using the notation ~A = Aα ∂ ∂xα . (23.25) This directional-derivative viewpoint on vectors makes natural the concept of the commu- tator of two vector fields ~A and ~B: [ ~A, ~B] is the vector which, when viewed as a differential operator, is given by [∂ ~A, ∂~B]—where the latter quantity is the same commutator as one meets elsewhere in physics, e.g. in quantum mechanics. Using this definition, we can compute the components of the commutator in a coordinate basis: [ ~A, ~B] ≡ [ Aα ∂ ∂xα , Bβ ∂ ∂xβ ] = ( Aα∂B β ∂xα − Bα∂A β ∂xα ) ∂ ∂xβ . (23.26) This is an operator equation where the final derivative is presumed to operate on a scalar field just as in quantum mechanics. From this equation we can read off the components of the commutator in any coordinate basis; they are AαBβ ,α −BαAβ ,α, where the comma denotes partial differentiation. Figure 23.5 uses this equation to deduce the geometric meaning of the commutator: it is the fifth leg needed to close a quadrilateral whose other four legs are constructed from the vector fields ~A and ~B. The commutator is useful as a tool for distinguishing between coordinate bases and non-coordinate bases (also called non-holonomic bases): In a coordinate basis, the basis vectors are just the coordinate system’s partial derivatives, ~eα = ∂/∂xα, and since partial derivatives commute, it must be that [~eα, ~eβ] = 0. Conversely (as Fig. 23.5 explains), if one has a basis with vanishing commutators [~eα, ~eβ] = 0, then it is possible to construct a coordinate system for which this is the coordinate basis. In a non-coordinate basis, at least one of the commutators [~eα, ~eβ] will be nonzero. 23.3.3 Differentiation of Vectors and Tensors; Connection Coeffi- cients In a curved manifold, the differentiation of vectors and tensors is rather subtle. To elucidate the problem, let us recall how we defined such differentiation in Minkowski spacetime or 16 A A B B A B [ ], Fig. 23.5: The commutator [ ~A, ~B] of two vector fields. In this diagram the vectors are assumed to be so small that the curvature of the manifold is negligible in the region of the diagram, so all the vectors can be drawn lying in the surface itself rather than in their respective tangent spaces. In evaluating the two terms in the commutator (23.26), a locally orthonormal coordinate basis is used, so Aα∂Bβ/∂xα is the amount by which the vector ~B changes when one travels along ~A (i.e. it is the short dashed curve in the upper right), and Bα∂Aβ/∂xα is the amount by which ~A changes when one travels along ~B (i.e. it is the other short dashed curve). According to Eq. (23.26), the difference of these two short-dashed curves is the commutator [ ~A, ~B]. As the diagram shows, this commutator closes the quadrilateral whose legs are ~A and ~B. If the commutator vanishes, then there is no gap in the quadrilateral, which means that in the region covered by this diagram one can construct a coordinate system in which ~A and ~B are coordinate basis vectors. Euclidean space (Sec. 1.9). Converting to the above notation, we began by defining the directional derivative of a tensor field F(P) along the tangent vector ~A = d/dζ to a curve P(ζ): ∇ ~AF ≡ lim ∆ζ→0 F[P(∆ζ)] − F[P(0)] ∆ζ . (23.27) This definition is problematic because F[P(∆ζ))] lives in a different tangent space than F[P(0)]. To make the definition meaningful, we must identify some connection between the two tangent spaces, when their points P(∆ζ) and P(0) are arbitrarily close together. That connection is equivalent to identifying a rule for transporting F from one tangent space to the other. In flat space or flat spacetime, and when F is a vector ~F , that transport rule is obvious: keep ~F parallel to itself and keep its length fixed during the transport; in other words, keep constant its components in an orthonormal coordinate system (Cartesian coordinates in Euclidean space, Lorentz coordinates in Minkowski spacetime). This is called the law of parallel transport. For a tensor F the parallel transport law is the same: keep its components fixed in an orthonormal coordinate basis. In curved spacetime there is no such thing as an orthonormal coordinate basis. Just as the curvature of the earth’s surface prevents one from placing a Cartesian coordinate system on it, so the spacetime curvature prevents one from introducing Lorentz coordinates; see Chap. 24. However, in an arbitrarily small region on the earth’s surface one can introduce coordinates that are arbitrarily close to Cartesian (as surveyors well know); the fractional deviations from Cartesian need be no larger than O(L2/R2), where L is the size of the 19 gµν are constant so the g’s will vanish and Γαβγ will be antisymmetric in its first two indices; and in a Cartesian or Lorentz coordinate basis, which is both coordinate and orthonormal, both the c’s and the g’s will vanish, so Γαβγ will vanish. (iv) Raise the first index on the covariant Christoffel symbols to obtain the connection coefficients, which are also sometimes called the mixed Christoffel symbols Γµ βγ = gµαΓαβγ . (23.39) The gradient operator ~∇ is an example of a geometric object that is not a tensor. The connection coefficients can be regarded as the components of ~∇; and because ~∇ is not a tensor, these components Γα βγ do not obey the tensorial transformation law (23.19) when switching from one basis to another. Their transformation law is far more complicated and is very rarely used. Normally one computes them from scratch in the new basis, using the above prescription or some other, equivalent prescription (cf. Chap. 14 of MTW). For most curved spacetimes that one meets in general relativity, these computations are long and tedious and therefore are normally carried out on computers using symbolic manipulations software such as Macsyma, or GRTensor (running under Maple or Mathematica), or Mathtensor (under Mathematica). Such software is easily found on the Internet using a search engine. The above prescription for computing the connection coefficients follows from two key properties of the gradient ~∇: First, The gradient of the metric tensor vanishes, ~∇g = 0 . (23.40) This can be seen by introducing a locally orthornormal coordinate basis at the arbitrary point P where the gradient is to be evaluated. In such a basis, the effects of curvature show up only at quadratic order in distance away from P, which means that the coordinate bases ~eα ≡ ∂/∂xα behave, at first order in distance, just like those of an orthonormal coordinate system in flat space. Since ∇β~eα involves only first derivatives and it vanishes in an orthonormal coordinate system in flat space, it must also vanish here—which means that the connection coefficients vanish at P in this basis. Therefore, the components of ~∇g at P are gαβ;γ = gαβ,γ = ∂gαβ/∂x γ , which vanishes since the components of g in this basis are all 0 or ±1 plus corrections second order in distance from P. This vanishing of the components of ~∇g in our special basis guarantees that ~∇g itself vanishes at P; and since P was an arbitrary point, ~∇g must vanish everywhere and always. Second, for any two vector fields ~A and ~B, the gradient is related to the commutator by ∇ ~A ~B −∇ ~B ~A = [ ~A, ~B] . (23.41) This relation, like ~∇g = 0, is most easily derived by introducing a locally orthonormal coordinate basis at the point P where one wishes to check its validity. Since Γµ αβ = 0 at P in that basis, the components of ∇ ~A ~B −∇ ~B ~A are Bα ;βA β −Aα ;βB β = Bα ,βA β −Aα ,βB β [cf. Eq. (23.33)]. But these components are identical to those of the commutator [ ~A, ~B] [Eq. (23.26)]. Since the components of these two vectors [the left and right sides of (23.41)] are identical at P in this special basis, the vectors must be identical, and since the point P was arbitrary, they must always be identical. 20 Turn, now, to the derivation of our prescription for computing the connection coefficients in an arbitrary basis. By virtue of the relation Γµ βγ = gµαΓαβγ [Eq. (23.39)] and its inverse Γαβγ = gαµΓµ βγ , (23.42) a knowledge of Γαβγ is equivalent to a knowledge of Γµ βγ . Thus, our task reduces to deriving expression (23.38) for Γαβγ , in which the cαβγ are defined by equations (23.36) and (23.37). As a first step in the derivation, notice that the constancy of the metric tensor, ~∇g = 0, when expressed in component notation using Eq. (23.35), and when combined with Eq. (23.42), becomes 0 = gαβ;γ = gαβ,γ − Γβαγ − Γαβγ; i.e., Γαβγ + Γβαγ = gαβ,γ . (23.43) This determines the part of Γαβγ that is symmetric in the first two indices. The commutator of the basis vectors determines the part antisymmetric in the last two indices: From cαβ µ~eµ = [~eα, ~eβ] = ∇α~eβ −∇β~eα = (Γµ βα − Γµ αβ)~eµ (23.44) (where the first equality is the definition (23.36) of the commutation coefficient, the second is expression (23.41) for the commutator in terms of the gradient, and the third follows from the definition (23.28) of the connection coefficient), we infer, by equating the components and lowering the µ index, that Γγβα − Γγαβ = cαβγ . (23.45) By combining equations (23.43) and (23.45) and performing some rather tricky algebra (cf. Ex. 8.15 of MTW), we obtain the computational rule (23.38). 23.3.4 Integration Our desire to use general bases and work in curved space gives rise to two new issues in the definition of integrals. First, the volume elements used in integration involve the Levi-Civita tensor [Eqs. (1.59b), (1.73), (1.76)], so we need to know the components of the Levi-Civita tensor in a general basis. It turns out [see, e.g., Ex. 8.3 of MTW] that the covariant components differ from those in an orthonormal basis by a factor √ |g| and the contravariant by 1/ √ |g|, where g ≡ det ||gαβ|| (23.46) is the determinant of the matrix whose entries are the covariant components of the metric. More specifically, let us denote by [αβ . . . ν] the value of ǫαβ...ν in an orthonormal basis of our n-dimensional space [Eq. (1.59b)]: [12 . . .N ] = +1 , [αβ . . . ν] = +1 if α, β, . . . , ν is an even permutation of 1, 2, . . . , N = −1 if α, β, . . . , ν is an odd permutation of 1, 2, . . . , N = 0 if α, β, . . . , ν are not all different. (23.47) 21 (In spacetime the indices must run from 0 to 3 rather than 1 to n = 4). Then in a general right-handed basis the components of the Levi-Civita tensor are ǫαβ...ν = √ |g| [αβ . . . ν] , ǫαβ...ν = ± 1 √ |g| [αβ . . . ν] , (23.48) where the ± is plus in Euclidean space and minus in spacetime. In a left-handed basis the sign is reversed. As an example of these formulas, consider a spherical polar coordinate system (r, θ, φ) in three-dimensional Euclidean space, and use the three infinitesimal vectors dxj(∂/∂xj) to construct the volume element dΣ [cf. Eq. (1.69b)]: dΣ = ǫ ( dr ∂ ∂r , dθ ∂ ∂θ , dφ ∂ ∂φ ) = ǫrθφdrdθdφ = √ gdrdθdφ = r2 sin θdrdθdφ . (23.49) Here the second equality follows from linearity of ǫ and the formula for computing its com- ponents by inserting basis vectors into its slots; the third equality follows from our formula (23.48) for the components, and the fourth equality entails the determinant of the metric coefficients, which in spherical coordinates are grr = 1, gθθ = r2, gφφ = r2 sin2 θ, all other gjk vanish, so g = r4 sin2 θ. The resulting volume element r2 sin θdrdθdφ should be familiar and obvious. The second new integration issue that we must face is the fact that integrals such as ∫ ∂V T αβdΣβ (23.50) [cf. Eqs. (1.76), (1.77)] involve constructing a vector T αβdΣβ in each infinitesimal region dΣβ of the surface of integration, and then adding up the contributions from all the infinitesimal regions. A major difficulty arises from the fact that each contribution lives in a different tangent space. To add them together, we must first transport them all to the same tangent space at some single location in the manifold. How is that transport to be performed? The obvious answer is “by the same parallel transport technique that we used in defining the gradient.” However, when defining the gradient we only needed to perform the parallel transport over an infinitesimal distance, and now we must perform it over long distances. As we shall see in Chap. 24, when the manifold is curved, long-distance parallel transport gives a result that depends on the route of the transport, and in general there is no way to identify any preferred route. As a result, integrals such as (23.50) are ill-defined in a curved manifold. The only integrals that are well defined in a curved manifold are those such as ∫ ∂V SαdΣα whose infinitesimal contributions SαdΣα are scalars, i.e. integrals whose value is a scalar. This fact will have profound consequences in curved spacetime for the laws of energy, momentum, and angular momentum conservation. **************************** EXERCISES 24 (a) In quantum mechanics one deals with “projection operators” P̂ , which satisfy the equation P̂ 2 = P̂ . Show that Pαβ is a projection tensor, in the sense that PαβP β γ = Pαγ. (b) This suggests that Pαβ may project vectors into some subspace of 4-dimensional space- time. Indeed it does: Show that for any vector Aα, PαβA β is orthogonal to ~u; and if Aα is already perpendicular to ~u, then PαβA β = Aα, i.e. the projection leaves the vector unchanged. Thus, Pαβ projects vectors into the 3-space orthogonal to ~u. (c) What are the components of Pαβ in the fluid’s local rest frame, i.e. in an orthonormal basis where ~u = ~e0̂? (d) Show that the rate of change of ~u along itself, ∇~u~u (i.e., the fluid 4-acceleration) is equal to the vector ~a that appears in the decomposition (23.54). Show, further, that ~a · ~u = 0. (e) Show that the divergence of the 4-velocity, ∇ · ~u, is equal to the scalar field θ that appears in the decomposition (23.54). (f) The quantities σαβ and ωαβ are the relativistic versions of the fluid’s shear and rotation tensors. Derive equations for these tensors in terms of uα;β and Pµν . (g) Show that, as viewed in a Lorentz reference frame where the fluid is moving with speed small compared to the speed of light, to first-order in the fluid’s ordinary velocity vj = dxj/dt, the following are true: (i) u0 = 1, uj = vj; (ii) θ is the nonrelativistic expansion of the fluid, θ = ∇·v ≡ vj ,j [Eq. (12.63)]; (iii) σjk is the fluid’s nonrelativistic shear [Eq. (12.63)]; (iv) ωjk is the fluid’s nonrelativist rotation tensor [denoted rjk in Eq. (12.63)]. Exercise 23.10 Practice: Integration — Gauss’s Theorem In 3-dimensional Euclidean space the Maxwell equation ∇ ·E = ρe/ǫ0 can be combined with Gauss’s theorem to show that the electric flux through the surface ∂V of a sphere is equal to the charge in the sphere’s interior V divided by ǫ0: ∫ ∂V E · dΣ = ∫ V (ρe/ǫ0)dΣ . (23.56) Introduce spherical polar coordinates so the sphere’s surface is at some radius r = R. Con- sider a surface element on the sphere’s surface with vectorial legs dφ∂/∂φ and dθ∂/∂θ. Eval- uate the components dΣj of the surface integration element dΣ = ǫ(..., dθ∂/∂θ, dφ∂/∂φ). Similarly, evaluate dΣ in terms of vectorial legs in the sphere’s interior. Then use these results for dΣj and dΣ to convert Eq. (23.56) into an explicit form in terms of integrals over r, θ, φ. The final answer should be obvious, but the above steps in deriving it are informative. **************************** 25 23.4 The Stress-Energy Tensor Revisited In Sec. 1.12 we defined the stress-energy tensor T of any matter or field as a symmetric, second-rank tensor that describes the flow of 4-momentum through spacetime. More specif- ically, the total 4-momentum P that flows through some small 3-volume Σ, going from the negative side of Σ to its positive side, is T( . . . , ~Σ) = (total 4-momentum ~P that flows through ~Σ); i.e., T αβΣβ = P α (23.57) [Eq. (1.91)]. Of course, this stress-energy tensor depends on the location P of the 3-volume in spacetime; i.e., it is a tensor field T(P). From this geometric, frame-independent definition of the stress-energy tensor, we were able to read off the physical meaning of its components in any inertial reference frame [Eqs. (1.92)]: T 00 is the total energy density, including rest mass-energy; T j0 = T 0j is the j-component of momentum density, or equivalently the j-component of energy flux; and T jk are the components of the stress tensor, or equivalently of the momentum flux. We gained some insight into the stress-energy tensor in the context of kinetic theory in Secs. 2.4.2 and 2.5.3, and we briefly introduced the stress-energy tensor for a perfect fluid in Eq. (1.99b). Because perfect fluids will play a very important role in this book’s applications of general relativity to relativistic stars (Chap. 25) and cosmology (Chap. 27), we shall now explore the perfect-fluid stress-energy tensor in some depth, and shall see how it is related to the Newtonian description of perfect fluids, which we studied in Part IV. Recall [Eq. (1.99a)] that in the local rest frame of a perfect fluid, there is no energy flux or momentum density, T j0 = T 0j = 0, but there is a total energy density (including rest mass) ρ and an isotropic pressure P : T 00 = ρ , T jk = Pδjk . (23.58) From this special form of T αβ in the local rest frame, one can derive Eq. (1.99b) for the stress-energy tensor in terms of the 4-velocity ~u of the local rest frame (i.e., of the fluid itself), the metric tensor of spacetime g, and the rest-frame energy density ρ and pressure P : T αβ = (ρ+ P )uαuβ + Pgαβ ; i.e., T = (ρ+ P )~u⊗ ~u+ Pg ; (23.59) see Ex. 23.11, below. This expression for the stress-energy tensor of a perfect fluid is an example of a geometric, frame-independent description of physics. It is instructive to evaluate the nonrelativistic limit of this perfect-fluid stress-energy tensor and verify that it has the form we used in our study of nonrelativistic, inviscid fluid mechanics (Table 12.1 on page 24 of Chap. 12, with vanishing gravitational potential Φ = 0). In the nonrelativistic limit the fluid is nearly at rest in the chosen Lorentz reference frame. It moves with ordinary velocity v = dx/dt that is small compared to the speed of light, so the temporal part of its 4-velocity u0 = 1/ √ 1 − v2 and spatial part u = u0v can be approximated as u0 ≃ 1 + 1 2 v2 , u ≃ ( 1 + 1 2 v2 ) v . (23.60) 26 In the fluid’s rest frame, in special relativity, it has a rest mass density ρo [defined in Eq. (1.83)], an internal energy per unit rest mass u (not to be confused with the 4-velocity), and a total density of mass-energy ρ = ρo(1 + u) . (23.61) Now, in our chosen Lorentz frame the volume of each fluid element is Lorentz contracted by the factor √ 1 − v2 and therefore the rest mass density is increased from ρo to ρo/ √ 1 − v2 = ρou 0; and correspondingly the rest-mass flux is ρou 0v = ρou [Eq. 1.83)], and the law of rest-mass conservation is ∂(ρou 0)/∂t+ ∂(ρou j)/∂xj = 0, i.e. ~∇ · (ρo~u) = 0. When taking the Newtonian limit, we should identify the Newtonian mass ρN with the low-velocity limit of this rest mass density: ρN = ρou 0 ≃ ρo ( 1 + 1 2 v2 ) . (23.62) The nonrelativistic limit regards the specific internal energy u, the kinetic energy per unit mass 1 2 v2, and the ratio of pressure to rest mass density P/ρo as of the same order of smallness u ∼ 1 2 v2 ∼ P ρo ≪ 1 , (23.63) and it expresses the momentum density T j0 accurate to first order in v ≡ |v|, the momentum flux (stress) T jk accurate to second order in v, the energy density T 00 accurate to second order in v, and the energy flux T 0j accurate to third order in v. To these accuracies, the perfect-fluid stress-energy tensor (23.59) takes the following form: T j0 = ρNv j , T jk = Pgjk + ρNv jvk , T 00 = ρN + 1 2 ρNv 2 + ρNu , T 0j = ρNv j + ( 1 2 v2 + u+ P ρN ) ρNv j ; (23.64) see Ex. 23.11(c). These are precisely the same as the momentum density, momentum flux, energy density, and energy flux that we used in our study of nonrelativistic, inviscid fluid mechanics (Chap. 12), aside from the notational change from there to here ρ → ρN , and aside from including the rest mass-energy ρN = ρNc 2 in T00 here but not there, and including the rest-mass-energy flux ρNv j in T 0j here but not there. Just as the nonrelativistic equations of fluid mechanics (Euler equation and energy con- servation) are derivable by combining the nonrelativistic T αβ of Eq. (23.64) with the non- relativistic laws of momentum and energy conservation, so also the relativistic equations of fluid mechanics are derivable by combining the relativistic version (23.59) of T αβ with the equation of 4-momentum conservation ~∇ · T = 0. (We shall give such a derivation and shall examine the resulting fluid mechanics equations in the context of general relativity in Chap. 24.) This, together with the fact that the relativistic T reduces to the nonrelativistic T αβ in the nonrelativistic limit, guarantees that the special relativistic equations of inviscid fluid mechanics will reduce to the nonrelativistic equations in the nonrelativistic limit. A second important example of a stress-energy tensor is that for the electromagnetic field. We shall explore it in Ex. 23.13 below. For a point particle which moves through spacetime along a world line P(ζ) (where ζ is the affine parameter such that the particle’s 4-momentum is ~p = d/dζ), the stress-energy
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved