Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Search for study opportunitiesNEW

Connect with the world's best universities and choose your course of study

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Lecture Notes on Special Relativity, Exercises of Relativity Theory

Vanderbilt University (Vandy)Relativity Theory

Relativity (both the Special and General theories), quantum mechanics, ... of reference, we could, for instance, plot the position of a particle moving ...

Typology: Exercises

2022/2023

Uploaded on 05/11/2023

shailen_555cell 🇺🇸

4.7

(18)

22 documents

1 / 70

Partial preview of the text

Download Lecture Notes on Special Relativity and more Exercises Relativity Theory in PDF only on Docsity! Lecture Notes on Special Relativity prepared by J D Cresser Department of Physics Macquarie University 8th August2005 Contents 1 Introduction: What is Relativity? 3 2 Frames of Reference 7 2.1 Constructing an Arbitrary Reference Frame . . . . . . . . . . . . . . . . . . . . 8 2.1.1 Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2 Inertial Frames of Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.1 Newton’s First Law of Motion . . . . . . . . . . . . . . . . . . . . . . . 13 3 Newtonian Relativity 15 3.1 The Galilean Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Newtonian Force and Momentum . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.1 Newton’s Second Law of Motion . . . . . . . . . . . . . . . . . . . . . 16 3.2.2 Newton’s Third Law of Motion . . . . . . . . . . . . . . . . . . . . . . 17 3.3 Newtonian Relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.4 Maxwell’s Equations and the Ether . . . . . . . . . . . . . . . . . . . . . . . . . 19 4 Einsteinian Relativity 21 4.1 Einstein’s Postulates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.2 Clock Synchronization in an Inertial Frame . . . . . . . . . . . . . . . . . . . . 22 4.3 Lorentz Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.4 Relativistic Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.4.1 Length Contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.4.2 Time Dilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.4.3 Simultaneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.4.4 Transformation of Velocities (Addition of Velocities) . . . . . . . . . . . 34 4.5 Relativistic Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.5.1 Relativistic Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.5.2 Relativistic Force, Work, Kinetic Energy . . . . . . . . . . . . . . . . . 38 4.5.3 Total Relativistic Energy . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.5.4 Equivalence of Mass and Energy . . . . . . . . . . . . . . . . . . . . . . 42 4.5.5 Zero Rest Mass Particles . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Chapter 1 Introduction: What is Relativity? 4 Consider a collection of experimenters, (or, as they are often referred to, observers) each based in laboratories moving at constant velocities with respect to one another, and each undertaking a series of experiments designed to lead to a mathematical statement of a particular physical law, such as the response of a body to the application of a force. According to the principle of relativity, the final form of the equations derived (in this case, Newton’s laws) will be found to have exactly the same form for all experimenters. It should be understood that whilst the mathematical form of the laws will be the same, the actually data obtained by each experimenter – even if they are monitoring the same physical event – will not necessarily be numerically the same. For instance, the point in space where two bodies collide, and the time at which this collision occurs, will not necessarily be assigned the same coordinates by all experimenters. However, there is invariably a mathematical relationship between such data obtained by the different observers. In the case of Newtonian relativity these transformation equa- tions constitute the so-called Galilean transformation. Using these transformation equations, the mathematical statement of any physical law according to one observer can be translated into the law as written down by another observer. The principle of relativity then requires that the trans- formed equations have exactly the same form in all frames of reference moving with constant velocity with respect to one another, in other words that the physical laws are the same in all such frames of reference. This statement contains concepts such as ‘mathematical form’ and ‘frame of reference’ and ‘Galilean transformation’ which we have not developed, so perhaps it is best at this stage to illustrate its con- tent by a couple of examples. In doing so it is best to make use of an equivalent statement of the principle, that is: Given two observers A and B moving at a constant velocity with respect to one an- other, it is not possible by any experiment whatsoever to determine which of the ob- servers is ‘at rest’ or which is ‘in motion’. First consider an example from ‘everyday experience’ – a train carriage moving smoothly at a constant speed on a straight and level track – this is a ‘frame of reference’, an idea that will be better defined later. Suppose that in a carriage of this train there is a pool table and suppose you were a passenger on this carriage, and you decided to play a game of pool. One of the first things that you would notice is that in playing any shot, you would have to make no allowance whatsoever for the motion of the train. Any judgement of how to play a shot as learned by playing the game at home, or in the local pool hall, would apply equally well on the train, irrespective of how fast the train was moving. If we consider that what is taking place here is the innate application of Newton’s Laws to describe the motion and collision of the pool balls, we see that no adjustment has to be made to these laws when playing the game on the moving train. This argument can be turned around. Suppose the train windows are covered, and the carriage is well insulated so that there is no vibration or noise – i.e. there is no immediate evidence to the senses as to whether or not the train is in motion. It might nevertheless still be possible to determine if the train is in motion by carrying out an experiment, such as playing a game of pool. But, as described above, a game of pool proceeds in exactly the same way as if it were being played back home – no change in shot-making is required. There is no indication from this experiment as to whether or not the train is in motion. There is no way of knowing whether, on pulling back the curtains, you are likely to see the countryside hurtling by, or to find the train sitting at a station. In other words, by means of this experiment which, in this case, involves Newton’s Laws of motion, it is not possible to determine whether or not the train carriage is moving, an outcome entirely consistent with the principle of relativity. This idea can be extended to encompass other laws of physics. To this end, imagine a collection of spaceships with engines shut off, all drifting through space. Each space ship constitutes a Chapter 1 Introduction: What is Relativity? 5 ‘frame of reference’. On each of these ships a series of experiments is performed: a measurement of the half life of uranium 235, a measurement of the outcome of the collision of two billiard balls, an experiment in thermodynamics, e.g. a measurement of the boiling point of water under normal atmospheric pressure, a measurement of the speed of light radiating from a nearby star: any conceivable experiment. If the results of these experiments are later compared, what is found is that in all cases (within experimental error) the results are identical. For instance, we do not find that on one space ship water boils at 100◦C, on another hurtling towards the first it boils at 150◦C and on another hurtling away from the first, it boils at 70◦C. In other words, the various laws of physics being tested here yield exactly the same results for all the spaceships, in accordance with the principle of relativity. Thus, quite generally, the principle of relativity means that it is not possible, by considering any physical process whatsoever, to determine whether or not one or the other of the spaceships is ‘in motion’. The results of all the experiments are the same on all the space ships, so there is nothing that definitely singles out one space ship over any other as being the one that is stationary. It is true that from the point of view of an observer on any one of the space ships that it is the others that are in motion. But the same statement can be made by an observer in any space ship. All that we can say for certain is that the space ships are in relative motion, and not claim that one of them is ‘truly’ stationary, while the others are all ‘truly’ moving. This principle of relativity was accepted (in somewhat simpler form i.e. with respect to the me- chanical behaviour of bodies) by Newton and his successors, even though Newton postulated that underlying it all was ‘absolute space’ which defined the state of absolute rest. He introduced the notion in order to cope with the difficulty of specifying with respect to what an accelerated object is being accelerated. To see what is being implied here, imagine space completely empty of all matter except for two masses joined by a spring. Now suppose that the arrangement is rotated around an axis through the centre of the spring, and perpendicular to the spring. As a conse- quence, the masses will undergo acceleration. Naively, in accordance with our experience, we would expect that the masses would pull apart. But why should they? How do the masses ‘know’ that they are being rotated? There are no ‘signposts’ in an otherwise empty universe that would indicate that rotation is taking place. By proposing that there existed an absolute space, Newton was able to claim that the masses are being accelerated with respect to this absolute space, and hence that they would separate in the way expected for masses in circular motion. But this was a supposition made more for the convenience it offered in putting together his Laws of motion, than anything else. It was an assumption that could not be substantiated, as Newton was well aware – he certainly felt misgivings about the concept! Other scientists were more accepting of the idea, however, with Maxwell’s theory of electromagnetism for a time seeming to provide some sort of confirmation of the concept. One of the predictions of Maxwell’s theory was that light was an electromagnetic wave that trav- elled with a speed c ≈ 3 × 108 ms−1. But relative to what? Maxwell’s theory did not specify any particular frame of reference for which light would have this speed. A convenient resolu- tion to this problem was provided by an already existing assumption concerning the way light propagated through space. That light was a form of wave motion was well known – Young’s in- terference experiments had shown this – but the Newtonian world view required that a wave could not propagate through empty space: there must be present a medium of some sort that vibrated as the waves passed, much as a tub of jelly vibrates as a wave travels through it. The proposal was therefore made that space was filled with a substance known as the ether whose purpose was to be the medium that vibrated as the light waves propagated through it. It was but a small step to then propose that this ether was stationary with respect to Newton’s absolute space, thereby solving the problem of what the frame of reference was in which light had the speed c. Furthermore, in keeping with the usual ideas of relative motion, the thinking was then that if you were to travel relative to the ether towards a beam of light, you would measure its speed to be greater than c, and Chapter 1 Introduction: What is Relativity? 6 less than c if you travelled away from the beam. It then came as an enormous surprise when it was found experimentally that this was not, in fact, the case. This discovery was made by Michelson and Morley, who fully accepted the ether theory, and who, quite reasonably, thought it would be a nice idea to try to measure how fast the earth was moving through the ether. But the result they found was quite unexpected. Irrespective of the position of the earth in its orbit around the sun, the result was always zero, which made no sense at all: surely somewhere in the orbit the Earth would have to be moving relative to the ether. To put it another way, they measured the speed of light always to be the same value c no matter what the relative motion might be of the Earth with respect to the ether. In our spaceship picture, this is equivalent to all the spaceships obtaining the same value for the speed of light radiated by the nearby star irrespective of their motion relative to the star. This result is completely in conflict with the rule for relative velocities, which in turn is based on the principle of relativity as enunciated by Galileo. Thus the independence of the speed of light on the motion of the observer seems to take on the form of an immutable law of nature, and yet it is apparently inconsistent with the principle of relativity. Something was seriously amiss, and it was Einstein who showed how to get around the problem, and in doing so he was forced to conclude that space and time had properties undreamt of in the Newtonian world picture. The first contribution made by Einstein was to raise to the level of a postulate the observation that the speed of light was apparently independent of the state of motion of its source, and this, along with the principle of relativity presented above leads to the Special Theory of Relativity. This theory is concerned almost entirely with physical processes as observed from reference frames moving at constant velocities with respect to each other, so-called inertial frames of reference, and incorporates the fact that the results of the hypothetical experiments described above will all be independent of the state of motion of the experimenters. This is an outcome which it is best to understand at a fundamental level in terms of the mathematical forms taken by the laws of nature. All laws of nature appear to have expression in mathematical form, and, as mentioned earlier, the principle of relativity can be understood as saying that the equations describing a law of nature take the same mathematical form in all frames of reference moving at a constant velocity with respect to each other, and moreover, the velocity of the reference frame does not appear anywhere in these equations. But in order to guarantee that the principle of relativity holds true for all physical processes, including the postulate concerning the constancy of the speed of light, Einstein was forced to propose, along with a new perspective on the properties of space and time, modified versions of the familiar Newtonian concepts of force, momentum and energy, leading, amongst other things, to the famous equation E = mc2. Much later (1915), after a long struggle, Einstein produced a generalization of this theory in which it was required that the laws of physics should be the same in all frames of reference whether in constant relative motion, or undergoing acceleration, or even accelerating different amounts in different places. This amounts to saying that any physical process taking place in space and time should proceed in a fashion that takes no account of the reference frame used to describe it. In other words, it ought to be possible to write down the laws of physics in terms of quantities that make no mention whatsoever of any particular reference frame. In accomplishing this task, Einstein was able to show that the force of gravity could be understood as a reflection of underlying geometrical properties of space and time – that space and time can be considered as a single geometric entity that can exhibit curvature. All these ideas, and a lot more besides, have to be presented in a much more rigorous form. It is this perspective on relativity in terms of the mathematical statements of the laws of physics that is developed here, and an important starting point is pinning down the notion of a frame of reference. Chapter 2 Frames of Reference 9 marks will be left on the netting, and the clock closest to the supernova will grind to a halt, so that it reads the time at which the event occurred, while the coordinates of the crossing point closest to where the burn marks appear will give the position of the event. It does not seem to be a particularly useful state of affairs to have completely arbitrary netting, i.e. it would be far more useful to design the net so that the crossing points are evenly spaced along the threads. This could be done, for instance, by laying out rods of some predetermined length and marking off some convenient separation between the crossing points – and the finer the scale the better. Like wise, it is not particularly useful to have a whole host of clocks ticking away indepen- dently of one another1, particularly if we want to compare whether one event occurring at some point in space occurs earlier or later than some other event occurring elsewhere, and if we want to specify how far apart in time they occur. In other words, it would be preferable to arrange for these clocks to be synchronized in some way. This, however, is not at all a straightforward procedure and in some cases not even possible! It is easy enough to synchronize clocks at the same point in space – the problem is coming up with a way of doing so for clocks at different points in space to be synchronized. In some circumstances it is possible to carry out this synchronization, thereby assigning a global time throughout the reference frame. In such cases, one possible procedure is to suppose that a whole collection of identical clocks are gathered at one point, say the origin of coordinates, and there they are all synchronized to some ‘master clock’. These clocks are then carried at an exceedingly slow rate (since, as we will see later, moving clocks ‘run slow’, and this can affect the synchronization) and distributed around the reference frame – a process known as adiabatic synchronization. Some adjustment may be necessary to the rates of each of these clocks, depending on any gravitational field present, and then we are done 2. Gravity-free space (i.e. the situation described by special relativity) is one important situation where this synchronization procedure is possible. An expanding isotropic universe is another. There are, however, circumstances in which this cannot be done, such as in space-time around a rotating black hole, or more exotic still, in Gödel’s model of a rotating universe. This combination of clocks and netting thus gives us one possible frame of reference with which to specify the positions and times at which events occur in space and time. With this frame of reference, we could, for instance, plot the position of a particle moving through space as a function of time: just imagine that the particle is highly radioactive so it leaves burn marks on the netting, and stops any closely nearby clock it passes. After the particle has passed by, someone (the observer) clambers along the netting and notes down the coordinates of all the burn marks, and the times registered on the clocks closest to each such mark, takes all this data back to his laboratory, and plots position as a function of time. The result is a depiction of the path of the particle according to this frame of reference. By using a discrete net and clocks that have a finite time interval between ticks we can only represent the positions and times of occurrence of events to the accuracy determined by how fine the netting is and how long this interval is between clock ticks. But as we believe that space and time are both continuous quantities (though quantum mechanics may have something to say about this), we can suppose that we can get a better approximation to the position and time of 1Two chronometers the captain had, One by Arnold that ran like mad, One by Kendal in a walnut case, Poor devoted creature with a hangdog face. Arnold always hurried with a crazed click-click Dancing over Greenwich like a lunatic, Kendal panted faithfully his watch-dog beat, Climbing out of Yesterday with sticky little feet. 2Carrying out the procedure of setting length scales and synchronizing clocks is actually trickier than it seems. For instance, the rods have to be at rest with respect to the net at the location where the the distance is to be marked off, and the rods cannot be too long. In fact, in curved space time, they need to be infinitesimal in length, or at least very short compared to the length scale of the curvature of spacetime in its vicinity. Both the setting of length scales and synchronizing of clocks can be achieved by the use of light signals, but we will not be concerning ourselves with these issues. Chapter 2 Frames of Reference 10 occurrence of an event by imagining a finer netting and clocks with shorter intervals ‘between ticks’. Ultimately we would end up with an infinitesimally fine net, and clocks whose ticks occur an infinitesimally short interval apart. But in the end we usually do away with this operationally based picture of nets and clocks and rely on the abstract mathematical notion of a reference frame. But the physical meaning of these mathematically idealized reference frames is nevertheless to be found in the approximate pictures conjured up by using these ideas of a network of clocks attached to a three-dimensional scaffolding filling all of space. When the going gets tough it is often useful to return to the notion of a reference frame defined in this way. We can set up any number of such reference frames, each with its own coordinate network and set of coordinate clocks. We have (almost) total freedom to set up a reference frame any way we like, including different reference frames being in motion, or even accelerating, with respect to one another, and not necessarily in the same way everywhere. But whichever reference frame we use, we can then conduct experiments whose outcomes are expressed in terms of the associated set of coordinates, and express the various laws of physics in terms of the coordinate systems used. So where does the principle of relativity come into the picture here? What this principle is saying, in its most general form, is that since any physical process taking place in space and time ought to proceed in a fashion that takes no account of the reference frame that we use to describe it. In other words. it ought to be possible to write down the laws of physics in terms of quantites that make no mention whatsoever of any particular reference frame. We can already do this for Newtonian mechanics: Newton’s second law can be written as F = m d2r dt2 i.e. expressed in a way that makes no mention of a reference frame (though note the appearance of a singled out time variable t – the absolute time of Newton). If we had chosen a particular set of axes, we would have Fx = m d2x dt2 and so on where the values of the components of F depend on the set of axes chosen. Later we will see how physical laws can be expressed in a ‘frame invariant way’ in the context of special relativity, rather than Newtonian physics. Requiring the relativity principle to be true for arbitrary reference frames, along with a further postulate, the principle of equivalence, which essentially states that an object undergoing free fall in a gravitational field is equivalent to the particle being acted on by no forces at all, then leads to general relativity. 2.1.1 Events Colloquially, an event is something that occurs at a localized region in space over a localized interval in time, or, in an idealized limit, at a point in space at an instant in time. Thus, the motion of a particle through space could be thought of as a continous series of events, while the collision of two particles would be an isolated event, and so on. However, it is useful to release this term ‘event’ from being associated with something happening. After all, the the coordinate network spread throughout space, and the clocks ticking away the hours will still be labelling points in space, along with ‘the time’ at each point in space, irrespective of whether or not anything actually takes place at a particular locality and at a particular time. The idea then is to use the term ‘event’ simply as another name for a point in space and time, this point specified by the spatial coordinates of the point in space, and the reading of a clock at that point. An event will have different coordinates in different reference frames. It is then important and useful to be able to relate the coordinates of events in one reference frame to the coordinates of the Chapter 2 Frames of Reference 11 same event in some other reference frame. In Newtonian physics, this relation is provided by the Galilean transformation equations, and in special relativity by the Lorentz transformation. It is the latter transformation law, and special relativity in particular that we will be concerning ourselves with from now on. 2.2 Inertial Frames of Reference O X Y Figure 2.2: Cartesian coor- dinate system (in two dimen- sions). As we have just seen, a reference frame can be defined in a mul- titude of ways, but quite obviously it would be preferable to use the simplest possible, which brings to mind the familiar Cartesian set of coordinate axes. Thus, suppose we set up a lattice work of rods as illustrated in Fig. 2.2 in which the rods extend indefinitely in all directions. Of dourse, there will be a third array of rods perpendicular to those in the figure in the Z direction. The question then arises: can we in fact do this for all of space? From the time of Euclid, and perhaps even earlier, until the 19th century, it was taken for granted that this would be possible, with the rods remaining parallel in each direction out to infinity, even though attempts to prove this from the basic axioms of Euclidean geometry never succeeded. Eventually it was realized by the mathematicians Gauss, Riemann and Lobachevsky that this idea about parallel lines never meet- ing, while intuitively plausible, was not in fact necesssarily true. It was perfectly possible to construct geometries wherein ‘parallel’ lines could either meet, or diverge, when sufficiently far extended without resulting in any mathematical inconsistencies. Practically, what this means is that by taking short lengths ∆l of rod and joining them together in such a way that each length is parallel to the one before, by a process known as parallel transport, then the extended rods could in fact come closer together or become increasingly separated, see Fig. 2.3. Figure 2.3: Coordinate array constructed by laying down in- finitesimal segments of length ∆l. Each segment is laid down parallel to its neighbour by a process known as parallel trans- port. The intrinsic curvature of the underlying space is revealed by the array of rods diverging (as here) or possibly coming closer together. Such behaviour is indicative of space being intrinsically curved, and there is no apriori reason why space ought to be flat as Euclid assumed, i.e. space could possess some kind of curvature. In fact, Gauss attempted to measure the curva- ture of space by measuring the area of a very large triangle whose vertices were taken to be the peaks of three widely separated mountain peaks in the Alps. The intention was to see if the area of the triangle came out to be either big- ger or smaller than that which would be expected on the basis of Euclidean geometry – either result would have been an indicator of curvature of space. To within the probably very large ex- perimental error of the experiment, no evidence of curvature was found. Riemann, a student of Gauss, surmised that the curvature of space was somehow related to the force of gravity, but he was missing one important ingredient – it is the curvature of space and time together that gives rise to gravity, as Einstein was able to show. Gravitational forces are thus rather peculiar forces in comparison to the other forces of nature such as electromagnetic forces or the nuclear forces in that they are not due to the action of some external influence exerting its effect within pre-existing space and time, but rather is associated with the intrinsic properties of spacetime itself. Thus, we Chapter 2 Frames of Reference 14 particle undergoing?’ In order to determine this we have to measure its position as a function of time, and to do this we have to provide a reference frame. We could imagine all sorts of reference frames, for instance one attached to a rocket travelling in some complicated path. Under such circumstances, the path of the particle as measured relative to such a reference frame would be very complex. However, it is at this point that an assertion can be made, namely that for certain frames of reference, the particle will be travelling in a particularly simple fashion – a straight line at constant speed. This is something that has not and possibly could not be confirmed experimentally, but it is nevertheless accepted as a true statement about the properties of the motion of particles in the absences of forces. In other words we can adopt as a law of nature, the following statement: There exist frames of reference relative to which a particle acted on by no forces moves in a straight line at constant speed. This essentially a claim that we are making about the properties of spacetime. It is also simply a statement of Newton’s First Law of Motion. A frame of reference which has this property is called an inertial frame of reference, or just an inertial frame. Gravity is a peculiar force in that if a reference frame is freely falling under the effects of gravity, then any particle also freely falling will be observed to be moving in a straight line at constant speed relative to this freely falling frame. Thus freely falling frames constitute inertial frames of reference, at least locally. response to any inherent properties of space (and time) in its vicinity. Chapter 3 Newtonian Relativity Te arguments in the previous Chapter do not tell us whether there is one or many inertial frames of reference, nor, if there is more than one, does it tell us how we are to relate the coordinates of an event as observed from the point-of-view of one inertial reference frame to the coordinates of the same event as observed in some other. These transformation laws are essential if we are to compare the mathematical statements of the laws of physics in different inertial reference frames. The transformation equations that are derived below are the mathematical basis on which it can be shown that Newton’s Laws are consistent with the principle of relativity. In establishing the latter, we can show that there is in fact an infinite number of inertial reference frames. 3.1 The Galilean Transformation To derive these transformation equations, consider an inertial frame of reference S and a second reference frame S ′ moving with a velocity vx relative to S . 4'-5" 1'-5" X Y Z vxt′ x′ X′ Y ′ Z′ ‘event’ vx S ′S Figure 3.1: A frame of reference S ′ is moving with a velocity vx relative to the inertial frame S . An event occurs with spatial coordinates (x, y, z) at time t in S and at (x′, y′, z′) at time t′ in S ′. Let us suppose that the clocks in S and S ′ are set such that when the origins of the two reference frames O and O′ coincide, all the clocks in both frames of reference read zero i.e. t = t′ = 0. According to ‘common sense’, if the clocks in S and S ′ are synchronized at t = t′ = 0, then they will always read the same, i.e. t = t′ always. This, once again, is the absolute time concept introduced in Section 2.2. Suppose now that an event of some kind, e.g. an explosion, occurs at a point (x′, y′, z′, t′) according to S ′. Then, by examining Fig. 3.1, according to S , it occurs at the point x = x′ + vxt′, y = y′, z = z′ and at the time t = t′ (3.1) Chapter 3 Newtonian Relativity 16 These equations together are known as the Galilean Transformation, and they tell us how the coordinates of an event in one inertial frame S are related to the coordinates of the same event as measured in another frame S ′ moving with a constant velocity relative to S . Now suppose that in inertial frame S , a particle is acted on by no forces and hence is moving along the straight line path given by: r = r0 + ut (3.2) where u is the velocity of the particle as measured in S . Then in S ′, a frame of reference moving with a velocity v = vxi relative to S , the particle will be following a path r′ = r0 + (u − v)t′ (3.3) where we have simply substituted for the components of r using Eq. (3.1) above. This last result also obviously represents the particle moving in a straight line path at constant speed. And since the particle is being acted on by no forces, S ′ is also an inertial frame, and since v is arbitrary, there is in general an infinite number of such frames. Incidentally, if we take the derivative of Eq. (3.3) with respect to t, and use the fact that t = t′, we obtain u′ = u − v (3.4) which is the familiar addition law for relative velocities. It is a good exercise to see how the inverse transformation can be obtained from the above equa- tions. We can do this in two ways. One way is simply to solve these equations so as to express the primed variables in terms of the unprimed variables. An alternate method, one that is mpre revealing of the underlying symmetry of space, is to note that if S ′ is moving with a velocity vx with respect to S , then S will be moving with a velocity −vx with respect to S ′ so the inverse transformation should be obtainable by simply exchanging the primed and unprimed variables, and replacing vx by −vx. Either way, the result obtained is x′ = x − vxt y′ = y z′ = z t′ = t.  (3.5) 3.2 Newtonian Force and Momentum Having proposed the existence of a special class of reference frames, the inertial frames of ref- erence, and the Galilean transformation that relates the coordinates of events in such frames, we can now proceed further and study whether or not Newton’s remaining laws of motion are indeed consistent with the principle of relativity. First we need a statement of these two further laws of motion. 3.2.1 Newton’s Second Law of Motion It is clearly the case that particles do not always move in straight lines at constant speeds relative to an inertial frame. In other words, a particle can undergo acceleration. This deviation from uniform Chapter 3 Newtonian Relativity 19 3.4 Maxwell’s Equations and the Ether The Newtonian principle of relativity had a successful career till the advent of Maxwell’s work in which he formulated a mathematical theory of electromagnetism which, amongst other things, provided a successful physical theory of light. Not unexpectedly, it was anticipated that the equa- tions Maxwell derived should also obey the above Newtonian principle of relativity in the sense that Maxwell’s equations should also be the same in all inertial frames of reference. Unfortunately, it was found that this was not the case. Maxwell’s equations were found to assume completely different forms in different inertial frames of reference. It was as if F = ma worked in one frame of reference, but in another, the law had to be replaced by some bizarre equation like F′ = m(a′)2a′! In other words it appeared as if Maxwell’s equations took a particularly simple form in one spe- cial frame of reference, but a quite complicated form in another moving relative to this special reference frame. For instance, the wave equation for light assumed the simple form ∂2E ∂x2 − 1 c2 ∂2E ∂t2 = 0 (3.16) in this ‘special frame’ S , which is the equation for waves moving at the speed c. Under the Galilean transformation, this equation becomes ∂2E′ ∂x′2 − 1 c2 ∂2E′ ∂t′2 − 2vx c2 ∂2E′ ∂x′∂t′ − vx c2 ∂ ∂x′ [ vx ∂E′ ∂x′ ] = 0 (3.17) for a frame S ′ moving with velocity vx relative to S . This ‘special frame’ S was assumed to be the one that defined the state of absolute rest as postulated by Newton, and that stationary relative to it was a most unusual entity, the ether. The ether was a substance that was supposedly the medium in which light waves were transmitted in a way something like the way in which air carries sound waves. Consequently it was believed that the behaviour of light, in particular its velocity, as measured from a frame of reference moving relative to the ether would be different from its behaviour as measured from a frame of reference stationary with respect to the ether. Since the earth is following a roughly circular orbit around the sun, then it follows that a frame of reference attached to the earth must at some stage in its orbit be moving relative to the ether, and hence a change in the velocity of light should be observable at some time during the year. From this, it should be possible to determine the velocity of the earth relative to the ether. An attempt was made to measure this velocity. This was the famous experiment of Michelson and Morley. Simply stated, they argued that if light is moving with a velocity c through the ether, and the Earth was at some stage in its orbit moving with a velocity v relative to the ether, then light should be observed to be travelling with a velocity c′ = c− v relative to the Earth. We can see this by simply solving the wave equation in S : E(x, t) = E(x − ct) (3.18) where we are supposing that the wave is travelling in the positive X direction. If we suppose the Earth is also travelling in this direction with a speed vx relative to the ether, and we now apply the Galilean Transformation to this expression, we get, for the field E′(x′, t′) as measured in S ′, the result E′(x′, t′) = E(x, t) = E(x′ + vxt′ − ct′) = E(x′ − (c − vx)t′) (3.19) i.e. the wave is moving with a speed c − vx which is just the Galilean Law for the addition of velocities given in Eq. (3.4). Needless to say, on performing their experiment – which was extremely accurate – they found that the speed of light was always the same. Obviously something was seriously wrong. Their exper- iments seemed to say that the earth was not moving relative to the ether, which was manifestly wrong since the earth was moving in a circular path around the sun, so at some stage it had to Chapter 3 Newtonian Relativity 20 be moving relative to the ether. Many attempts were made to patch things up while still retaining the same Newtonian ideas of space and time. Amongst other things, it was suggested that the earth dragged the ether in its immediate vicinity along with it. It was also proposed that objects contracted in length along the direction parallel to the direction of motion of the object relative to the ether. This suggestion, due to Fitzgerald and elaborated on by Lorentz and hence known as the Lorentz-Fitzgerald contraction, ‘explained’ the negative results of the Michelson-Morley experiment, but faltered in part because no physical mechanism could be discerned that would be responsible for the contraction. The Lorentz-Fitzgerald contraction was to resurface with a new interpretation following from the work of Einstein. Thus some momentary successes were achieved, but eventually all these attempts were found to be unsatisfactory in various ways. It was Einstein who pointed the way out of the impasse, a way out that required a massive revision of our concepts of space, and more particularly, of time. Chapter 4 Einsteinian Relativity T difficulties with the Newtonian relativity was overcome by Einstein who made two pos- tulates that lead to a complete restructuring of our ideas of space time, and the dyanamical proerties of matter. 4.1 Einstein’s Postulates The difficulty that had to be resolved amounted to choosing amongst three alternatives: 1. The Galilean transformation was correct and something was wrong with Maxwell’s equa- tions. 2. The Galilean transformation applied to Newtonian mechanics only. 3. The Galilean transformation, and the Newtonian principle of relativity based on this trans- formation were wrong and that there existed a new relativity principle valid for both me- chanics and electromagnetism that was not based on the Galilean transformation. The first possibility was thrown out as Maxwell’s equations proved to be totally successful in application. The second was unacceptable as it seemed something as fundamental as the transfor- mation between inertial frames could not be restricted to but one set of natural phenomena i.e. it seemed preferable to believe that physics was a unified subject. The third was all that was left, so Einstein set about trying to uncover a new principle of relativity. His investigations led him to make two postulates: 1. All the laws of physics are the same in every inertial frame of reference. This postulate implies that there is no experiment whether based on the laws of mechanics or the laws of electromagnetism from which it is possible to determine whether or not a frame of reference is in a state of uniform motion. 2. The speed of light is independent of the motion of its source. Einstein was inspired to make these postulates through his study of the properties of Maxwell’s equations and not by the negative results of the Michelson-Morley experiment, of which he was apparently only vaguely aware. It is this postulate that forces us to reconsider what we understand by space and time. Chapter 4 Einsteinian Relativity 24 kinematic/symmetry considerations. Doing so is based on two further assumptions which seem to be entirely reasonable: Homogeneity: The intrinsic properties of empty space are the same everywhere and for all time. In other words, the properties of the rulers and clocks do not depend on their positions in (empty) space, nor do they vary over time. Spatial Isotropy: The intrinsic properties of space is the same in all directions. In other words, the properties of the rulers and clocks do not depend on their orientations in empty space. There is a third, much more subtle condition: No Memory: The extrinsic properties of the rulers and clocks may be functions of their current states of motion, but not of their states of motion at any other time. This is not referring to what might happen to a ruler or a clock as a consequence of what it might have done in the past such as, for instance, having undergone such severe acceleration that its inner workings were wrecked. To see what it refers to, we can imagine that we prepare two identical clocks and send one off on an elaborate journey through space and time while the other stays behind. When brought back together, the clocks might not read the same time, but what this postulate is saying is that they will be ticking at the same rate. Similarly for a pair of rulers: they will have the same length when brought back together. Thus we do not have to consider the past history of any of our clocks and rulers when comparing lengths or intervals of time: space and time does not leave a lingering imprint on the objects that live in space and time. The starting point is to consider two inertial frames S and S ′ where S ′ is moving with a velocity vx relative to S . Let us suppose that when the two origins coincide, the times on the clocks in each frame of reference are set to read zero, that is t = t′ = 0. Now consider an event that occurs at the point (x, y, z, t) as measured in S . The same event occurs at (x′, y′, z′, t′) in S ′. What we are after is a set of equations that relate these two sets of coordinates. We are going to assume a number of things about the form of these equations, all of which can be fully justified, but which we will introduce more or less on the basis that they seem intuitively reasonable. First, because the relative motion of the two reference frames is in the X direction, it is reasonable to expect that all distances measured at right angles to the X direction will be the same in both S and S ′, i.e.1 y = y′ and z = z′. (4.2) We now assume that (x, t) and (x′, t′) are related by the linear transformations x′ = Ax + Bt (4.3) t′ = Cx + Dt. (4.4) Why linear? Assuming that space and time is homogeneous tells us that a linear relation is the only possibility2. What it amounts to saying is that it should not matter where in space we choose 1If we assumed, for instance, that z = kz′, then it would also have to be true that z′ = kz if we reverse the roles of S and S ′, which tells us that k2 = 1 and hence that k = ±1. We cannot have z = −z′ as the coordinate axes are clearly not ‘inverted’, so we must have z = z′. 2In general, x′ will be a function of x and t, i.e. x′ = f (x, t) so that we would have dx′ = fxdx + ftdt where fx is the partial derivatve of f with respect to x, and similarly for ft. Homogeniety then means that these partial derivatives are constants. In other words, a small change in x and t produces the same change in x′ no matter where in space or time the change takes place. Chapter 4 Einsteinian Relativity 25 our origin of the spatial coordinates to be, not should it matter when we choose the orgin of time, i.e. the time that we choose to set as t = 0. Now consider the origin O′ of S ′. This point is at x′ = 0 which, if substituted into Eq. (4.3) gives Ax + Bt = 0 (4.5) where x and t are the coordinates of O′ as measured in S , i.e. at time t the origin O′ has the X coordinate x, where x and t are related by Ax + Bt = 0. This can be written x t = − B A (4.6) but x/t is just the velocity of the origin O′ as measured in S . This origin will be moving at the same speed as the whole reference frame, so then we have − B A = vx (4.7) which gives B = −vxA which can be substituted into Eq. (4.3) to give x′ = A(x − vxt). (4.8) If we now solve Eq. (4.3) and Eq. (4.4) for x and t we get x = Dx′ + vxAt′ AD − BC (4.9) t = At′ −Cx′ AD − BC . (4.10) If we now consider the origin O of the reference frame S , that is, the point x = 0, and apply the same argument as just used above, and noting that O will be moving with a velocity −vx with respect to S ′, we get − vxA D = −vx (4.11) which then gives A = D (4.12) and hence the transformations Eq. (4.9) and Eq. (4.10) from S ′ to S will be, after substituting for D and B: x = (x′ + vxt′) A + vxC t = (t′ − (C/A)x′) A + vxC  (4.13) which we can compare with the original transformation from S to S ′ x′ = A(x − vxt) t′ = A(t + (C/A)x).  (4.14) At this point we will introduce a notation closer to the conventional notation i.e. we will now write A = γ and C/A = K. (4.15) Chapter 4 Einsteinian Relativity 26 so that the sets of equations above become x = (x′ + vxt′) γ(1 + vxK) t = (t′ − Kx′) γ(1 + vxK)  (4.16) and x′ = γ(x − vxt) t′ = γ(t + Kx).  (4.17) We now want to make use of some of the symmetry properties listed above to learn more about γ and K. In doing this, it should be understood that the quantities γ and K are not constants. While it is true that they do not depend on x or t, they still potentially depend on vx. However, the assumed isotropy of space means that γ cannot depend on the sign of vx. If we write γ ≡ γ(vx) and γ′ ≡ γ(−vx), (with a similar meaning for K and K′), this means that3 γ = γ′. (4.18) A symmetry property we have already used is that if S ′ is moving with a velocity vx relative to S , then S must be moving with velocity −vx relative to S ′. We now make use of this fact to reverse the transformation equations Eq. (4.17) to express x and t in terms of x′ and t′. We do this by making the substitutions vx → −vx, x↔ x′, and t ↔ t′, which leads to x = γ(x′ + vxt′) t = γ(t′ + K′x′).  (4.19) By comparison with Eq. (4.16) we have γ = 1 γ(1 + vxK) and −K γ(1 + vxK) = γK′ (4.20) which tells us that γ2 = 1 1 + vxK and K = −K′. (4.21) The second of these two equations tells us that we can write K as K = −vx/V 2 (4.22) where V2 will not depend on the sign of vx though it could still depend on vx. We are motivated to write K in this way because by doing so the quantity V will have the units of velocity, which will prove to be convenient later. There is nothing physical implied by doing this, it is merely a mathematical convenience. Thus we have γ = 1√ 1 − (vx/V)2 . (4.23) 3To see this, suppose we have a third reference frame S ′′ which is moving with a velocity −vx relative to S . We then have the two transformation equations x′ = γ(x− vxt) and x′′ = γ′(x+ vxt). Now suppose some event occurs at the origin of S , i.e. at x = 0 at a time t as measured in S . The position of this event as measured in S ′ will be x′ = −γvxt while, as measured in S ′′, would be at x′′ = γ′vxt. By the assumed isotropy of space we ought to have γvxt = γ′vxt i.e. γ = γ′. Chapter 4 Einsteinian Relativity 29 must be related by the transformation laws Eq. (4.24) obtained above. If we substitute these expressions into Eq. (4.33) we get[ 1 − (cvx/V 2)2 ] x2 + [ 1 − (vx/V)2 ] y2 + [ 1 − (vx/V)2 ] z2 − [ 1 − (vx/c)2 ] (ct)2 − 2vx [ 1 − (c/V)2 ] xt = 0. (4.34) This equation must reduce to Eq. (4.32). Either by working through the algebra, or simply by trial and error, it is straightforward to confirm that this requires V = c, i.e. the general transformation Eq. (4.24) with V = c, guarantees that the two spheres of light are expanding at the same rate, that is at the speed c, in both inertial frames of reference. Now writing the quantity γ as γ = 1√ 1 − (vx/c)2 (4.35) we are left with the final form of the transformation law consistent with light always being ob- served to be travelling at the speed c in all reference frames: x′ = γ(x − vxt) y′ = y z′ = z t′ = γ ( t − (vx/c 2)x ) .  (4.36) These are the equations of the Lorentz transformation. We can find the inverse transformation either by solving Eq. (4.36) for x, y, z, and t in terms of x′, y′, z′, and t′, or else by simply recognizing, as was mentioned above in the derivation of this transformation, that if S ′ is moving with velocity vx relative to S , then S is moving with velocity −vx relative to S ′. Consequently, all that is required is to exchange the primed and unprimed variables and change the sign of vx in Eq. (4.36). The result by either method is x = γ(x′ + vxt′) y = y′ z = z′ t = γ ( t′ + (vx/c 2)x′ ) .  (4.37) These equations were first obtained by Lorentz who was looking for a mathematical transforma- tion that left Maxwell’s equations unchanged in form. However he did not assign any physical significance to his results. It was Einstein who first realized the true meaning of these equations, and consequently, with this greater insight, was able to derive them without reference at all to Maxwell’s equations. The importance of his insight goes to the heart of relativity. Although the use of a flash of light played a crucial role in deriving the transformation equations, it was intro- duced as a means by which the value of the unknown parameter V could be determined. The final result simply establishes a connection between the two sets of space-time coordinates associated with a given event, this event being the passage of a flash of light past the point (x, y, z) at time t, as measured in S , or (x′, y′, z′) at time t′, as measured in S ′. The transformation equations therefore Chapter 4 Einsteinian Relativity 30 represent a property that space and time must have in order to guarantee that light will always be observed to have the same speed c in all inertial frames of reference. But given that these trans- formation equations represent an intrinsic property of space and time, it can only be expected that the behaviour of other material objects, which may have nothing whatsoever to do with light, will also be influenced by this fundamental property of space and time. This is the insight that Einstein had, that the Lorentz transformation was saying something about the properties of space and time, and the consequent behaviour that matter and forces must have in order to be consistent with these properties. Later we will see that the speed of light acts as an upper limit to how fast any material object can travel, be it light or electrons or rocket ships. In addition, we shall see that anything that travels at this speed c will always be observed to do so from all frames of reference. Light just happens to be one of the things in the universe that travels at this particular speed. Subatomic particles called neutrinos also apparently travel at the speed of light, so we could have formulated our arguments above on the basis of an expanding sphere of neutrinos! The constant c therefore represents a characteristic property of space and time, and only less significantly is it the speed at which light travels. Two immediate conclusions can be drawn from the Lorentz Transformation. Firstly, suppose that , vx > c i.e. that S ′ is moving relative to S at a speed greater than the speed of light. In that case we find that γ2 < 0 i.e. γ is imaginary so that both position and time in Eq. (4.36) become imaginary. However position and time are both physical quantities which must be measured as real numbers. In other words, the Lorentz transformation becomes physically meaningless if vx > c. This immediately suggests that it is a physical impossibility for a material object to attain a speed greater than c relative to any reference frame S . The frame of reference in which such an object would be stationary will then also be moving at the speed vx, but as we have just seen, in this situation the transformation law breaks down. We shall see later how the laws of dynamics are modified in special relativity, one of the consequences of this modification being that no material object can be accelerated to a speed greater than c4. Secondly, we can consider the form of the Lorentz Transformation in the mathematical limit vx << c. We find that γ ≈ 1 so that Eq. (4.36) becomes the equations of the Galilean Transformation, Eq. (3.1). (Though this also requires that the x dependent term in the time transformation equation to be negligible, which it will be over small enough distances). Thus, at low enough speeds, any unusual results due to the Lorentz transformation would be unobservable. 4.4 Relativistic Kinematics The Lorentz transformation leads to a number of important consequences for our understanding of the motion of objects in space and time without concern for how the matter was set into motion, i.e. the kinematics of matter. Later, we will look at the consequnces for our understanding of the laws of motion themselves, that is relativistic dynamics. Perhaps the most startling aspect of the Lorentz Transformation is the appearance of a transfor- mation for time. The result obtained earlier for the Galilean Transformation agrees with, indeed it was based on, our ‘common sense’ notion that time is absolute i.e. that time passes in a manner completely independent of the state of motion of any observer. This is certainly not the case with the Lorentz Transformation which leads, as we shall see, to the conclusion that moving clocks run slow. This effect, called time dilation, and its companion effect, length contraction will now be discussed. 4In principle there is nothing wrong with having an object that is initially travelling with a speed greater than c. In this case, c acts as a lower speed limit. Particles with this property, called tachyons, have be postulated to exist, but they give rise to problems involving causality (i.e. cause and effect) which make their existence doubtful. Chapter 4 Einsteinian Relativity 31 4.4.1 Length Contraction The first of the interesting consequences of the Lorentz Transformation is that length no longer has an absolute meaning: the length of an object depends on its motion relative to the frame of reference in which its length is being measured. Let us consider a rod moving with a velocity vx relative to a frame of reference S , and lying along the X axis. This rod is then stationary relative to a frame of reference S ′ which is also moving with a velocity vx relative to S . X Y Z X′ Y ′ Z′ rod S ′S vx Figure 4.4: A rod of length at rest in reference frame S ′ which is moving with a velocity vx with respect to another frame S . As the rod is stationary in S ′, the ends of the rod will have coordinates x′1 and x′2 which remain fixed as functions of the time in S ′. The length of the rod, as measured in S ′ is then l0 = x′2 − x′1 (4.38) where l0 is known as the proper length of the rod i.e. l0 is its length as measured in a frame of reference in which the rod is stationary. Now suppose that we want to measure the length of the rod as measured with respect to S . In order to do this, we measure the X coordinates of the two ends of the rod at the same time t, as measured by the clocks in S . Let x2 and x1 be the X coordinates of the two ends of the rod as measured in S at this time t. It is probably useful to be aware that we could rephrase the preceding statement in terms of the imaginary synchronized clocks introduced in Section 2.2 and Section 4.2 by saying that ‘the two clocks positioned at x2 and x1 both read t when the two ends of the rod coincided with the points x2 and x1.’ Turning now to the Lorentz Transformation equations, we see that we must have x′1 = γ(x1 − vxt)) x′2 = γ(x2 − vxt).  (4.39) We then define the length of the rod as measured in the frame of reference S to be l = x2 − x1 (4.40) where the important point to be re-emphasized is that this length is defined in terms of the positions of the ends of the rods as measured at the same time t in S . Using Eq. (4.39) and Eq. (4.40) we find l0 = x′2 − x′1 = γ(x2 − x1) = γl (4.41) which gives for l l = γ−1l0 = √ 1 − (vx/c)2l0. (4.42) But for vx < c √ 1 − (vx/c)2 < 1 (4.43) Chapter 4 Einsteinian Relativity 34 Thus, consider two events 1 and 2 which are simultaneous in S i.e. t1 = t2, but which occur at two different places x1 and x2. Then, in S ′, the time interval between these two events is t′2 − t′1 = γ(t2 − vxx2/c 2) − γ(t1 − vxx1/c 2) = γ(x1 − x2)vx/c 2 , 0 as x1 , x2. (4.50) Here t′1 is the time registered on the clock in S ′ which coincides with the position x1 in S at the instant t1 that the event 1 occurs and similarly for t′2. Thus events which appear simultaneous in S are not simultaneous in S ′. In fact the order in which the two events 1 and 2 are found to occur in will depend on the sign of x1 − x2 or vx. It is only when the two events occur at the same point (i.e. x1 = x2) that the events will occur simultaneously in all frames of reference. 4.4.4 Transformation of Velocities (Addition of Velocities) Suppose, relative to a frame S , a particle has a velocity u = uxi + uyj + uzk (4.51) where ux = dx/dt etc. What we require is the velocity of this particle as measured in the frame of reference S ′ moving with a velocity vx relative to S . If the particle has coordinate x at time t in S , then the particle will have coordinate x′ at time t′ in S ′ where x = γ()x′ + vxt′) and t = γ(t′ + vxx′/c2). (4.52) If the particle is displaced to a new position x + dx at time t + dt in S , then in S ′ it will be at the position x′ + dx′ at time t′ + dt′ where x + dx = γ ( x′ + dx′ + vx(t′ + dt′) ) t + dt = γ ( t′ + dt′ + vx(x′ + dx′)/c2) ) and hence dx = γ(dx′ + vxdt′) dt = γ(dt′ + vxdx′/c2) so that ux = dx dt = dx′ + vxdt′ dt′ + vxdx′/c2 = dx′ dt′ + vx 1 + vx c2 dx′ dt′ = u′x + vx 1 + vxu′x/c2 (4.53) where u′x = dx′/dt′ is the X velocity of the particle in the S ′ frame of reference. Similarly, using y = y′ and z = z′ we find that uy = u′y γ(1 + vxu′x/c2) (4.54) uz = u′z γ(1 + vxu′x/c2) . (4.55) Chapter 4 Einsteinian Relativity 35 The inverse transformation follows by replacing vx → −vx interchanging the primed and unprimed variables. The result is u′x = ux − vx 1 − vxux/c2 u′y = uy γ(1 − vxux/c2) u′z = uz γ(1 − vxux/c2) .  (4.56) In particular, if ux = c and uy = uz = 0, we find that u′x = c − vx 1 − vx/c = c (4.57) i.e., if the particle has the speed c in S , it has the same speed c in S ′. This is just a restatement of the fact that if a particle (or light) has a speed c in one frame of reference, then it has the same speed c in all frames of reference. Now consider the case in which the particle is moving with a speed that is less that c, i.e. suppose uy = uz = 0 and |ux| < c. We can rewrite Eq. (4.56) in the form u′x − c = ux − c 1 − uxvx/c2 − c = (c + vx)(c − vx) c(1 − vxux/c2) . (4.58) Now, if S ′ is moving relative to S with a speed less than c, i.e. |vx| < c, then along with |ux| < c it is not difficult to show that the right hand side of Eq. (4.58) is always negative i.e. u′x − c < 0 if |ux| < c, |vx| < c (4.59) from which follows u′x < c. Similarly, by writing u′x + c = ux − vx 1 − uxvx/c2 + c = (c + ux)(c − vx) c(1 − vxux/c2) (4.60) we find that the right hand side of Eq. (4.60) is always positive provided |ux| < c and |vx| < c i.e. u′x + c > 0 if |ux| < c, |vx| < c (4.61) from which follows u′x > −c. Putting together Eq. (4.59) and Eq. (4.61) we find that |u′x| < c if |ux| < c and |vx| < c. (4.62) What this result is telling us is that if a particle has a speed less than c in one frame of reference, then its speed is always less than c in any other frame of reference, provided this other frame of reference is moving at a speed less than c. As an example, consider two objects A and B approaching each other, A at a velocity ux = 0.99c relative to a frame of reference S , and B stationary in a frame of reference S ′ which is moving with a velocity vx = −0.99c relative to S . Chapter 4 Einsteinian Relativity 36 X X′ Z S A ux = 0.99c Z′ S ′ B vx = −0.99c Figure 4.7: Object B stationary in reference frame S ′ which is moving with a velocity vx = −0.99c relative to reference frame S . Object A is moving with velocity ux = 0.99c with respect to reference frame S . According to classical Newtonian kinematics, B will measure A as approaching at a speed of 1.98c. However, according to the Einsteinian law of velocity addition, the velocity of A relative to B, i.e. the velocity of A as measured in frame S ′ is, from Eq. (4.56) u′x = 0.99c − (−0.99c) 1 + (0.99)2 = 0.99995c which is, of course, less than c, in agreement with Eq. (4.62). In the above, we have made use of the requirement that all speeds be less than or equal to c. To understand physically why this is the case, it is necessary to turn to consideration of relativistic dynamics. 4.5 Relativistic Dynamics Till now we have only been concerned with kinematics i.e. what we can say about the motion of the particle without consideration of its cause. Now we need to look at the laws that determine the motion i.e. the relativistic form of Newton’s Laws of Motion. Firstly, Newton’s First Law is accepted in the same form as presented in Section 2.2.1. However two arguments can be presented which indicate that Newton’s Second Law may need revision. One argument only suggests that something may be wrong, while the second is of a much more fundamental nature. Firstly, accord- ing to Newton’s Second Law if we apply a constant force to an object, it will accelerate without bound i.e. up to and then beyond the speed of light. Unfortunately, if we are going to accept the validity of the Lorentz Transformation, then we find that the factor γ becomes imaginary i.e. the factor γ becomes imaginary. Thus real position and time transform into imaginary quantities in the frame of reference of an object moving faster than the speed of light. This suggests that a problem exists, though it does turn out to be possible to build up a mathematical theory of particles moving at speeds greater than c (tachyons). The second difficulty with Newton’s Laws arise from the result, derived from the Second and Third laws, that in an isolated system, the total momentum of all the particles involved is constant, where momentum is defined, for a particle moving with velocity u and having mass m, by p = mu (4.63) The question then is whether or not this law of conservation of momentum satisfies Einstein’s first postulate, i.e. with momentum defined in this way, is momentum conserved in all inertial frames of reference? To answer this, we could study the collision of two bodies Chapter 4 Einsteinian Relativity 39 where T is the relativistic kinetic energy of the particle. We can write this last equation as dT dt = F · u = u · dp dt = u · d dt m0u√ 1 − u2/c2 = m0u · du dt√ 1 − u2/c2 + m0u · uu du dt c2 √ 1 − u2/c2 But u · du dt = u du dt (4.73) and hence dT dt = [ m0√ 1 − u2/c2 + m0u2/c2√ (1 − u2/c2)3 ] u du dt = m0√ (1 − u2/c2)3 u du dt so that we end up with dT dt = d dt [ m0c2√ 1 − u2/c2 ] . (4.74) Integrating with respect to t gives T = m0c2√ 1 − u2/c2 + constant. (4.75) By requiring that T = 0 for u = 0, we find that T = m0c2√ 1 − u2/c2 − m0c2. (4.76) Interestingly enough, if we suppose that u << c, we find that, by the binomial approximation6 1√ 1 − u2/c2 = (1 − u2/c2)− 1 2 ≈ 1 + u2 2c2 (4.77) so that T ≈ m0c2(1 + u2/c2) − m0c2 ≈ 1 2 m0c2 (4.78) which, as should be the case, is the classical Newtonian expression for the kinetic energy of a particle of mass moving with a velocity u. 6The binomial approximation is (1 + x)n ≈ 1 + nx if x << 1. Chapter 4 Einsteinian Relativity 40 4.5.3 Total Relativistic Energy We can now define a quantity E by E = T + m0c2 = m0c2√ 1 − u2/c2 . (4.79) This quantity E is known as the total relativistic energy of the particle of rest mass m0. It is all well and good to define such a thing, but, apart from the neatness of the expression, is there any real need to introduce such a quantity? In order to see the value of defining the total relativistic energy, we need to consider the transformation of momentum between different inertial frames S and S ′. To this end consider px = m0c2√ 1 − u2/c2 (4.80) where u = √ u2 x + u2 y + u2 z (4.81) and where u is the velocity of the particle relative to the frame of reference S . In terms of the velocity u′ of this particle relative to the frame of reference S ′ we can write ux = u′x + vx 1 + u′xvx/c2 uy = u′y γ(1 + u′xvx/c2) uz = u′z γ(1 + u′xvx/c2) (4.82) with γ = 1√ 1 − v2 x/c2 (4.83) as before. After a lot of exceedingly tedious algebra, it is possible to show that√ 1 − u2/c2 = √ 1 − u′2/c2 √ 1 − v2 x/c2 1 + u′xvx/c2 (4.84) so that, using Eq. (4.82), Eq. (4.83) and Eq. (4.84) we find px = m0(u′x + vx)√ (1 − u′2/c2)(1 − v2 x/c2) = γ [ m0u′x√ 1 − u′2/c2 + vx ( m0√ 1 − u′2/c2 )] which we can readily write as px = γ [ p′x + vx(E′/c2) ] (4.85) i.e. we see appearing the total energy E′ of the particle as measured in S ′. A similar calculation for py and pz yields py = p′y and pz = p′z (4.86) while for the energy E we find E = m0c2√ 1 − u2/c2 = m0c2√ 1 − u′2/c2 · 1 + u′xvx/c2√ 1 − v2 x/c2 = γ [ m0c2√ 1 − u′2/c2 + m0u′xvx√ 1 − u′2/c2 ] Chapter 4 Einsteinian Relativity 41 which we can write as E = γ [ E′ + p′xvx ] . (4.87) Now consider the collision between two particles 1 and 2. Let the X components of momentum of the two particles be p1x and p2x relative to S . Then the total momentum in S is Px = p1x + p2x (4.88) where Px is, by conservation of relativistic momentum, a constant, i.e. Px stays the same before and after any collision between the particles. However p1x + p2x = γ ( p′1x + p′2x ) + γ ( E′1 + E′2 ) vx/c 2 (4.89) where p′1x and p′2x are the X component of momentum of particles 1 and 2 respectively, while E′1 and E′2 are the energies of particles 1 and 2 respectively, all relative to frame of reference S ′. Thus we can write Px = γP′x + γ ( E′1 + E′2 ) vx/c 2. (4.90) Once again, as momentum is conserved in all inertial frames of reference, we know that P′x is also a constant i.e. the same before and after any collision. Thus we can conclude from Eq. (4.90) that E′1 + E′2 = constant (4.91) i.e. the total relativistic energy in S ′ is conserved. But since S ′ is an arbitrary frame of reference, we conclude that the total relativistic energy is conserved in all frames of reference (though of course the conserved value would in general be different in different frames of reference). Since, as we shall see later, matter can be created or destroyed, we generalize this to read: The total relativistic energy of a system of particles is always conserved in all frames of reference, whether or not the total number of particles remains a constant. Thus we see that conservation of relativistic momentum implies conservation of total relativistic energy in special relativity whereas in Newtonian dynamics, they are independent conditions. Nevertheless, both conditions have to be met in when determining the outcome of any collision between particles, i.e. just as in Newtonian dynamics, the equations representing the conservation of energy and momentum have to be employed. A useful relationship between energy and momentum can also be established. Its value lies both in treating collision problems and in suggesting the existence of particles with zero rest mass. The starting point is the expression for energy E = m0c2√ 1 − u2/c2 (4.92) from which we find E2 = m2 0c4 1 − u2/c2 = m0c4 [ 1 − u2/c2 + u2/c2 ] 1 − u2/c2 so that E2 = m2 0c4 + m0u2 1 − u2/c2 · c2. (4.93) Chapter 4 Einsteinian Relativity 44 4.5.5 Zero Rest Mass Particles For a single particle, rest mass m0, its momentum p and energy E are related by the expression: E2 = p2c2 + M2 0c4. This result allows us to formally take the limit of m0 → 0 while keeping E and p fixed. The result is a relationship between energy and momentum for a particle of zero rest mass. In this limit, with E, p , 0, we have E = pc = |p|c (4.101) i.e. p is the magnitude of the momentum vector p. If we rearrange Eq. (4.79) to read E √ 1 − u2/c2 = m0c2 and if we then let m0 → 0 with E , 0, we must have√ 1 − u2/c2 → 0 so that, in the limit of m0 → 0, we find that u = c. (4.102) Thus, if there exists particles of zero rest mass, we see that their energy and momentum are related by Eq. (4.101) and that they always travel at the speed of light. Particles with zero rest mass need not exist since all that we have presented above is a mathematical argument. However it turns out that they do indeed exist: the photon (a particle of light) and the neutrino, though recent research in solar physics seems to suggest that the neutrino may in fact have a non-zero, but almost immeasurably tiny mass. Quantum mechanics presents us with a relationship between frequency f of a beam of light and the energy of each photon making up the beam: E = h f = ~ω (4.103) Chapter 5 Geometry of Flat Spacetime T theory of relativity is a theory of space and time and as such is a geometrical theory, though the geometry of space and time together is quite different from the Euclidean geometry of ordinary 3-dimensional space. Nevertheless it is found that if relativity is recast in the language of vectors and ”distances” (or preferably ”intervals”) a much more coherent picture of the content of the theory emerges. Indeed, relativity is seen to be a theory of the geometry of the single entity, ‘spacetime’, rather than a theory of space and time. Furthermore, without the geometrical point-of-view it would be next to impossible to extend special relativity to include transformations between arbitrary (non-inertial) frames of reference, which ultimately leads to the general theory of relativity, the theory of gravitation. In order to set the stage for a discussion of the geometrical properties of space and time, a brief look at some of the more familiar ideas of geometry, vectors etc in ordinary three dimensional space is probably useful. 5.1 Geometrical Properties of 3 Dimensional Space For the present we will not be addressing any specifically relativistic problem, but rather we will concern ourselves with the issue of fixing the position in space of some arbitrary point. To do this we could, if we wanted to, imagine a suitable set of rulers so that the position of a point P can be specified by the three coordinates (x, y, z) with respect to this coordinate system, which we will call R. X Y Z P1 P2 ∆r Figure 5.1: A displacement vector ∆r in space with an arbitrary coordinate system R. If we then consider two such points P1 with coordinates (x1, y1, z1) and P2 with coordinates (x2, y2, z2) then the line joining these two points defines a vector ∆r which we can write in com- Chapter 5 Geometry of Flat Spacetime 46 ponent form with respect to R as ∆r x2 − x1 y2 − y1 z2 − z1  R (5.1) where the subscript R is to remind us that the components are specified relative to the set of coordinates R. Why do we need to be so careful? Obviously, it is because we could have, for instance, used a different set of axes R′ which have been translated and rotated relative to the first: X X′ Y Y ′ θ P1 P2 ∆r Figure 5.2: Displacement vector and two coordinate systems rotated with respect to each other about Z axis through angle θ. The vector has an existence independent of the choice of coordinate systems. In this case the vector ∆r will have new components, but the vector itself will still be the same vector i.e. ∆r x2 − x1 y2 − y1 z2 − z1  R x′2 − x′1 y′2 − y′1 z′2 − z′1  R′ (5.2) or ∆r ∆x ∆y ∆z  R ∆x′ ∆y′ ∆z′  R′ (5.3) So the components themselves are meaningless unless we know with respect to what coordinate system they were determined. In fact, the lack of an absolute meaning of the components unless the set of axes used is specified means that the vector ∆r is not so much ‘equal’ to the column vector as ‘represented by’ the column vectors – hence the use of the dotted equal sign ‘’ to indicate ‘represented by’. The description of the vector in terms of its components relative to some coordinate system is something done for the sake of convenience. Nevertheless, although the components may change as we change coordinate systems, what does not change is the vector itself, i.e. it has an existence independent of the choice of coordinate system. In particular, the length of ∆r and the angles between any two vectors ∆r1 and ∆r2 will be the same in any coordinate system. While these last two statements may be obvious, it is important for what comes later to see that they also follow by explicitly calculating the length and angle between two vectors using their components in two different coordinate systems. In order to do this we must determine how the coordinates of ∆r are related in the two different coordinate systems. We can note that the displacement of the two coordinate systems with respect to each other is immaterial as we are considering differences between vectors thus we only need to worry about the rotation which we have, for simplicity, taken to be through an angle θ about the Z axis (see the above diagram). The transformation between the sets of coordinates can then be shown to be given, in matrix form, by∆x′ ∆y′ ∆z′  R′ =  cos θ sin θ 0 − sin θ cos θ 0 0 0 1  ∆x ∆y ∆z  R (5.4) Chapter 5 Geometry of Flat Spacetime 49 four-momentum of a particle. This four-momentum is, of course, by virtue of this transformation property, also a four-vector. We can note that the (‘length’)2 of this four-vector is given by (E/c)2 − [ p2 x + p2 y + p2 z ] = (E/c)2 − p2 = (E2 − p2c2)/c2 = m2 0c2 (5.17) where m0 is the rest mass of the particle. This quantity is the same (i.e. invariant) in different frames of reference. A further four-vector is the velocity four-vector ~v  cdt/dτ dx/dτ dy/dτ dz/dτ  (5.18) where dτ = ds/c (5.19) and is known as the proper time interval. This is the time interval measured by a clock in its own rest frame as it makes its way between the two events an interval ds apart. To see how the velocity four-vector relates to our usual understanding of velocity, consider a particle in motion relative to the inertial reference frame S . We can identify two events, E1 wherein the particle is at position (x, y, z) at time t, and a second event E2 wherein the particle is at (x + dx, y+dy, z+dz) at time t+dt. The displacement in space and time between these events will then be represented by the four-vector d~s defined in Eq. (5.14). Furthermore, during this time interval dt as measured in S , the particle undergoes a displacement dr = dxi + dyj + dzk and so has a velocity u = dx dt i + dy dt j + dz dt k = uxi + uyj + uzk. (5.20) The time interval between the events E1 and E2 as measured by a clock moving with the particle will be just the proper time interval dτ in the rest frame of the particle. We therefore have, by the time dilation formula dt = dτ√ 1 − (u/c)2 (5.21) where u is the speed of the particle. So, if we form the four-velocity to be associated with the two events E1 and E2, we write ~u  cdt/dτ dx/dτ dy/dτ dz/dτ  = 1√ 1 − (u/c)2  c dx/dt dy/dt dz/dt  = 1√ 1 − (u/c)2  c ux uy uz  (5.22) Thus, if u << c, the three spatial components of the four velocity reduces to the usual components of ordinary three-velocity. Note also that the invariant (‘length’)2 of the velocity four-vector is just c2. Finally, if we take the expression for the four-velocity and multiply by the rest mass of the particle, we get m0~u 1√ 1 − (u/c)2  c ux uy uz  =  E/c px py pz  (5.23) which can be recognized as the four momentum defined above. Chapter 5 Geometry of Flat Spacetime 50 We can continue in this way, defining four-acceleration ~a = d~u dτ (5.24) and the four-force, also known as the Minkowski force ~F: ~F = d~p dτ . (5.25) A direct generalization of the Newtonian definition would have been ~F = m0~a, but this definition is not applicable to zero rest mass particles, hence the more general alternative in Eq. (5.25). 5.3 Minkowski Space Till now we have represented a frame of reference S by a collection of clocks and rulers. An alternative way of doing the same thing is to add a fourth axis, the time axis, ‘at right angles’ to the X, Y , Z axes. On this time axis we can plot the time t that the clock reads at the location of an event. Obviously we cannot draw in such a fourth axis, but we can suppress the Y , Z coordinates for simplicity and draw as in Fig. (5.3): ct x E(x, t) Figure 5.3: An event represented as a point in spacetime. This representation is known as a spacetime or Minkowski diagram and on it we can plot the positions in space and time of the various events that occur in spacetime. In particular we can plot the motion of a particle through space and time. The curve traced out is known as the world line of the particle. We can note that the slope of such a world line must be greater than the slope of the world line of a photon since all material particles move with speeds less than the speed of light. Some typical world lines are illustrated in Fig. (5.4) below. ct x World line of a photon World line of particle stationary in S World line of particle moving at speed < c. Figure 5.4: Diagram illustrating different kinds of world lines. The above diagram gives the coordinates of events as measured in a frame of reference S say. We can also use these spacetime diagrams to illustrate Lorentz transformations from one frame of Chapter 5 Geometry of Flat Spacetime 51 reference to another. Unfortunately, due to the peculiar nature of the interval between two events in spacetime, the new set of axes for some other frame of reference S ′ is not a simple rotation of the old axes. The equations for the S ′ axes are determined in a straightforawrd fashion from the Lorentz transformation equations. The x′ axis is just the line for which t′ = 0, which gives ct = vx c x (5.26) and for the t′ axis, for which x′ = 0 ct = c vx x. (5.27) It therefore turns out that these new axes are oblique, as illustrated in Fig. 5.5, and with increasing speeds of S ′ relative to S , these axes close in on the world line of the photon passing through the common origin. ct ct′ x′ x c2t2 − x2 = −1 c2t2 − x2 = 1 O P Q Q′ Figure 5.5: Space and time axes for two different ref- erence frames. The rectilinear axes are for the reference frame S , the oblique axes those for a reference frame S ′ moving with respect to S . The lengths OP and OQ are the same on the figure, but if OP represents a distance of 1 m in S , then OQ′ represents the same distance in S ′. It should be noted that in deriving these equations, the γ factor cancels out. But, as this factor plays an integral role in the Lorentz transformation, appearing in both the length contraction and time dilation formulae, it is clear that it is not sufficient to simply determine the new axes in S ′ if the spacetime dia- gram is to be used to compare lengths or times in the two reference frames. What also needs to be done is to rescale the units of time and distance along each S ′ axis. To put it another way, if the two events O and P on the x axis are one metre apart, then two points O and Q on the x′ axis which are the same distance apart on the diagram (they are about 1.5 cm in Fig. 5.5) will not represent a distance of one metre in S ′. To see what separation is required on the x′ axis, we proceed as follows. The spacetime interval between O and P is given by ∆s = −1 m2. If we now plot all points that have the same spacetime separation from O on this spacetime diagram, we see that these points will lie on the curve (ct)2 − x2 = −1 (5.28) which is the equation of a hyperbola. It will cut the x′ axis (where t′ = 0) at the point Q′. But since the interval is the same in all reference frames, we must also have (ct′)2 − x′2 = −1 (5.29) so at t′ = 0 we have x′ = 1. Thus, it is the distance between O and P′′ that represents a distance of 1 m in S ′. A similar argument can be used to determine the scaling along the time axis in the S ′ frame, i.e. the point for which ct′ = 1, where the hyperbola (ct)2 − x2 = 1 cuts the ct′ axis gives the unit of time on the ct′ axis. Chapter 5 Geometry of Flat Spacetime 54 2. Two events separated by a time-like or light-like interval can affect one another. Thus, returning to our spacetime diagram, we have: past light cone future light cone these events can be affected by O these events can affect O events here are not af- fected by, or cannot affect the event O O Figure 5.8: Future and past light cones of the event O All the events that can be influenced by O constitute the future of event O while all events that can influence O constitute the past of event O. 5.5 Four-Vector Notation It is at this point that a diversion into further mathematical development of the subject is necessary. For the present, we will be more concerned with the way that the physics is described mathemati- cally, rather than the content of the physics itself. This is necessary to put in place the notation and mathematical machinery that is used in general relativity (and in further developments in special relativity, for that matter.) The first step in this direction is to introduce a more uniform way of naming the components of the four-vector quantities introduced above which better empahsizes its vector nature, that is: x0 = ct, x1 = x, x2 = y, x3 = z (5.33) where the superscript numbers are NOT powers of x. In the same way, the components of the momentum four-vector will be p0 = E/c, p1 = px, p2 = py, p3 = pz (5.34) and similarly for other four-vectors. In terms of these names for the components we can write the Lorentz transformation equations as ( ∆xµ )′ = 3∑ ν=0 Λ µ ν∆xν (5.35) Chapter 5 Geometry of Flat Spacetime 55 where, if S ′ is moving with velocity vx relative to S , then the Λµν will be the components of the 4 × 4 matrix appearing in Eq. (5.10) and Eq. (5.16), that is1 Λ µ ν =  Λ0 0 Λ0 1 Λ0 2 Λ0 3 Λ1 0 Λ1 1 Λ1 2 Λ1 3 Λ2 0 Λ2 1 Λ2 2 Λ2 3 Λ3 0 Λ3 1 Λ3 2 Λ3 3  =  γ −γvx/c 0 0 −γvx/c γ 0 0 0 0 1 0 0 0 0 1  . (5.36) It is at this point that we make the first of two notational changes. First we note that we have usually represented quantities as observed in S ′ by attaching a prime to the symbol, e.g. x′, t′ and so on. Now, we will attach the prime to the index, so that we will henceforth write: ∆xµ ′ = 3∑ ν=0 Λ µ′ ν ∆xν (5.37) where now µ′ = 0′, 1′, 2′ or 3′, so that the transformation matrix is now Λ µ′ ν =  Λ0′ 0 Λ0′ 1 Λ0′ 2 Λ0′ 3 Λ1′ 0 Λ1′ 1 Λ1′ 2 Λ1′ 3 Λ2′ 0 Λ2′ 1 Λ2′ 2 Λ2′ 3 Λ3′ 0 Λ3′ 1 Λ3′ 2 Λ3′ 3  =  γ −γvx/c 0 0 −γvx/c γ 0 0 0 0 1 0 0 0 0 1  . (5.38) It is important to recognize that this matrix, as used in Eq. (5.37), transforms the components of 4-vectors in S to the components in S ′. If we were to carry out a transformation from S ′ to S , we would have to write ∆xµ = 3′∑ ν′=0′ Λ µ ν′∆xν ′ (5.39) with now Λ µ ν′ =  Λ0 0′ Λ 0 1′ Λ 0 2′ Λ 0 3′ Λ1 0′ Λ 1 1′ Λ 1 2′ Λ 1 3′ Λ2 0′ Λ 2 1′ Λ 2 2′ Λ 2 3′ Λ3 0′ Λ 3 1′ Λ 3 2′ Λ 3 3′  =  γ γvx/c 0 0 γvx/c γ 0 0 0 0 1 0 0 0 0 1  (5.40) where we note that vx → −vx as we are transforming ‘the other way’, that is, from S ′ to S . It is reasonable to expect that the two matrices for the Lorentz transformations from S to S ′ and vice versa would be inverses of each other. That this is indeed the case can be readily confirmed by multiplying the two matrices together. This point is discussed further in the following Section. The second change in notation is very important as it offers considerable simplification of what would otherwise be exceedingly complicated expressions. This new notation goes under the name of the Einstein summation convention. 5.5.1 The Einstein Summation Convention This convention, which Einstein looked on as his greatest invention, means replacing sums like ∆xµ ′ = 3∑ ν=0 Λ µ′ ν ∆xν 1Note the unusual notation in which the symbol for an element of a matrix is used as the symbol for the complete matrix. This is done as a shorthand convenience, and though mathematically inelegant, it does have its uses. Chapter 5 Geometry of Flat Spacetime 56 by ∆xµ ′ = Λ µ′ ν ∆xν (5.41) with the understanding that whenever there is any repeated (greek) index appearing in a ‘one up’, ‘one down’ combination the summation over the four values of the repeated index is understood. Thus here, as the index ν appears ‘down’ in Λµ ′ ν and ‘up’ in ∆xν, a summation over this index is understood. There are a number of important features associated with the convention. Dummy indices: A repeated index is known as a dummy index, by which is meant that any (greek) symbol can be used instead without a change in meaning, i.e. ∆xµ ′ = Λ µ′ α ∆xα = Λµ ′ β ∆xβ = . . . . (5.42) This changing around of dummy indices can be a useful trick in simplifying expressions, particularly when a substitution has to be made, as will be illustrated later. No summation implied: If an index is repeated, but both occur in an up position or in a down position, then no summation is implied, i.e. Γµµ , Γ00 + Γ11 + Γ22 + Γ33. No meaning assigned: If an index is repeated more than twice, then no meaning is assigned to such a combination, i.e. Γµµµ does not have an unambiguous meaning. If such a combination should occur, then there is a good chance that an error has been made! Free index: Any index that is not repeated in a one up one down arrangement is known as a free index – we are free to give it any of its four possible values. In an equation, all free indices must appear on both sides of the equation in the same i.e. up or down, position. Thus aµ = Λ µ ν′a ν′ is NOT correct, as µ appears in different positions on either side of the equation. The following example is also not correct G = gαβu β. as the free index α appears only on the right hand side of the equation. The name of a free index can be changed, of course, provided it is changed on both sides of an equation. Changing the name of a free index is also a useful trick when manipulating expressions, particularly when one expression is to be substituted into another. Multiple repeated indices: If more than one pair of repeated indices occurs, then a summation is implied over all the repeated indices i.e. gµνa µbν = g0νa 0bν + g1νa 1bν + g2νa 2bν + g3νa 3bν = g00a0b0 + g01a0b1 + g02a0b2 + g03a0b3 + g10a1b0 + g11a1b1 + g12a1b2 + g13a1b3 + g20a2b0 + g21a2b1 + g22a2b2 + g23a2b3 + g30a3b0 + g31a3b1 + g32a3b2 + g33a3b3. Chapter 5 Geometry of Flat Spacetime 59 Hence we must conclude that ~eµ · ~eν = 0 µ , ν ~e0 · ~e0 = 1 ~ei · ~ei = −1 i = 1, 2, 3  (5.59) so they are most unusual basis vectors indeed! At this point we introduce a new quantity gµν = ~eµ · ~eν (5.60) which, written as a matrix, looks like gµν =  1 0 0 0 0 −1 0 0 0 0 −1 0 0 0 0 −1  . (5.61) In terms of this quantity, the interval ∆s can be written ∆s2 = gµν∆xµ∆ν. (5.62) The quantity gµν plays a central role in defining the geometrical properties of spacetime in that in curved spacetime the components of gµν are not simple constants but rather are functions of the spacetime coordinates xµ. A more precise statement is that in the presence of curvature, no matter what frame of reference we use to describe the events in spacetime, there are none for which all the gµν are constants given by Eq. (5.61) throughout all spacetime. In the particular case in which the gµν have the constant values given in Eq. (5.61), then spacetime is said to be flat. In that case, a different notation is occasionally used, that is gµν is written ηµν. As it plays a role in determining the interval or ‘distance’ between two events in spacetime, gµν is referred to as the metric tensor. Why it is called a tensor is something to be examined later. There are two properties of gµν that are worth keeping in mind. First, it is symmetric in the indices, i.e. gµν = gνµ (5.63) and secondly, it has the same components in all reference frames, i.e. gµ′ν′ = Λ α µ′Λ β ν′gαβ (5.64) a result that can be confirmed by direct calculation. 5.5.4 Covectors and Covariant Components The metric tensor plays another useful role in that we can define a new set of quantities ∆xµ = gµν∆xν, (5.65) a procedure known as ‘lowering an index’. Using the values of gµν (or ηµν) given in Eq. (5.59) we can easily evaluate these quantities: ∆x0 =∆x0 ∆xi = − ∆xi, i = 1, 2, 3.  (5.66) Chapter 5 Geometry of Flat Spacetime 60 In terms of the ∆xµ, the interval becomes ∆s2 = gµν∆xµ∆xν = ∆xµ∆xµ. (5.67) If we write this out in matrix form we get ∆s2 = ( ∆x0 ∆x1 ∆x2 ∆x3 )  ∆x0 ∆x1 ∆x2 ∆x3  . (5.68) The column vector here represents the four-vector ∆~s, but what does the row vector represent? The fact that it is written out as a row vector, and its components are different from those that appear in the column vector, suggests that it represents a different mathematical object as compared to the four-vector ∆~s, and so we (temporarily) give it a new name, ∆s̃: ∆s̃ ( ∆x0 ∆x1 ∆x2 ∆x3 ) . (5.69) The newly defined mathematical object is known as a one-form or a covector with components ∆xµ. We can define a set of basis covectors, ẽµ so that we have ∆s̃ = ∆xµẽµ (5.70) but as we will soon see, we will not need to develop this idea any further. The components ∆xµ will, of course, be different in different reference frames. We can derive the transformation law by, once again, making use of the fact that ∆s2 is the same in all reference frames to write ∆s2 = ∆xµ′∆xµ ′ . (5.71) Using ∆xµ ′ = Λ µ′ ν ∆xν, this becomes ∆s2 = Λ µ′ ν ∆xµ′∆xν = ∆xν∆xν (5.72) where ∆xν = Λ µ′ ν ∆xµ′ . (5.73) If we compare this with Eq. (5.54), that is ~eν = Λ µ′ ν ~eµ′ we see that ∆xν and ~eν transform in exactly the same way. Consequently, the ∆xν are referred to as the covariant components of ∆s̃. We have arrived at a state of affairs analogous to what we have in quantum mechanics, namely that ∆~s→ |ψ〉 and ∆s̃→ 〈ψ| though here, no complex conjugation is required as the components of ∆~s are all real. Further, just as in quantum mechanics we can equally well describe the state of a physical system in terms of a bra or ket vector, we have here a perfect one-to-one correspondence between ∆~s and ∆s̃. In fact, in general, no distinction need be drawn between them as they equally well represent the same geometrical object in spacetime, so in future we will have no need to talk about the covector ∆s̃, and instead will simply refer to the four-vector ∆~s which has covariant components ∆xµ or contravariant components ∆xµ. Any four-vector, such as the velocity, acceleration, and Minkowski force four-vectors can be ex- pressed in terms of its covariant components in the same way as the spacetime displacement vector, with the components transforming in exactly the same fashion as in Eq. (5.73). In all cases, the covariant components of these four-vectors will be related to their contravariant counterparts in the same way as for the components of ∆~s, e.g. for the momentum four-vector: p0 =p0 pi = − pi, i = 1, 2, 3.  (5.74) Chapter 5 Geometry of Flat Spacetime 61 5.5.5 Transformation of Differential Operators To see an important example of a situation in which the covariant components of a four-vector naturally arise, consider the derivatives ∂φ ∂xµ where φ is some function that has the same value in all reference frames (a scalar function). To transform these derivatives to their values in another reference frame, we need to use the chain rule for partial derivatives, i.e. ∂φ ∂x0′ = ∂φ ∂x0 ∂x0 ∂x0′ + ∂φ ∂x1 ∂x1 ∂x0′ + ∂φ ∂x2 ∂x2 ∂x0′ + ∂φ ∂x3 ∂x3 ∂x0′ . (5.75) Using the Lorentz transformation xµ = Λµν′ x ν′ (5.76) we see that ∂x0 ∂x0′ = Λ0 0′ . (5.77) If we carry out the same calculation for all the partial derivatives, we find that ∂xµ ∂xν′ = Λ µ ν′ (5.78) so that ∂φ ∂xν′ = Λ µ ν′ ∂φ ∂xµ . (5.79) If we introduce a new notation and write ∂φ ∂xν = ∂νφ (5.80) then Eq. (5.79) becomes ∂ν′φ = Λ µ ν′∂µφ (5.81) which is just the transformation rule for covariant components, Eq. (5.73) of a four-vector. In fact, it is usual practice to treat the differential operators ∂µ themselves as being the covariant components of a four-vector, and write ∂ν′ = Λ µ ν′∂µ. (5.82) 5.6 Tensors The last formal mathematical tool that we need to introduce is the concept of a tensor. A tensor is a generalization of the idea of a four-vector, and as such a tensor represents geometrical object existing in spacetime, but one that is even more difficult to visualize than a four-vector. One viewpoint with regard to tensors is that they can be considered as being ‘operators’ that act upon four-vectors to produce real numbers, and that is the way that the concept will be introduced here. The connection between tensors defined in this manner, and concepts already introduced will emerge later, as will the physical applications of the idea. Thus, we begin with a definition. A tensor T(~a, ~b, ~c, . . .) is a linear function of the four-vectors ~a, ~b, ~c, . . . that maps these four-vectors into the real numbers. Chapter 5 Geometry of Flat Spacetime 64 By using aµ ′ = Λ µ′ ν aν and similarly for the other vector components, this becomes Tµνα...a µbνcα . . . = Tµ′ν′α′...Λ µ′ µ Λ ν′ ν Λ α′ α . . . a µbνcα . . . (5.100) As the vectors ~a, ~b, ~c . . . are arbitrary, we have Tµνα... = Λ µ′ µ Λ ν′ ν Λ α′ α . . . Tµ′ν′α′... (5.101) In other words, the transformation is carried out in the same fashion as we have seen for the single index case (i.e. for the components of vectors). In a similar way (by use of gµν to raise indices), we can show for the contravariant components that T µνα... = Λ µ µ′Λ ν ν′Λ α α′ . . . T µ′ν′α′... (5.102) The results Eq. (5.101) and Eq. (5.102), and a corresponding result for mixed components of the tensor T can be used as a test to see whether or not a multi-indexed quantity is, in fact, a tensor. We shall see how this can be implemented in the case of the Faraday tensor used to describe the electromagnetic field. Chapter 6 Electrodynamics in Special Relativity O of the driving forces behind Einstein’s formulation of the principles of special relativity was the deep significance he attached to the laws of electromagnetism. It is therefore not too surprising to find that these laws can be expressed in the language of four-vectors and tensors in a way that explicitly shows that electromagnetism is consistent with the principles of special relativity. The central feature of this relativistic formulation of Maxwell’s theory is the Faraday tensor. 6.1 The Faraday Tensor By judicious arguments based on applying length contraction and time dilation arguments to the various basic laws of electromagnetism: Ampere’s law for the magnetic field produced by currents, Faraday’s law of magnetic induction for the electric fields produced by a time varying magentic field, and Gauss’s law for the electric field produced by static electric charges – all of which are expressed in terms of either line or surface integrals, or else by working directly from Maxwell’s equations (which are simply restatements of the integral laws in differential form), it is possible to show that electric and magnetic fields E(x, y, z, t) and B(x, y, z, t) as measured in a frame of reference S are related to the electric and magnetic fields E′(x′, y′, z′, t′) and B′(x′, y′, z′, t′) as measured in a reference frame S ′ moving with a velocity vx with respect to S is given by E′x = Ex E′y = γ ( Ey − vxBz ) E′z = γ ( Ez + vxBy ) B′x = Bx B′y = γ ( By + vx c2 Ez ) B′z = γ ( Bz − vx c2 Ey )  (6.1) with, in addition, the usual Lorentz transformation equations for the space time coordinates. The question then arises as to how the electromagnetic field fits in with the general mathematical formalism presented above. It is first of all clear that the transformation laws given in Eq. (6.1) are not those of a four-vector. For one thing, a four-vector has four components – the electromagnetic field has six, while a second rank tensor has two indices and hence has 4 × 4 = 16 components. However, these components need not all be independent. In fact, we can distinguish two important special cases in which the tensor is either symmetric or antisymmetric in its components. In discussing this point, we will work with the contravariant components of a rank 2 tensor as this turns out to be most convenient when dealing with Maxwell’s equations in four-vector notation. Thus, we have the two possibilities T µν = T νµ symmetric T µν = −T νµ antisymmetric  (6.2) Chapter 6 Electrodynamics in Special Relativity 66 In the symmetric case, we only need to know 10 components, that is T µµ, µ = 0, 1, 2, 3 and T 01,T 02,T 03,T 12,T 13 and T 23. An important example of a symmetric tensor is the energy- momentum tensor. In the antisymmetric case, we have that T µµ = −T µµ (6.3) so that T µµ = 0. What is left are the components T 01,T 02,T 03,T 12,T 13 and T 23 which automati- cally give their transposed companions by a change of sign. Written as a matrix, we have T µν =  0 T 01 T 02 T 03 −T 01 0 T 12 T 13 −T 02 −T 12 0 T 23 −T 03 −T 13 −T 23 0  . (6.4) Thus only six independent quantities are needed to fully specify the components of an anti- symmetric tensor, exactly the same as the number of components of the electromagnetic field. The prospect therefore exists that the electromagnetic field components together constitute the components of a second rank antisymmetric tensor. To test whether or not this is the case, we need to show that an antisymmetric tensor transforms in the same way as the electromagnetic components as given in Eq. (6.1). The transformation equations for T µν are given by T µ′ν′ = Λ µ′ µ Λ ν′ ν′T µν. (6.5) Expanding the various components we have T 0′1′ = Λ0′ µ Λ 1′ ν T µν = Λ0′ 0 Λ 1′ ν T 0ν + Λ0′ 1 Λ 1′ ν T 1ν = Λ0′ 0 Λ 1′ 1 T 01 + Λ1′ 0 Λ 0′ 1 T 10 (6.6) where all the other terms vanish either because the diagonal elements T µµ are all zero, or because the elements of the Lorentz transformation matrix are zero. Further, since T 01 = −T 10, we get T 0′1′ = ( Λ0′ 0 Λ 1′ 1 − Λ 1′ 0 Λ 0′ 1 ) T 01 = ( γ2 − γ2 v2 x c2 ) = T 01. (6.7) In the same way we find that T 0′2′ = Λ0′ µ Λ 2′ ν T µν = Λ0′ 0 Λ 2′ ν T 0ν + Λ1′ 0 Λ 2′ ν T 1ν = Λ0′ 0 Λ 2′ 2 T 02 + Λ1′ 0 Λ 2′ 2 T 12 (6.8) Substituting for the elements of the Lorentz transformation matrix then gives T 0′2′ = γ ( T 02 − vx c T 12 ) . (6.9) Proceeding in this way, we end up with the set of transformation equations: T 0′1′ = T 01 T 0′2′ = γ ( T 02 − vx c T 12 ) T 0′3′ = γ ( T 03 + vx c T 31 ) T 2′3′ = T 23 T 3′1′ = γ ( T 31 + vx c T 03 ) T 1′2′ = γ ( T 12 − vx c T 02 )  (6.10) which can be compared with Eq. (6.1) written as follows: E′x c = Ex c E′y c = γ ( Ey c − vxBz ) E′z c = γ (Ez c + vxBy ) B′x = Bx B′y = γ ( By + vx c Ez c ) B′z = γ ( Bz − vx c Ey c ) .  (6.11)

Documents

questions

Lecture Notes on Special Relativity, Exercises of Relativity Theory

Related documents

Partial preview of the text