A modern viewpoint on special relativity (I)

In this series of blog posts, I will explain our current understanding of spacetime using the notions of relativity.

Special relativity originates from two simple principles but its predictions challenge our day to day experiences. For instance, according to the relativity, a person in the train station will see the the length of a moving train a little smaller than that of stationary train (of same type). Furthermore, the same person will notice that the clocks hanging in the train ticking  little slower than that on her wrist. These phenomena are termed as “length contraction” and “time dilation”. The magnitude of these contractions and dilations depends directly on the relative velocity of the train and the person standing on the platform. Now, commonly, the relative speed of train is so low that one doesn’t see any appreciable changes. Cosmic particles which travel at very high speeds are known to have longer half lives than their laboratory counterparts. But convincing the audience that relativity is a correct theory of the nature is not the purpose of this post. Assuming that relativity is a valid theory, the aim of this post is to describe and discuss the theory.

In my opinion, special relativity is all about the properties of spacetime in fairly small region and how various observers record the events occurring in that spacetime. The words in italics have precise definitions and meanings in physics so let us spend some time in understanding them. When you read the word spacetime you might think of it as a three dimensional space with time t as separate entity running in the background. This is totally a wrong picture of the spacetime. If you are a physics major, and have had a course in relativity, you might think of it as a strange mixed space of \mathbb{R}^3 and t with 4 coordinates in which the invariant path length is given ds^2 = dx^2+dy^2+dz^2-c^2dt^2, where c is the speed of light.

The problem with the last definition is that it does’t capture the essence of spacetime as such. It is like explaining an alien that the person in the red shirt is a boy. Just like the word “boy” is assigned to a specific creature having certain characteristics and is independent of the shirt he wears, the word “spacetime” is assigned to a specific physical entity and is independent of the coordinates assigned to it by the observer (yes, an observer is someone who assigns the coordinates such that her coordinates are fixed  in the coordinate system). Please note that this is not a standard definition of the observer but it is quite helpful in explaining the subject. In relativity, spacetime is a special set, with many interesting properties, which mathematicians call “Manifold”. And the coordinate system is a map from the spacetime manifold (say the set \mathcal{M}) to the set \mathbb{R}^4. The coordinate chart of a N dimensional manifold is shown in the figure below (courtesy Carroll’s notes)


Thus, when an observer is assigning coordinates x, y, z and t to an event, he/she is basically mapping the events in \mathcal{M} to a \mathbb{R}^4 space. When an observer assigns a coordinate system, he/she defines a frame of reference. So when we say “the frame of reference of the observer A”, we mean the coordinates assigned by A.

In physics we postulate that all the physical phenomena, comprising of events, happen in this spacetime manifold. In order to quantitatively analyze them, we require an observer who can assign the coordinates  to these events. First postulate of relativity says that the physics doesn’t depend on the coordinate systems of the observers moving with constant relative velocities with respect to each other. We call all such observers as “inertial observers”. It simply means that if we do the physics calculations using the coordinates of any inertial observer we will find the same results. This shouldn’t be hard to accept as the real physical phenomena take place in the spacetime manifold and no matter which map (coordinate system) we use for quantitative analysis, we must find the same physical results. Now be careful! this postulate talks about the maps which are inertial. So we have to be careful when working with accelerated bodies.

Vectors are mathematical  objects (maps, to be precise) which, by definition, remain unchanged under the coordinate transformations. In relativity, we have 4 dimensional vectors living in the spacetime manifold which remain same in all the inertial coordinate systems. We call such objects as invariants. In fact there is a class of invariants called “Tensors” and we assign all sorts of physical objects (energy-momentum, electromagnetic field etc) to the tensors as they, too, don’t depend on the frame of reference. Now events, which are also physical objects, are assigned to 4 vectors. The distance between two events is called the “path length” and is a very good example of an invariant (actually it is a scalar which is a tensor of rank 0). The second postulate gives us a prescription to make scalars, in terms of what is known as metric. According to this prescription the squared path length is mapped to dx^2+dy^2+dz^2-c^2dt^2 in a coordinate system constructed from x, y, z  and ct . Here c is a constant introduced to match the dimensions of the length squared when multiplied by dt^2 . Simple analysis shows that it must have the dimension of \frac{\text{length}}{\text{time}}. In another coordinate system with the coordinates x^\prime, y^\prime, z^\prime and ct^\prime the squared path length is mapped to dx^{\prime 2}+dy^{\prime 2}+dz^{\prime 2}-c^2dt^{\prime 2}. Note the same constant c has been used to maintain equality of the frames. Now, due to the invariance of path length, dx^2+dy^2+dz^2-c^2dt^2=dx^{\prime 2}+dy^{\prime 2}+dz^{\prime 2}-c^2dt^{\prime 2}. This is the starting point of the derivation of the Lorentz transformation in the standard relativity textbooks (although they arrive at this expression using light rays which is misleading sometimes). The Lorentz transformation is a relation between the coordinate systems of two inertial observers, assigned to the same events in the spacetime manifold \mathcal{M}.

In the end of this post I would like to give the celebrated Lorentz transformations. Consider two observers moving with a constant relative speed along one particular direction (say along x axis), then the relation between the coordinates of the observers is (image courtesy Wikepedia)


Here \gamma = 1/\sqrt{1-\frac{v^2}{c^2}}. When physicists did the experiments they found that the value of the constant c is nearly 3\times 10^8meters per second which is the speed of light. The derivation of these transformations can be found in any standard textbook on relativity or in Wikipedia (https://en.wikipedia.org/wiki/Derivations_of_the_Lorentz_transformations).


2 thoughts on “A modern viewpoint on special relativity (I)

  1. Pingback: Special relativity in Minkowski graph (II) | Ravi Mohan
  2. Pingback: The Twin Paradox (III) | Ravi Mohan

Participate in discussion

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s