Determining the causal structure of spacetime (I)

In this series of blog posts, I aim to explain the causal structure of the spacetime solutions in general relativity. Currently, I am working on a special extension of Einstein’s theory (general relativity) which is known as Horndeski theory. There, I am trying to find the causal structure of the allowed solutions which allegedly permit super luminal propagation of metric perturbations. The methodology to obtain the structure is similar for all the gravitational theories and I wish to demonstrate it for general relativity (which is my comfort zone).

In order to make the observable predictions from a consistent physical theory, we are interested in finding how degrees of the freedom evolve and behave. For that, we try to obtain/formulate equations of motions, which capture the physics at the infinitesimal scale. Once we embed the physics in these equations, we solve them and make the verifiable predictions. For instance, in Newtonian mechanics, we study point particle moving in one dimension x. We, then ask how this degree of freedom behaves or evolves in time (which, again, is an assumption). For such a theory, we have the equation of motion F=ma=m\frac{d^2 x(t)}{dt^2} which contains the spectacular physical insight from Newton (I won’t even try to elaborate that because it deserves a separate blog post). Now this is a linear second order differential equation which requires two initial conditions. In other words, if you give me the initial position and initial velocity, I can tell you the entire future of x(t) by solving the differential equation. In fact, for constant F, we get x(t)=ut+\frac{1}{2}\frac{F}{m}t^2+x_0

Another way to look at it: equation(s) of motion (of a theory) give us the prescription to evolve the initial data (known values of the degrees of freedom) to final data that we are interested in.

Now Einstein’s gravity theory has the components of the metric as the degrees of freedom. We denote them by g_{\mu\nu}. Another radical difference (common to all relativistic theories) is that the space and time are unified into a single parameter space called spacetime manifold (inheriting all the structures of the pseudo-Riemannian manifolds). So here we ask our favorite question: how do g_{\mu\nu} evolve in the spacetime. Mathematically, we want to know the g_{\mu\nu}(x), where x is the representative of the spacetime coordinates from now on.

This time, we use the insight from Einstein to write the equation of motion for general relativity (in vacuum) as


where the symbols have their usual meaning. Here we are working on the 4 dimensional pseudo-Riemannian manifold (\mathcal{M},g) on which g is to be determined by the equation of motion. We define a Cauchy surface \mathcal{C} which is a c0dimension 1 surface in \mathcal{M} on which the initial data for Einstein’s equations is defined. So what does this initial data comprised of? (We will take a small detour and return to the topic in next blog post)

Helmholtz wave

 Cauchy.jpgIn order to understand that, we start with a simple example. Consider a 2+1 manifold as shown. Now let us consider a wave operator on this manifold given by \hat{L}=-\frac{\partial^2}{\partial t^2}+\frac{\partial^2}{\partial x^2}+\frac{\partial^2}{\partial y^2}. For a scalar degree of freedom denoted by u the operator \hat{L} acts on it to generate a wave equation \hat{L}u=0. Here we are not interested in solving the linear second ordered partial differential equation. We want to deduce the properties of the wave, like it’s speed of propagation and the extent to which a disturbance/fluctuation in u can propagate.

Let us say that we know the value of u=u(x,y) on \phi(x,y,t)=t=0 surface (where we might have a specified source or some boundary condition). The surface is essentially a codimension-1 surface and we will call the coordinates x,y as internal coordinates (w.r.t the surface). Thus we can now easily compute the internal derivatives (from u(x,y)) and denote them by u_i=\partial_iu(x,y) where i\in (x,y). Let the exterior/normal derivative to the surface be u_\phi (x,y)=\eta(du,d\phi)=-\partial_tu (note that d\phi = dt on \mathcal{C}).

In this example, we have a Cauchy surface \mathcal{C} defined by \phi =t with the data

  • u(x,y)
  • internal derivative u_i(x,y)
  • external derivate u_\phi(x,y)

given on \mathcal{C}. Clearly the second derivatives of  u except u_{\phi\phi} can be computed from the given data. And here we use the wave equation (or equation of motion in general) to find the missing second derivative of u. The co-normal to the surface \phi=0 is given by d\phi which we can easily represent in the natural co-normal basis as d\phi=\phi_\mu dx^\mu. And then we perform the coordinate transformation to chart \lambda^\mu such that \lambda^0=\phi. Note for this particular case d\phi \parallel dt and the new coordinate chart is exactly equal to x^\mu chart. This is because \mathcal{C} is already perpendicular to the time coordinate.

It is not very difficult to show that the wave equation \hat{L}u=0, under the above transformation, converts into u_{\phi\phi}Q(\phi_\mu)+\ldots=0 where Q(\phi_\mu)=\eta^{\mu\nu}\phi_\mu\phi_\nu and we call it the characteristic form. Now there are two situations

  1. Q\neq 0: in this case, we can invert the equation u_{\phi\phi}Q(\phi_\mu)+\ldots=0 and find the second derivative of u in the direction normal to the surface. With that information we can easily evolve the data further in time.
  2. Q=0: well, we can’t invert the equation which basically implies that there is no unique evolution of u beyond that surface (which is now called the characteristic surface).

In our example, the characteristic form is identically 1 (just look at the coefficients of second derivatives (-1+1+1). Hence we don’t have any characteristic hypersurface for the degree of freedom u obeying the wave equation \hat{L}u=0 with the Cauchy surface as the entire x,y plane. It is not surprising if you think about the plane electromagnetic waves, which again don’t have any characteristic hypersurface and obey the same wave equation we wrote above.

Moving on, we obtained an equation for a surface in new coordinates \lambda^\mu given by Q(\phi_\mu)=0. Physically, this is the surface beyond which we can not evolve the degrees of freedom uniquely (as the second derivative is not uniquely determined). Now we need a mechanism to generate this surface.

Bicharacteristic curves

First we define the bicharacteristic curves  or rays which are related to the linear second order partial differential operator \hat{L}, where we now generalize it to \hat{L}[u]=a^{\mu\nu}u_{\mu\nu}+d where u_{\mu\nu}=\partial_\mu\partial_\nu u. It is not difficult to check that in this case the characteristic form becomes Q=a^{\mu\nu}\phi_\mu\phi_\nu. For our Helmholtz wave, we get the form Q=-\phi_t^2+\phi_x^2+\phi_y^2.

Now the bicharacteristic curves are generated in by the linear ordinary differential equations, with a parameter s, given by

\frac{dx_\mu}{ds}=\frac{1}{2}\partial_{\phi_\mu}Q and  \frac{d\phi_\mu}{ds}=-\frac{1}{2}\partial_{x_\mu}Q

For our Helmholtz wave, one can easily solve them and find the solutions x(t)=at+b, y(t)=ct+d. They actually form the rays of a cone (with appropriate boundary conditions) traveling with speed 1.

We have shown that if we introduce some perturbations at some point in Minkowski manifold, those perturbations will travel at unit speed and won’t escape the cone. This essentially exhibits the causal structure of flat Lorentzian spacetime which is in concurrence with the wave equation.



BMS on my mind

I thought that I knew the Minkowski spacetime solution fairly well. But recently, to my surprise, I found that there is much more physics in that solution, especially at the null infinity. There is a set of certain symmetry transformations acting asymptotically at the null infinity which preserve some boundary conditions. This group of symmetries is known as BMS (Bondi-Metzner-Sachs) symmetry group. I will make more precise statements below in this post. Another, more surprising fact I found is that general relativity may not be a truly diffeomorphism invariant theory. In fact these BMS transformations map one asymptotically flat spacetime solution of constraints to another physically inequivalent asymptotically flat solution ( But it is a topic for later post (when I am older and wiser).

Before progressing further, let me try to explain why physicists are interested in this symmetry group. In the well established Standard Model, the particles are the unitary irreducible representations of the isometry group of the flat Minkowski spacetime (known as the Poincaré Group), and, that of internal symmetries. For curved and dynamical spacetime, a part of the isometry group breaks down and the rest gets gauged (I plan to write a blog post describing this phenomenon in the near future). The Standard Model is formulated within the framework of Quantum Field Theory which works only on the flat spacetimes or curved spacetimes with fixed geometry or curved dynamical spacetimes with classical graviton.  Gravity on the other hand is the theory of the dynamics of the spacetime. And, therefore, the Standard Model can not include the gravitons, the quantized form of gravity (like photons are quantized form of electromagnetic fields).

Now, this is the most interesting part, if we study physics on the asymptotically flat dynamical spacetime solutions, then we do have the asymptotic symmetry group of this gravitational space (it is essentially BMS). And the irreducible representation of this group will give the usual particles and some extra multiplets. These extra multiplets are termed as soft gravitons which have gained much attention recently in the quantum gravity community.

Again, in the hope of using these notes for my future reference, and to save a great fraction of my energy, I will take the liberty to be mathematically intense. But, I will try to maintain the rigor of topology, differential geometry and group theory so that my mathematician friends don’t get annoyed.

Now there are two ways to fish the BMS group of Minkowski space. One is by Sachs ( and other is by Penrose. Let us start with the first one.

Consider a normal-hyperbolic Riemann manifold (\mathcal{M},g^{\mu\nu}) and a chart (u(=t-r),r,\theta,\phi) in the neighborhood of point P with the following properties

  1. the hypersurface u=\text{constant} are tangent to the local lightcone everywhere.
  2. r is the corresponding luminosity distance.
  3. the scalars  \theta, \phi are constant along each ray defined by the tangent vector k^a=- g^{ab}\partial_b u.

For such manifold \mathcal{M}, the line element ansatz can be written as ds^2=e^{2\beta}V\frac{du^2}{r}-2e^{2\beta}dudr+r^2h_{AB}(dx^A-U^Adu)(dx^B-U^Bdu) where A,B \in (\theta, \phi) and V, \beta, U^A \text{and} h_{AB} are the functions of coordinates and determinant of h_{AB}=b.

After feeding this ansatz into the Einstein’s field equations one obtains the following asymptotic behavior of the functions

  •  V=-r+2M+\mathcal{O}(\frac{1}{r})
  • \beta=\frac{cc^*}{(2r)^2}+\mathcal{O}(\frac{1}{r^4})
  • h_{AB}dx^Adx^B=(d\theta^2+\sin^2\theta d\phi^2)+\mathcal{O}(\frac{1}{r})
  • U^A=\mathcal{O}(\frac{1}{r^2})

A spacetime is said to be asymptotically flat if

  1. there exist the chart u,r,\theta,\phi with the properties mentioned above
  2. the line element equation and the asymptotic behavior mentioned above hold

Also at large r the line element \lim_{r\to\infty}{ds^2}=-du^2-2dudr+r^2d\Omega^2, which is what we would expect.

At this point Bondi and Metzner studied all the set of coordinate transformations which preserve the line element ds^2 and the asymptotic behavior of the functions. Their considerations were subsequently generalized and the following group was obtained.

So consider the spacetime smoothly covered by the coordinates 0\leq r<\infty, 0\leq\theta\leq\pi, 0\leq\phi\leq 2\pi \text{ and} -\infty<u<\infty. The BMS transformations are given by (\alpha, \Lambda)

  • \bar{u}=(K_\Lambda(x))^{-1}(u+\alpha(x))+\mathcal{O}(\frac{1}{r})
  • \bar{r}=K_\Lambda(x)r+J(x,u)+\mathcal{O}(\frac{1}{r})
  • \bar{\theta}=(\Lambda x)_\theta+H_\theta(r,u)r^{-1}+\mathcal{O}(\frac{1}{r})
  • \bar{\phi}=(\Lambda x)_\phi+H_\phi(r,u)r^{-1}+\mathcal{O}(\frac{1}{r})

Here x is the coordinate on S^2 given by (\theta,\phi)\Lambda is the Lorentz transformation acting as conformal transformation on S^2 and on K_\Lambda(x) is the corresponding conformal factor. \alpha is a scalar function on S^2 related to the supertranslation subgroup. Rest of the functions are uniquely determined by imposing the following constraints of composition

  • (\alpha_1,\Lambda_1)(\alpha_2,\Lambda_2)=(\alpha_1+\Lambda_1\alpha_2,\Lambda_1\Lambda_2)
  • (\Lambda_1\alpha_2)(x)=(K_\Lambda(x))^{-1}\alpha_2(\Lambda_1^{-1}x)

One can immediately notice the semi-direct product structure of the BMS group here. B=N\ltimes L where N is the infinite dimensional group of supertranslations and L is the connected component of the homogeneous Lorentz group.

Penrose’s derivation coming soon!