Geometrical representation of the Killing spinors preserving N=4 supersymmetry (I)

In the low energy limit the mysterious M-theory boils down to a much tractable d=11 Supergravity theory (SUGRA). Therefore it is essential to understand the supersymmetric constraints of the theory which have crucial applications in the field of holography.

Supersymmetry is essentially a (very awesome if you ask me!) symmetry which keeps the theory invariant under the bosonic and fermionic variations given by

\delta_\epsilon\Theta =\epsilon\\\delta_\epsilon X^M=i\bar{\epsilon}\Gamma^M\Theta

Here \epsilon is a Killing spinor which satisfies the Killing equation

\nabla_X\epsilon=\lambda X.\epsilon

It becomes covariantly constant for \lambda =0. In the curved solutions of the SUGRA, supersymmetries are broken due to the non trivial covariant derivative. In order to preserve SUSY, the solutions of the Killing equation play essential role. We focus on those spinors which are invariant under the spin lift of the holonomy group of the appropriate manifold.  For d=11 SUGRA, the Killing equation takes the following algebraic form


Now the notion of the G-structures essentially classifies the special differential forms which arise in the supersymmetric flux compactifications. As can be deduced from the Killing equation, the solutions characterize the Spin Bundle of the supersymmetries with the metric of the manifold with a spin structure in a very intimate way.

Definition: A spin structure on a manifold (\mathcal{M},g) with signature (s,t) is a principle Spin(s,t)-bundle with Spin(\mathcal{M})\to \mathcal{M} together with a bundle morphism \phi : Spin(\mathcal{M})\to SO(\mathcal{M}).

To define the G-structure, we associate the differential forms with the Killing spinors as follows


The aim is to show that these differential forms obey the set of the first order differential equations as a natural consequence of the Killing equations. Now it can be shown that for Clabi-Yau manifolds, or manifolds with the G_2 holonomy, one usually finds the Killing spinor bundles trivially defined by an algebraic projection which are some differential forms applied to the complete spin bundle.

So this seems like a good point to start and make an ansatz for the projection operator for the spin bundle structure in the curved spacetime. These projections are essentially the differential forms defined above which give rise to the notion of the G_2 structures.

Here (for the reasons beyond me right now), three projection operators are defined \Pi_j for j=0,1,2 which break the 32 supersymmetries to four. Another factual data is that if there is a holographic dual to the theory with a Coulomb branch, then there is a non-trivial space of the moduli for brane probes. This moduli space will be realized as conformally Kahler section of the metric (for four supersymmetries). And it is on this section of the metric, supersymmetries will satisfy projection conditions \Pi_j\epsilon=0 with the form \Pi_j=\frac{1}{2}(1+\Gamma^{\xi_j}), where \Gamma^{\xi_j} represents the product of gamma matrices parallel to the moduli space of the branes.

Now we can find the equations of motion of the theory by demanding that the fermionic variations vanish, implying the Killing equation! The solution we are considering here essentially has the topology of AdS_4\times S^7. Using the orthonormal frames, shows the presence of a Kahler structure on the brane-probe moduli space as a conformal multiple of

J_{\text{moduli}}=e^6\wedge e^9+e^7\wedge e^8-e^5\wedge e^{10}

I will continue from here in the next blog-post!

Posted in Uncategorized | Leave a comment

The Holometer

Physics is all about understanding the phenomena that occur in nature. We essentially want to write down the equations which describe these phenomena and can be used for the benefit of the humanity. Blackholes are the naturally occurring mysterious objects in the space. They have very strong gravitational pull and are condensed in very small  region where quantum effects are appreciable. Therefore, we need to formulate a successful theory of quantum gravity which will not only give us a better understanding of the nature, but also provide the powerful practical tools in the future.

The Holographic Principle is a physical principle of the successful theory of quantum gravity. Although we don’t quite understand quantum gravity, we can extrapolate the notions from the already well established physical theories and cleverly deduce a pattern which should be manifest in quantum gravity.  The pattern here is essentially the existence of a precise and very strong limit on the information content of the spacetime. The holographic principle intimately connects the number of quantum mechanical states with the region of spacetime and builds up the stage for a consistent theory of quantum mechanics, matter and gravity. String Theory is the theory of quantum gravity which has successfully realized this principle through the AdS/CFT conjecture. It is important to note that any experiment that directly validates the holographic principle does not necessarily validate the string theory itself.

Recently, my friend Suzanne Jacobs introduced me to a project named Holometer at Fermilab which is an attempt to experimentally verify the holographic principle. I aim to explain the concept behind the working of the Holometer in this, hopefully, self-contained, blog-post.

Now, general relativity is a theory of spacetime and matter. It is a good theory for large length scales (rough estimate is  from radius of the Earth to that of Milky-way galaxy and beyond!). At these scales, matter can be appropriately described by the classical mechanics and the spacetime can be treated as a continuum. At the length scales smaller than that of radius of earth, we have Newtonian gravity in which space and time are treated separately and matter is again governed by classical mechanics. When you go down at the length scale of an atom, gravity becomes weak and other forces like electromagnetic forces become dominant. This is the regime of quantum mechanics. So for all practical purposes, we can forget about gravity and classical mechanics, and just work with the Hilbert Space of quantum mechanics. In very loose sense, Hilbert space is the stage for quantum theory just as spacetime is the stage for general relativity.

Quantum theory is a very peculiar theory and one of its results can be stated as

To probe the smaller length scales, you need to apply more energy in the system.

This is known as the uncertainty principle and given by a simple formula \Delta x \Delta p \geq \hbar/2. Now this is fine. We have built the particle accelerators capable of achieving very high energies and verifying the theory known as Standard Model. But since we live in a universe with gravity, there is a theoretical upper bound on the amount of energy that we can put in a region of space without creating a blackhole (of which we don’t know much about). And at this point gravity (a perfectly understood concept in classical domain) comes back to haunt us in the quantum regime. This length scale is known as Planck Scale and its numerical value is 1.6×10^(-35) meters. And at this length scale we need to formulate a theory of quantum gravity.

Many physicists believe that the spacetime should be an emergent notion in quantum gravity. Based on this school of thought, and theoretical calculations like covariant entropy bound, the Holographic Principle has very nice interpretation. It basically associates a Hilbert Space with each causal diamond in the flat spacetime as shown in the picture. Here we are considering 1+1 spacetime manifold.


A causal diamond (here, rhombus A1A2 and B1B2) is roughly the region of spacetime which is causally connected and is characterized by the proper time parameter \tau (the distance along time axis tA and tB). Of course, here we assume that the observers are at rest with respect to this coordinate system. The Holographic principle assigns the Hilbert spaces \mathcal{H}_A with the causal diamond A1A2 and \mathcal{H}_B with the diamond B1B2. Essentially, these Hilbert spaces have states amongst which the causal connection can be established by definition. The intersection of the causal diamonds (shaded red) is the region of spacetime causally connected to both the points A2 and B2. The Hilbert space associated with this region is \mathcal{H}_{AB}. And now the holographic claim is

the Hilbert space \mathcal{H}_{AB} is completely determined by some mathematical manipulation of the spaces \mathcal{H}_{A} and \mathcal{H}_{B}.

the geometry of red shaded region is completely governed by the \mathcal{H}_{AB}

Now in the Hilbert spaces, the observables (experimentally detectable structures) are non-local. It simply means they don’t depend on the spacetime coordinates. In fact, as we have seen, the spacetime, hence locality, emerges from the holographic picture. It was not there in the quantum theory that we started with! So whenever I mention that something is non-local, it would just mean that it is somewhere in the Hilbert space of the spacetime.

(The matter of this blog-post from here is based on the non peer-reviewed article I shouldn’t be held responsible for any inconsistencies in the subsequent paragraphs :))

Ok, now consider an observable denoted by \hat{x} in the Hilbert space \mathcal{H}, which holographically represents the set of world lines in the spacetime manifold. Since \hat{x} is a quantum mechanical object, the set of world lines it corresponds to should exhibit the quantum behavior. \hat{x} is entirely new degree of freedom and differs from the position variable in the classical spacetime (please note that classical spacetime is different from the holographic spacetime we are talking about here). Also we, a priori, don’t know the corresponding Hamiltonian and the conjugate observable. This is radically different from the String Theory treatment of quantum gravity!

Define a measure (time-domain correlation function) to quantify the deviation of the quantum characteristic of \hat{x} from its classical counterpart \bar{x} by \sigma(\tau) = \langle\Delta x(t)\Delta x(t+\tau)\rangle_{t}F(\tau). In other words,  by the very definition of this function, \sigma(\tau)=0 if there is no quantum or the holographic behavior in the evolution of the world lines! And if some experiment establishes the equality, we can then safely say that the spacetime is perfectly classical and throw the holographic principle out of the window. The non-zero value of \sigma(\tau) represents the “jitter” or the “fuziness” that Fermilab’s Holometer is trying to detect.

Now a quantum mechanical state dechoeres (becomes more classical) with time. This effect can and will make the time-domain correlation function 0 which would destroy the entire purpose of the experiment. The condition to measure the non-zero \sigma(\tau) before the decoherence kicks in, gives the bounds on the dimensions of the experimental apparatus (length and size of the mirrors in the interferometer).

Physicists at Fermilab have an interesting construction to fish out the holographic “jitter” using certain “models” and the details of the experiment can be found at

My observation as a graduate student

This seems to be a good project to uncover and understand the physics at the Planck scales without having to achieve tremendously high amount of the energy. The results reported by the physicists, which are based on a particular model of the correlated holographic noise (cHN), at Fermilab are negative till now. But, as with all the scientific research programs, we now know what is incorrect and move on with the new and better models to gauge the cHN.

Curiously enough, the arXiv papers I consulted to study about these models are not peer reviewed and contain several instances of the ambiguities (broken Lorentz invariance for example). I am not really sure what to make of this, but again, these are just my personal views and I would follow this research only if the articles get published in a good peer reviewed journal.

Posted in The holographic view | Leave a comment

Holographic duals of the twisted supersymmetric theories

The winter breaks are essentially the “slingshots” which provide exponential growth to my knowledge-base. There is nothing like sipping coffee while staring at my digital paper and thinking about how universe works at various length-scales (especially with no semester pressure and coursework!).  My research in String Theory has exposed me to the several elegant “candidate ways” which describe the working of nature, and I aim to explain one of them in this blog-post. Please note that I will use the jargon frequently enough to bore a sane layperson (and most of the physics majors!) but non-rigorously enough to annoy a decent mathematician. Clearly, the aim of my graduate career is to rectify these drawbacks and explain physics in a way which is fun without losing the mathematical rigor.

Now, there are certain quantum field theories with some extra (symmetry) constraints which provide a lucid way to discover and test the framework of String Theory. These symmetries are

  1. Conformal symmetry
  2. Supersymmetry (SUSY)

I will be focusing on \mathcal{N}=4 Super Yang-Mill (SYM) theory in d=4 spacetime manifold with the topology given by \mathbb{R}^{1,1}\times\Sigma_2. Here \Sigma_2 is a 2 manifold with generic structure and curvature (for instance it could be a Riemann surface with constant negative curvature). For this theory the spin connection is in a U(1) subgroup of the R-Symmetry group SU(4). Now since \Sigma_2 can be a curved manifold, it can (and will) break the supersymmetries. In order to preserve at least some of them, we need to, what is known as, twist the theory in a specific fashion. Essentially, we couple the external SO(N) gauge fields with the R-Symmetry current and identify the spin connection with the gauge connection such that we get the covariantly constant spinors on the manifold (there is a more visually appealing picture in the language of branes which I will explain later in the post). In other words, the twist corresponds to the nature of the embedding of the U(1) subgroup in the SU(4). The aim, then, is to find the holographic gravity duals of these twisted field theories.

Twists preserving (4,4) susy

Here we will consider the twist which corresponds to picking a U(1) subgroup such that we break the R-Symmetry in the following way SO(6)\rightarrow SO(2)\times SO(4). To see what exactly is happening, consider the spinor field \phi of the SYM with spin s under the SO(2) spin connection on \Sigma_2 and U(1) charge q. Now, the covariant derivative on the manifold is, obviously,   \mathcal{D}_\mu\phi=(\partial_\mu+is\omega_\mu+iqA_\mu)\phi. Here \omega_\mu=\epsilon_{ab}\omega^{ab}_\mu/2. Now if the metric on \Sigma_2 is ds^2=e^{2h}(dx^2+dy^2), the spin connection can be computed and once we identify the U(1) gauge connection with the spin connection, the constraint s=-q will give us the “covariantly constant” spinors which, now, are essentially the scalars. We have twisted the field theory by fixing the spin of the fields!

Essentially, the symmetry group (associated with the \mathbb{R}^{1,1}\times\Sigma_2), SO(1,3)\times SO(6) (corresponding to the tangent bundle and the normal bundle) is decomposed as SO(1,1)\times SO(2)_{\Sigma_2}\times U(1)\times SU(2)_L \times SU(2)_R. This corresponds to having (4,4) susy in the theory.

Brane realization through an example:

Consider a manifold \mathbb{R}^6\times K3 with D3 branes wrapping some holomorphic curve (Riemann surface) in K3. In the field theory limit, we obtain the gauge theory mentioned above. The transverse \mathbb{R}^6 direction, after the twist, will have the SO(4)=SU(2)_L \times SU(2)_R rotational symmetry. Now I will make a statement without showing any mathematics, because it is tedious, but it is important for my research. When we consider the low energy limit, compared to the size of Riemann surface, then we get a 2 dimensional effective theory in IR which now becomes (4,4) SCFT! 

Lagrangian description:

Let us write down the Lagrangian for the partially twisted theory which will enable us to find the gravity dual. Since we coupled the theory by introducing the spin connection with the \Sigma_2, we expect the extra terms in the Lagrangian coming from the covariant derivatives along that direction or fields which are charged under the U(1) part of the normal bundle with non-zero gauge connections.

Consider the Lagrangian with two twisted scalar fields given by \mathcal{Z}=X^1+iX^2. The action looks like S=\int Tr(|D_z\mathcal{Z}|^2+|D_{\bar{z}}\mathcal{Z}|^2+\frac{1}{4}R|\mathcal{Z}|^2).

Supergravity (SUGRA) duals of the field theories

Maldacena’s conjecture of SYM theories being dual to supergravity theories on AdS_5\times S^5 give us a good starting point. Since we are dealing with the deformed SYM (defined on \mathbb{R}^{1,1}\times\Sigma_2 with coupling to SO(6) gauge fields), the information gets translated into the boundary conditions in the dual gravity theory.

We start with a reasonable ansatz AdS_5 which at the boundary behaves like ds^2\sim \frac{-dt^2+dz^2+dr^2+e^{2h}(dx^2+dy^2)}{r^2}. Also we impose that the SO(6) gauge fields in AdS_5 asymptotes to the field theory gauge fields. This means that the metric of the geometry AdS_5\times S^5 with one index in AdS_5 and other in S^5, that is g_{\mu\phi}\sim A_\mu near the boundary.

Now a non trivial condition is that we should turn on an operator in the 20 of SO(6). Since we can easily see the coupling to the curvature in the action above, we get the ansatz for the operator as \mathcal{O}=Tr(\frac{2}{3}|Z|^2-\frac{1}{3}(\phi_1^2+\ldots+\phi_4^2)).

Fortunately operators that are turned on correspond to the fields in the 5d gauged supergravity multiplet. Now people have already worked out such theory with \mathcal{N}=8 SUGRA and we will start with that!

Since the connection we started of is U(1), we can start with the U(1) truncations of the supergravity. Hence the data with which we start off is

  1. a 5d metric
  2. a scalar field
  3. a U(1) gauge field

To find the supersymmetric solutions, we start with the fermionic supersymmetric variations which give the constraint equations. Once these equations are solved, we get the complete dual supergravity theory to the twisted SYM theory. To see the explicit calculations head over to page 8 of

Posted in The holographic view | Leave a comment

Determining the causal structure of spacetime (I)

In this series of blog posts, I aim to explain the causal structure of the spacetime solutions in general relativity. Currently, I am working on a special extension of Einstein’s theory (general relativity) which is known as Horndeski theory. There, I am trying to find the causal structure of the allowed solutions which allegedly permit super luminal propagation of metric perturbations. The methodology to obtain the structure is similar for all the gravitational theories and I wish to demonstrate it for general relativity (which is my comfort zone).

In order to make the observable predictions from a consistent physical theory, we are interested in finding how degrees of the freedom evolve and behave. For that, we try to obtain/formulate equations of motions, which capture the physics at the infinitesimal scale. Once we embed the physics in these equations, we solve them and make the verifiable predictions. For instance, in Newtonian mechanics, we study point particle moving in one dimension x. We, then ask how this degree of freedom behaves or evolves in time (which, again, is an assumption). For such a theory, we have the equation of motion F=ma=m\frac{d^2 x(t)}{dt^2} which contains the spectacular physical insight from Newton (I won’t even try to elaborate that because it deserves a separate blog post). Now this is a linear second order differential equation which requires two initial conditions. In other words, if you give me the initial position and initial velocity, I can tell you the entire future of x(t) by solving the differential equation. In fact, for constant F, we get x(t)=ut+\frac{1}{2}\frac{F}{m}t^2+x_0

Another way to look at it: equation(s) of motion (of a theory) give us the prescription to evolve the initial data (known values of the degrees of freedom) to final data that we are interested in.

Now Einstein’s gravity theory has the components of the metric as the degrees of freedom. We denote them by g_{\mu\nu}. Another radical difference (common to all relativistic theories) is that the space and time are unified into a single parameter space called spacetime manifold (inheriting all the structures of the pseudo-Riemannian manifolds). So here we ask our favorite question: how do g_{\mu\nu} evolve in the spacetime. Mathematically, we want to know the g_{\mu\nu}(x), where x is the representative of the spacetime coordinates from now on.

This time, we use the insight from Einstein to write the equation of motion for general relativity (in vacuum) as


where the symbols have their usual meaning. Here we are working on the 4 dimensional pseudo-Riemannian manifold (\mathcal{M},g) on which g is to be determined by the equation of motion. We define a Cauchy surface \mathcal{C} which is a c0dimension 1 surface in \mathcal{M} on which the initial data for Einstein’s equations is defined. So what does this initial data comprised of? (We will take a small detour and return to the topic in next blog post)

Helmholtz wave

 Cauchy.jpgIn order to understand that, we start with a simple example. Consider a 2+1 manifold as shown. Now let us consider a wave operator on this manifold given by \hat{L}=-\frac{\partial^2}{\partial t^2}+\frac{\partial^2}{\partial x^2}+\frac{\partial^2}{\partial y^2}. For a scalar degree of freedom denoted by u the operator \hat{L} acts on it to generate a wave equation \hat{L}u=0. Here we are not interested in solving the linear second ordered partial differential equation. We want to deduce the properties of the wave, like it’s speed of propagation and the extent to which a disturbance/fluctuation in u can propagate.

Let us say that we know the value of u=u(x,y) on \phi(x,y,t)=t=0 surface (where we might have a specified source or some boundary condition). The surface is essentially a codimension-1 surface and we will call the coordinates x,y as internal coordinates (w.r.t the surface). Thus we can now easily compute the internal derivatives (from u(x,y)) and denote them by u_i=\partial_iu(x,y) where i\in (x,y). Let the exterior/normal derivative to the surface be u_\phi (x,y)=\eta(du,d\phi)=-\partial_tu (note that d\phi = dt on \mathcal{C}).

In this example, we have a Cauchy surface \mathcal{C} defined by \phi =t with the data

  • u(x,y)
  • internal derivative u_i(x,y)
  • external derivate u_\phi(x,y)

given on \mathcal{C}. Clearly the second derivatives of  u except u_{\phi\phi} can be computed from the given data. And here we use the wave equation (or equation of motion in general) to find the missing second derivative of u. The co-normal to the surface \phi=0 is given by d\phi which we can easily represent in the natural co-normal basis as d\phi=\phi_\mu dx^\mu. And then we perform the coordinate transformation to chart \lambda^\mu such that \lambda^0=\phi. Note for this particular case d\phi \parallel dt and the new coordinate chart is exactly equal to x^\mu chart. This is because \mathcal{C} is already perpendicular to the time coordinate.

It is not very difficult to show that the wave equation \hat{L}u=0, under the above transformation, converts into u_{\phi\phi}Q(\phi_\mu)+\ldots=0 where Q(\phi_\mu)=\eta^{\mu\nu}\phi_\mu\phi_\nu and we call it the characteristic form. Now there are two situations

  1. Q\neq 0: in this case, we can invert the equation u_{\phi\phi}Q(\phi_\mu)+\ldots=0 and find the second derivative of u in the direction normal to the surface. With that information we can easily evolve the data further in time.
  2. Q=0: well, we can’t invert the equation which basically implies that there is no unique evolution of u beyond that surface (which is now called the characteristic surface).

In our example, the characteristic form is identically 1 (just look at the coefficients of second derivatives (-1+1+1). Hence we don’t have any characteristic hypersurface for the degree of freedom u obeying the wave equation \hat{L}u=0 with the Cauchy surface as the entire x,y plane. It is not surprising if you think about the plane electromagnetic waves, which again don’t have any characteristic hypersurface and obey the same wave equation we wrote above.

Moving on, we obtained an equation for a surface in new coordinates \lambda^\mu given by Q(\phi_\mu)=0. Physically, this is the surface beyond which we can not evolve the degrees of freedom uniquely (as the second derivative is not uniquely determined). Now we need a mechanism to generate this surface.

Bicharacteristic curves

First we define the bicharacteristic curves  or rays which are related to the linear second order partial differential operator \hat{L}, where we now generalize it to \hat{L}[u]=a^{\mu\nu}u_{\mu\nu}+d where u_{\mu\nu}=\partial_\mu\partial_\nu u. It is not difficult to check that in this case the characteristic form becomes Q=a^{\mu\nu}\phi_\mu\phi_\nu. For our Helmholtz wave, we get the form Q=-\phi_t^2+\phi_x^2+\phi_y^2.

Now the bicharacteristic curves are generated in by the linear ordinary differential equations, with a parameter s, given by

\frac{dx_\mu}{ds}=\frac{1}{2}\partial_{\phi_\mu}Q and  \frac{d\phi_\mu}{ds}=-\frac{1}{2}\partial_{x_\mu}Q

For our Helmholtz wave, one can easily solve them and find the solutions x(t)=at+b, y(t)=ct+d. They actually form the rays of a cone (with appropriate boundary conditions) traveling with speed 1.

We have shown that if we introduce some perturbations at some point in Minkowski manifold, those perturbations will travel at unit speed and won’t escape the cone. This essentially exhibits the causal structure of flat Lorentzian spacetime which is in concurrence with the wave equation.


Posted in The holographic view | Tagged , , , , , | Leave a comment

BMS on my mind

I thought that I knew the Minkowski spacetime solution fairly well. But recently, to my surprise, I found that there is much more physics in that solution, especially at the null infinity. There is a set of certain symmetry transformations acting asymptotically at the null infinity which preserve some boundary conditions. This group of symmetries is known as BMS (Bondi-Metzner-Sachs) symmetry group. I will make more precise statements below in this post. Another, more surprising fact I found is that general relativity may not be a truly diffeomorphism invariant theory. In fact these BMS transformations map one asymptotically flat spacetime solution of constraints to another physically inequivalent asymptotically flat solution ( But it is a topic for later post (when I am older and wiser).

Before progressing further, let me try to explain why physicists are interested in this symmetry group. In the well established Standard Model, the particles are the unitary irreducible representations of the isometry group of the flat Minkowski spacetime (known as the Poincaré Group), and, that of internal symmetries. For curved and dynamical spacetime, a part of the isometry group breaks down and the rest gets gauged (I plan to write a blog post describing this phenomenon in the near future). The Standard Model is formulated within the framework of Quantum Field Theory which works only on the flat spacetimes or curved spacetimes with fixed geometry or curved dynamical spacetimes with classical graviton.  Gravity on the other hand is the theory of the dynamics of the spacetime. And, therefore, the Standard Model can not include the gravitons, the quantized form of gravity (like photons are quantized form of electromagnetic fields).

Now, this is the most interesting part, if we study physics on the asymptotically flat dynamical spacetime solutions, then we do have the asymptotic symmetry group of this gravitational space (it is essentially BMS). And the irreducible representation of this group will give the usual particles and some extra multiplets. These extra multiplets are termed as soft gravitons which have gained much attention recently in the quantum gravity community.

Again, in the hope of using these notes for my future reference, and to save a great fraction of my energy, I will take the liberty to be mathematically intense. But, I will try to maintain the rigor of topology, differential geometry and group theory so that my mathematician friends don’t get annoyed.

Now there are two ways to fish the BMS group of Minkowski space. One is by Sachs ( and other is by Penrose. Let us start with the first one.

Consider a normal-hyperbolic Riemann manifold (\mathcal{M},g^{\mu\nu}) and a chart (u(=t-r),r,\theta,\phi) in the neighborhood of point P with the following properties

  1. the hypersurface u=\text{constant} are tangent to the local lightcone everywhere.
  2. r is the corresponding luminosity distance.
  3. the scalars  \theta, \phi are constant along each ray defined by the tangent vector k^a=- g^{ab}\partial_b u.

For such manifold \mathcal{M}, the line element ansatz can be written as ds^2=e^{2\beta}V\frac{du^2}{r}-2e^{2\beta}dudr+r^2h_{AB}(dx^A-U^Adu)(dx^B-U^Bdu) where A,B \in (\theta, \phi) and V, \beta, U^A \text{and} h_{AB} are the functions of coordinates and determinant of h_{AB}=b.

After feeding this ansatz into the Einstein’s field equations one obtains the following asymptotic behavior of the functions

  •  V=-r+2M+\mathcal{O}(\frac{1}{r})
  • \beta=\frac{cc^*}{(2r)^2}+\mathcal{O}(\frac{1}{r^4})
  • h_{AB}dx^Adx^B=(d\theta^2+\sin^2\theta d\phi^2)+\mathcal{O}(\frac{1}{r})
  • U^A=\mathcal{O}(\frac{1}{r^2})

A spacetime is said to be asymptotically flat if

  1. there exist the chart u,r,\theta,\phi with the properties mentioned above
  2. the line element equation and the asymptotic behavior mentioned above hold

Also at large r the line element \lim_{r\to\infty}{ds^2}=-du^2-2dudr+r^2d\Omega^2, which is what we would expect.

At this point Bondi and Metzner studied all the set of coordinate transformations which preserve the line element ds^2 and the asymptotic behavior of the functions. Their considerations were subsequently generalized and the following group was obtained.

So consider the spacetime smoothly covered by the coordinates 0\leq r<\infty, 0\leq\theta\leq\pi, 0\leq\phi\leq 2\pi \text{ and} -\infty<u<\infty. The BMS transformations are given by (\alpha, \Lambda)

  • \bar{u}=(K_\Lambda(x))^{-1}(u+\alpha(x))+\mathcal{O}(\frac{1}{r})
  • \bar{r}=K_\Lambda(x)r+J(x,u)+\mathcal{O}(\frac{1}{r})
  • \bar{\theta}=(\Lambda x)_\theta+H_\theta(r,u)r^{-1}+\mathcal{O}(\frac{1}{r})
  • \bar{\phi}=(\Lambda x)_\phi+H_\phi(r,u)r^{-1}+\mathcal{O}(\frac{1}{r})

Here x is the coordinate on S^2 given by (\theta,\phi)\Lambda is the Lorentz transformation acting as conformal transformation on S^2 and on K_\Lambda(x) is the corresponding conformal factor. \alpha is a scalar function on S^2 related to the supertranslation subgroup. Rest of the functions are uniquely determined by imposing the following constraints of composition

  • (\alpha_1,\Lambda_1)(\alpha_2,\Lambda_2)=(\alpha_1+\Lambda_1\alpha_2,\Lambda_1\Lambda_2)
  • (\Lambda_1\alpha_2)(x)=(K_\Lambda(x))^{-1}\alpha_2(\Lambda_1^{-1}x)

One can immediately notice the semi-direct product structure of the BMS group here. B=N\ltimes L where N is the infinite dimensional group of supertranslations and L is the connected component of the homogeneous Lorentz group.

Penrose’s derivation coming soon!

Posted in The holographic view | Leave a comment

The Twin Paradox (III)

This blog post is a sequel of the blog posts 1 and 2.

Special relativity is very simple and elegant theory of symmetry (ignore the last word if you are not familiar with it). Sometimes, a naive thinking may lead to the contradictions in the relativistic physics. For instance consider a tunnel and a train of same proper length L_0 with some constant relative velocity between them. There is an observer sitting outside the tunnel (observer A) and one sitting in the train (observer B). Now observer A will see the length of the train little shorter than the tunnel and observer B will see the length of tunnel little shorter than train (relativistic length contraction). Let the tunnel be a special one having two doors at the front (from which the train enters the tunnel) and at the end, capable of shutting down simultaneously. Observer A controls the doors. Now since the length of the train is smaller then that of tunnel according to A, she decides to shut down both the doors at that point of time when she sees the train completely inside the tunnel. On the other hand, the observer sitting in the train will deny the fact that the train can be captured by the tunnel as the train is longer than the tunnel according to her. But relativity says that the physics is invariant in any inertial frame therefore both the observers must concur on the facts (whether the train passes or gets captured).

The paradox is resolved by the fact that the shutting of doors is simultaneous in the coordinate chart of observer A and not in the frame of observer B (we have seen in previous blog post, the notion of simultaneity is frame dependent). The observer B sees the door at the end shutting first, which stops the train instantaneously (infinite acceleration), and then the door at front closing. Therefore both the observers will report the same fact, that is, the tunnel captures the train. Here we emphasize again, the events may not be mapped to same coordinates in different coordinate charts. To map the coordinates, of an event, form one inertial coordinate system to another, we must use Lorentz transformation.

Another common paradox that arises is as follows. Consider two twins Alice and Bob living in an inertial coordinate system at same location. Alice starts traveling in a spaceship, moving with a constant relative speed, say .6 meters per light second with respect to Bob along x-axis, for some time (in the frame of the observer B) and then comes back to Bob with same the relative velocity. Now, naively, one can say that according to Bob, the time of Alice goes slower than him, hence when Alice returns, he must age more. According to Alice, Bob’s time goes slowly (moving clocks are slower for Alice too!) and therefore Alice should age more when she returns. But both Alice and Bob should agree on the same fact and here lies the paradox which everyone comes by when studying the relativity for the first time.

We will attack this problem from the perspective of the naive thinking which is based on the assumption that the physical phenomenon is symmetric from the point of view of Alice and Bob. So first let us think how many coordinate charts we require to study the problem?

No, not two but three. In first coordinate chart (of observer B) Bob is at rest at some position (from where Alice starts her journey). In second coordinate chart (of observer A), Alice is at rest at some position when going away from Bob and in third coordinate chart (of observer A’), Alice is at rest at some position when returning back towards Bob. Also note that Alice and Bob are not observers according to our definition in the blog post. So it is Alice who “jumps” from one coordinate chart to another whereas Bob remains in the same frame of reference. Thus the physical phenomenon is not symmetric from the point of view of Alice and Bob.

The word “jumps” has been used because we assume that she accelerated infinitely when changing her direction. Now the postulates of special relativity being valid for the inertial frames doesn’t mean that we can’t study accelerated objects. We can perfectly study the accelerated objects by using the notion of comoving frames. In our example, second and third frames are the comoving frames for Alice. Similarly one can study finitely accelerating objects by considering a series of jumps (of the system) from one comoving frame to another.

The the Minkowski graph of the observer B is (tracking of the coordinate x is not necessary). Different colors represent the time coordinate axis of different frames.BIn this graph, the green line is the world line of Bob and the world line of Alice is represented by the violet line and red line. One can clearly see that Bob ages by 30 light seconds and Alice ages by 24 light seconds.

Now let us look at the graph of the observer AA1
Here one can observe that in the coordinate chart of observer A, Alice ages faster than Bob for half of her trip. For the next half, she ages much slower then Bob (who ages 30 light seconds) such the in the end of the journey, Alice ages less (again 24 light seconds) than Bob.

And finally we see the graph of the observer A’A2

From these three graphs we see that all the observers (A, A’ and B) agree that Bob has aged more than Alice in this trip, by 6 light seconds. It means all the inertial observers observed the same physical phenomenon, as required by the first postulate of special relativity, thus resolving the twin paradox.

Posted in The holographic view | Tagged , , , , , , | Leave a comment

Special relativity in Minkowski graph (II)

In the previous blog post we noted that the Lorentz transformation is the map from one inertial coordinate system to another. This means that the coordinates of a person standing on the platform can be mapped to the coordinates of the person sitting in the train using the transformation, which results from special relativity postulates, and quantitative analysis of the physical phenomenon can be done in both the frames.

We also note that these inertial frames (coordinate systems) are same in nature (that is the reason why we used the same constant c in both the coordinate systems). In other words, no inertial frame is to be given any sort of preference. If you like, you can think a 4 dimensional grid of space and time built from the 4 dimensional version of a unit cube. The 2 dimensional version of this grid is shown in the figure of coordinate chart below.

Now this grid must be exactly same for all the coordinate systems (henceforth, we will talk of inertial coordinate systems only). Therefore the grids in coordinate charts 1, 2 and 3 (of different inertial observers) are congruent. All that differs is the mapping of same spacetime events. For instance the green dots represent the events of aging of Bob. Now this set of spacetime events is mapped differently in different frames as shown.

More precisely, one meter “of” a coordinate system should be equal to one meter “of” any other coordinate system and so must be the case with one second (I am really not sure which preposition to use here, but I think “of” should convey the message). These units of measure which define the grid size, are same for all the frames and are termed as “proper length” and “proper time”. Although, a rod (a set of special spacetime events defined later) of one meter in one frame might not be one meter in other coordinate system. In fact Lorentz transformation makes sure that the length is contracted by a factor of \gamma = 1/\sqrt{1-\frac{v^2}{c^2}} ( We will also explain this phenomenon later in this post.

The German mathematician Hermann Minkowski utilized the property of coordinate charts (maps from manifold to \mathbb{R}^4) in making a useful geometrical tool called Minkowski graph. The idea is that on a graph paper we have two dimensions at our disposal. And, the Lorentz transformation shows that if relative velocity between two frames is along x-axis then the map of the coordinates y and z to another coordinate system is identity. Therefore we consider the map from the events in the spacetime manifold to \mathbb{R}^2 space (our graph paper) which includes only two coordinates x and t as shown in the figure.


In this coordinate system (frame of reference), the event A has been assigned the coordinates (5,3). Also note that the time axis has been divided by the constant c and, here, we define one light second as the time taken by light to travel one meter (or inverse of c).

Now each point in this graph (coordinate system) represents an event in the spacetime manifold. As the consequence, the evolution of a particle in the spacetime is mapped to a trajectory in the graph (which is drawn by the observer at the origin). This trajectory is called the “world line” of the particle. On carefully examining the transformations, one can note that when the relative speed between two frames is greater than the constant c (which we now set to unity) then the factor \gamma becomes imaginary. Furthermore, the factor approaches infinity when the relative speed approaches unity. In relativistic dynamics the total energy of a free massive particle is directly proportional to \gamma. Hence it would require infinite amount of energy to accelerate a particle near the speed of light.

The slope of the trajectory in the Minkowski graph is the inverse of the speed of particle with respect to the observer (who is drawing the graph). Therefore the slope of the trajectory is always greater than unity in the graph because a massive particle cannot be accelerated to the speed of light. A light ray always follows the trajectory of straight line with unity slope. The Lorentz transformation makes sure that the trajectory of the light remains same in all the graphs, corresponding to various coordinate systems.

The Minkowski graph is quite helpful in comparing the physics from two frames. Consider two observers F and F’ moving with some relative speed.


Observer F uses her coordinate system to draw a Minkowski graph as shown in the figure above.

Now consider a massive particle at the origin with respect to the observer F’ (i.e x^\prime =0) from time t^\prime = 0 to t^\prime = 1. So how will F draw the world line of the particle in her coordinate system (or Minkowski graph)? As you might have guessed, we will use the Lorentz transformation (because we have to map the information given in F’ to F). In other words we have to find the equation of the particle in coordinates of F subjected to the constraints \Delta x^\prime = 0 and \Delta t^\prime = 1 in the coordinates of F’.

From the Lorentz transformation, the first constraint will give a trejectory equation \Delta t = \frac{\Delta x}{v} which F will interpret as the time axis of the observer F’ (because the line of constant x-coordinate giving t-axis and vice versa is a property of the orthogonal coordinate system in the \mathbb{R}^2 space). Therefore F will draw a straight line in her graph with slope as inverse of the relative speed. Let v=.6 meters per light second. The question now arises is that how will she mark the scale of that time axis in her coordinate system (or graph)? The second constraint gives a result that \Delta t = \frac{1}{\sqrt{1-v^2}} which means that a unit second in the coordinates of F’ is mapped t0 1.25 seconds in the coordinates of F. It means that the two events with a unit time separation at the origin in frame F’, are separated by 1.25 seconds in the frame F. Thus the graph of F will look like


The violet dots correspond to the events with unit time intervals and zero space displacements in F’ (hence it is the time axis for F’) while green dots in vertical line correspond to the events with unit time intervals and zero space displacements in F. Note that in the graph drawn by the observer F, the 15 units of her time is equal to the 12 units of the time in F’. It is consistent with the fact that \Delta t = 1.25\Delta t^\prime. This is why we say that moving clocks are slower.

Similarly, we can draw the x-axis of F’ in the coordinates of F using the same technique. Consider a situation in which \Delta t^\prime = 0 (simultaneous events) and \Delta x^\prime = 1. Using the Lorentz transformation and first constraint we get \Delta t = v\Delta x. This equation gives the x-axis of the observer F’ in the frame of F. Second constraint gives the relation \Delta x = \frac{1}{\sqrt{1-v^2}}. This means that a unit length (between two simultaneous events) in F’ is mapped to 1.25 meters in F. The graph now looks like

minkowskispacetimeThe violet line with lesser slope represents all the simultaneous events of F’ (with t^\prime = 0) in the coordinates of F where they are having different coordinate t. Thus simultaneity is a relative concept and depends of the frame of reference.

In the end of this post, we will explain length contraction using the Minkowski graph. The length of a rigid rod is defined as the distance (in \mathbb{R}^n space) between its end points at the same time coordinate (convince yourself!). Consider a rod of 2 meters length at rest in the frame F’. Let its first end point be at (4, t^\prime). Therefore, according to the definition of length, the coordinates of other end point will be (6, t^\prime). Now this rod will trace a “world sheet” in the Minkowski graph as shown in the figure below. The red dots on the lower violet line (the x-axis of F’) are the coordinates of the endpoints of the rod which are, according to F’, (4,0) and (6,0). They evolve in the time t^\prime such that after unit second in F’, the coordinates are (4,1) and (6,1) (the couple of next red points on each red line) in F’. And it should be, because the 2 meter rod is at rest in F’.


Now how much length will the observer F measure? Let us say that at t=8 light seconds she measures the length. Now according to the definition of the length, she will cut the world sheet of the rod with a line (to obtain simultaneous coordinates in her frame) and measure the distance between space coordinates. In this example the coordinates (pointed by the arrows) are 8 and 9.6. Thus the length of the rod is 1.6 meters in the frame F. This is called length contraction and this is why we say moving trains are shorter.

The main point to keep in mind are that in special relativity, an event in spacetime might not have same set of coordinates in different frames of reference. To map an event from one coordinate chart to another we must always use Lorentz transformation. If this is followed honestly, then all the paradoxes of special relativity can be resolved. One such paradox is “twin paradox” which I will explain in next post.

If you are wondering how I made these cool Minkowski graphs then head over here.

Posted in The holographic view | Tagged , , , , | 1 Comment

A modern viewpoint on special relativity (I)

In this series of blog posts, I will explain our current understanding of spacetime using the notions of relativity.

Special relativity originates from two simple principles but its predictions challenge our day to day experiences. For instance, according to the relativity, a person in the train station will see the the length of a moving train a little smaller than that of stationary train (of same type). Furthermore, the same person will notice that the clocks hanging in the train ticking  little slower than that on her wrist. These phenomena are termed as “length contraction” and “time dilation”. The magnitude of these contractions and dilations depends directly on the relative velocity of the train and the person standing on the platform. Now, commonly, the relative speed of train is so low that one doesn’t see any appreciable changes. Cosmic particles which travel at very high speeds are known to have longer half lives than their laboratory counterparts. But convincing the audience that relativity is a correct theory of the nature is not the purpose of this post. Assuming that relativity is a valid theory, the aim of this post is to describe and discuss the theory.

In my opinion, special relativity is all about the properties of spacetime in fairly small region and how various observers record the events occurring in that spacetime. The words in italics have precise definitions and meanings in physics so let us spend some time in understanding them. When you read the word spacetime you might think of it as a three dimensional space with time t as separate entity running in the background. This is totally a wrong picture of the spacetime. If you are a physics major, and have had a course in relativity, you might think of it as a strange mixed space of \mathbb{R}^3 and t with 4 coordinates in which the invariant path length is given ds^2 = dx^2+dy^2+dz^2-c^2dt^2, where c is the speed of light.

The problem with the last definition is that it does’t capture the essence of spacetime as such. It is like explaining an alien that the person in the red shirt is a boy. Just like the word “boy” is assigned to a specific creature having certain characteristics and is independent of the shirt he wears, the word “spacetime” is assigned to a specific physical entity and is independent of the coordinates assigned to it by the observer (yes, an observer is someone who assigns the coordinates such that her coordinates are fixed  in the coordinate system). Please note that this is not a standard definition of the observer but it is quite helpful in explaining the subject. In relativity, spacetime is a special set, with many interesting properties, which mathematicians call “Manifold”. And the coordinate system is a map from the spacetime manifold (say the set \mathcal{M}) to the set \mathbb{R}^4. The coordinate chart of a N dimensional manifold is shown in the figure below (courtesy Carroll’s notes)


Thus, when an observer is assigning coordinates x, y, z and t to an event, he/she is basically mapping the events in \mathcal{M} to a \mathbb{R}^4 space. When an observer assigns a coordinate system, he/she defines a frame of reference. So when we say “the frame of reference of the observer A”, we mean the coordinates assigned by A.

In physics we postulate that all the physical phenomena, comprising of events, happen in this spacetime manifold. In order to quantitatively analyze them, we require an observer who can assign the coordinates  to these events. First postulate of relativity says that the physics doesn’t depend on the coordinate systems of the observers moving with constant relative velocities with respect to each other. We call all such observers as “inertial observers”. It simply means that if we do the physics calculations using the coordinates of any inertial observer we will find the same results. This shouldn’t be hard to accept as the real physical phenomena take place in the spacetime manifold and no matter which map (coordinate system) we use for quantitative analysis, we must find the same physical results. Now be careful! this postulate talks about the maps which are inertial. So we have to be careful when working with accelerated bodies.

Vectors are mathematical  objects (maps, to be precise) which, by definition, remain unchanged under the coordinate transformations. In relativity, we have 4 dimensional vectors living in the spacetime manifold which remain same in all the inertial coordinate systems. We call such objects as invariants. In fact there is a class of invariants called “Tensors” and we assign all sorts of physical objects (energy-momentum, electromagnetic field etc) to the tensors as they, too, don’t depend on the frame of reference. Now events, which are also physical objects, are assigned to 4 vectors. The distance between two events is called the “path length” and is a very good example of an invariant (actually it is a scalar which is a tensor of rank 0). The second postulate gives us a prescription to make scalars, in terms of what is known as metric. According to this prescription the squared path length is mapped to dx^2+dy^2+dz^2-c^2dt^2 in a coordinate system constructed from x, y, z  and ct . Here c is a constant introduced to match the dimensions of the length squared when multiplied by dt^2 . Simple analysis shows that it must have the dimension of \frac{\text{length}}{\text{time}}. In another coordinate system with the coordinates x^\prime, y^\prime, z^\prime and ct^\prime the squared path length is mapped to dx^{\prime 2}+dy^{\prime 2}+dz^{\prime 2}-c^2dt^{\prime 2}. Note the same constant c has been used to maintain equality of the frames. Now, due to the invariance of path length, dx^2+dy^2+dz^2-c^2dt^2=dx^{\prime 2}+dy^{\prime 2}+dz^{\prime 2}-c^2dt^{\prime 2}. This is the starting point of the derivation of the Lorentz transformation in the standard relativity textbooks (although they arrive at this expression using light rays which is misleading sometimes). The Lorentz transformation is a relation between the coordinate systems of two inertial observers, assigned to the same events in the spacetime manifold \mathcal{M}.

In the end of this post I would like to give the celebrated Lorentz transformations. Consider two observers moving with a constant relative speed along one particular direction (say along x axis), then the relation between the coordinates of the observers is (image courtesy Wikepedia)


Here \gamma = 1/\sqrt{1-\frac{v^2}{c^2}}. When physicists did the experiments they found that the value of the constant c is nearly 3\times 10^8meters per second which is the speed of light. The derivation of these transformations can be found in any standard textbook on relativity or in Wikipedia (

Posted in The holographic view | Tagged , , , | 2 Comments

Spacetime odyssey

All physical phenomena we see around us take place on something that we call spacetime. Therefore it becomes important to get acquainted with spacetime and understand how it affects the physical phenomena or gets affected by them. I will chronologically introduce the notions associated with spacetime because, usually, everyone gets to learn it in the exact same fashion.

Until 20th century, space and time were thought as different entities. It was thought that there exists a three dimensional euclidean space which acts as a stage for the physical phenomena. This stage was supposed to be rigid, static and flat (rigorous definition will be given later). The space was a three dimensional extension of Euclid’s five axioms for plane geometry which were

  1. Between two points in the space, only one line can be drawn.
  2. A finite line in a space can be extended indefinitely.
  3. A circle with arbitrary radius and center can be drawn in the space
  4. All right angles are equal to one another.
  5. If a straight line falling on two straight lines make the interior angles on the same side less than two right angles, the two straight lines, if produced indefinitely, meet on that side on which are the angles less than the two right angles.

These axioms were based on the human experience (at the time of Euclid) and considered to be true in every part of the universe.  Even now, based on high-school experience, one doesn’t need enough convincing about the validity of these postulates.

Let us now define our three-dimensional Euclidean space in formal way. It is a set of points with well defined distance between any two points of the set. If you give me two points say P_1(x_1,y_1,z_1) and P_2(x_2,y_2,z_2) belonging to Euclidean space. I will give you the distance between them by computing  d=\sqrt{(x_1-x_2)^2+(y_1-y_2)^2+(z_1-z_2)^2}.

A couple of interesting points can be noted. First, the definition of distance between two points is static i.e the way to find distance doesn’t change with time. We can represent the definition of distance between two points using matrices. Let us represent the points by column matrices P_1 = \left(\begin{array}{c}x_1\\y_1\\z_1\end{array}\right) and P_2 = \left(\begin{array}{c}x_2\\y_2\\z_2\end{array}\right) and define a 3\times 3 matrix \eta = \left(\begin{array}{ccc}  1 & 0 & 0\\  0 & 1 & 0\\  0 & 0 & 1 \end{array}\right). The distance between two points can now be given by d^2 = (P_1-P_2)^T.\eta .(P_1-P_2) (here ‘.’ represents matrix multiplication). Note that the definition of distance depends on the matrix \eta. Physicists and mathematicians call \eta metric of the space. Now we can say that it is the metric which defines the distance between two points of a space. In fact the metric defines the notion of a ‘dot product’ in a space. Take any vector, write its components in column matrix and used the matrix multiplication defined above to find the length of the vector.

The metric of Euclidean space is time independent (build from constant numbers). Although I have used matrix to represent a metric, generally it should be thought as a set of numbers given by \eta^{\i\j} where i,j\in 1,2,3.

Next we note that the metric (of Euclidean space) is independent of space itself. In fact this independence is a direct consequence of Euclid’s axiom 5. This independence implies that the space is flat. I will demonstrate this and provide a formal definition of flatness in a later post.


Posted in The holographic view | 3 Comments

Particle in a box (infinite square well).

Yes, I am aware of the ennui caused by this topic. We write schrodinger’s time independent equation, solve it, apply the boundary conditions, find the energy eigen states, normalize them and everybody is happy. If you are a physics major, then you must have done this countless times. But I always had a problem with this system because of its highly un-physical nature (I will come back to this later). That makes it a mathematical curiosity and if it is, then why not explore it to the full extent? Please note that in this post I will use rigorous mathematics and will define terms in proper mathematical language (I hope this post will serve as my notes for future reference). Here one can also appreciate the aesthetics of Linear Algebra.

The question I had been asking myself was, “Why do we always work in position representation? What is wrong in working in momentum representation (for this particular problem)?”. The curiosity to find the answers compelled me to plunge into the depths that are generally not touched and mentioned in physics literature.

The conventional Quantum Mechanics (for this problem) is performed on the Hilbert Space L^2 ([0,1], dx) which I will denote with \mathcal{H}. Here L^2 (read L two) represents a functional space of square integrable functions and [0,1] represents the range of parameter x (position). I am interested in working with L^2 (??, dp) and want to obtain energy eigen states in terms of the wavefunction \varphi(p). I dont know the range of the parameter p (momentum). I can’t take it as whole real line (because of the bounded nature of position). Parseval-Plancherel theorem in not applicable here. I must first find the nature of momentum operator. Then maybe its spectrum might give me the idea about the range of momentum.

Just like a function has a domain, the linear operator on Hilbert space has a domain. The formal definition of the operator is (this definition is for general Hilbert Spaces and not restricted to L two spaces)

An operator on the Hilbert space is a linear map A : \mathcal{D}(A) \mapsto \mathcal{H} such that \psi \mapsto A \psi (where \psi \in \mathcal{D}(A)).

\mathcal{D}(A) represents the linear subspace of \mathcal{H}. This subspace is the domain of the operator. Strictly speaking, Hilbert space operator is the pair (A, \mathcal{D}(A)) which is the specification of the operation with the domain on which the operation is defined. Now two operators are said to be equal if and only if

A \varphi = B \varphi for all \varphi \in \mathcal{D}(A) = \mathcal{D}(B).

Let us consider position operator denoted by Q. The operation is defined as Q \psi (x) = x \psi (x). But given the boundary conditions, the domain is defined as

\mathcal{D}(Q) = \{\psi \in \mathcal{H} \mid Q\psi \in \mathcal{H} and \psi(0) = \psi(1) = 0 \}.

As mentioned earlier, the Quantum Mechanics is performed on the Hilbert Space of square integrable functions (denoted by \mathcal{H}), we certainly would not want a mapping which blows up the square integrability. Hence it is important to put the restriction that the output be Q\psi \in \mathcal{H} which translates into ||Q\psi||^2 = \int_{0}^{1} dx x^2|\psi (x)|^2< \infty. So, in words, the domain of Q is the set of all the square integrable functions such that

  • they satisfy boundary conditions
  • the output of linear map should be finitely square integrable

Now let us define momentum operator P. Quantum Mechanics defines the operation of this operator (on a function) as

P\psi = \frac{\hbar}{\iota} \frac{\partial \psi}{\partial x}

Certainly, we would like to have a set of those functions which besides satisfying boundary condition, also have square integrable derivatives. In formal language

 \mathcal{D}(P) = \{\psi \in \mathcal{H} \mid \psi^{\prime} \in \mathcal{H} and \psi(0) = \psi(1) = 0 \}.

Now let us see the definition of the adjoint of an operator A which means we have to define the operation and domain of the adjoint of operator (let’s denote it by A^{\dagger} ).

\mathcal{D}(A^{\dagger}) = \{\phi \in \mathcal{H} \mid \exists \tilde{\phi} \in \mathcal{H} such that \langle \phi , A\psi\rangle = \langle \tilde{\phi}, \psi\rangle \forall \psi \in \mathcal{D}(A)\}

This definition says that the domain of \mathcal{D}(A^{\dagger}) is set of \phi (square integrable)  such that there exists \tilde{\phi} (again square integrable) which makes the equation \langle \phi , A\psi\rangle = \langle \tilde{\phi}, \psi\rangle true for all the \psi belonging to domain of operator A.

For a given \psi , \tilde{\phi} depends on A and \phi . The operation of  A^{\dagger} is defined as

A^{\dagger} \phi = \tilde{\phi}.

So we, now, have a full working definition of the adjoint of a operator. Let’s find the adjoint of P. The inner product equation \langle \phi , P\psi\rangle = \langle \tilde{\phi}, \psi\rangle must hold. So

\int_{0}^{1} \overline{P^{\dagger}\phi}(x) \psi (x) dx = \int_{0}^{1} \overline{\phi}(x) \frac{\hbar}{\iota} \frac{\partial \psi}{\partial x}(x) dx .

On integrating by parts, the RHS

\int_{0}^{1} \overline{P^{\dagger}\phi}(x) \psi (x) dx = \frac{\hbar}{\iota}\left[_{0}^{1} \overline{\phi}(x) \psi (x)\right] -\frac{\hbar}{\iota} \int_{0}^{1}\overline{\frac{\partial \phi}{\partial x}}(x) \psi (x) dx.

For this equation to be true we must have

\overline{P^{\dagger}\phi}(x) = -\frac{\hbar}{\iota}\overline{\frac{\partial \phi}{\partial x}}(x)


P^{\dagger}\phi(x) = \frac{\hbar}{\iota}\frac{\partial \phi}{\partial x}(x).

The most curious and interesting point is that the function \phi(x) does’t need to satisfy boundary condition (of infinite square well) since \left[_{0}^{1} \overline{\phi}(x) \psi (x)\right] = \left[ \overline{\phi}(1) \psi (1) - \overline{\phi}(0) \psi (0) \right] is zero as \psi(x) already satisfies the boundary condition. Clearly the \phi(x) has no restriction save square integrability which shows that the set \mathcal{D}(P^{\dagger}) has more elements than set \mathcal{D}(P). The formal definition of \mathcal{D}(P^{\dagger}) is

\mathcal{D}(P^{\dagger}) = \{\phi \in \mathcal{H} \mid \phi^{\prime} \in \mathcal{H}\}.

It can be seen that  \mathcal{D}(P) \subset \mathcal{D}(P^{\dagger}). Thus P^{\dagger} \neq P i.e the momentum operator is not self-adjoint. On the other hand it is quiet simple to show that Q^{\dagger} = Q i.e the position operator is self-adjoint.

Now let us see the definition of Hermitian operators.

The operator A on \mathcal{H} is Hermitian if  \langle \phi , A\psi\rangle = \langle A\phi,\psi\rangle for all \phi, \psi \in \mathcal{D}(A)

Since A operates on both the functions, needless to say, they must belong to the domain of the operator. Let us use the definition of adjoint operators. Operator A^{\dagger} is such that \langle \phi , A\psi\rangle = \langle A^{\dagger}\phi,\psi\rangle which implies \langle A\phi,\psi\rangle = \langle A^{\dagger}\phi,\psi\rangle . Thus if the operation of the operator is same as that of its adjoint, then it is Hermitian operator. The domain of both the operators need not be same.

We have seen that the specification of both P and P^{\dagger} is same \frac{\hbar}{\iota}\frac{\partial}{\partial x}. The momentum operator is Hermitian for this case but not self-adjoint. It is easy to note that all the self-adjoint operators are Hermitian but vice-versa is not true.

Now the spectral theorem of linear operators states that

If the Hilbert space operator A is self-adjoint, then its spectrum is real and the eigenvectors associated to different eigenvalues are mutually orthogonal; moreover, the eigenvectors together with the generalized eigenvectors yield a complete system of (generalized) vectors of the Hilbert space.

This property does not hold for only Hermitian operators.

As the momentum operator, here, is only Hermitian its eigen vectors won’t form complete basis. Thus there is no use in solving this problem in momentum space. Moreover, L^2 (??, dp) space wont have continuous momentum (because of quantization imparted by boundary conditions).

Normally one assumes that the eigen vectors of an observable forms a complete set of basis. This means that observables must be represented by self-adjoint operators. From above we note that momentum operator is not an observable for this problem! 

Absurd, isn’t it? Well, the boundary condition itself gives us the hint of the highly non-physical nature of the problem. The “non self-adjointness” of momentum operator further shows us how the infinite potential square well is a mathematical curiosity with tenuous nexus to real physical systems.

Posted in The holographic view | 4 Comments