Behavior Trees in Game Artificial Intelligence

Due to circumstances beyond my present control, my career trajectory has taken a sharp turn (quantified by delta function ). I hope Dirac would be proud of that! In order to work with same passion and rigor, I have channelized my energy into Artificial Intelligence (AI) and parted ways from Physics. Of course it was painful and depressing, but I found that even such feelings have utility in the catharsis. This blog-post is an attempt to show just that!

The gaming industry has played pivotal role in reshaping the modern networking architecture and graphics rendering (replication, realistic rendering and ray tracing). Therefore it is not unreal to expect the industry to push forward the AI realm. This can be estimated from sheer number of players in games like Fortnite (250 million), Halo and  GTA V among others. Any breakthrough in the field of AI can be conveniently and collectively scrutinized by millions of players, facilitated by streamline workflow including developers, players and academics.

Behavior tree (BT) is an important mathematical structure which generates appropriate series of tasks in modular fashion. For instance, a patrolling pawn in some evil fortress. Unreal Engine (UE) is one of the very first game engines to implement BT in very natural way (given the visual scripting structure of UE called Blueprints). I will demonstrate the BT in action using UE project in this blog-post.

The BT can be pictured as

behaviortree

Here black nodes represent the “composites” (from of flow control) and pink nodes represent “tasks”. I have used two categories of composites

  1. Selector: Executed in left-right pattern. It stops traversing the subtrees once successful execution branch is found.
  2. Sequence: Executed in left-right pattern. It doesn’t stop executing subtrees until unsuccessful execution branch is  found.

The entire BT is executed top-down pattern in deterministic way. A next level implementation could involve assigning probabilities with each edge resulting in particular node, but we won’t talk about that here.

If you were to petrol, what would be the list of tasks you’d make to execute. Probably it might include

  1. Spotting enemy
  2. Chasing enemy if spotted
  3. Else perform random patrol in arbitrary directions

Now next step is to further divide the tasks into single elemental entities. For instance spotting enemy task includes checking lineofsight actors and spin towards appropriate actor if found. Thus the hierarchy and placement of composites and task should be as shown in the figure above.

Now BT in action corresponding to the chase is shown below.

Chase

One can clearly visualize the train of executing branches of the tree. Since Chase Player is a sequence node, we can deduce that the tasks “Rotate to face BB entry”  and “BTT_ChasePlayer” have been executed and now “Move To” task is undergoing and indeed that is what is being done in the Editor.

Next, the BT simulation of patrolling with that task “Move To” is

petrol_moveto

and “Wait” is

petrol_wait

The complete information to setup the project is detailed at https://docs.unrealengine.com/en-US/Engine/ArtificialIntelligence/BehaviorTrees/BehaviorTreeQuickStart/index.html. I encourage to try!

Finally I will give a teaser to upcoming UE project https://github.com/ravimohan1991/MAI

On-shell bosonic supersymmetric brane configuration

800px-calabi-yau

It has been, again, a long time, since I last wrote my blog-post! It is not that I don’t want to write, it is just that I have been having so much fun (doing my research) and somewhat busy (changing my apartment and doing similar non-productive chores). Now that I am within fewer steps away from my department, and that I have decided to spend rest of my PhD days in this new apartment, I can devote more time to writing.

This post is about kappa symmetry which is a tool to obtain the supersymmetric brane configurations. Now, Susy (the heart of my research), is not only the most beautiful and difficult 😉 symmetry but also the strongest symmetry that I have ever encountered.  For some theories, it turns out that a supersymmetric configuration automatically implies the equations of motion (the on-shell configuration)! Therefore, supersymmetric theories without the Lagrangian formalism can be probed and studied! Furthermore, there are usually alluring geometrical interpretations associated with the configurations.

Currently I am working out the solutions of some supersymmetric brane embeddings on a curved supergravity spacetime geometry (with the topology AdS_5\times\mathcal{C}\leftarrow S^1\times S^2\times S^1), which, according to the AdS/CFT correspondence, represent line defects (analog of Wilson & t’Hooft lines)  in the mysterious (2,0) super-conformal field theories in d=6.

Consider any SUGRA with bosonic (\mathcal{B}) and fermionic (\mathcal{F}) degrees of freedom. Now it turns out that one can set the \mathcal{F}=0 on-shell. I don’t clearly see that, but it seems that the supersymmetry constraints the theory to that extent that equations of motion render the \mathcal{F} non-dynamical! This also means setting \mathcal{F}=0 implies equations of motion. Now the question whether the \mathcal{B} configuration preserves supersymmetry reduces to the question that what transformation parameters \epsilon exist for on-shell bosonic configurations. Symbolically, \delta\mathcal{B}|_{\mathcal{F}=0}=0 while \delta\mathcal{F}|_{\mathcal{F}=0}=0. The structure of local supersymmetry in SUGRA is given by \delta\mathcal{B}\propto\mathcal{F} and \delta\mathcal{F}\propto\mathcal{P}(\mathcal{B})\epsilon.  Here \mathcal{P} is a Clifford valued operator with maximum first order derivatives.

The previous statements imply \mathcal{P}(\mathcal{B})\epsilon = 0. Now we note couple of points

  • The equation constraints the \mathcal{B} degrees of freedom via first order (in some cases I know, linear) partial differential equations which are much simpler than the second order on-shell differential equations. Thus the complexity is greatly reduced!
  • The equation also constraints the transformation parameter \epsilon in accordance with the bosonic configuration. For SUGRA geometry, we have what are known as Killing spinors which are solution of Killing equations corresponding to the bosonic degrees of freedom known as metric. For example, d=11, \mathcal{N}=1 supergravity, \mathcal{F} consists of gravitini \Psi_a, the supersymmetric partner of graviton (the metric). Then \delta\Psi_a=\left(\partial_a+\frac{1}{4}\omega_a^{bc}\Gamma_{bc}\right)\epsilon-\frac{1}{288}\left(\Gamma_a^{bcde}-8\delta_a^b\Gamma^{cde}\right)R_{bcde}\epsilon= \mathcal{P}(\mathcal{B})\epsilon. Thus the equation \delta\Psi_a=0 gives the solution of Killing spinor.

As of now, we don’t have the complete formulation of M-theory (a unification of five superstring theories). We have a good idea of how M-theory should look like at low energies. In other words, we know the dynamical degrees of freedom with large wavelengths and they make up supergravity theory (that we know and understand) + M branes. We even have a Lagrangian for the theory at that energy scale which is given by

S\approx S_{SUGRA} + S_{\text{Brane}}

The first term corresponds to \mathcal{N}=2 d=10 type II A/II B supergravity or \mathcal{N}=1 d=11 supergravity. The second term describes both the brane excitations (giving rise to field theories) and interactions with the gravity. The action here is known as brane effective action.

Now for my research purpose, I am supposed to find the placement of M2 brane in the SUGRA background (mentioned above) such that there is a supersymmetric bosonic configuration. The placement of brane is based on \mathcal{B}. Here again we set the \mathcal{F}=\theta=0 which is compatible with the on-shell configuration (brane equation of motion). To get the supersymmetric configuration

\delta\theta=\delta_\kappa\theta+\epsilon+\Delta\theta+\xi^\mu\partial_\mu\theta=0

where

  • \delta_\kappa\theta is the kappa symmetry
  • \xi^\mu\partial_\mu\theta world volume diffeomorphism
  • \Delta\theta is any other transformation besides supersymmetry generated by \epsilon.

Now again due to the reasons beyond me for now, the restriction of these transformations for the bosonic configuration

  • \delta_\kappa\theta|_{\mathcal{B}}=(1+\Gamma_\kappa|_\mathcal{B})\kappa
  • \Delta\theta|_\mathcal{B}=0 (makes sense since transformations by \epsilon are fermionic!)

Hence

\delta\theta=(1+\Gamma_\kappa|_\mathcal{B})\kappa+\epsilon

Now it turns out that not all the fermionic degrees of freedom in this theory are dynamical.  This forces us to work at the intersection of kappa symmetry gauge fixing conditions and \theta=0. So we follow a two step process

  1. Kappa symmetry invariance: \mathcal{P}\theta=0 where \mathcal{P} is field independent gauge fixing projector such that \theta = \mathcal{P}\theta+(1-\mathcal{P})\theta. And now the restriction of supersymmetric variation to bosonic configuration is
    \delta\mathcal{P}\theta|_{\mathcal{B}}=\mathcal{P}(1+\Gamma_{\kappa}|_\mathcal{B})_\kappa+\mathcal{P}\epsilonEquating this to 0 gives \kappa = \kappa(\epsilon) the compensating kappa transformation corresponding to the background spinor.
  2. Now we have the dynamical set of fermionic configuration given by (1-\mathcal{P})\theta|_{\mathcal{B}} which we set to 0.

Now from the above equations and little bit of linear algebra, we finally have \Gamma_\kappa|_\mathcal{B}\epsilon=\epsilon which is known as the kappa symmetry constraint.

The Holometer

Physics is all about understanding the phenomena that occur in nature. We essentially want to write down the equations which describe these phenomena and can be used for the benefit of the humanity. Blackholes are the naturally occurring mysterious objects in the space. They have very strong gravitational pull and are condensed in very small  region where quantum effects are appreciable. Therefore, we need to formulate a successful theory of quantum gravity which will not only give us a better understanding of the nature, but also provide the powerful practical tools in the future.

The Holographic Principle is a physical principle of the successful theory of quantum gravity. Although we don’t quite understand quantum gravity, we can extrapolate the notions from the already well established physical theories and cleverly deduce a pattern which should be manifest in quantum gravity.  The pattern here is essentially the existence of a precise and very strong limit on the information content of the spacetime. The holographic principle intimately connects the number of quantum mechanical states with the region of spacetime and builds up the stage for a consistent theory of quantum mechanics, matter and gravity. String Theory is the theory of quantum gravity which has successfully realized this principle through the AdS/CFT conjecture. It is important to note that any experiment that directly validates the holographic principle does not necessarily validate the string theory itself.

Recently, my friend Suzanne Jacobs introduced me to a project named Holometer at Fermilab which is an attempt to experimentally verify the holographic principle. I aim to explain the concept behind the working of the Holometer in this, hopefully, self-contained, blog-post.

Now, general relativity is a theory of spacetime and matter. It is a good theory for large length scales (rough estimate is  from radius of the Earth to that of Milky-way galaxy and beyond!). At these scales, matter can be appropriately described by the classical mechanics and the spacetime can be treated as a continuum. At the length scales smaller than that of radius of earth, we have Newtonian gravity in which space and time are treated separately and matter is again governed by classical mechanics. When you go down at the length scale of an atom, gravity becomes weak and other forces like electromagnetic forces become dominant. This is the regime of quantum mechanics. So for all practical purposes, we can forget about gravity and classical mechanics, and just work with the Hilbert Space of quantum mechanics. In very loose sense, Hilbert space is the stage for quantum theory just as spacetime is the stage for general relativity.

Quantum theory is a very peculiar theory and one of its results can be stated as

To probe the smaller length scales, you need to apply more energy in the system.

This is known as the uncertainty principle and given by a simple formula \Delta x \Delta p \geq \hbar/2. Now this is fine. We have built the particle accelerators capable of achieving very high energies and verifying the theory known as Standard Model. But since we live in a universe with gravity, there is a theoretical upper bound on the amount of energy that we can put in a region of space without creating a blackhole (of which we don’t know much about). And at this point gravity (a perfectly understood concept in classical domain) comes back to haunt us in the quantum regime. This length scale is known as Planck Scale and its numerical value is 1.6×10^(-35) meters. And at this length scale we need to formulate a theory of quantum gravity.

Many physicists believe that the spacetime should be an emergent notion in quantum gravity. Based on this school of thought, and theoretical calculations like covariant entropy bound, the Holographic Principle has very nice interpretation. It basically associates a Hilbert Space with each causal diamond in the flat spacetime as shown in the picture. Here we are considering 1+1 spacetime manifold.

emergentspacetime

A causal diamond (here, rhombus A1A2 and B1B2) is roughly the region of spacetime which is causally connected and is characterized by the proper time parameter \tau (the distance along time axis tA and tB). Of course, here we assume that the observers are at rest with respect to this coordinate system. The Holographic principle assigns the Hilbert spaces \mathcal{H}_A with the causal diamond A1A2 and \mathcal{H}_B with the diamond B1B2. Essentially, these Hilbert spaces have states amongst which the causal connection can be established by definition. The intersection of the causal diamonds (shaded red) is the region of spacetime causally connected to both the points A2 and B2. The Hilbert space associated with this region is \mathcal{H}_{AB}. And now the holographic claim is

the Hilbert space \mathcal{H}_{AB} is completely determined by some mathematical manipulation of the spaces \mathcal{H}_{A} and \mathcal{H}_{B}.

the geometry of red shaded region is completely governed by the \mathcal{H}_{AB}

Now in the Hilbert spaces, the observables (experimentally detectable structures) are non-local. It simply means they don’t depend on the spacetime coordinates. In fact, as we have seen, the spacetime, hence locality, emerges from the holographic picture. It was not there in the quantum theory that we started with! So whenever I mention that something is non-local, it would just mean that it is somewhere in the Hilbert space of the spacetime.

(The matter of this blog-post from here is based on the non peer-reviewed article https://arxiv.org/abs/1506.06808. I shouldn’t be held responsible for any inconsistencies in the subsequent paragraphs :))

Ok, now consider an observable denoted by \hat{x} in the Hilbert space \mathcal{H}, which holographically represents the set of world lines in the spacetime manifold. Since \hat{x} is a quantum mechanical object, the set of world lines it corresponds to should exhibit the quantum behavior. \hat{x} is entirely new degree of freedom and differs from the position variable in the classical spacetime (please note that classical spacetime is different from the holographic spacetime we are talking about here). Also we, a priori, don’t know the corresponding Hamiltonian and the conjugate observable. This is radically different from the String Theory treatment of quantum gravity!

Define a measure (time-domain correlation function) to quantify the deviation of the quantum characteristic of \hat{x} from its classical counterpart \bar{x} by \sigma(\tau) = \langle\Delta x(t)\Delta x(t+\tau)\rangle_{t}F(\tau). In other words,  by the very definition of this function, \sigma(\tau)=0 if there is no quantum or the holographic behavior in the evolution of the world lines! And if some experiment establishes the equality, we can then safely say that the spacetime is perfectly classical and throw the holographic principle out of the window. The non-zero value of \sigma(\tau) represents the “jitter” or the “fuziness” that Fermilab’s Holometer is trying to detect.

Now a quantum mechanical state dechoeres (becomes more classical) with time. This effect can and will make the time-domain correlation function 0 which would destroy the entire purpose of the experiment. The condition to measure the non-zero \sigma(\tau) before the decoherence kicks in, gives the bounds on the dimensions of the experimental apparatus (length and size of the mirrors in the interferometer).

Physicists at Fermilab have an interesting construction to fish out the holographic “jitter” using certain “models” and the details of the experiment can be found at https://holometer.fnal.gov/faq.html.

My observation as a graduate student

This seems to be a good project to uncover and understand the physics at the Planck scales without having to achieve tremendously high amount of the energy. The results reported by the physicists, which are based on a particular model of the correlated holographic noise (cHN), at Fermilab are negative till now. But, as with all the scientific research programs, we now know what is incorrect and move on with the new and better models to gauge the cHN.

Curiously enough, the arXiv papers I consulted to study about these models are not peer reviewed and contain several instances of the ambiguities (broken Lorentz invariance for example). I am not really sure what to make of this, but again, these are just my personal views and I would follow this research only if the articles get published in a good peer reviewed journal.

Holographic duals of the twisted supersymmetric theories

The winter breaks are essentially the “slingshots” which provide exponential growth to my knowledge-base. There is nothing like sipping coffee while staring at my digital paper and thinking about how universe works at various length-scales (especially with no semester pressure and coursework!).  My research in String Theory has exposed me to the several elegant “candidate ways” which describe the working of nature, and I aim to explain one of them in this blog-post. Please note that I will use the jargon frequently enough to bore a sane layperson (and most of the physics majors!) but non-rigorously enough to annoy a decent mathematician. Clearly, the aim of my graduate career is to rectify these drawbacks and explain physics in a way which is fun without losing the mathematical rigor.

Now, there are certain quantum field theories with some extra (symmetry) constraints which provide a lucid way to discover and test the framework of String Theory. These symmetries are

  1. Conformal symmetry
  2. Supersymmetry (SUSY)

I will be focusing on \mathcal{N}=4 Super Yang-Mill (SYM) theory in d=4 spacetime manifold with the topology given by \mathbb{R}^{1,1}\times\Sigma_2. Here \Sigma_2 is a 2 manifold with generic structure and curvature (for instance it could be a Riemann surface with constant negative curvature). For this theory the spin connection is in a U(1) subgroup of the R-Symmetry group SU(4). Now since \Sigma_2 can be a curved manifold, it can (and will) break the supersymmetries. In order to preserve at least some of them, we need to, what is known as, twist the theory in a specific fashion. Essentially, we couple the external SO(N) gauge fields with the R-Symmetry current and identify the spin connection with the gauge connection such that we get the covariantly constant spinors on the manifold (there is a more visually appealing picture in the language of branes which I will explain later in the post). In other words, the twist corresponds to the nature of the embedding of the U(1) subgroup in the SU(4). The aim, then, is to find the holographic gravity duals of these twisted field theories.

Twists preserving (4,4) susy

Here we will consider the twist which corresponds to picking a U(1) subgroup such that we break the R-Symmetry in the following way SO(6)\rightarrow SO(2)\times SO(4). To see what exactly is happening, consider the spinor field \phi of the SYM with spin s under the SO(2) spin connection on \Sigma_2 and U(1) charge q. Now, the covariant derivative on the manifold is, obviously,   \mathcal{D}_\mu\phi=(\partial_\mu+is\omega_\mu+iqA_\mu)\phi. Here \omega_\mu=\epsilon_{ab}\omega^{ab}_\mu/2. Now if the metric on \Sigma_2 is ds^2=e^{2h}(dx^2+dy^2), the spin connection can be computed and once we identify the U(1) gauge connection with the spin connection, the constraint s=-q will give us the “covariantly constant” spinors which, now, are essentially the scalars. We have twisted the field theory by fixing the spin of the fields!

Essentially, the symmetry group (associated with the \mathbb{R}^{1,1}\times\Sigma_2), SO(1,3)\times SO(6) (corresponding to the tangent bundle and the normal bundle) is decomposed as SO(1,1)\times SO(2)_{\Sigma_2}\times U(1)\times SU(2)_L \times SU(2)_R. This corresponds to having (4,4) susy in the theory.

Brane realization through an example:

Consider a manifold \mathbb{R}^6\times K3 with D3 branes wrapping some holomorphic curve (Riemann surface) in K3. In the field theory limit, we obtain the gauge theory mentioned above. The transverse \mathbb{R}^6 direction, after the twist, will have the SO(4)=SU(2)_L \times SU(2)_R rotational symmetry. Now I will make a statement without showing any mathematics, because it is tedious, but it is important for my research. When we consider the low energy limit, compared to the size of Riemann surface, then we get a 2 dimensional effective theory in IR which now becomes (4,4) SCFT! 

Lagrangian description:

Let us write down the Lagrangian for the partially twisted theory which will enable us to find the gravity dual. Since we coupled the theory by introducing the spin connection with the \Sigma_2, we expect the extra terms in the Lagrangian coming from the covariant derivatives along that direction or fields which are charged under the U(1) part of the normal bundle with non-zero gauge connections.

Consider the Lagrangian with two twisted scalar fields given by \mathcal{Z}=X^1+iX^2. The action looks like S=\int Tr(|D_z\mathcal{Z}|^2+|D_{\bar{z}}\mathcal{Z}|^2+\frac{1}{4}R|\mathcal{Z}|^2).

Supergravity (SUGRA) duals of the field theories

Maldacena’s conjecture of SYM theories being dual to supergravity theories on AdS_5\times S^5 give us a good starting point. Since we are dealing with the deformed SYM (defined on \mathbb{R}^{1,1}\times\Sigma_2 with coupling to SO(6) gauge fields), the information gets translated into the boundary conditions in the dual gravity theory.

We start with a reasonable ansatz AdS_5 which at the boundary behaves like ds^2\sim \frac{-dt^2+dz^2+dr^2+e^{2h}(dx^2+dy^2)}{r^2}. Also we impose that the SO(6) gauge fields in AdS_5 asymptotes to the field theory gauge fields. This means that the metric of the geometry AdS_5\times S^5 with one index in AdS_5 and other in S^5, that is g_{\mu\phi}\sim A_\mu near the boundary.

Now a non trivial condition is that we should turn on an operator in the 20 of SO(6). Since we can easily see the coupling to the curvature in the action above, we get the ansatz for the operator as \mathcal{O}=Tr(\frac{2}{3}|Z|^2-\frac{1}{3}(\phi_1^2+\ldots+\phi_4^2)).

Fortunately operators that are turned on correspond to the fields in the 5d gauged supergravity multiplet. Now people have already worked out such theory with \mathcal{N}=8 SUGRA and we will start with that!

Since the connection we started of is U(1), we can start with the U(1) truncations of the supergravity. Hence the data with which we start off is

  1. a 5d metric
  2. a scalar field
  3. a U(1) gauge field

To find the supersymmetric solutions, we start with the fermionic supersymmetric variations which give the constraint equations. Once these equations are solved, we get the complete dual supergravity theory to the twisted SYM theory. To see the explicit calculations head over to page 8 of https://arxiv.org/pdf/hep-th/0007018v2.pdf.

Determining the causal structure of spacetime (I)

In this series of blog posts, I aim to explain the causal structure of the spacetime solutions in general relativity. Currently, I am working on a special extension of Einstein’s theory (general relativity) which is known as Horndeski theory. There, I am trying to find the causal structure of the allowed solutions which allegedly permit super luminal propagation of metric perturbations. The methodology to obtain the structure is similar for all the gravitational theories and I wish to demonstrate it for general relativity (which is my comfort zone).

In order to make the observable predictions from a consistent physical theory, we are interested in finding how degrees of the freedom evolve and behave. For that, we try to obtain/formulate equations of motions, which capture the physics at the infinitesimal scale. Once we embed the physics in these equations, we solve them and make the verifiable predictions. For instance, in Newtonian mechanics, we study point particle moving in one dimension x. We, then ask how this degree of freedom behaves or evolves in time (which, again, is an assumption). For such a theory, we have the equation of motion F=ma=m\frac{d^2 x(t)}{dt^2} which contains the spectacular physical insight from Newton (I won’t even try to elaborate that because it deserves a separate blog post). Now this is a linear second order differential equation which requires two initial conditions. In other words, if you give me the initial position and initial velocity, I can tell you the entire future of x(t) by solving the differential equation. In fact, for constant F, we get x(t)=ut+\frac{1}{2}\frac{F}{m}t^2+x_0

Another way to look at it: equation(s) of motion (of a theory) give us the prescription to evolve the initial data (known values of the degrees of freedom) to final data that we are interested in.

Now Einstein’s gravity theory has the components of the metric as the degrees of freedom. We denote them by g_{\mu\nu}. Another radical difference (common to all relativistic theories) is that the space and time are unified into a single parameter space called spacetime manifold (inheriting all the structures of the pseudo-Riemannian manifolds). So here we ask our favorite question: how do g_{\mu\nu} evolve in the spacetime. Mathematically, we want to know the g_{\mu\nu}(x), where x is the representative of the spacetime coordinates from now on.

This time, we use the insight from Einstein to write the equation of motion for general relativity (in vacuum) as

R_{\mu\nu}-\frac{1}{2}g_{\mu\nu}R=0=E_{\mu\nu}

where the symbols have their usual meaning. Here we are working on the 4 dimensional pseudo-Riemannian manifold (\mathcal{M},g) on which g is to be determined by the equation of motion. We define a Cauchy surface \mathcal{C} which is a c0dimension 1 surface in \mathcal{M} on which the initial data for Einstein’s equations is defined. So what does this initial data comprised of? (We will take a small detour and return to the topic in next blog post)

Helmholtz wave

 Cauchy.jpgIn order to understand that, we start with a simple example. Consider a 2+1 manifold as shown. Now let us consider a wave operator on this manifold given by \hat{L}=-\frac{\partial^2}{\partial t^2}+\frac{\partial^2}{\partial x^2}+\frac{\partial^2}{\partial y^2}. For a scalar degree of freedom denoted by u the operator \hat{L} acts on it to generate a wave equation \hat{L}u=0. Here we are not interested in solving the linear second ordered partial differential equation. We want to deduce the properties of the wave, like it’s speed of propagation and the extent to which a disturbance/fluctuation in u can propagate.

Let us say that we know the value of u=u(x,y) on \phi(x,y,t)=t=0 surface (where we might have a specified source or some boundary condition). The surface is essentially a codimension-1 surface and we will call the coordinates x,y as internal coordinates (w.r.t the surface). Thus we can now easily compute the internal derivatives (from u(x,y)) and denote them by u_i=\partial_iu(x,y) where i\in (x,y). Let the exterior/normal derivative to the surface be u_\phi (x,y)=\eta(du,d\phi)=-\partial_tu (note that d\phi = dt on \mathcal{C}).

In this example, we have a Cauchy surface \mathcal{C} defined by \phi =t with the data

  • u(x,y)
  • internal derivative u_i(x,y)
  • external derivate u_\phi(x,y)

given on \mathcal{C}. Clearly the second derivatives of  u except u_{\phi\phi} can be computed from the given data. And here we use the wave equation (or equation of motion in general) to find the missing second derivative of u. The co-normal to the surface \phi=0 is given by d\phi which we can easily represent in the natural co-normal basis as d\phi=\phi_\mu dx^\mu. And then we perform the coordinate transformation to chart \lambda^\mu such that \lambda^0=\phi. Note for this particular case d\phi \parallel dt and the new coordinate chart is exactly equal to x^\mu chart. This is because \mathcal{C} is already perpendicular to the time coordinate.

It is not very difficult to show that the wave equation \hat{L}u=0, under the above transformation, converts into u_{\phi\phi}Q(\phi_\mu)+\ldots=0 where Q(\phi_\mu)=\eta^{\mu\nu}\phi_\mu\phi_\nu and we call it the characteristic form. Now there are two situations

  1. Q\neq 0: in this case, we can invert the equation u_{\phi\phi}Q(\phi_\mu)+\ldots=0 and find the second derivative of u in the direction normal to the surface. With that information we can easily evolve the data further in time.
  2. Q=0: well, we can’t invert the equation which basically implies that there is no unique evolution of u beyond that surface (which is now called the characteristic surface).

In our example, the characteristic form is identically 1 (just look at the coefficients of second derivatives (-1+1+1). Hence we don’t have any characteristic hypersurface for the degree of freedom u obeying the wave equation \hat{L}u=0 with the Cauchy surface as the entire x,y plane. It is not surprising if you think about the plane electromagnetic waves, which again don’t have any characteristic hypersurface and obey the same wave equation we wrote above.

Moving on, we obtained an equation for a surface in new coordinates \lambda^\mu given by Q(\phi_\mu)=0. Physically, this is the surface beyond which we can not evolve the degrees of freedom uniquely (as the second derivative is not uniquely determined). Now we need a mechanism to generate this surface.

Bicharacteristic curves

First we define the bicharacteristic curves  or rays which are related to the linear second order partial differential operator \hat{L}, where we now generalize it to \hat{L}[u]=a^{\mu\nu}u_{\mu\nu}+d where u_{\mu\nu}=\partial_\mu\partial_\nu u. It is not difficult to check that in this case the characteristic form becomes Q=a^{\mu\nu}\phi_\mu\phi_\nu. For our Helmholtz wave, we get the form Q=-\phi_t^2+\phi_x^2+\phi_y^2.

Now the bicharacteristic curves are generated in by the linear ordinary differential equations, with a parameter s, given by

\frac{dx_\mu}{ds}=\frac{1}{2}\partial_{\phi_\mu}Q and  \frac{d\phi_\mu}{ds}=-\frac{1}{2}\partial_{x_\mu}Q

For our Helmholtz wave, one can easily solve them and find the solutions x(t)=at+b, y(t)=ct+d. They actually form the rays of a cone (with appropriate boundary conditions) traveling with speed 1.

We have shown that if we introduce some perturbations at some point in Minkowski manifold, those perturbations will travel at unit speed and won’t escape the cone. This essentially exhibits the causal structure of flat Lorentzian spacetime which is in concurrence with the wave equation.

Causal.jpg