Posted by

Posted on

August 6, 2014

Posted under

Comments

Spacetime odyssey

All physical phenomena we see around us take place on something that we call spacetime. Therefore it becomes important to get acquainted with spacetime and understand how it affects the physical phenomena or gets affected by them. I will chronologically introduce the notions associated with spacetime because, usually, everyone gets to learn it in the exact same fashion.

Until 20th century, space and time were thought as different entities. It was thought that there exists a three dimensional euclidean space which acts as a stage for the physical phenomena. This stage was supposed to be rigid, static and flat (rigorous definition will be given later). The space was a three dimensional extension of Euclid’s five axioms for plane geometry which were

Between two points in the space, only one line can be drawn.
A finite line in a space can be extended indefinitely.
A circle with arbitrary radius and center can be drawn in the space
All right angles are equal to one another.
If a straight line falling on two straight lines make the interior angles on the same side less than two right angles, the two straight lines, if produced indefinitely, meet on that side on which are the angles less than the two right angles.

These axioms were based on the human experience (at the time of Euclid) and considered to be true in every part of the universe. Even now, based on high-school experience, one doesn’t need enough convincing about the validity of these postulates.

Let us now define our three-dimensional Euclidean space in formal way. It is a set of points with well defined distance between any two points of the set. If you give me two points say $P_1(x_1,y_1,z_1)$ and $P_2(x_2,y_2,z_2)$ belonging to Euclidean space. I will give you the distance between them by computing $d=\sqrt{(x_1-x_2)^2+(y_1-y_2)^2+(z_1-z_2)^2}$ .

A couple of interesting points can be noted. First, the definition of distance between two points is static i.e the way to find distance doesn’t change with time. We can represent the definition of distance between two points using matrices. Let us represent the points by column matrices $P_1 = \left(\begin{array}{c}x_1\\y_1\\z_1\end{array}\right)$ and $P_2 = \left(\begin{array}{c}x_2\\y_2\\z_2\end{array}\right)$ and define a $3\times 3$ matrix $\eta = \left(\begin{array}{ccc} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{array}\right)$ . The distance between two points can now be given by $d^2 = (P_1-P_2)^T.\eta .(P_1-P_2)$ (here ‘.’ represents matrix multiplication). Note that the definition of distance depends on the matrix $\eta$ . Physicists and mathematicians call $\eta$ metric of the space. Now we can say that it is the metric which defines the distance between two points of a space. In fact the metric defines the notion of a ‘dot product’ in a space. Take any vector, write its components in column matrix and used the matrix multiplication defined above to find the length of the vector.

The metric of Euclidean space is time independent (build from constant numbers). Although I have used matrix to represent a metric, generally it should be thought as a set of numbers given by $\eta^{\i\j}$ where $i,j\in 1,2,3$ .

Next we note that the metric (of Euclidean space) is independent of space itself. In fact this independence is a direct consequence of Euclid’s axiom 5. This independence implies that the space is flat. I will demonstrate this and provide a formal definition of flatness in a later post.

Posted by

ravimohan

Posted on

July 25, 2013

Posted under

The holographic view

Comments

5 Comments

Particle in a box (infinite square well).

Yes, I am aware of the ennui caused by this topic. We write schrodinger’s time independent equation, solve it, apply the boundary conditions, find the energy eigen states, normalize them and everybody is happy. If you are a physics major, then you must have done this countless times. But I always had a problem with this system because of its highly un-physical nature (I will come back to this later). That makes it a mathematical curiosity and if it is, then why not explore it to the full extent? Please note that in this post I will use rigorous mathematics and will define terms in proper mathematical language (I hope this post will serve as my notes for future reference). Here one can also appreciate the aesthetics of Linear Algebra.

The question I had been asking myself was, “Why do we always work in position representation? What is wrong in working in momentum representation (for this particular problem)?”. The curiosity to find the answers compelled me to plunge into the depths that are generally not touched and mentioned in physics literature.

The conventional Quantum Mechanics (for this problem) is performed on the Hilbert Space $L^2 ([0,1], dx)$ which I will denote with $\mathcal{H}$ . Here $L^2$ (read L two) represents a functional space of square integrable functions and $[0,1]$ represents the range of parameter $x$ (position). I am interested in working with $L^2 (??, dp)$ and want to obtain energy eigen states in terms of the wavefunction $\varphi(p)$ . I dont know the range of the parameter $p$ (momentum). I can’t take it as whole real line (because of the bounded nature of position). Parseval-Plancherel theorem in not applicable here. I must first find the nature of momentum operator. Then maybe its spectrum might give me the idea about the range of momentum.

Just like a function has a domain, the linear operator on Hilbert space has a domain. The formal definition of the operator is (this definition is for general Hilbert Spaces and not restricted to L two spaces)

An operator on the Hilbert space is a linear map $A$ : $\mathcal{D}(A) \mapsto \mathcal{H}$ such that $\psi \mapsto A \psi$ (where $\psi \in \mathcal{D}(A)$ ).

$\mathcal{D}(A)$ represents the linear subspace of $\mathcal{H}$ . This subspace is the domain of the operator. Strictly speaking, Hilbert space operator is the pair $(A, \mathcal{D}(A))$ which is the specification of the operation with the domain on which the operation is defined. Now two operators are said to be equal if and only if

$A \varphi = B \varphi$ for all $\varphi \in \mathcal{D}(A) = \mathcal{D}(B)$ .

Let us consider position operator denoted by $Q$ . The operation is defined as $Q \psi (x) = x \psi (x)$ . But given the boundary conditions, the domain is defined as

$\mathcal{D}(Q) = \{\psi \in \mathcal{H} \mid Q\psi \in \mathcal{H}$ and $\psi(0) = \psi(1) = 0 \}$ .

As mentioned earlier, the Quantum Mechanics is performed on the Hilbert Space of square integrable functions (denoted by $\mathcal{H}$ ), we certainly would not want a mapping which blows up the square integrability. Hence it is important to put the restriction that the output be $Q\psi \in \mathcal{H}$ which translates into $||Q\psi||^2 = \int_{0}^{1} dx x^2|\psi (x)|^2$ $< \infty$ . So, in words, the domain of $Q$ is the set of all the square integrable functions such that

they satisfy boundary conditions
the output of linear map should be finitely square integrable

Now let us define momentum operator $P$ . Quantum Mechanics defines the operation of this operator (on a function) as

$P\psi = \frac{\hbar}{\iota} \frac{\partial \psi}{\partial x}$

Certainly, we would like to have a set of those functions which besides satisfying boundary condition, also have square integrable derivatives. In formal language

$\mathcal{D}(P) = \{\psi \in \mathcal{H} \mid \psi^{\prime} \in \mathcal{H}$ and $\psi(0) = \psi(1) = 0 \}$ .

Now let us see the definition of the adjoint of an operator $A$ which means we have to define the operation and domain of the adjoint of operator (let’s denote it by $A^{\dagger}$ ).

$\mathcal{D}(A^{\dagger}) = \{\phi \in \mathcal{H} \mid \exists \tilde{\phi} \in \mathcal{H}$ such that $\langle \phi , A\psi\rangle = \langle \tilde{\phi}, \psi\rangle \forall \psi \in \mathcal{D}(A)\}$

This definition says that the domain of $\mathcal{D}(A^{\dagger})$ is set of $\phi$ (square integrable) such that there exists $\tilde{\phi}$ (again square integrable) which makes the equation $\langle \phi , A\psi\rangle = \langle \tilde{\phi}, \psi\rangle$ true for all the $\psi$ belonging to domain of operator $A$ .

For a given $\psi$ , $\tilde{\phi}$ depends on $A$ and $\phi$ . The operation of $A^{\dagger}$ is defined as

$A^{\dagger} \phi = \tilde{\phi}$ .

So we, now, have a full working definition of the adjoint of a operator. Let’s find the adjoint of $P$ . The inner product equation $\langle \phi , P\psi\rangle = \langle \tilde{\phi}, \psi\rangle$ must hold. So

$\int_{0}^{1} \overline{P^{\dagger}\phi}(x) \psi (x) dx = \int_{0}^{1} \overline{\phi}(x) \frac{\hbar}{\iota} \frac{\partial \psi}{\partial x}(x) dx$ .

On integrating by parts, the RHS

$\int_{0}^{1} \overline{P^{\dagger}\phi}(x) \psi (x) dx = \frac{\hbar}{\iota}\left[_{0}^{1} \overline{\phi}(x) \psi (x)\right] -\frac{\hbar}{\iota} \int_{0}^{1}\overline{\frac{\partial \phi}{\partial x}}(x) \psi (x) dx$ .

For this equation to be true we must have

$\overline{P^{\dagger}\phi}(x) = -\frac{\hbar}{\iota}\overline{\frac{\partial \phi}{\partial x}}(x)$

$P^{\dagger}\phi(x) = \frac{\hbar}{\iota}\frac{\partial \phi}{\partial x}(x)$ .

The most curious and interesting point is that the function $\phi(x)$ does’t need to satisfy boundary condition (of infinite square well) since $\left[_{0}^{1} \overline{\phi}(x) \psi (x)\right] = \left[ \overline{\phi}(1) \psi (1) - \overline{\phi}(0) \psi (0) \right]$ is zero as $\psi(x)$ already satisfies the boundary condition. Clearly the $\phi(x)$ has no restriction save square integrability which shows that the set $\mathcal{D}(P^{\dagger})$ has more elements than set $\mathcal{D}(P)$ . The formal definition of $\mathcal{D}(P^{\dagger})$ is

$\mathcal{D}(P^{\dagger}) = \{\phi \in \mathcal{H} \mid \phi^{\prime} \in \mathcal{H}\}$ .

It can be seen that $\mathcal{D}(P) \subset \mathcal{D}(P^{\dagger})$ . Thus $P^{\dagger} \neq P$ i.e the momentum operator is not self-adjoint. On the other hand it is quiet simple to show that $Q^{\dagger} = Q$ i.e the position operator is self-adjoint.

Now let us see the definition of Hermitian operators.

The operator $A$ on $\mathcal{H}$ is Hermitian if $\langle \phi , A\psi\rangle = \langle A\phi,\psi\rangle$ for all $\phi, \psi \in \mathcal{D}(A)$

Since $A$ operates on both the functions, needless to say, they must belong to the domain of the operator. Let us use the definition of adjoint operators. Operator $A^{\dagger}$ is such that $\langle \phi , A\psi\rangle = \langle A^{\dagger}\phi,\psi\rangle$ which implies $\langle A\phi,\psi\rangle = \langle A^{\dagger}\phi,\psi\rangle$ . Thus if the operation of the operator is same as that of its adjoint, then it is Hermitian operator. The domain of both the operators need not be same.

We have seen that the specification of both $P$ and $P^{\dagger}$ is same $\frac{\hbar}{\iota}\frac{\partial}{\partial x}$ . The momentum operator is Hermitian for this case but not self-adjoint. It is easy to note that all the self-adjoint operators are Hermitian but vice-versa is not true.

Now the spectral theorem of linear operators states that

If the Hilbert space operator A is self-adjoint, then its spectrum is real and the eigenvectors associated to different eigenvalues are mutually orthogonal; moreover, the eigenvectors together with the generalized eigenvectors yield a complete system of (generalized) vectors of the Hilbert space.

This property does not hold for only Hermitian operators.

As the momentum operator, here, is only Hermitian its eigen vectors won’t form complete basis. Thus there is no use in solving this problem in momentum space. Moreover, $L^2 (??, dp)$ space wont have continuous momentum (because of quantization imparted by boundary conditions).

Normally one assumes that the eigen vectors of an observable forms a complete set of basis. This means that observables must be represented by self-adjoint operators. From above we note that momentum operator is not an observable for this problem!

Absurd, isn’t it? Well, the boundary condition itself gives us the hint of the highly non-physical nature of the problem. The “non self-adjointness” of momentum operator further shows us how the infinite potential square well is a mathematical curiosity with tenuous nexus to real physical systems.

Posted by

ravimohan

Posted on

June 11, 2013

Posted under

The holographic view

Comments

Linear vectors and Quantum Mechanics. (II)

So it has been more than a week since I wrote my last blog post. Although, time to time, I was tempted to write the sequel, but I had to distance myself from that temptation. I have been studying David Griffiths’s “Introduction to Elementary Particles” and much of my time has been devoted to it. Hopefully one day I will know Particle Physics well enough to blog about it.

So by now I am sure you know what it means when we say something is linear. This is the beginning of linear algebra. It is a very beautifully constructed algebra which physicists use in their daily life (like in Quantum Mechanics, Special theory of Relativity and Classical Mechanics). I don’t intend to start off with the abstract Hilbert Spaces (basically a set of linear vectors with some axioms) and define linear maps, inner product structure and so on. Instead I will try to demonstrate the linear algebra “at work (using Quantum Mechanics)” without worrying too much about mathematical vigour. Now I must warn you that this mathematical vigour is absolutely necessary to not only appreciate the beauty of mathematics (I am a big fan of the water-tight derivations done by the mathematicians!) but to get a better understanding of Quantum Mechanics (least you land up in the situation where it might seem that Quantum Mechanics is inconsistent with itself). Basically in the following demonstration, I will define certain mathematical quantities in a very loose way (loose enough to drive any mathematician, with slightest respect for mathematics, crazy) and mix them with the postulates of Quantum Mechanics without explicitly stating them. For instance, the example I gave in last post, deals with the linear vectors belonging to a linear vector space equipped with inner product structure (or Hilbert Space). I believe that this way I wont make the subject abstract enough for the people new to Linear Algebra and Quantum Mechanics.

So Quantum Mechanics says that the physical states are linear. This might not surprise you and you might think that it is obvious (maybe after going through the example I gave in last post). But think again! This statement is powerful enough to lead to certain results which you might not consider obvious.

Before going further I want you to see this video and try to understand the problem it states in the end.

At first it might seem too confusing. But we will go, step by step, and try to reproduce the experimental results using Quantum Mechanics. I am considering the case when only one electron is shot out at a time from source $S$ . I associate a linear state vector $\left|S\right>$ to represent the electron at source. Now the states representing slit 1 and 2 be $\left|1\right>$ and $\left|2\right>$ respectively. The screen is basically a real number-line where each point represents a state written as $\left|y\right>$ . Note that $y$ is a continuous parameter here.

Now recall last post in which I gave an example of “creating” a set of persons “using” two persons (basis). In general a person can be written as $\left|C\right> = \cos(\frac{\theta}{2})\left|A\right> + e^{i*\varphi}\sin(\frac{\theta}{2})\left|B\right>$ where the $(|coefficient|)^2$ will tell me the resemblance of person C with person A or B. Now I define a term inner product as a function (a linear map) which maps the ordered pair of linear vectors into a scalar complex number. This complex number gives the component of the linear vector on other linear vector ( $(|component|)^2$ will give resemblance).

Consider the inner product of person A and B written as $\left<A|B\right>$ . Now we know that they don’t resemble at all. Hence $\left<A|B\right>$ should be equal to $0$ (state B has no component on state A whatsoever). The states $\left|A\right>$ and $\left|B\right>$ are said to be orthogonal. Also we know that state $\left|A\right>$ completely resembles with itself and so does state $\left|B\right>$ . Thus $\left<A|A\right> = \left<B|B\right> = 1$ (I can assign any number to this inner product). I have normalised the state vectors $\left|A\right>$ and $\left|B\right>$ (because the number I assigned is $1$ ). They are now called orthonormal linear vectors.

To find the component of $\left|C\right>$ along $\left|A\right>$ , I just need to evaluate the inner product $\left<A|C\right>$ . So

$\left|C\right> = \cos(\frac{\theta}{2})\left|A\right> + e^{i*\varphi}\sin(\frac{\theta}{2})\left|B\right>$

$\left<A\right|\left|C\right> = \cos(\frac{\theta}{2})\left<A\right|\left|A\right> + e^{i*\varphi}\sin(\frac{\theta}{2})\left<A\right|\left|B\right>$

$\left<A|C\right> = \cos(\frac{\theta}{2})$ and $\left<B|C\right> = e^{i*\varphi}\sin(\frac{\theta}{2})$

In Quantum Mechanics it is used in a different way. We have state C (now representing a physical state and not a person) given in terms of state A and state B (again physical states). Now if a system is described by state $\left|C\right>$ then $\left<A|C\right>$ is the amplitude of state $\left|C\right>$ to exist in state $\left|A\right>$ (read it this way and understand it this way!). $(|\left<A|C\right>|)^2$ is the probability with which we will find a system described by state $\left|C\right>$ in state $\left|A\right>$ . The probability of state C to exist in state B is $(|\left<B|C\right>|)^2$ . So instead of resemblance, the component of vector gives an idea about the probability. This is the basic rule which the nature seems to follow. Max Born was the guy who realised this.

The question which might arise here is that before measuring the system it was in state $\left|C\right> = \cos(\frac{\theta}{2})\left|A\right> + e^{i*\varphi}\sin(\frac{\theta}{2})\left|B\right>$ . Mathematically it is called superposition of states $\left|A\right>$ and $\left|B\right>$ . But what is it physically? Does it mean that if a system is represented by state $\left|C\right>$ then it is existing in both the states $\left|A\right>$ and $\left|B\right>$ , simultaneously? Quantum Mechanics gives no answer to this. It is something that you have to interpret yourself.

Although you might try to relate it to the example I gave in previous blog post, but you must understand that these physical states are very different from the states representing the persons in the sense that there are no “extra variables” (like lefty, hindi and lazy) which make up the states. These states are fundamental and nothing is implicit. For instance state $\left| 1\right>$ represents slit $1$ in double slit experiment and that is it. So from now on, the physical states are to be considered without any extra variables.

Now I define another linear entity “operator”. It is again a linear map (or function) which maps a linear vector into another linear vector. Operator is denoted by a letter with a hat $\hat{A}$ . So the operation of a linear operator on a linear vector is defined as $\hat{A}\left|\alpha\right> = \gamma\left|\beta\right>$ , where $\gamma$ is a complex number. Now I can represent same operator $\hat{A}$ by $\left|\beta\right>\left<\alpha\right|$ . This is known as “outer product notation”. Let’s check weather this notation is compatible with the definition of operator.

$\hat{A}\left|\alpha\right> = \left|\beta\right>\left<\alpha\right|\left|\alpha\right> = \left|\beta\right> \left<\alpha|\alpha\right> = \gamma\left|\beta\right>$

Here $\left<\alpha|\alpha\right> = \gamma$ . State $\left|\alpha\right>$ might not be normalised.

In Quantum Mechanics all kind of actions on a state vector are represented by linear operator. There are operators corresponding to time translation, space translation, space rotations and observables. By definition theses operators map a linear vector (physical state) to another linear vector (another physical state).

Suppose you are standing on the other side of slit (opposite to electron source). Further suppose that slit $2$ is closed. Now what state would you assign to the electrons coming out towards you? Now the slit $1$ will act like a source. You will say that the state of electron is $\left|1\right>$ . Thus the state $\left|S\right>$ has been mapped to state $\left|1\right>$ . This is the handiwork of slit $1$ and thus we should associate an operator with it. How should the operator look like? Operator should be able to tell us the amplitude of state $\left|S\right>$ to exist in state $\left|1\right>$ . Consider the form $\left|1\right>\left<1\right|$ . Now $(\left|1\right>\left<1\right|)\left|S\right> = \left<1|S\right>\left|1\right>$ . This operator maps the state $\left|S\right>$ to state $\left|1\right>$ and also provides the required amplitude. So good! Similarly, on closing slit $1$ , we will find that the operator corresponding to slit $2$ is $\left|2\right>\left<2\right|$ . In hat notation, I assign operators $\hat{M_1}$ and $\hat{M_2}$ to slit $1$ and $2$ respectively.

Now we have necessary tools to work out the Quantum Mechanics of this experiment. The states for double slit experiment we have are $\left|S\right>, \left|1\right>, \left|2\right>$ and $\left|y\right>$ and the operators are $\hat{M_1}$ and $\hat{M_2}$ . Now if I shoot one electron after other, then (assuming no absorption by the slit wall) I should be able to get corresponding detection at certain state $\left|y\right>$ (on the screen). So one electron is emitted out from source and we detect one electron on screen. I should evaluate the amplitude of state $\left|S\right>$ to exist in state $\left|y\right>$ or the number $\left<y|S\right>$ . Then I can get the probability with which an electron will be detected at $y$ by evaluating $|\left<y|S\right>|^2$ .

We saw that the slits have the ability to map state $\left|S\right>$ to state $\left|1\right>$ or state $\left|2\right>$ and these are the only two alternatives. This should make us conclude that state $\left|S\right>$ is made up of state $\left|1\right>$ and $\left|2\right>$ with certain amplitude (which is nothing but probability for state S to exist in slit 1 or 2) for each of them in a linear way. Thus

$\left|S\right> = c_1\left|1\right> + c_2\left|2\right>$

$\left|S\right> = \left<1|S\right>\left|1\right> + \left<2|S\right>\left|2\right>$

and

$\left<y\right|\left|S\right> = \left<1|S\right>\left<y\right|\left|1\right> + \left<2|S\right>\left<y\right|\left|2\right>$

with little re-arrangement

$\left<y|S\right> = \left<y|1\right>\left<1|S\right> +\left<y|2\right>\left<2|S\right>$

Feynman’s way

In Feynman’s lectures volume III, he obtains the last equation in a very beautiful and intuitive way. He first states two rules

when an event can occur in several alternative ways, the probability amplitude of the event is the sum of amplitudes of each alternative way considered separately (provided experiment is not capable of determining which alternative is actually taken). If the experiment is capable of determining, then the probabilities should be added.

and

if a particle goes through a route then the amplitude for that route can be written as a product of amplitude to go part way and amplitude to go rest of the way.

In our double slit, the particle can reach state $\left|S\right>$ by two ways (which can not be distinguish just by detecting the electron at the screen) thus the amplitudes must add

$\left<y|S\right> = \left<y|S\right>_{slit 1} + \left<y|S\right>_{slit 2}$

The experiment is going via superposition of two possible alternatives. Now using second rule we can say that $\left<y|S\right>_{slit 1} = \left<y|1\right>\left<1|S\right>$ (I have changed the order since it is better to read from right to left). Similarly $\left<y|S\right>_{slit 2} = \left<y|2\right>\left<2|S\right>$ and we end up with the same equation.

Now $\left<1|S\right>$ is a complex number and so is $\left<y|1\right>$ , but $y$ is a continuous variable which can take any value like $3.4, 6, 0.12$ and so on and for every $y$ we should have corresponding amplitude. It is better to represent $\left<y|1\right>$ by a function say $\phi_1(y)$ . Same goes for $\left<y|2\right>$ . Thus

$\left<y|S\right> = \phi_1(y)\left<1|S\right> +\phi_2(y)\left<2|S\right>$

$\left<y|S\right> = a\phi_1(y) +b\phi_2(y)$

$\left<y|S\right> = \phi_1(y) +\phi_2(y)$ .

You can put $a$ and $b$ inside the functions. It won’t be a problem becase they are constant.

You see, how by associating linear vectors to physical systems (and using some additional assumptions) we found an astonishing result of interference. If you want you can watch the video again (keeping this treatment in mind). Now I am going to discuss the physical implications of this treatment.

First you might be wondering about the nature of electron. Is it particle? Or is it a wave? Well it is neither. Wave and particle are classical concepts which are often mixed with the name “wave particle duality” to describe quantum entities. I consider it as misleading specially for new people learning Quantum mechanics. You see in Quantum Mechanics we just say that it is a linear vector by which we mean that it is an entity that is linear in nature. There is absolutely no need to draw a picture of the electron in your mind (like a rigid ball or wavy curve or mixture of both).

Now since it is a linear vector it can exist in superposition of various states. So it is not necessary to imagine an electron going through a particular slit (when you are not observing the electrons at the slit). It is going through both the slits (if this is what you want to hear). And in Quantum Mechanics you will come across such examples many times.

In the video you have seen that interference pattern disappears when one tries to observe the slits through which electron goes. So the act of measurement destroys interference pattern. Of course we can show that using Quantum Mechanics, but it involves concept of Tensor Product or Direct Product of linear vectors, which is out of the scope of this article. Feynman, on the other hand shows (by following those two rules) how the measurement destroys interference without “explicitly” talking in terms of direct product. I dont want to repeat it, so I recommend Feynman Lectures Vol III chapter 3.

Posted by

ravimohan

Posted on

May 28, 2013

Posted under

The holographic view

Comments

2 Comments

Linear vectors and Quantum Mechanics. (I)

$i\hbar\frac{\partial}{\partial t}\left|\Psi(t)\right>=\hat{H}\left|\Psi(t)\right>$

This is called Schrondinger’s equation and it is often used in Quantum Mechanics. To understand this equation one should understand the mathematics on which Quantum Mechanics is based.

Note $\left|\Psi(t)\right>$ appearing on both the sides of the equation. Mathematicians call it “linear vector”.

In mathematics linear vectors are recognised by the following property:

On adding two linear vectors the entity obtained is again a linear vector.

The mathematical examples of linear vectors are numbers, integers, matrices and solutions of linear differential equations (there are more, think!). You can see that if you add two entities belonging to a type (say integer), then you end up with same type of entity again (integer) and not some other entity (matrix or irrational numbers). On the other hand solutions of non-linear differential equations are not linear vectors. If you add two solutions, the resultant function will not satisfy the equation!

In Quantum Mechanics we use linear vectors to represent the state of a physical system at certain time ‘t’. In physics, state of a system means the information of the various variables (in form of numbers) related to a system. If I know these numbers, I know the physical system through and through. And if I know these numbers as function of time, I know the dynamics of the system. One natural question which may arise is

If we know these numbers as function of time then does it mean we know the future of that physical system?

According to Classical Physics answer is yes (in principle), but in Quantum Mechanics we can only estimate the odds, so the answer is no (even in principle).

So Quantum Mechanics says that I can associate $\left|\Psi(t)\right>$ with any physical system (starting from electrons and molecules (even myself!) upto galaxy and beyond). All the information (set of numbers) associated with the physical system is buried inside this state vector. It is the job of physicist to extract the information and use it.

Quantum Mechanics is telling us that the properties of physical systems are linear in nature. If I add two physical states then the resulting state will also be a physical state!

Let us consider a physical example which explicitly shows this property of linear vectors. Then I will come back to explain what I said in last paragraph. Consider a city which is inhabited by only 8 people. Now each person has different characteristics which make him/her unique. For simplicity, let us consider that they have only three characteristics with two possible choices.

The hand they use to do work (lefty/righty)
The language they know (hindi/english)
Habit (lazy/active)

Let us consider a person ‘A’ having three characteristics as

Righty
Hindi
Active

and denote this person with linear vector $\left|A\right>$ . In the city if we search for set of characteristics mentioned above, it means we are searching for person A and thus we should use $\left|A\right>$ for these set of characteristics which describes the state.

Similarly, let there be person ‘B’ having characteristics

Lefty
English
Lazy

and denote this set or state with $\left|B\right>$ . Analyse the table and take its snapshot in your mind.

	Person A or $\left\|A\right>$	Person B or $\left\|B\right>$	Phase ( $\varPhi$ )
Work	Righty	Lefty	$0$
Language	Hindi	English	$\frac{2\pi}{3}$
Habit	Active	Lazy	$\frac{4\pi}{3}$

Now my aim is to show that one can represent other 6 persons using these two persons or in other words I can represent 6 other states using these two states ( $\left|A\right>$ and $\left|B\right>$ ). For the given scenario, the only possible way is to choose one characteristic from any one person and rest two from other. This way we can build 6 other set of characteristics (6 more states and hence 6 more people). Now we set the coefficients of $\left|A\right>$ and $\left|B\right>$ as the square root of number of characteristics picked from that particular state (I will explain the reason little later).

Consider a case: Select 1 characteristic of person A, say work, and rest from characteristics of person B, language and habit. The state thus formed is {Righty, English and Lazy} which neither represents person A nor person B. It represents new state or a new person denoted by $\left|C\right> = \sqrt{\frac{1}{3}}\left|A\right> + \sqrt{\frac{2}{3}}\left|B\right>$ . Here coefficient of $\left|A\right>$ denotes the square root of number of characteristics picked from person A and similarly coefficient of $\left|B\right>$ denotes the square root of number of characteristics picked from person B. I have divided whole linear vector by $\sqrt{3}$ in order to normalize it. We can represent new person in terms of two special persons (call them basis).

Let us pause here and think why linear vectors are suitable to describe persons. We defined a person with a set of characteristics. And we know that if we add characteristics (in any proportion) we get another set of characteristics which describes another person (and not some animal!). This idea is important and this is what allowed us to use linear vectors for persons.

If you are wondering why I chose to take the square root of number of characteristics picked as coefficients, then now is the time to explain. We can say that the $(|coefficient|)^2$ shows us the similarity to the person (or state) it is written with. Thus person C is 1/3 like person A and 2/3 like person B. Also it is obvious by the choices I made.

But this is not a correct way and there is a fallacy. If I select language of person A and rest two characteristic from person B then the state I get is {Lefty, Hindi and Lazy} which represents different person altogether (who is certainly not person C). So let us call him person D (if you did think this on your own then you are really paying attention). Hence I need one more variable to store one more piece of information. The information being what kind of choice is made first. The previously mentioned coefficients will tell me how many characteristics were picked from person A and B. So I need to concentrate on the person from which only one characteristic was chosen (because this choice will determine other two choices from other person). Hence I have introduced a term (call it phase) $\varphi$ . Take a look in the table. The fourth column shows the phase associated with each characteristic (later I will explain the reason for this particular allotment of phase).

If I define the linear vector associated with person C as $\left|C\right> = \sqrt{\frac{1}{3}}\left|A\right> + e^{i*0}\sqrt{\frac{2}{3}}\left|B\right>$ and person D as $\left|D\right> = \sqrt{\frac{1}{3}}\left|A\right> + e^{\frac{2*i*\pi}{3}}\sqrt{\frac{2}{3}}\left|B\right>$ then everything will fall in right place. Not only I can see the mathematical difference between person C and D, but I am also able to preserve the definition of “likeliness” in the way that both the person C and D are 1/3 like person A and 2/3 like person B and still they are different!

No you might ask why did I assign phase like this. I considered the number of ways in which one characteristic is chosen from one person and two from another (for this case it is 3). I, then, divided $2*\pi$ by that number and allotted the phases accordingly.

Another question might rise at this point. What if we have a set of more than three characteristics? Will this method work? Certainly, the answer is yes. In that case you can consider all sorts of permutation and assign a phase to each permutation like I did in this case.

But why to divide the number $2\pi$ ? Well it clearly means that I want to divide a circle in parts. The number of parts being equal to number of permutations calculated. Also I know that coefficients should vary from 0 to 1. So I can use sine and cosine of a parameter instead of these coefficients. Let that parameter be $\theta/2$ where $\theta$ varies from $0$ to $\pi$ . And we also know that the phase $\varphi$ is varying from $0$ to $2\pi$ (in parts). What do you make out of it? Have you seen these limits elsewhere? Yes, of course, these are the polar and azimuthal angles of a spherical-polar coordinate system where the length (defined as $(|coefficient 1|)^2 + (|coefficient 2|)^2$ ) is constant as $\theta$ and $\varphi$ are varied. The surface traced this way will be, no doubt, a sphere. Discreet points on the sphere represent the states (decided by the $\theta$ and $\varphi$ of that point).

In general state $\left|\Psi\right>$ representing a person can be written as

$\left|\Psi\right> = \cos(\frac{\theta}{2})\left|A\right> + e^{i*\varphi}\sin(\frac{\theta}{2})\left|B\right>$

This sphere has special name Bloch’s Sphere. The only difference is that on our sphere we have discrete points where as on bloch sphere there are uncountable points and, hence, uncountably infinite states. $\theta$ and $\varphi$ vary continuously.

Let us stop here conclude this post with following statements:

Anything (mathematical or physical) that is linear can be represented by linear vectors.
Quantum Mechanics says that all the physical systems are linear.
To represent the set of linear entities (in this example we had a set of 8 persons) we can find a subset of those entities (persons A and B), known as basis, to represent all the other entities.

In next post I will demonstrate how linear vectors are used in Quantum Mechanics.

Ravi Mohan

If you haven't found something strange during a day, it hasn't been much of a day. – J. A. Wheeler

Category Archives: The holographic view

Spacetime odyssey

Particle in a box (infinite square well).

Linear vectors and Quantum Mechanics. (II)

Linear vectors and Quantum Mechanics. (I)