On A New Technique to Solve Ordinary Differential Equations

Over the past two years I have gradually developed a very effective technique to solve ordinary differential equations. It may not be rigorous mathematically, in fact when I showed it to a mathematics professor he was absolutely freaked out, but it works, and that’s all that matters to a physics student. I called it a theory of differential generator, and here is a rather long summary of it.

1

2 3 4 5 6 7 8 9

Advertisements

Conceptual Quantum Mechanics (Part 1)

As part of recollection, here I gathered some my conceptual analysis on Quantum Mechanics.

1. Preliminary

It started when people observed conventional particles, for example, electrons, behave like a wave. This hints a description of particle in wave mechanics, so here we go, we will describe a particle by \psi(x, t). However, it’s not so easy to interpret it. We could say, since it’s a wave, maybe it has similar properties as light wave. Since we could interpret the amplitude of light wave as probability amplitude, \psi(x, t) could possibly also be a probability amplitude of some sort.

Being a probability amplitude implies that

|\psi(x, t)|^2 \rightarrow \text{probability density function}

\int|\psi(x, t)|^2dx = 1

In other words, the second equation means that particle must be somewhere, we call it normalization condition.

2. Two particles

Now suppose our system has two particles that are sufficiently apart such that there is no interaction between them. The two particles are described by \psi_1(x_1, t) and \psi_2(x_2, t) respectively. One may ask, what is the probability amplitude of finding particle 1 at x_1  and particle 2 at x_2?

It’s not hard to guess that

\psi_{12}(x_1, x_2, t) = \psi_1(x_1, t)\psi_2(x_2, t)

This really follows from common sense that the probability of two events occurring at same time is the product of the probability of individual event occurring.  We could check that it indeed behaves like a probability amplitude.

\int|\psi_{12}(x_1, x_2, t)|^2 dx_1 dx_2=\int|\psi_1(x_1, t)|^2|\psi_2(x_2,t)|^2dx_1dx_2=1

3. Energy conservation

The fact that we could describe a particle by a probability amplitude implies that the information about energy must be somehow encapsulated in the function. This means

E = E[\psi(x, t)]

Now let’s consider a two-particle system as we described above. We know their energies respectively.

E_1=E[\psi_1(x_1, t)]

E_2=E[\psi_2(x_2, t)]

The total energy of the system E is

E=E_1+E_2

On the other hand, if we consider the total probability amplitude of the system, we will have

E=E[\psi_{12}(x_1, x_2, t)]

By comparison we conclude that

E[\psi_1(x_1, t)\psi_2(x_2, t)]=E[\psi_1(x_1,t)]+E[\psi_2(x_2, t)]

This shows that the form of E must follow a peculiar structure. One familiar function that has the above property is log function.

log(AB)=log(A)+log(B)

If this is the case, we would expect

E\rightarrow log(\psi(x, t))

\psi(x, t)\rightarrow Ae^{aE}

Since there is no justification to say A and a are constants, so we expect

\psi(x, t) = A(x, t)e^{a(x, t)E}

It can’t be right that the probability distribution depends on the energy exponentially. In other words, the exponential dependence must be compensated by an extra term to ensure particle has to exist somewhere, but this obvious contradicts with our assumption that all energy dependence comes in the exponential term. It must be that a is an imaginary number. We rewrite it as

\psi(x, t) = A(x, t)e^{ib(x, t)E}

where b is a real function. The energy dependence nicely cancels out when we check our normalization condition

\int|\psi(x, t)|^2dx=\int|A(x, t)|^2dx=1

We see that the dependence on t disappears after integration over x. This could only happen either when A doesn’t depend on time, which is physically wrong, or the dependence on t cancels out like the term with E. In this case, one could factor out the dependence on t and absorb it in b, so we have a new form

\psi(x, t)=\phi(x)e^{ic(x, t)E}

Now we want to see how \psi(x, t) changes in time by taking the partial derivative of t.

\frac{\partial\psi(x,t)}{\partial t}=i\phi(c\frac{\partial E}{\partial t}+E \frac{\partial c}{\partial t})e^{icE}=i(c\frac{\partial E}{\partial t}+E \frac{\partial c}{\partial t})\psi

In a closed system, the energy doesn’t change with time, we get

\frac{\partial\psi(x,t)}{\partial t}=iE\frac{\partial c}{\partial t}\psi

We could infer that the \frac{\partial c}{\partial t} can’t be dependent on t because, in an arbitrary system, how \psi changes with time should be independent from where we choose to start timing. We have

\frac{\partial c}{\partial t}=C(x)

The same argument applies that the change of \psi should be independent with where we set up our coordinate in any arbitrary system. Therefore we conclude that

\frac{\partial c}{\partial t} = k

where k is a constant. Substituting this result back we get the expression of time evolution of wave function

\frac{\partial}{\partial t}\psi = iEk\psi

We could, then, easily figure out the constant k by fitting experiment. Here we could consider an example of a photon. Photon classically is described as the propagation of alternating electric field and magnetic field. However, as proposed by Einstein, the amplitude of classical electromagnetic field could also be interpreted as the probability amplitude of the particle photon. This means for a free photon, it’s wave function would be of the following form.

\psi = Ae^{i(kx - wt)}

where A is a normalization constant, k is its wave number and w is the angular frequency of the wave. Substituting this wavefunction back to our expression of energy we get

\frac{\partial}{\partial t}\psi =-iw\psi= iEk\psi

We know the energy of a photon with frequency w is

E=\hbar w

Substitute this expression of energy back to the wavefunction

-iw\psi= iEk\psi

-w\psi=\hbar wk\psi

k=-\frac{1}{\hbar}

We get the value of k, thus we have derived that

\frac{\partial}{\partial t}\psi =-iE\frac{1}{\hbar}\psi

i\hbar\frac{\partial}{\partial t}\psi =E\psi

This is in fact one of the very important results in Quantum Mechanics. Through the arguments, one has to conclude that the emergence of complex number is a natural consequence to conserve energy, and in this sense, Quantum Mechanics doesn’t seem that mysterious at all!

Commutation Relations

Commutation relation in quantum mechanics has triggered my interests ever since i discovered the existence and equivalence of two commutations. The story is, one day while i was reading a book on relativity, i started to wonder if quantum mechanics should be better formulated in a four-vector (space-time) representation. Here is my thought process. The two commutation relations are

[\hat{x}, \hat{p}] = i\hbar

[\hat{H}, \hat{t}] = i \hbar

In theory of relativity we have,

x^a = (ct, x^i)

p^a = m\frac{dx^a}{d\tau} = \gamma(mc, p^i) = \gamma(E/c, p^i)

The quest is, we know [\hat{x^i}, \hat{p^i}] = i\hbar, does this relation holds even up to four dimensions? i.e. [\hat{x^a}, \hat{p^a}]? Before that let’s mention a simple fact about commutation relations,

[a\hat{A}, b\hat{B}] = ab[\hat{A}, \hat{B}]

Imagining we are on a rest frame, i.e. \gamma = 1, replacing all variables with operators of correspondence and replacing energy term E with H, I got the following

[\hat{x^0}, \hat{p^0}] = [c\hat{t}, \hat{H}/c] = [\hat{t}, \hat{H}] = -i\hbar

However,

[\hat{x^i}, \hat{p^i}] = i\hbar

They have different signs! It troubles me for some time because it doesn’t seem to be as elegant, until i realized something is wrong with my calculation! Time and space are treated completely equal footing, this shouldn’t happen because time is uni-directional, unlike space! This inspires me to add this extra term to make the three coordinate equal footing

x^a = (ict, x^i)

It follows then,

 [\hat{x^0}, \hat{p^0}] = [ic\hat{t}, i\hat{H}/c] = -[\hat{t}, \hat{H}] = i\hbar

[\hat{x^a}, \hat{p^a}] = i\hbar

The commutation relation indeed holds for four dimensions which is no only remarkable but also mysterious that an addition of imaginary number makes the picture complete.

After i discovered this, i was so excited that i called my friend to tell him about it. He, who has taken a graduate course on quantum field theory, told me that this is indeed what people have found, but instead of solving the problem by introducing imaginary number, people uses the metric tensor to solve the inconsistency.

We had a big argument which i think is worth mentioning here. My friend insisted that using metric tensor approach is more fundamental because apparently some famous guy said so and all textbooks are consistent with that, my solution is just a mathematical trick which bears no fundamental truth beneath. I didn’t agree very well with this statement.

There is no authorities in physics. Uncountable cases we have, where orthodox belief was completely proven wrong. If there is only one thing to be called fundamental, it is the experimental truth, unchangeable, unprejudiced, upon which all theoretical architectures build up.

Metric tensor is really an invention of notation that encapsulates the distinction of time and space in a geometric interpretation. I see no justification, that this point of view is more fundamental than introducing imaginary number.

I am personally fond of my own notation for the following reasons. It gives rise to a nice equivalence of space and time, which is elegant in its own right. If whichever law applies to one dimension, it applies to any other dimensions.

I believe the fact that time travels along an imaginary axis has profound implications. Anything oscillating in space and time could only behave in two ways, it either oscillates in space but decaying in time, or oscillates in time but decaying in space. That’s how nature works, nothing could happen at all time over all places . Some conservation law seems to be at work, in a mysterious but elegant way.

Two Commutation Relations

In previous post I mentioned that the First Commutation Relation and Second Commutation Relation are equivalent. Here is a proof from Classical Perspective.

We assume one of them to be true. For example, we assume First Commutation Relation to be true. i.e.

[\hat{x}, \hat{p}] = i\hbar\rightarrow \hat{p} = -i\hbar\frac{\partial{}}{\partial{x}}

Classical Hamiltonian is given by

\hat{H} = \frac{\hat{p}^2}{2m} + \hat{V} = -\frac{\hbar^2}{2m}\frac{\partial{}^2}{\partial{x^2}} + V

We next compute the commutation between Hamiltonian and Position

[\hat{H},\hat{x}] = [\frac{\hat{p}^2}{2m} + \hat{V},\hat{x}]=\frac{1}{2m}[\hat{p}^2, x] =i\hbar\frac{\hat{p}}{m}

We see the operator in the last term is precisely the velocity operator \hat{v}

\hat{v}\psi = v\psi

Therefore from definition of velocity

\hat{v}\psi = \frac{\partial{x}}{\partial{t}}\psi = [\frac{\partial{}}{\partial{t}}, \hat{x}]\psi

So we conclude that

[\hat{H},\hat{x}]\psi = i\hbar[\frac{\partial{}}{\partial{t}}, \hat{x}]\psi

\hat{H} = i\hbar\frac{\partial{}}{\partial{t}}

It then follows

[\hat{H},\hat{t}] = [i\hbar\frac{\partial{}}{\partial{t}}, \hat{t}] = i\hbar

Equivalence proved.

(Note it’s only proved for non-relativistic hamiltonian, i still haven’t worked out the most general case at the moment)

On Schrodinger Equation

There is a concept in Schrodinger equation that has never been taught in any of my Quantum Mechanics lectures.

We assume the validity of the following commutation relation

[x, p] = i\hbar

We call it First Commutation Relation. Equivalently we conjecture the validity of another commutation relation

[H, t] = i\hbar

We call it Second Commutation Relation. Based on these two assumptions, we build up Quantum Mechanics as follows

From First Commutation Relation:

[p, x] = -i\hbar \rightarrow p = -i\hbar\frac{\partial{}}{\partial{x}}

From Second Commutation Relation:

[H, t] = i\hbar \rightarrow H = i\hbar\frac{\partial{}}{\partial{t}}

From Classical Mechanics, the definition of Hamiltonian for a particle is

H = \frac{P^2}{2m} + V

Substitute H and P into this equation

H=\frac{P^2}{2m} + V = -\frac{\hbar^2}{2m}\frac{\partial{}^2}{\partial{x^2}} + V = i\hbar\frac{\partial{}}{\partial{t}}

This produces Schrodinger Equation. Next we consider the case for photon. The Hamiltonian for photon is given by Einstein’s Special Theory of Relativity

H^2 = p^2c^2 + m^2c^4

Since photon has no mass

H^2 = p^2c^2

Substitute our formula for Momentum and Hamiltonian

H^2 = -\hbar^2\frac{\partial{}^2}{\partial{x^2}}c^2=-\hbar^2\frac{\partial{}^2}{\partial{t^2}}

We can easily see

\frac{\partial{}^2}{\partial{x^2}} = \frac{1}{c^2}\frac{\partial{}^2}{\partial{t^2}}

It reproduces Maxwell’s Light Equation. Given a standard EM wave equation to be

\psi = Ae^{i(kx - wt)}

We can find its energy by hamiltonian operator

H\psi = i\hbar\frac{\partial{}}{\partial{t}}Ae^{i(kx-\omega t)} = \hbar\omega\psi \rightarrow E=\hbar\omega

It predicts correct energy of each photon as predicted by Einstein and Planck.

As a concluding remark, the assumption of the validity of two commutation relation turns out to be generally true even in relativistic case. Interestingly, one can also show that the two commutations are essentially equivalent. Dropping the First Commutation Relation, our Quantum Mechanics builds itself upon the incompatibility of energy and time, which it’s absolutely mysterious, utterly intriguing and extremely suggestive that an internal structure of space-time must remain undiscovered that governs this incompatibility.

Principle of Concentration

We first define a term called Interest to represent one thing that a person may potentially doThen we collect all Interests of a person, and organize them in such a way that associated Interests are interconnected. This will give us a huge network of Interests. We then topologically fold our network of Interests into a multidimensional space while keeping connected Interests as adjacent points in space. This space formed is called Interest Space.

Each person is represented as a collection of N Interest points living freely in the Interest Space. We assume each Interest point has equal probability of moving in any direction in the Interest Space. Given N>>1 we can approximate the distribution of  Interest points by a continuous function which we call Concentration Function, represented as C(x,t),

C(\mathbf{x}, t) = C(x_1, x_2, ... , x_d, t)

\int_V C(\mathbf{x},t)d\tau = N

where d is the dimension of Interest Space, and {x_1, x_2, ... , x_d} represents a coordinate and d\tau represents a unit volume in Interest Space.

Random walk of Interest Point

As the name suggests, the magnitude of Concentration Function will represent the level of concentration at the specific interest point. Our assumption is that each interest point has equal probability of moving in any direction. Anyone with a basic physics training will immediately realize that it reassembles a random walk. For a simplified case where d = 2, it implies a person’s concentration, which is initially at the origin, will move around like shown in Fig. 1. It shows how our concentration will be digressed graphically.

Fig 1: “Random Walk” of Interest Point. Image adapted from Wikipedia http://en.wikipedia.org/wiki/Brownian_motion

Diffusion of Concentration

When we have a collection of N interest points diffusing simultaneously from the same interest point, from statistics, they follow a Gaussian distribution around the initial interest point, and get flattened through time.

Fig. 2 Qualitative feature of concentration diffusion. (Note the axis labels and time scale don’t correspond to our discussion) Image adapted from Wikipedia: http://en.wikipedia.org/wiki/Brownian_motion

Diffusion Equation of Concentration

An important feature of our construct is the Finiteness of a person’s Interest Points. In other words, we enforce a Conservation of Interest Points. Together with our assumption that each interest point has equal probability to diffuse into its adjacent points, it’s easy to show that our concentration function must follow a continuity equation.

\nabla^2 C(\mathbf{x},t) = D\frac{\partial{}}{\partial{t}}C(\mathbf{x},t)

To simplify the problem, we first investigate a one dimensional model. Assuming there is no Boundary Conditionswe can solve the equation with standard separation of variables technique. Given a constraint that C(x,t) must remain real, we get the solution to be like

C(x,t) = const. cos(kx - \frac{k}{\sqrt{D}}t)

I call this equation Freeworking Equation as we impose no constraints on interest space and time. The implication is that during freeworking, our concentration on any specific interest point will fluctuate sinusoidally with time, and our concentration will fluctuate through related interest points sinusoidally. All these make sense to me.

Fig. 3. Cosine function. Illustration that concentration fluctuates through interest points. Image adapted from: http://www.biology.arizona.edu/biomath/tutorials/trigonometric/graphtrigfunctions.html

Importance of boundary conditions

From previous discussion, when there is no constraints, concentration merely fluctuates indefinitely throughout space and time. So here comes the question, how do we raise our concentration on a specific interest point? This is when boundary condition starts to play an important role.

Suppose we have such a constraint that anything outside interest points A and B will be completely ignored. The solutions start to look like in Fig. 4. Compared with our freeworking equation where our concentration merely flows around indefinitely, now they become localized in a certain region. A remarkable fact is that now the solution form a complete set of fourior basics that are able to compose any functions within A and B. Similar effect happens if we constrain our time, we could arbitrarily concentrate on any interest point within A and B.

Fig. 4 Constrain our concentration that any interest points outside A and B will be ignored completely. Our concentration function will look like this. Image adapted from http://webs.morningside.edu/slaven/Physics/atom/atom5.html

What does it mean after all?

We have proved that to raise concentration on a specific interest point, we constrain our interest space and time. This is exactly the rationale why we often have to constrain ourselves by setting rigid deadlines and ignore irrelevant topics to be arbitrarily concentrated and therefore achieve great things.

One may ask if we have enough time, why should we care about productivity at all? The ultimate reason is the intrinsic constraint of our lifetime. We simply don’t have enough time. That’s how nature made it to be, neither too short that we can’t achieve anything, nor too long that our concentration starts to diffuse. Nature simply assigns each species an appropriate time scale in preference of productivity. It’s absolutely fascinating!