A Little Technical Interlude

What I’m about to tell you is not something new. These ideas have been around for many years already (since classical mechanics was still a thing). Moreover, I have been bothering my friends, students, teachers and some angry chemists with this for the last four or five years, ever since a student in San Luis Potosi made the mistake of saying the word “Hamiltonian” in front of me. This let another professor to reveal me one of the most guarded treasures for the physicist: the all-mighty  Lagrangian.

This post is not about an advanced subject, but it is neither for the initiated, nor for the ones that ignore the subject completely. If you have never seen the Lagrangian formalism before, maybe it is not the bet place to start, but I will take you by the hand. The reason I’m doing this is because I like this a lot, and I don’t usually find the treatment as I will present it here in many course. Moreover, I you get familiar with this I will be easier for me to share with you one of the subject I’m more passionate about… Symmetries.

I’ll go a bit axiomatic at the beginning, but I assure all of this is kinda justified. I will quickly enunciate the minimum-action principle, Lagrangian formalism, Hamiltonian formalism, Poisson bracket and a small example (harmonic oscillator of course). This post is not necessarily about beautiful things, but I wanted to include it here so I can use some of this tools in later posts. Lets start with the action.

The idea is that for every physical system, we can assign a function called Lagrangian. This little monster allows us to study the behavior of systems, its symmetries, their time evolution, etc. For “regular” mechanical systems as we are use to in classical mechanics we can write this quantity as

\displaystyle \mathcal{ L} = T - V 

The first term corresponds to the kinetic energy, while the second is the potential energy. We would like to use this to predict the movement of a particle between two points x_1, \; x_2 . The usual prescription is to take the following integral

\displaystyle  S[x(t)]  =\int_{t_1}^{t_2} \mathrm{d}t\; \mathcal{L}[x(t)] 

This curious object is referred as the action integral, or simply the action. Depending on which trajectory x(t) we choose, it will give a different value. This kind of assignation between functions and number is called a functional. However, not all trajectories x(t) are useful; in fact, classically we are only interested in one, the one that actually is followed by the system. The usual prescription is to study the variations of the action when varying the trajectory, having boundaries fixed. The particle will follow the path that renders a null-variation. This is the minimum-action principle. In practice this fairly similar to taking some kind of differential of S:

\displaystyle \delta S =0 

From the construction above, we see that \mathcal{L}=T-V depends both on position and velocities. Also, note that time only works as a parametrization of the trajectory, meaning that the variation operator and the integral commute:

\displaystyle \delta S = \int_{t_1}^{t_2} \mathrm{d}t\; \delta \mathcal{L}(x,\dot{x}) = \int_{t_1}^{t_2} \mathrm{ d}t\; \left( \frac{\partial \mathcal{L} }{\partial x}\delta x + \frac{\partial \mathcal{L} }{\partial\dot{x}}\delta \dot{x}\right) =0

A common trick on this procedures is to find a way to factorize stuff, usual way to do so is by performing a integration by parts on the second term:

\displaystyle \delta S =\left. \frac{\partial \mathcal{L} }{\partial\dot{x}}\delta x\right|_{x_1}^{x_2} + \int_{t_1}^{t_2} \mathrm{d}t\; \left( \frac{\partial \mathcal{L} }{\partial x} - \frac{\mathrm{d}}{\mathrm{d}t} \left(\frac{\partial \mathcal{L} }{\partial\dot{x}}\right) \right) \delta x =0

Some details are to be marked:

If this is your first time doing this procedures, it may look a bit informal, but I assure you every step is well justified… somewhere else… Also, you may notice this weird term \left. \frac{\partial \mathcal{L} }{\partial\dot{x}}\delta x\right|_{x_1}^{x_2} .  This looks weird, but if you have had a healthy approach to calculus, you know this is only the term uv in the usual integration by parts formula. It is a common practice among physicist to get read of this term under some boundary condition argument. For example, here we said that we are fixing the boundaries when performing the variation i . e. \delta x(t_1) = \delta x(t_2) = 0. However, remember that this term is there and we don’t always have a good reason to neglect it; actually there are some nice physics behind it.

For now we will get rid of these boundary terms. The stationary condition is then written as:

\displaystyle \delta S = \int_{t_1}^{t_2} \mathrm{d}t\; \left( \frac{\partial \mathcal{L} }{\partial x} - \frac{\mathrm{d}}{\mathrm{d}t} \left(\frac{\partial \mathcal{L} }{\partial\dot{x}}\right) \right) \delta x =0

This has to be the case regardless of the variation \delta x , which is only possible if the argument in parenthesis is equal to zero everywhere in the interval:

\displaystyle \frac{\partial \mathcal{L} }{\partial x} - \frac{\mathrm{d}}{\mathrm{d}t} \left(\frac{\partial \mathcal{L} }{\partial\dot{x}}\right)=0

Here I present you the Euler-Lagrange equation. This was a nice accomplishment for physics many years ago. It has many variations and can be generalized to deal with more modern theories. One of the main points of this result is that , this is a “covariant” equation, meaning that it looks the same in different coordinate system, as long as the change of coordinates behaves correctly, which is usually the case for coordinates used in physics. The idea is that if we know the correct Lagrangian associated to a physical system, we can use it to derive what are know as equations of motion. In fact, a big part of modern theoretical physics consist in finding the Lagrangian that would describe more accurately a system. To be honest, in many cases we are guessing the correct Lagrangian, but these are “informed” ansatz based on physical arguments or symmetries of systems.

Lets fine the physics hidden in this equation. I told you before that, in classical mechanics, we usually write \mathcal{L}=T-V . It is to be notice that the kinetic energy in Cartesian coordinates depends only on the velocity and the potential energy depends on position. This allows us to do the following association:

\displaystyle \frac{\partial \mathcal{L} }{\partial\dot{x}}=\frac{\partial T }{\partial\dot{x}}\frac{\partial }{\partial\dot{x}} \left( \frac{1}{2}m \dot{x}^2 \right) = m \dot{x}

Notice that this corresponds to the usual linear momentum. In Lagrangian formalism we define a generalized momentum conjugated to a coordinate as p_q=\frac{\partial \mathcal{L} }{\partial\dot{q}} . This association is done regardless on the coordinate chose used to express the Lagrangian; whenever there is no confusion, we drop the subindex. With this assignation, the E-L equation can be rewritten as:

\displaystyle \dot{p}=\frac{\partial \mathcal{L} }{\partial x}

but… this looks unsettling familiar. Once again, consider \mathcal{L}=T-V in the right-hand side:

\displaystyle \dot{p}=\frac{\partial (T-V) }{\partial x}= -\frac{\partial V }{\partial x}

But, from your Physics 101 you recall that this last term corresponds to the force resulting from a potential energy V . Then, for this particular kind of Lagrangian, the equation of motion is:

\displaystyle  F = \frac{\mathrm{d} p}{\mathrm{d}t}

This is ye good olde Newton’s second law! See? I was not inventing stuff completely out of nowhere. Now we can start with the really interesting part: Conservation and symmetric stuff.

We first write a version of this for more than one coordinate:

\displaystyle p_i=\frac{\partial \mathcal{L} }{\partial\dot{q_i}} \qquad \dot{p_i} = \frac{\partial \mathcal{L}}{\partial q_i}

Here the index i can refer to coordinate (x,y,z) , (r,\theta,\phi) , or to a many-particles system. As we like to derivate stuff, it would be interesting to see the time evolution for the Lagrangian along a trajectory. Lets do it:

\displaystyle \frac{\mathrm{d}\mathcal{L}}{\mathrm{d}t} = \frac{\partial\mathcal{L}}{\partial t} + \sum_i \left( \frac{\partial\mathcal{L}}{\partial q_i} \frac{\mathrm{d} q_i}{\mathrm{d}t}+\frac{\partial\mathcal{L}}{\partial \dot{q_i}} \frac{\mathrm{d} \dot{q_i}}{\mathrm{d}t}\right) = \frac{\partial\mathcal{L}}{\partial t} + \sum_i \left( \frac{\mathrm{d} p_i}{\mathrm{d}t}\dot{q_i}+p_i \frac{\mathrm{d} \dot{q_i}}{\mathrm{d}t}\right)

for the second equality we use directly the momentum definition and the E-L equation. The last term can be easily seen as the derivative of a product:

\displaystyle \frac{\mathrm{d}\mathcal{L}}{\mathrm{d}t} = \frac{\partial\mathcal{L}}{\partial t} + \frac{\mathrm{d} }{\mathrm{d}t} \left(\sum_i  p_i \dot{q_i}\right)

Putting the time derivative together we obtain an interesting identity:

\displaystyle \frac{\mathrm{d} }{\mathrm{d}t}\left( \sum_i  p_i \dot{q_i}-\mathcal{ L }\right) = - \frac{\partial\mathcal{L}}{\partial t}

The thing inside the derivative is of great importance; it is so important that we will give it a letter and a name:

\displaystyle H =  \sum_i  p_i \dot{q_i}-\mathcal{ L }

This is the Hamiltonian. And it will whose to have a great deal of relevance. The signs used are a convention, thus arbitrary, but it is better to keep them like that so people don’t start getting confused. A first interesting thing to notice is that if the Lagrangian does not depends explicitly on time, we can set its partial derivative to zero:

\displaystyle \frac{\mathrm{d} H}{\mathrm{d}t} = - \frac{\partial\mathcal{L}}{\partial t} = 0

This means that the Hamiltonian will have the same value everywhere along the trajectory. Whenever a function on the trajectory takes a constant value along it we refer to this value as a constant of motion. If we go back to the sample Lagrangian we have been working with, we notice the following:

\displaystyle H =  \sum_i  p_i \dot{q_i}-\mathcal{ L } =  \sum_i (m \dot {x_i}) \dot{x_i}-(T-V) = 2T-T+V = T+V

This last terms is the total mechanical energy as you have seen in other physics courses. Then we can associate the Hamiltonian function to the classical energy. A point to stress here is this association is not an identity. The Hamiltonian is function depending on the coordinates and momenta; the space formed with this variables is referred as phase space. The energy are the values that this function takes when it behaves as a constant of motion. From this a straightforward observation:

Observation 1: If the Lagrangian describing a system does not depends explicitly on time, the trajectory described by the system is such that the generalized coordinates and momenta lie on the level curves of the Hamiltonian function.

This may sound a bit cryptic now, but here I give you a rephrasing that may sound more…. physical: The fact that your Lagrangian does not depend on time means that it does not matter when do you start studying your system, you will alway find the same equations of motion. If you change in time, the particulars of your system, such as positions and velocities, may look different, but the physics behind it will be the same, i. e. same differential equations. We then say that the system is symmetric in time:

Observation 1 (Rephrased): If the Lagrangian describing a system is symmetric in time, the trajectory described by the system will have a constant energy given by the Hamiltonian function.

This one is a first taste of a marvelous thing called Noether’s theorem, but we are going to talk about that later.

There is another utility involving the Hamiltonian. To see this, let’s replace it directly in the action integral:

\displaystyle \mathcal{ L } = \sum_i  (p_i \dot{q_i})-H   \qquad \rightarrow \qquad S=\int \left(\sum_i p_i \mathrm{d}q_i - H\mathrm{d}t\right)

and once again we perform a variation with fixed boundaries now in both momentum and position space to find the stationary action:

\displaystyle \delta S=\int \sum_i \left( \delta p_i \mathrm{d}q_i + p_i \mathrm{d}(\delta q_i)  - \frac{\partial H}{\partial q_i}\delta q_i -\frac{\partial H}{\partial p_i}\delta p_i\mathrm{d}t \right) = 0

Using again integration by parts with fixed ends we obtain:

\displaystyle \delta S=\int \sum_i \left( \delta p_i \mathrm{d}q_i  - \delta q_i  \mathrm{d} p_i  - \frac{\partial H}{\partial q_i}\delta q_i \mathrm{d}t -\frac{\partial H}{\partial p_i}\delta p_i\mathrm{d}t \right) = 0

\displaystyle \delta S=\int \sum_i \left( \delta p_i \left(\mathrm{d}q_i -\frac{\partial H}{\partial p_i}\mathrm{d}t\right) - \delta q_i\left(\mathrm{d} p_i+  \frac{\partial H}{\partial q_i}\mathrm{d}t\right) \right) = 0

Since this variation should be null for every variation on the momentum and position space, the terms in parenthesis should vanish identically. This yields two sets of first order differential equations:

\displaystyle   \dot{q_i} = \frac{\partial H}{\partial p_i} \qquad \dot{p_i}=-  \frac{\partial H}{\partial q_i}

These are the Hamilton’s equations. We can use those to describe the dynamic of the system and a straight-forward calculation shows that this are equivalent to the equations of motion obtained either by Euler-Lagrange or by Newton’s second law.

Finally we violently introduce the notion of Poisson brackets. Consider a function f(q,p) of phase space that does not depend on time explicitly. Using the chain rule and the Hamilton’s equation we obtain the following identity:

\displaystyle \frac{\mathrm{d} f(q,p)}{\mathrm{d} t} = \sum_i \left( \frac{\partial f}{\partial q_i}\frac{\mathrm{d} q_i}{\mathrm{d} t}+\frac{\partial f}{\partial p_i}\frac{\mathrm{d} p_i}{\mathrm{d} t}\right) = \sum_i \left( \frac{\partial f}{\partial q_i}\frac{\partial H}{\partial p_i}-\frac{\partial H}{\partial q_i}\frac{\partial f}{\partial p_i}\right)

Inspired on this structure we introduce the Poisson bracket of two functions of phase space as:

\displaystyle  \{f,g\} = \sum_i \left( \frac{\partial f}{\partial q_i}\frac{\partial g}{\partial p_i}-\frac{\partial f}{\partial q_i}\frac{\partial g}{\partial p_i}\right)

Which allow us to rewrite the previous expression as:

\displaystyle \frac{\mathrm{d} f(q,p)}{\mathrm{d} t} =  \{f,H\}

Of course, for those versed in quantum mechanics, this is awfully similar to the Ehrenfest theorem. I invite you to also check that \{ q_i,p_j \}=\delta_{ij} . Furthermore, this object have the same behavior as the commutator:

  • Anticommutativity: \{f,g\}=-\{g,f\}
  • Bilinearity: \{af+bg,h\}=a\{f,h\}+b\{g,h\}; \; a,b\in\mathbb{R}
  • Leibniz’s rule: \{fg,h\}=\{f,h\}g+f\{g,h\}; \; a,b\in\mathbb{R}

With this notation, Hamilton’s equation are written as follows:

\displaystyle   \dot{q_i} =\{q_i,H\} \qquad \dot{p_i}= \{p_i,H\}

Now this equations of motion look symmetric as fuck. Moreover, if H is a nice analytic function on phase space (which for classical mechanics tends to be a reasonable assumption) it admits a series expansion on powers of the generalized coordinates and momentum. It is therefore not hard to convince ourselves that the equation of motion can be directly calculated using the Poisson bracket of the coordinates with the momenta (\{ q_i,p_j \}=\delta_{ij} ).

The last observation raise the interest on the following question: What happens if we change coordinates ? Well, we can show that, as long as the new coordinates have the same Poisson bracket behavior (\{ q_i,p_j \}=\delta_{ij} ), the equations of motion derived for the Hamilton’s equations in the Poisson bracket form, will be consistent. Such coordinate changes are named “canonical transformations”.

We need an example:

Start with a 1D classical harmonic oscillator; its kinetic energy is given by T=\frac{1}{2}m\dot{x}^2 , while the potential energy is V=\frac{1}{2}m\omega^2 x^2 . Under the above prescription , the Lagrangian of the system is:

\displaystyle \mathcal{L}=\frac{1}{2}m\dot{x}^2  - \frac{1}{2}m\omega^2 x^2

Simple calculation tells us that the generalized momentum coincides with the usual momentum p=\frac{\partial \mathcal{L}}{\partial \dot{x}}=m\dot{x} . We could directly write  Euler-Lagrange equation from this and get an equation of motion:

\displaystyle \dot{p}=-m\omega^2 x

This is the expected Newton’s second law for a spring-mass configuration in 1D, but… a bit boring and not that insightful. However, if you already read my post on the harmonic oscillator, I bet you know there is a bit more to it. Lets first write the Lagrangian in terms of the momentum variable:

\displaystyle \mathcal{L}=\frac{p^2}{2m}  - \frac{1}{2}m\omega^2 x^2

Then we obtain the Hamiltonian:

\displaystyle H=p\dot{x} - \mathcal{L}=\frac{p^2}{2m} + \frac{1}{2}m\omega^2 x^2 = \frac{\omega}{2} \left( \frac{p^2}{m\omega} + m\omega x^2 \right)

I guess you see where are we going. We change variables X=\sqrt{m\omega} x, \; P=\frac{p}{\sqrt{m\omega}} . With those, our Hamiltonian has a more aesthetic view:

\displaystyle H= \frac{\omega}{2} \left( P^2 + X^2 \right)

And those trivially preserves the Poisson bracket:

\displaystyle \{X,P\}=\left\{\sqrt{m\omega} x, \frac{p}{\sqrt{m\omega}} \right\}=\{x,p\}=1

This is already fairly interesting, because, due to the observation 1, the trajectory followed in phase space is level curve of this Hamiltonian, meaning that, for this particular case, it describes a circle with radius \sqrt{\frac{2E}{\omega}} . This hints to another canonical transformation: Circles imply polar coordinates; set

\displaystyle R = \frac{1}{2} \left( P^2 + X^2 \right), \qquad \varphi = \arctan \frac{P}{X}

Why to keep the squares and the one half? Honestly that’s just because I know already the answer and we want to have certain similarities with the quantum version of this system. If I don’t keep the half and similar stuff I would get a Poisson bracket that is off by some multiplicative constant, keeping everything guarantees us to have the correct brackets; you can try and do it with the square root, removing the one half, but for my it is plainly annoying to deal with the derivative of square roots. Well, before using it, let me just show you that the Poisson bracket are consistent:

\displaystyle \{ R, \varphi \} =\frac{ \partial R}{ \partial X}\frac{\partial \varphi}{\partial P} - \frac{\partial \varphi}{\partial X} \frac{\partial R}{\partial P} = X \left( \frac{ \frac{1}{X} }{1+\frac{P^2}{X^2}} \right) - \left( \frac{ -\frac{P}{X^2} }{1+\frac{P^2}{X^2}} \right) = 1

There you have it, our coordinate transformation is one we can trust; now we can see why is this beautiful. First of all, of Hamiltonian ow is much nicer:

\displaystyle H = R \omega

Also, the equations of motion are fairly trivial:

\displaystyle \dot{R}=\{ R, H \} = \{ R, R \omega \} =0  \qquad \Rightarrow \qquad R = A

Which allow us to see much more easily that the Hamiltonian yields a constant value R \omega in the trajectory. For the second coordinate:

\displaystyle \dot{\varphi}=\{ \varphi, R \} = - \omega \{ R, \varphi \} = -\omega

In the last one we made, direct use of the Poisson bracket between the coordinate and its momentum; that’s what I meant when saying that the equation of motions are encoded in this brackets. This last equation tells us that, for an arbitrary time:

\displaystyle \varphi = -\omega t + \phi

Going back to the original coordinates, we directly obtain:

\displaystyle x = \sqrt{\frac{2A}{m \omega }}\cos (\omega t - \phi), \qquad p= -\sqrt{2Am\omega}\sin (\omega t - \phi)

which is in accordance with the result we would get by solving the traditional equation of motion. There are some thing to notice in this approach:

  • You can see the direct relation between the energy of the system and the amplitude of oscillation. That’s why as soon as we put some damping, breaking the energy conservation, we obtain a decrease on amplitude. This also shows how resonance and other phenomena as such can occur when introducing an external time-dependent force, since this one breaks the time-translation symmetry.
  • Regardless of the fact that we started from the action integral, no actual integral was calculated along this whole process! Not even an ansatz for the differential equation
  • Hamiltonian in this approach look unsettlingly similar to the one in the quantum treatment, I invite you to have a look.


That’s pretty much it. I’m sorry for the long technical post, but I wanted to leave you a reference in case I use some of this for latter post. For now, as a gift, I’ll show you what happens if we are don’t neglect those boundaries terms I told you before:

Consider once again the action integral we mention before, but this time we will only leave fixed the initial point while allowing the end point to vary a bit. However, we will focus on the value of this action integral only for the trajectories that satisfy the E-L equation. This is the same as to compare how the stationary action would have change if, instead of arriving to the point that I was interesting on, I had arrive to another next to it, but still following a physical trajectory. With this line of thought in mind, the action integral looks like this:

\displaystyle  S =\int_{t_1}^{t} \mathrm{d}t'\; \mathcal{L}(x,\dot{x},t') 

and, from what we did before, its variation will be:

\displaystyle \delta S =\left. \frac{\partial \mathcal{L} }{\partial\dot{x}}\delta x\right|_{x_1}^{x} + \int_{t_1}^{t} \mathrm{d}t\; \left( \frac{\partial \mathcal{L} }{\partial x} - \frac{\mathrm{d}}{\mathrm{d}t} \left(\frac{\partial \mathcal{L} }{\partial\dot{x}}\right) \right) \delta x = \frac{\partial \mathcal{L} }{\partial\dot{x}}\delta x + \int_{t_1}^{t} \mathrm{d}t\; \left( \frac{\partial \mathcal{L} }{\partial x} - \frac{\mathrm{d}}{\mathrm{d}t} \left(\frac{\partial \mathcal{L} }{\partial\dot{x}}\right) \right) \delta x

Since we said that this trajectories obey the Euler-Lagrange equation, the argument within the integral vanishes for every value of t, leaving us with a nice identity:

\displaystyle \delta S =\frac{\partial \mathcal{L} }{\partial\dot{x}}\delta x

Notice that the term in the right-hand side is the generalized momentum… identifying the variation with the differential of S and taking into account the other coordinate, we obtain a familiar expression:

\displaystyle p_i=\frac{\partial S}{\partial x_i}

For those familiar with quantum mechanics this should look pretty interesting. This is not a coincidence and can be used to connect some loose strings… but this will be later, for now lets use this together with the Hamiltonian. From the action integral definition, it is not hard to see when varying the arrival time that:

\displaystyle \frac{\mathrm{d} S }{\mathrm{d} t} = \mathcal{L}

But, since we are thinking of the action as a function of the ending point:

\displaystyle \frac{\mathrm{d} S }{\mathrm{d} t} = \frac{\partial S}{\partial t} + \sum_i \frac{\partial S}{\partial x_i} \frac{\mathrm{d} x_i }{\mathrm{d} t} = \frac{\partial S}{\partial t} + p_i \frac{\mathrm{d} x_i }{\mathrm{d} t}

Where in the last equality we used the previous identity. Combining this two equalities  with the definition of the Hamiltonian:

\displaystyle H = -\frac{\partial S}{\partial t}

This is also equal to the energy of the system. I invite you to try by yourself the following exercise:

Consider the operators \hat{P}=-i\partial_x and \hat{E}=i\partial_t . Study the effect of those operator when acting on the function \psi=A e^{i S} . In particular, show that, for a system displaying conservation of energy and momentum, this function is an eigenvector of these operators.

This exercise is related to some nice subjects such as the WKB method, the Bohr-Sommerfeld quantization rule, path integrals, etc. However, this subject are for a longer discussion and now it is not the time; I just invite you to notice that everything I did here was classical mechanics, so, the real question is: Where is actually the quantum in quantum mechanics ?


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s