A short personal story: While trying to understand special relativity from books, and failing to, I tried doing maths myself based on whatever I had understood till then. I ended up doing what this paper has mentioned. I was surprised. For the next twelve years, I was asking various physicists I knew about what was I doing wrong, however none could give me the time needed. I then saw basically the same proof appear on Wikipedia, so finally had the answer.
Einstein's original paper on special relativity, while mentioning the two famous postulates, also later assumes linearity of the transformations (allowing non-linear takes it from special relativity to general relativity). However, the paper did not explicitly call it out as a postulate. Also, Einstein derived a particular equation in the paper from two paths and mentions that there's consistency. This was actually a missed opportunity -- He could have forced consistency as an input assumption, worked backwards, and would then have seen that invariant speed is not needed as an input assumption anymore.
With this OP submission, I have now learnt that this has been known for more than a century now, and agree that this should be more popular in books.
You are probably on the right path IMHO. What wad that quote where Einstein said the mathematicians got hold of relativity, and turned it into something he didn't understand any more?
Within historical context: This was said after Minkowski introduced the geometry for special relativity, which can be visualized on the `spacetime diagram`, now recognized as the vastly simpler and more efficient approach than the wall-of-text approach. Einstein initially had some difficulty, that was when he said this, but he caught up quickly.
>Strangely enough no personal contacts resulted between his teacher of mathematics, Hermann Minkowski, and Einstein. When, later on, Minkowski built up the special theory of relativity into his 'world-geometry', Einstein said on one occasion: 'Since the mathematicians have invaded the theory of relativity, I do not understand it myself any more'. But soon thereafter, at the time of the conception of the general theory of relativity, he readily acknowledged the indispensability of the four-dimensional scheme of Minkowski
Einstein, philosopher-scientist, Schilpp
And later Einstein expressed admiration for geometry again and again
>This problem remained insoluble to me until 1912, when I suddenly realized that Gauss's theory of surfaces holds the key for unlocking this mystery. I realized that Gauss's surface coordinates had a profound significance. However, I did not know at that time that Riemann had studied the foundations of geometry in an even more profound way.
Einstein, Kyoto Lecture, 1922
>I admire the elegance of your method of computation; it must be nice to ride through these fields upon the horse of true mathematics while the like of us have to make our way laboriously on foot
Hentschel (1998). The Collected Papers of Albert Einstein, Vol. 8 (English): The Berlin Years: Correspondence, 1914-1918
Did you even read his post? what you are calling the "right" path is the "mathematician" path, assuming relativity + group structure.
What he complained about, the Physics textbook approach was the path taken by Einstein who started from the "invariant speed" assumption to derive relativity:
>"introduce another postulate... namely, that light is always propagated in empty space with a definite velocity c which is independent of the state of motion of the emitting body."
There's also nothing really "wrong" with Einstein's approach either, it's motivated by physical constraint from Maxwell's electromagnetism.
Mathematician's approach is motivated by imposing logical constraint of an abstract mathematical structure (group of transformations from one coordinate system to another)
The two turn out to be connected (physical constraint selects a math model from alternatives). Nothing really strange.
On the subject of implicit assumptions (like linear trajectories transform to linear trajectories), it has always struck me that there is another assumption: if A sees B at speed v, then B sees A at the same speed v. It’s hard to imagine this symmetry not being true but who knows what’s possible in theory space?
Another interesting velocity-related conundrum: It’s impossible to measure the one-way speed of light (from a source to a detector) because we have to synchronize the clocks somehow, which requires an algorithm based on the speed of light to account for time dilation. We can only measure the two-way speed of light (from a source to a reflector and back).
So, from the standpoint of pure math, isotropy of the speed of light is purely a convention. The argument can only be made based on, so to speak, philosophical reasoning.
This is true — I take the upshot of this observation to be that only proper times are ever directly physically measurable. Coordinate times and concepts that rely on them (such as what you call one-way velocity of light) are always merely a bookkeeping device for organizing the results of different proper time measurements and will always entail some amount of arbitrariness. General relativity makes this more explicit: the arbitrariness is that of the choice of coordinates on spacetime, and one can choose any old coordinate system or none at all.
>if A sees B at speed v, then B sees A at the same speed v
is directly derivable from the Lorentz velocity addition formula which is the result of the usual two postulates. Why would you want that as an extra assumption...
Of course, if you already assume special relativity then this is baked in: taking the inverse of the Lorentz transformation matrix effectively sends v to -v. And velocity addition for sure already assumes you have the Lorentz transformations.
The point is that in the derivation of relativity from elementary postulates, the “reflectivity of velocities” assumption is used. For instance the linked paper uses it between equations (12) and (13). All such derivations use it at some stage.
Intuitively, “reflectivity of velocities” is why time dilation leads to length contraction with same gamma factor.
I don’t think it follows from absence of a preferred frame since one could imagine some complicated group structure relating the velocities of the various observers relative to each other in such a way that all are on equal footing and yet v_AB is not exactly -v_BA.
As mentioned, this is certainly a reasonable assumption but who knows what whacky alternatives are out there? After all, in the different context of quantum mechanics, such “obvious” relationships as xp = px no longer hold true.
- You derived the theory correctly (end result should match textbook result) and thought what you were doing were wrong.
- No physicist you asked could explain where you were wrong
- No physicist you encountered could see the obvious connection of this and the usual approach
It's not really a mysterious or novel fact that you can have these shift of perspectives (at least since Kant who heavily influenced Einstein) e.g. Newton took his three laws as the base, derived relativity as a theorem. Einstein took (modified version) relativity (together with the invariant speed assumption which comes from electromagnetism) as the base and rebuilt the rest of theory of motion that reduces to Newton's theory in the limit. Nothing is amiss here.
What the paper claims it is saying: you can get the Lorentz transformations without requiring an invariant speed.
What the paper is actually saying: assuming there isn’t a privileged reference frame, you can show that either the Galilean transformations hold (i.e. simultaneity is invariant, suggesting instantaneous propagation of forces, a special case of Lorentz transformations where c is infinity) or Lorentz transformations hold (i.e. a single invariant speed in all reference frames, suggesting propagation of forces at that speed).
Empirically, we know from measuring the speed of light that we live in the latter world. But we can’t prove that a priori as the title suggests.
The mathematics are sound, but the reasoning around is unclear to me. The derivation shows that Lorentz transformations and Galilean transformations are the only ones that allow for the equivalence of all inertial frames, which is a nice result. But it clearly does require the additional assumption of an invariant speed to conclude that Lorentz transformations are anything more than a mathematical curiosity.
So what have we really gained? Since we still need the extra assumption that an invariant speed actually exists, we could've just gone the other way and done the light clock calculation to get the Lorentz transformation instead.
I agree that the paper's title is somewhat misleading, since you still do need to assume an invariant speed to rule out the Galilean transformations.
However, this derivation does greatly narrow things down before the invariant speed comes in: at the point where the invariant speed is assumed, you already know that there are only two alternatives: an invariant speed (Lorentz transformations) or Galilean transformations. So it's much easier to see why you would assume an invariant speed; the assumption isn't just pulled out of thin air at the start, it is seen to be one of only two alternatives that are compatible with the principle of relativity.
>to see why you would assume an invariant speed, the assumption isn't just pulled out of thin air at the start
It's not out of thin air, it's from a very empirically successful theory: Maxwell's electrodynamics. The problem back then was that this theory was not relativistic, i.e. the speed of electromagnetic wave in Maxwell's equations was the same under all reference frames. So you either abandon the idea that laws of physics remaining the same under all reference frames, OR abandon Galilean velocity addition.
Einstein's approach was modifying the latter so that it fits with the former, by imposing invariant speed. This was written in Einstein's original paper. It's not a mystery assumption.
It's also a very common procedure: two empirically successful theories have conflict and you need to resolve them by building something larger than both and reducing to both under limit.
I also agree we have gained insight into how kinematic structure is derived from algebra + physical constraint. Though you still need the physical insight to choose which physical constraint.
> It's not out of thin air, it's from a very empirically successful theory: Maxwell's electrodynamics.
That was the source of the assumption, yes--as you point out, Einstein said so in his original paper. But from the standpoint of mechanics, as opposed to electrodynamics, it was pulled out of thin air. There was no reason based on mechanics to make any such assumption. In fact, everyone else except Einstein that was working on the problem was looking at ways to modify electrodynamics, not mechanics--in other words, to come up with a theory of electrodynamics that was Galilean invariant, rather than to come up with a theory of mechanics that was Lorentz invariant.
> you either abandon the idea that laws of physics remaining the same under all reference frames, OR abandon Galilean velocity addition
Or, as above, you look for a Galilean invariant theory of electrodynamics. Of course we know today that that is a dead end, but that wasn't known then.
That is true. Still, we are kind of trading one unintuitive postulate (an invariant speed) for a different one: Why would we ever think that the time interval between two events can depend on the reference frame?
Sadly, I feel like SR can only really be "understood" as a complete theory. All the individual phenomena (time dilation, length contraction, relativity of simultaneity, constant speed of light etc.) are very hard to understand, because you cannot just take one of them and add it to classical relativity without immediately running into paradoxes. Only once the whole picture is known you see that all the pieces beautifully imply each other. This problem applies to every approach to the subject I've seen.
> Why would we ever think that the time interval between two events can depend on the reference frame?
This isn't a postulate, it's a derived theorem. That's true no matter what axiomatic formulation you use.
> All the individual phenomena (time dilation, length contraction, relativity of simultaneity, constant speed of light etc.) are very hard to understand, because you cannot just take one of them and add it to classical relativity without immediately running into paradoxes. Only once the whole picture is known you see that all the pieces beautifully imply each other.
This is all true, but all of these things you talk about (except the speed of light) are not postulates; they are derived theorems. No approach to relativity that I'm aware of has ever tried to start with any of these things as postulates. Even Einstein's original 1905 paper didn't start with any of these things as assumptions. He started with the principle of relativity and the speed of light being invariant. This paper is just showing how to derive at least a part of the second assumption from the first.
"Let us consider two events, E1 and E2, at the same spatial location in frame O, but separated by a time difference τ. In O′ the two events are separated by a time lapse T."
If you didn't know special relativity, you would never get this idea.
> If you didn't know special relativity, you would never get this idea.
You're looking at it backwards. The paper is not assuming anything here; in fact it is explicitly refusing to assume that we know the correct transformation law between frames. That means we have to leave open the possibility of the time difference changing, not because we know SR, but because we are being logically rigorous.
> Sadly, I feel like SR can only really be "understood" as a complete theory. All the individual phenomena (time dilation, length contraction, relativity of simultaneity, constant speed of light etc.) are very hard to understand, because you cannot just take one of them and add it to classical relativity without immediately running into paradoxes.
This is because you implicily used the (wrong) postulate that "phenomena of special relativity can be iteratively added to a a lassical description of a non-relativistic theory".
Yes but this paper isn't isn't an island unto itself. It's a nice little lemma that can illustrate a point within the subject of special relativity, from a different perspective. It could be included as a small derivation in a textbook, or lecture notes, or even broken down as an exercise for the reader.
In my experience approaching the same idea from many different perspectives yields a deeper, richer understanding of the underlying concept, and offers a path for someone to reach the big picture understanding you mention.
It's curious that this simple proof -which don't require invariant speed, aka Maxwell equations- was discovered after Einstein's proposal which depends on invariant speed assumption. I wonder how the history of physics would have been if someone proposed this before Einstein. The maths needed for this derivation are quite simple, so I guess Newton or some mathematician before Einstein could have proposed special relativity.
> [...] On looking through the literature, we notice that several previous studies undertook the analysis of the optical force in the time domain, but at a certain point always shifted their focus to the time average force [67–69] or, alternatively, use numerical approaches to find the force in the time domain [44,70–75]. To the best of our knowledge, only a few publications conducted analytical studies of the optical force evolution. Very recent paper employs the signal theory to derive the imaginary part of the Maxwell stress tensor, which is responsible for the oscillating optical force and torque [76]. The optical force is studied under two-wave excitation acting on a half-space [40] and on cylinders [77], and a systematic analytical study of the time evolution of the optical force has not yet been reported.
If mass warps space and time nonlinearly per relevant confirmations of General Relativity, and there is observable retrocausality and also indefinite causal order, is forcing time to be the frame of reference, and to be the constant frame of reference necessary or helpful for the OT problem and otherwise?
In 1987 my physics professor presented this proof to our class -- I always wondered why this elegant derivation of the Lorentz transform was not more widely used in undergrad physics education.
As a first introduction, its more instructive to develop physical theories from physical postulates (like speed of light is constant), rather than constraining possible theories using more abstract mathematical principles.
Undergrads will not appreciate the latter, as much as they will the first. Especially, because in the third semester, where relativity is first introduced, most physics undergrads don't know any group theory.
The contradiction between Maxwell’s electrodynamics and Newtonian mechanics was the raison d'etre of the relativity foundations laid out by Lorentz, Poincaré and Einstein. And since those subjects are usually studied before the theory of relativity, I think it makes sense to piggyback on them and use the historical reasoning to build the intuition behind relativity.
I don't understand why the video needs be so negative. Why dump on other videos? And unfairly so!
> Science educators love to make videos about relativity, but surprisingly few of them seem to understand what this stuff really means.
The two examples the video gives of science educators that don't "know what this stuff means" are Don Lincoln and Matt O'Dowd. Both have PhDs in physics. Lincoln was on the team that discovered the Higgs boson. O'Dowd is a professor at CUNY. They know what they're talking about.
Then the video gives as an example of a bad explanation the PBS Space Time on the speed of light. But that video makes pretty much the same argument. That the speed of light/causality falls out logically even if you don't postulate it. The author clearly didn't watch that video carefully https://www.youtube.com/watch?v=msVuCEs8Ydo
I would have subscribed to this channel, but this tone is a bad look.
Both the paper and the video (both of which i've skimmed but not gone through the math substantially) make a case for a maximum speed, notated as 'c'. Let me assume they make a very good mathematic case, and there is a maximum speed that satisfies the fundamental assumptions provided.
Both the paper and the video then finish off by ontologically tying this maximum speed, notation 'c', to the interesting physical E&M phenomenon we happen to call 'light'.
My question - in this framework where we are deconstructing everything (which I applaud), then why make that leap to associate this 'c' with the phenomenon known as light?
Inference, sure why not until any future evidence suggests otherwise, but why do so in the intentionally limited scope of the paper & video's derivations?
> why make that leap to associate this 'c' with the phenomenon known as light?
That's an excellent question! And there are two answers.
1. If you apply the principle of relativity to Maxwell's equations, that's the result you get. Maxwell's equations predict a wave phenomenon that propagates at a constant speed in a vacuum. If the laws of physics, including the laws of electromagnetism, are the same for all inertial observers, then the speed at which those waves propagate have to be the Lorentz-invariant speed.
2. Experiments bear out this prediction! :-)
Note that it did not have to be this way, but if it were not, then EM experiments would give different results for observers in different inertial frames, which would falsify the relativity assumption. The different results for EM experiments would allow you to identify a privileged inertial frame. For a long time, physicists thought this would turn out to be the case, that there was a medium, a "luminiferous aether", through which EM waves propagated, and this medium would define a privileged frame (at least for electromagnetism). It just turns out by experiment that this is wrong, at least in our universe.
That's all fine, but from the moment Maxwell's eq's are invoked, we enter an empirical framing, not a first principles one. (as Maxwell's results follow from observations of the behavior of charged phenomena, current, magnets, & electromagnets). The paper & video seek to postulate the existence of a universal speed limit from first principles. They do so, and without changing the framing from a first principles one to an empirical one, associate their notational 'c' with the observed (i.e. empirical) speed of light. I find this element of their rhetorical strategy at odds with their intentions.
I think what's left unsaid in the paper is that if a Lorentz-invariant speed exists, then it is unique. (This is easy enough to show.)
Therefore, once you show that the speed of light in a vacuum is Lorentz-invariant, then it has to be the unique Lorentz-invariant speed (i.e., the universal speed limit).
> That's all fine, but from the moment Maxwell's eq's are invoked, we enter an empirical framing, not a first principles one.
Yes, of course. First principles only leads you to the conclusion that there is a reference speed. It tells you nothing about its actual value. In fact, first principles cannot rule out the possibility that the reference speed is infinite. They can only lead you to suspect that it might not be.
Saying c as “the speed of light” is an easier thing to comprehend than c as “the speed of causality” which is really what it is. It so happens that photons travel at the speed of causality so we can get away with saying the speed of light.
I’ve always had a problem with the speed of light being the ultimate speed limit in the universe. I would think “who granted the photon this authority?”. It wasn’t until I watched a lot of pbs space time YouTube’s that I learned what people really meant was the speed of causality is the ultimate speed limit. That makes more sense to me.
Fun that the paper uses a lot the argument "this is clearly unphysical" while for anyone before Einstein it was clearly unphysical to consider solutions that yield to Lorentz transformations
Yeah, I was disappointed that the paper claims to use only so many premises up front, and then proceeds to smuggle in a few extra ones under the "clearly unphysical" line. I'd like to see a full and rigorous treatment of exactly what are the axioms which produce Lorentz transformations and no alternatives.
The fact that special relativity is completely consistent with an underlying preferred absolute reference frame that simply is not accessible to experiment (unless things break down at the plank scale) seems obvious to me but I've had a lot of grief talking to people who supposedly knew more than me.
There's a YouTube channel called dialect that has a few videos on this subject
> The 19th century Lorentzian ether theory was ruled out by the famous Michelson-Morley experiment, which ruled out the possibility that the Earth moves through the ether. What we propose here is that the Earth (and everything else) is made of ether. No experiment so far ruled out that possibility, so such a neo-Lorentzian ether theory is a viable possibility
That's in fact the viewpoint Lorentz took when he discovered the Lorentz transform and relativity. He showed that you can make an ether (i.e. a preferred reference frame) consistent with the observations by applying Lorentz transforms to the observations of anybody who is in a different reference frame.
"Ignatowsky showed that the only admissible transforma-
tions consistent with the principle of inertia, the isotropy
of space, the absence of preferred inertial frames, and a
group structure (i.e., closure under composition), are the
Lorentz transformations, in which c can be any veloc-
ity scale, or the Galilei transformations"
The interpretation of their result is totally wrong although the result is correct, because excluding Newtons' relativity (the Galilean group) is why electromagnetism or c is necessary.
Exactly. The paper is great for constraining the possibilities, but you still need to show empirically that there is an invariant speed of light and we live in a Minkowskian not a Newtonian world.
You have it backwards. The assumption is that there is no privileged reference frame. And that assumption (along with a few others) does imply an invariant speed. That's the whole point here.
Let's assume there exists one or more privileged ref system(s), and deal the two cases one by one.
If there was one privileged reference system, and in our deduction process it's not involved,then the original deduction still holds. therefore the special ref system is an redudunt assumption.
If there were many such ref systems, we would have to carefully get them involved in any possible deduction process, which is impossible.
QED
Einstein's original paper on special relativity, while mentioning the two famous postulates, also later assumes linearity of the transformations (allowing non-linear takes it from special relativity to general relativity). However, the paper did not explicitly call it out as a postulate. Also, Einstein derived a particular equation in the paper from two paths and mentions that there's consistency. This was actually a missed opportunity -- He could have forced consistency as an input assumption, worked backwards, and would then have seen that invariant speed is not needed as an input assumption anymore.
With this OP submission, I have now learnt that this has been known for more than a century now, and agree that this should be more popular in books.