If you are copy and pasting code from open source projects into your own project, then I think that is more likely to be considered copyright infringement than fair use. Fair use is generally for things like criticism, parody, teaching etc. Obviously this kind of thing would need to be judged on a case-by-case basis, but I think you are on shaky ground here.
In the first two links you sent, the 40% result looks like the baseline case getting slower, not the unit under test getting faster. The core assembly looks look identical in both cases.
The order in which the tests were run was the first thing I checked in his implementation, but I looked too quickly and thought he was generating the data for each variant, so I assumed that was not the problem. [Actually, you need the same data for both tests, but generated twice]
I was going to just point out that 40% percent difference would mean that the version without the sentinel can be improved... was going to check if there is something that is preventing the branch prediction from actually taking care of that performance drop - memory is only being read and nothing should be invalidated...
I once worked in an office with fake windows---just some blinds hung over a small recess in the wall.
When our team first arrived in the office, a colleague walked over to them and said something along the lines of `let's get some sunlight in here' before opening them to reveal the deception.
Previous employer I was given the choice of been in the main open office or taking over an old out of the way (huge) meeting room that had no windows to myself.
Boss simply couldn't understand that someone would choose to sit in a quiet, miles away from anyone else on site, air conditioned office even without windows.
Honestly, never bothered me, lots of plants and replacing the strip lights with 5000K bright LED's for the overheads and some LED lamps dotted around it was only the same as working at night at home.
Hands down the best physical work environment I've had outside of work from home.
I joked at the time I'd program in a cave if it had good internet and was quiet.
For lowest latency applications I void avoid using RT priorities. Better to run each core 100% with busy waiting and if you do so with RT prio you can prevent the kernel from running tasks such as vmstat leading to lockup issues. Out of the box there is currently no way to 100% isolate cores in Linux. There is some ongoing work on that: https://lwn.net/Articles/816298/
Currently the page must be writeable at some point in order to create the trampoline.
A page fault is used as a way of executing the trampoline without the page having to be made executable/writable---the page fault handler recognises the page as a special trampoline page and handles the jump to the trampolines target address (which was previously registered using the new syscall).
Note that AFAICS this is unrelated to Spectre. The intended use is for constructing closures for use in FFI libraries such as libffi.
EDIT: I think I perhaps misunderstood your query---are you saying why not just make a system call where the kernel creates the page for you with the desired trampoline code and execute permissions?
Double map it, like a JIT does. Once writable, and once executable. Put the pointers into different shared objects so that ASLR puts a randomized offset between them and you can't discover the write pointers from the execute pointer, and vice versa.
JavaScriptCore had an amusing scheme where they'd make an (executable) memcpy gadget with the address hardcoded, then throw away read permissions to that memory. So the pointer's address is thus not readable without modifying memory permissions.
Yes, the edit is precisely what I was trying to get at. Do you happen to have context around this?
edit: also, “spectre mitigations” was just a shot in the dark. It does feel like this mode of jmping (modifying saved registers and restoring them) would be more prone to interfering with speculative execution.