Hacker Newsnew | past | comments | ask | show | jobs | submit | sild's commentslogin

If you are copy and pasting code from open source projects into your own project, then I think that is more likely to be considered copyright infringement than fair use. Fair use is generally for things like criticism, parody, teaching etc. Obviously this kind of thing would need to be judged on a case-by-case basis, but I think you are on shaky ground here.


I put the benchmark into quick-bench but could not replicate the 40% result. The sentinel version was faster but only slightly.

https://quick-bench.com/q/314Z81FskTlcDqMCUHFVhWmDz8Q

Update 1: After moving some constants around, I get the 40% result:

https://quick-bench.com/q/lPrpQTAyDQuOoKS9MBWCTBXk1TE

No idea why it made such a big difference to the benchmark.

Update 2: If the test order is reversed, the result goes back to being only slightly faster for the sentinel version:

https://quick-bench.com/q/Ds7aqe5-6md_tTPndOK54ltYZmE


In the first two links you sent, the 40% result looks like the baseline case getting slower, not the unit under test getting faster. The core assembly looks look identical in both cases.


Well spotted; great for sild! Looks like the 40% claim was due to a bug in benchmarking (which makes sense, and can happen).


The order in which the tests were run was the first thing I checked in his implementation, but I looked too quickly and thought he was generating the data for each variant, so I assumed that was not the problem. [Actually, you need the same data for both tests, but generated twice]

I was going to just point out that 40% percent difference would mean that the version without the sentinel can be improved... was going to check if there is something that is preventing the branch prediction from actually taking care of that performance drop - memory is only being read and nothing should be invalidated...


I am pretty sure that the 40% original difference was due to a bug in benchmarking - or food for thought to improve the non-sentinel version.


The snoop command on SunOS and IRIX had it in the 90s too.


If you are using gcc you can use the -Wparentheses flag to turn on warnings for this: https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#inde...


I once worked in an office with fake windows---just some blinds hung over a small recess in the wall.

When our team first arrived in the office, a colleague walked over to them and said something along the lines of `let's get some sunlight in here' before opening them to reveal the deception.


Previous employer I was given the choice of been in the main open office or taking over an old out of the way (huge) meeting room that had no windows to myself.

Boss simply couldn't understand that someone would choose to sit in a quiet, miles away from anyone else on site, air conditioned office even without windows.

Honestly, never bothered me, lots of plants and replacing the strip lights with 5000K bright LED's for the overheads and some LED lamps dotted around it was only the same as working at night at home.

Hands down the best physical work environment I've had outside of work from home.

I joked at the time I'd program in a cave if it had good internet and was quiet.



This page [1] describes additional tuning parameters. In particular adjusting /proc/sys/kernel/sched_rt_runtime_us can be beneficial.

[1] https://access.redhat.com/documentation/en-us/red_hat_enterp...


For lowest latency applications I void avoid using RT priorities. Better to run each core 100% with busy waiting and if you do so with RT prio you can prevent the kernel from running tasks such as vmstat leading to lockup issues. Out of the box there is currently no way to 100% isolate cores in Linux. There is some ongoing work on that: https://lwn.net/Articles/816298/


Currently the page must be writeable at some point in order to create the trampoline.

A page fault is used as a way of executing the trampoline without the page having to be made executable/writable---the page fault handler recognises the page as a special trampoline page and handles the jump to the trampolines target address (which was previously registered using the new syscall).

Note that AFAICS this is unrelated to Spectre. The intended use is for constructing closures for use in FFI libraries such as libffi.

EDIT: I think I perhaps misunderstood your query---are you saying why not just make a system call where the kernel creates the page for you with the desired trampoline code and execute permissions?


Double map it, like a JIT does. Once writable, and once executable. Put the pointers into different shared objects so that ASLR puts a randomized offset between them and you can't discover the write pointers from the execute pointer, and vice versa.


You still need to store the write pointers somewhere. So this very quickly becomes a game of cat and mouse.


JavaScriptCore had an amusing scheme where they'd make an (executable) memcpy gadget with the address hardcoded, then throw away read permissions to that memory. So the pointer's address is thus not readable without modifying memory permissions.


Yes, the edit is precisely what I was trying to get at. Do you happen to have context around this?

edit: also, “spectre mitigations” was just a shot in the dark. It does feel like this mode of jmping (modifying saved registers and restoring them) would be more prone to interfering with speculative execution.


Would you mind sharing which course you took please? It sounds very interesting.


It sounds like Offensive Security's OSCP and OSCE



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: