IMHO Common Lisp is the modern one. Using the JVM is nice for certain uses, and clojure is a fine language, but going bare metal from higher abstractions is some power of Lisp that we should not give up. Same with macros and code generation.
Luckily all of them can coexist, so we do not need to choose.
I agree, Common Lisp is crufty but really stands the test of time. I have been using Common Lisp since about 1982 and old code still runs fine.
My relative use of Lisp languages: Common Lisp 60%, Racket 20%, Haskell 5%, Clojure 5%, and various other Schemes 10%. Unfortunately since most of my work in the last 8 years has been deep learning, LLMs, LLM chains, etc., I spend a little over half my time now using Python. So it goes...
"old code runs fine" is a poor benchmark for how well a language holds up over time. All x86 code still runs fine. Perl 5 is from, I think, 1994. It's still maintained and in development. Most C code from the '80s would still compile and run just fine (at least POSIX-based, Windows however...). You can trivially find Pascal, Fortran, COBOL, and Forth code that all "still runs fine."
I don't think it's a particularly unique or interesting quality, that some old code still runs. After all, I can go to archive.org right now and run all of that ancient DOS, Amiga, whatever code in a 100% exact (or close to) emulator in my browser.
I am curious how performant compiled Common Lisp is compared to GraalVM compiled Clojure native images. You certainly don't give up macros and code generation when using Clojure, though you do give up a specific class of explicit reader macros. Some reader extensibility can be had (and is used in Clojure.core) via reader functions (data_readers.clj)
How is code generation handled on modern operating systems and CPUs? Isn't there normally a strict separation between code and data to prevent exploits?
I just checked: SBCL on Linux/amd64 puts compiled functions in a r+w+x page. You could write Lisp that writes Lisp that writes self-modifying machine code, and it would (I think) run by default.
(compiled-function-p #'f) ; T
(disassemble 'f)
; disassembly for F
; Size: 58 bytes. Origin: #x5350CD14 ; F
; 14: 498B4510 MOV RAX, [R13+16] ; thread.binding-stack-pointer
; 18: 488945F8 MOV [RBP-8], RAX
; 1C: 4883EC10 SUB RSP, 16
; [...]
$ pmap $PID --range 5350CD14
442331: /usr/bin/sbcl
0000000053498000 122272K rwx-- [ anon ]
total 122272K
Self modifying code in the sense of changing one instruction to another at runtime (as was commonly done in assembler in the 1960s) is not generally possible with modern Common Lisps mostly because modern operating systems don't allow it. And that's a good thing because such code would be hopelessly insecure and impossible to reason about.
But if you mean "compile a new function called X and replace the old X at runtime", that's easy in Common Lisp. It's not commonly done unless you're explicitly writing some kind of compiler.
What is commonly done is to create a lexical closure at compile time and change its bound values at runtime. IOW changing the private data of a function at runtime is more generally useful than changing its instructions.
What's most common is to write lisp programs that emit lisp source code and compile it at compile time (but usually not run time). Such programs are called macros.
One can define and compile functions in a running Common Lisp. There's a standard function, compile, that does this.
In SBCL, any evaluation of an expression is done by first compiling it. Compiled functions that are no longer accessible (including not having any frames on the stack) are garbage collected.
The really interesting question is not whether users can mutate existing compiled code, but whether it's useful for the Common Lisp implementation to do so. This is because Common Lisp is a dynamic language, where generic functions and classes can be redefined on the fly. If you want to implement such things efficiently, it would be useful to be able to change existing compiled functions to reflect that (for example) this slot of this object is at this offset rather than that offset.
A scheme has been proposed to do that that puts such code off to the side in little chunks accessed with unconditional branches. When a redefinition occurs the branch is redirected to a newly created chunk; the old one is GCed when no longer referenced from the stack. You have to pay for the unconditional branches, but those are typically quite fast.
> How is code generation handled on modern operating systems and CPUs? Isn't there normally a strict separation between code and data to prevent exploits?
You compile code, which is text (data), all the time, don’t you?
The difference is that the output of the compiler is usually not loaded into the same process that did the compiling. That is not the case for applications that are developed On Lisp™.
At worst you have to ensure that you separate what needs to be writeable from executable, and then flip mappings RW->RX when generating new code, just like other JIT compilers do.
Oh okay. I was thinking of self-modifying, not just code generation. That was something that was done occasionally by DOS and Windows programmers before the introduction of DEP in Windows XP.
Yeah, self modifying has become harder. As far as I know the most robust solution to that now is like what Julia does, where it can compile new generations of any module to update call targets, for example. At worst, when seccomp disallows flipping a writable page to executable at runtime, you can always compile into a file then dlopen the file. In other words, it can take many more contortions now than it used to, but it’s still possible.
It is not that strict. Many years ago machine code was directly loaded into memory and wherever the cpus program counter was, that's what it'd execute.
These days, a page of memory can be set to
Read
Write
Execute
The exploit mitigation you refer to is having the program typically set pages of memory to never have both write and execute set at the same time.
However, this is ultimately controlled by the program. On Linux, the program can invoke the os call 'mprotect' to change the permissions on a page (though a program can also voluntarily use seccmp routines to forego ever invoking this ever again)
And this is basically what browsers do. They compile the code into memory that has been set to 'write' (but not execute) and proceed to then set it to execute (but not write).
The effectiveness of this mitigation is mitigated by the existence of ROP techniques, which is why Intel started introducing Control-flow Enhancemnent Technology (CET), which is intended to ensure you can only branch to certain locations in memory.
Luckily all of them can coexist, so we do not need to choose.