I think those other languages have real advantages you aren't seeing. —·— The ot...

igouy · 2025-10-20T16:54:07 1760979247

> And it doesn't really want to talk to the rest of the world—you can forget about calling a Squeak method from the Unix command line.

You seem absolutely certain!

Here's an example of a Pharo Smalltalk program call on the Ubuntu command line, with the calculation result written to stdout --

    /opt/src/pharo-vm-Linux-x86_64-stable/pharo --headless nbody.pharo_run.image Include/pharo/main.st 50000000

    -0.169075164
    -0.169059907

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

Here's a corresponding Perl program --

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

kragen · 2025-10-20T18:18:56 1760984336

Thanks! I'll take a look.

igouy · 2025-10-20T19:24:03 1760988243

If you have questions, I'll try to answer.

codys · 2025-10-20T04:23:42 1760934222

The article in CACM that presents Knuth's solution [1] also includes some criticism of Knuth's approach, and provides an alternate that uses a shell pipeline:

    tr -cs A-Za-z $'\n' |
    tr A-Z a-z |
    sort |
    uniq -c |
    sort -rn |
    sed ${1}q

(I converted a newline to `$'\n'` for readability, but the original pipeline from the article works fine on a current MacOS system)

1: https://dl.acm.org/doi/pdf/10.1145/5948.315654

mek6800d2 · 2025-10-20T05:36:16 1760938576

With great respect to Doug McIlroy (in the CACM article), the shell pipeline has a serious problem that Knuth's Pascal program doesn't have. (I'm assuming Knuth's program is written in standard Pascal.) You could have compiled and run Knuth's program on an IBM PC XT running MS-DOS; indeed on any computer having a standard Pascal compiler. Not so the shell pipeline, where you must be running under an operating system with pipes and 4 additional programs: tr, sort, uniq, and sed.

McIlroy also discusses how a program "built for the ages" should have "a large factor of safety". McIlroy was worried about how Knuth's program would scale up to larger bodies of text. Also, Bentley's/McIlroy's critique was published in 1986, which I think was well before there was a major look into Unix tools and their susceptibility to buffer overruns, etc. In 1986, could people have determined the limits of tr, sort, uniq, sed, and pipes--both individually and collectively--when handling large bodies of text? With a lot of effort, yes, but if there was a problem, Knuth at least only had one program to look at. With the shell pipeline, one would have to examine the 4 programs plus the shell's implementation of pipes.

(I'm not defending Pascal and Knuth, Bentley, and McIlroy are always worth reading on any topic -- thanks for posting the link!)

Bringing this back to Forth, Bernd Paysan, who needs no introduction to the people in the Forth community, wrote "A Web-Server in Forth", https://bernd-paysan.de/httpd-en.html . It only took him a few hours, but in fairness to us mortals, it's an HTTP request processor that reads a single HTTP request from stdin, processes it, and writes it output to stdout. In other words, it's not really a full web server because it depends on an operating system with an inetd daemon for all the networking. As with McIlroy's shell pipeline, there is a lot of heavy lifting done by operating system tools. (Paysan's article is highly recommended for people learning Forth, like me when I read it back in the 2000s.)

eadmund · 2025-10-21T05:27:20 1761024440

> splitting a line into words is a whole project on its own

Is it[1]? My version below accumulates alphabetical characters until it encounters a non-alphabetical one, then increments the count for the accumulated word and resets the accumulator.

    (let (c accumulator (counts (make-hash-table :test #'equal)))
      (handler-case
          (loop
            (setq c (read-char))
            (if (find c "ABCDEFGHIJKLMNOPQRSTUVWXYZ" :test #'char-equal)
                (push (char-downcase c) accumulator)
                (when accumulator
                  (incf (gethash (coerce (reverse accumulator) 'string) counts 0))
                  (setq accumulator nil))))
        (end-of-file ()
          (when accumulator
            (incf (gethash (coerce (reverse accumulator) 'string) counts 0))
            (setq accumulator nil))
          (maphash #'(lambda (word count)
                       (push (list count word) accumulator))
                   counts)
          (format t "~{~&~{~a ~a~}~%~}" (reverse (last (sort accumulator #'< :key #'car)
                                                       (let ((n (second sb-ext:*posix-argv*)))
                                                         (if n (parse-integer n) 100))))))))

It’s not exactly pretty or idiomatic, but its 19 lines appear to get the job done.

1: Well, technically it is, because there is SPLIT-SEQUENCE: https://github.com/sharplispers/split-sequence

kragen · 2025-10-21T05:53:58 1761026038

Hey, this is great! Thanks!

It does look a lot like what I was thinking would be necessary. About 9 of the 19 lines are concerned with splitting the input into words. Also, I think you have omitted the secondary key sort (alphabetical ascending), although that's only about one more line of code, something like

  #'(lambda (a b)
       (or (< (car a) (car b)) 
           (and (= (car a) (car b)) 
                (string> (cadr a) (cadr b)))))

Because the lines of code are longer, it's about 3× as much code as the verbose Perl version.

In SBCL on my phone it's consistently slower than Perl on my test file (the King James Bible), but only slightly: 2.11 seconds to Perl's 2.05–2.07. It's pretty surprising that they are so close.

eadmund · 2025-10-21T21:02:02 1761080522

Doh, I missed the secondary sort.

Were I trying to optimise this, I would test to see if a hash table of alphabetical characters is better, or just checking (or (and (char>= c #\A) (char<= c #\Z)) (and (char>= c #\a) (char<= c #\z))). The accumulator would probably be better as an adjustable array with a fill pointer allocated once, filled with VECTOR-PUSH-EXTEND and reset each time. It might be better to use DO, initializing C and declaring its type.

Also worth giving it a shot with (optimize (speed 3) (safety 0)) just to see if it makes a difference.

Yes, definitely more verbose. Perl is good at this sort of task!

igouy · 2025-10-20T17:24:47 1760981087

> You can do anything you want in it, it's super flexible, the tooling is great, but almost everything requires you to just write quite a bit more code than you would in Perl, Python, Ruby, JS, etc.

Given that Smalltalk precedes JS by many years: if it is true, then it was not always true.

Given that Smalltalk was early to the GUI WIMP party: if it is true, then it was not always true for GUI WIMP use.