Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I haven't seen the video [do they discuss this?], but the biggest problem I have with Lua is its lack of good unicode support.

Hear hear. I too don't care much for the argument that Lua doesn't have unicode because C doesn't have unicode. For two reasons. First of all, Lua can be straightforwardly embedded into a host of other languages (for example, http://tinyurl.com/39qhxzw). Second, who in their right mind thought that a language designed for embedding into commercial applications shouldn't have good I18N? (And honestly, how in the world could someone from Brazil develop a programming language that doesn't even support their own spoken language? It's as if Ruby was developed by someone in Japan.)

A bit of history. From what I have gathered in its documentation, Lua steals quite unabashedly from NewtonScript. NewtonScript was a proto-style OO language developed in a hurry for the Newton when the Dylan language wasn't going to deliver. Like Lua, it's a language designed originally for an interpreter, which runs embedded in an outer C++ environment and must interoperate with it. But NewtonScript doesn't just use Unicode pervasively: it was the first major language to do so. And C++ interoperability with the language is just fine. So this Lua-needs-to-work-with-C-and-thus-can't-do-unicode thing is nonsense both in fact and precedent.

And while we're on the subject of NewtonScript: Lua's let's-almost-do-proto-style-OO-but-require-the-user-to-do-extra-work is really incredibly annoying. Lua should have had proto OO built into the language, like NewtonScript, and not just "available", in a hacked way, through its meta model. Lua wants to be more "general" than NewtonScript, but it just winds up being (IMO) rather less usable.



Lua is as old as NewtonScript. Its authors have mentioned Scheme, SNOBOL, awk, bibtex, Icon, and (IIRC) Self as influences, but I don't recall them mentioning NewtonScript. Javascript also has a lot in common with Lua. I think it's because the trade-offs inherent in embedded scripting languages are making them converge on a similar overall language design, rather than plagiarism.

If you're putting Lua in a commercial application, the commercial application itself will provide the i18n, and Lua can use it with very little trouble. Lua's strings are raw byte arrays - you can load arbitrary binary data in them. A library that reads UTF-8 strings (say) can work with them just fine. (And while I don't speak Portuguese, there are examples in PiL that use it without problems.)

I disagree with you about whether metatables are annoying, but it's a matter of taste. I haven't ever used NewtonScript, but I do use metatables for quite a bit more than just prototype OO - it's easy to turn a table into a proxy + cache to a function, for example. Also, my Lua redis library (http://github.com/silentbicycle/sidereal) turns table reads and writes into syntactic sugar for redis db key, list, and set operations.


I'll have to look up the specific direct references I saw that the Lua authors made to NewtonScript which prompted this: though note that the sole reference to Self in Programming in Lua is made in the same breath as NewtonScript. At any rate, I very strongly disagree with you about the 18n. Let's say you're making a video game. Let's call it, oh, I dunno, how about "World of Warcraft". You've decided to craft much of the level design in Lua so you don't have to write it in C++. Now you want to port it to the Mongolian market, complete with Mongolian storyline, instructions, character dialogue, you name it. All this stuff was in Lua strings in English. If you had a decent 18N system in your scripting language you'd just type Mongolian in those strings instead. This is a real problem.

> I disagree with you about whether metatables are annoying, but it's a matter of taste.

I didn't say metatables are annoying in and of themselves. I said they're annoying as a hacked-together substitute for a true proto OO.

Lua has many good things. But 18N and a usable OO ain't among them.


Lua "strings" are raw byte arrays with a saved length. You can store JPEG data in Lua strings. Storing Mongolian is not a big deal. You'll need a lib to e.g. calculate UTF-8 lengths, but all that needs to happen is the Lua community (or a project's company) agreeing on a specific Unicode library. There are no technical limitations there.


> Storing Mongolian is not a big deal.

...

> You'll need a lib to e.g. calculate UTF-8 lengths,

...

> There are no technical limitations there.

You cannot be seriously making this argument. Why not just code everything in assembly?


I understand it's annoying that Lua doesn't have a a single, officially recommended Unicode library* (in the core distribution or otherwise), but as problems go, it's easily solved by loading an existing, freely available library and getting on with life. It's much less trouble than (say) removing the GIL from Python or retroactively fixing weird operator precendence in C.

* Though there are recommendations at http://lua-users.org/wiki/LuaUnicode .

In practice, it could look like this:

   U = require "unicode"
   s = U"some unicode"      -- using a library sure is hard
   length = s:len()
   replaced = s:gsub("thing1", "thing2") -- global substitute
   etc.
You can add regular expressions and bignums to Lua pretty painlessly, too.


Agreed; only, the fact that no one has done it conclusively so far says to me that the Lua community isn't/hasn't been thinking about these sorts of things, and makes me worry about the other holes I will encounter if I do delve into the language. I'd rather invest my time and energy into learning how to use Cython to speed up Python, or Clojure to make Java more fun.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: