Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The differences of the various regex "dialects" came to me over the years of using regular expressions for all kinds of stuff.

Matching EOL feels natural for every line-based process.

What I find way more annoying is escaping characters and writing character groups. Why can't all regex engines support '\d' and '\w' and such? Why, in sed, is an unescaped '.' a regex-dot matching any character, but an unescaped '(' is just a regular bracket?



> Why, in sed, is an unescaped '.' a regex-dot matching any character, but an unescaped '(' is just a regular bracket?

It is because sed predates the very influential second generation Extended Regular Expression engine and by default uses the first generation Basic Regular Expression engine. So really it is for backwards compatibility.

http://man.openbsd.org/re_format#BASIC_REGULAR_EXPRESSIONS

you can usually pass sed a -r flag to get it to use ERE's

Actually I don't really know if BRE's predate ERE's or not. I assume they do based on the name but I might be wrong.


BRE and ERE was created at the same time. Prior to this there wasn't a clear standard for Regex. From my memory this was standardised in 1996 (IEEE Std 1003.1-1996).

The work originally came from work by Stephen Cole Kleene in the 1950s. It was introduced into Unix fame via the QED editor (which later became ed (and sed), then ex, then vi, then vim; all with differing authors) when Ken Thompson added regex when he ported QED to CTSS (an OS developed at MIT for the IBM 709, which was later used to develop Multics, and hence lead to Unix).

Also the "grep" command got its name from "ed"; "g" (the global ed command) "re" (regular expression), and "p" (the print ed command). Try it in vi/vim, :g/string/p it is the same thing as the grep command.


>you can usually pass sed a -r flag

for portability, -E is the POSIX flag for the same thing




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: