Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think this naming style should be considered obsolete.

- This function will return the number of bytes, not of characters or codepoints. - str and len are both abbreviations, we should use full words when possible - We can also be more explicit about what the function does, it does not simply returns the string length, it counts characters (or bytes in this case)

Here is how I would name it:

u32 CountBytesInString(char* string); u32 CountCharactersInString(char* string);

And on the implementation side, this work can be done with SIMD instructions, and be really freaking fast, but still, it should be explicit for the user that the work is O(n) complexity, not exactly free.



C isn't Java. Even Niklaus Wirth in Pascal, Oberon, and the like avoided naming their identifiers too long. 'GetStrSz()' is enough to achieve (most of) what you want, assuming certain naming conventions:

- Makes it clear that this returns the number of bytes, assuming a naming convention where `sz` refers to size (in bytes) and `ln` refers to length (in some other unit which would be specified in the type). Note that in C, 'characters' refers to bytes. It's a flaw in how C names its types, yes, but I wouldn't say it should be any different just because other languages do things differently.

- It doesn't use full words because I don't think it needs to. Abbreviations are OK as long as every (invested) party agrees that they're sane, and I think they're pretty sane.

- It makes it explicit that it is performing a calculation (hence, is O(n)) via 'get'.

I don't think all this is necessary, though - I actually think 'strln()' is enough. First, because characters means bytes, I can assume that this function is getting the number of characters (bytes) in a string. I wouldn't expect it to give me anything else! Second, in C, if strings were a struct of some sort, I'd expect to be able to get their length via 'str->ln', which would be O(1). The fact that the length is found through a function in the first place signals to me that it's doing something behind the scenes to figure that out. Remember - that's just my opinion, which I admit is extreme - but I think yours is just as extreme.


Naming is extremely important, and while strlen is a very basic and hardly ambiguous example, consistency is key and I believe that good naming rules should be applied globally or at least at the framework level.

I think that full words and verbs are easier to read and avoid ambiguity.

I guess this is a matter of style and preference.

This anecdote reminds me of the Mutazt type, something I found in a new codebase I was asked to debug. I had to dig for almost an hour to find exactly what this type was.

Turns out it was a char*, a C string. Buried under 4-5 levels of abstractions.

Mutazt = Mutable ASCII Zero Terminal.


"- str and len are both abbreviations, we should use full words when possible -"

u32 and char are abbreviations.


Yes, this is true. And this is a tradeoff, I think that basic types are so widely used that we can use this style of abbreviation without much ambiguity.

strlen is also pretty unambiguous, but I still have to check what strstr means.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: