Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Realistically you don't even want to allow good chunk of ASCII in filenames. Actually the subset that you would like to allow is small, namely lower and uppercase, numbers, ., - and _. That is what 65 out of 127... Maybe a few more at stretch, but even some of those are questionable with some history like ~.


As far as uppercase goes, do we really need a difference between the 'Documents' folder and the 'documents' folder? What is the use case for this?


How would you implement caching, or looking up for the decryption key for some directory in a lookup table, if `Documents` and `documents` must both resolve to the same entry? Some kind of normalization would be needed, right? Then you need to introduce encoding, and [Unicode?] normalization. Shivers


> How would you implement caching, or looking up for the decryption key for some directory in a lookup table, if `Documents` and `documents` must both resolve to the same entry

Calculating a hash or equality for a string always uses some kind of comparer logic. Being able to use a "raw" comparer would be one special case of that. In C#/Windows, you'd use

    var cache = new Dictionary<string, Cached>(StringComparison.InvariantCultureIgnoreCase);
, or similar. These correctly calculate that "documents".GetHashCode() == "Documents".GetHashCode(), and that "documents".Equals("Documents"). You might think that this is more complex because of the case insensitivity, but it's only slightly so. E.g. if you instead assumed case sensititivity and naively use a default here:

    var cache = new Dictionary<string, Cached>();
, then you'd actually be in MORE trouble because now the default comparison using locale-specific collation comes into play. So e.g. in germany files weiß,txt and weiss.txt would compare equal and thus also compare to the same hash (despite being two different files). A working linux lookup table with case sensitivity would look pretty similar to the first one

      var cache = new Dictionary<string, Cached>(StringComparison.InvariantCulture);


What if we only allowed the following in filenames: [a-z0-9.-_]?

Now there will be no 'Documents', only 'documents'.

The comment I replied to already restricted us to ASCII letters, numbers, _, ., and -. I questioned why upper and lower case numbers should be allowed.


That is also an open question. The whole case-sensitivity for filenames. I don't really think it is needed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: