Pascal strings are *also* kind of bad though. All sub-string operations need all...

WalterBright · on April 6, 2023

D's strings were defined to be UTF-8 back in 2000. wstring is UTF-16, and dstring is UTF-32.

Back then it wasn't clear which encoding method would turn out to be dominant, so we did all three. (Java was built on UTF-16.)

As it eventually became clear, UTF-8 is da winnah, and the other formats are sideshows. Windows, which uses UTF-16, is handled by converting UTF-8 to -16 just before calling a Windows function, and converting anything coming back to UTF-8.

D doesn't distinguish between a string and a string view.

eco · on April 7, 2023

A lot of people don't know about this but Microsoft is taking steps to move everything over to utf-8.

They added a setting in Windows 10 to switch the code page over to utf-8 and then in Windows 11 they made it on by default. Individual applications can turn it on for themselves so they don't need to rely on the system setting being checked.

With that you can, in theory, just use the -A variants of the winapi with utf-8 strings. I haven't tried it out yet as we still support prior Windows releases but it's nice that Microsoft has found a way out from the utf-16 mess.

WalterBright · on April 7, 2023

The A-variants had problems years ago, which is why D abandoned them in favor of the W versions.

I don't mind seeing UTF-16 fade away. We've been considering scaling back the D support for UTF-16/32 in the runtime library, in favor of just using converters as necessary. We recommend using UTF-8 as much as practical.

saagarjha · on April 7, 2023

What’s the ownership story for string views?

eco · on April 7, 2023

They don't own anything. It's just a pointer and length. They don't allocate/deallocate.

saagarjha · on April 7, 2023

I mean clearly something needs to own the buffer for a new string.

tialaramex · on April 17, 2023

Sure, but that's not the string_view's problem, you can't just make string_views, the string you want to borrow a view into needs to exist first.

Imagine you go to a library and insist on borrowing "My Cousin Rachel", but they don't have it. "Oh I don't care whether you have the book, I just want to borrow it" is clearly nonsense. If they don't have it, you can't borrow it.

saagarjha · on April 20, 2023

Walter is talking about D, and he said this:

> D doesn't distinguish between a string and a string view.

In C++ std::string owns the buffer and std::string_view borrows it. If there is no difference between the two in D, then how is this difference bridged?

WalterBright · on April 7, 2023

You can use automatic memory management and not worry about it. Or you can use D's prototype ownership/borrowing system. Or you can encapsulate them in something that manages the memory. Or you can do ownership/borrowing by convention (it's not hard to do).

saagarjha · on April 7, 2023

Automatic memory management makes copies?

WalterBright · on April 7, 2023

No. Another word for automatic memory management is garbage collection.

saagarjha · on April 8, 2023

I guess I should rephrase. Let's say I have a string, which owns its buffer. What happens in D if I take a substring of it? Does a copy of that section occur to form a new string?