I personally think something like this is ridiculous. See all this poor child? YOUR money can help us (a multi billion dollar company) makes a difference!
>I might be the only person who likes SQL nulls
From my understanding, null values are bad because its a sign that the database design is flawed (See database normalizations)
Perhaps you have a more practical experience?
That's not right. "Null" means that the value is not known. Suppose you have a table for employees, and you want to record the last time they were paid. What do you put in that column for people who just started this morning? The alternatives are to use null, indicating that they haven't been, or to formulate a codebase-wide sentinel value like "0000-01-01" and then accounting for that in every single database operation everywhere.
Further suppose that you have an external function in your codebase to estimate how many paychecks you've paid to someone, but the author doesn't know about any "0000-01-01" conventions your office uses. Without that, you'd see that Joe New Guy has worked here about 2,020 years, so we've probably issued him about 48,000 checks. If only you'd used null, then that function would have calculated "today() - null", which in any sane language would raise a type exception and alert you to the problem.
Nulls are beautiful. They have meaning. Lots of people misuse them, but that doesn't mean they're not valid and useful.
> The alternatives are to use null, indicating that they haven't been, or to formulate a codebase-wide sentinel value like "0000-01-01" and then accounting for that in every single database operation everywhere.
Arguably, you might consider that it is a bad design. Perhaps there should be a paycheck table that you can sort by date, instead of a field in the employee table that is updated.
(Of course this is hypothetical, so it is hard to argue use cases)
But a paycheck table does help with the second example because now you don’t need to estimate number of paychecks either. Which will help with unpaid leave or sabbatical situations to give you an accurate number.
But this is what they are getting at with bad design. (Potentially) As far as normalization goes.
Oh, sure. Your proposal is a better plan for this specific situation.
I certainly won't argue that people come up with some bad designs! Heaven knows I've dealt with some of them. I just wanted to point out that there are situations where null is perfectly reasonable, even though there are plenty where it surely isn't.
But data protection might not allow you to keep this for longer than legally mandated, i.e., ~10 years. How do you handle this for long-time employees?
A quick question I use to demonstrate the usefulness of NULL is "What color is the elephant on my desk?"
That question doesn't have an answer because there is no elephant on my desk. It can't be represented by any color, the answer needs to indicate that there is no value.
Often in analytics you will create datasets which are an amalgamation of several upstream sources. For example, an Endangered Species database. You might have "wing length" and "number of eyes" as fields, and both of these need to be NULL for something like a sea urchin.
There's really no justification to avoid nulls as a principle.
Okay, but what if I `SELECT wood, lampshade, elephant_color FROM desks WHERE location = "office"`? How does it represent my mahogany office desk with a beach theme lampshade and no elephant on it, vs my teak bedroom writing desk with a pink elephant and white cream lampshade?
It is hard to model without requirements but you probably need a table for type of desks, another for desk instances with location and another one to many linking table for items/colors.
Why? I like two lamps and no elephants. By normalizing, this new requirement is simple to represent compared to jamming more fields into the desk table.
Or if another person prefers penguins you don’t end up with penguin_color columns.
It is possible the one table design works, but the introduction of nulls is a smell that indicates maybe the design is a problem.
> That's not right. "Null" means that the value is not known.
According to my old SQL Server text book (written by the SQL Server team) NULL should be avoided because it as so many different interpretations (I think forty was mentioned).
1. NOT NULL is very frequently used in schemas, because NULL is often an undesirable value (e.g. for a mandatory field). That doesn't make NULLs bad, and NULLs are still frequently used when you have optional fields or fields where NULL has some other special meaning.
2. NULLs are used by SQL functions and operators as an "unknown" value. So, for example, "NULL AND TRUE" is NULL, because we could substitute NULL with TRUE or with FALSE to get different results, but "NULL AND FALSE" is FALSE, because no matter what we substitute NULL with, the result will always be FALSE.
3. Clearly all these valid uses of NULL do not indicate "flaws" in the database design.
4. Database normalization isn't always a good thing, and beyond a certain level it's almost always a bad thing, so using normalization methods as a standard for whether something is "flawed" is probablyn ot the best idea.
5. No database normalization method, as far as I know, actually tries to eliminate NULLs, so I don't know what "(See database normalizations)" refers to. Can you clarify?
Re: NOT NULL is very frequently used in schemas, because NULL is often an undesirable value
Talking strings, usually if you don't want a null string, you also don't want blanks (white space) either. I'd like to see auto-trim in the standard, and also a minimum length specifier. We then wouldn't need to deal with nulls. A single min-length-when-trimmed-and-denulled integer value would replace a lot of repetitious hubbub in typical CRUD apps. D.R.Y. it! You'd have one attribute "cell" that would replace the equivalent of:
IF length(trim(denull(inputValue,'')))
< this_fields_min_length THEN
raise_data_too_short_error(...);
That's the way you'd want it done in the vast vast majority of CRUD systems (if not doing weird things).
The fact that a database is not fully normalized is not a sign that it's flawed. In fact, there are cases where some tables being fully denormalized makes sense (although that's less common in my experience).
This furthers my hypothesis that it is entirely possible we live in a completely different "reality". My red is different than your "red", so on and so forth. Just like its incomprehensible for me to imagine what additional color is there outside of my human range; your reality could be unimaginable to me too.
That is, I do not know what its like to be you. I don't think this necessarily mean that we live in different physical reality, its just that we come to interact with it through completely different means.