The fundamental reason had to do with how rdbms structured its pages of data and having arbitrary sized blobs directly in the record broke the storage optimization and made performance tank.
It was a design constraint back in the day.
I haven’t looked at this in decades, but I think now it’s all just pointers to the file system and not actually bytes in the record.
So it was fundamentally the wrong thing to do based on how databases stored data for performant recall.
But that’s back when disks were expensive and distributed nodes were kind of hard.
> I think now it’s all just pointers to the file system
It depends. InnoDB, assuming the DYNAMIC row type, will store TEXT/BLOB on-page up until 40 bytes, at which point it gets sent off-page with a 20 byte pointer on-page. However, it comes with a potentially severe trade-off before MySQL 8.0.13: any queries with those columns that would generate a temporary table (CTEs, GROUP BY with a different ORDER BY predicate, most UNIONS, many more) can’t use in-memory temp tables and instead go to disk. Even after 8.0.13, if the size of the temp table exceeds a setting (default of 16 MiB), it spills to disk.
tl;dr - be very careful with MySQL if storing TEXT or BLOB, and don’t involve those columns in queries unless necessary.
Postgres, in comparison, uses BYTEA as a normal column that gets TOASTed (sent off-page in chunks) after a certain point (I think 2 KiB?), so while you might need to tune the column storage strategy for compression - depending on what you’re storing - it might be fine. There are some various size limits (1 GiB?) and row count limits for TOAST, though. The other option is with the Large Binary Object extension which requires its own syntax for storage and retrieval, but avoids most of the limitations mentioned.
Or, you know, chuck binary objects into object storage and store a pointer or URI in the DB.
It was a design constraint back in the day.
I haven’t looked at this in decades, but I think now it’s all just pointers to the file system and not actually bytes in the record.
So it was fundamentally the wrong thing to do based on how databases stored data for performant recall.
But that’s back when disks were expensive and distributed nodes were kind of hard.