yeah but that's because there is human creative work in the write ups of the copyrighted work, a court might liken this more to a phone book which which doesn't add any (human) creative expression on top of it.
Not to mention the final product of encyclopedic work and journalism does not internalize the original it is describing in its entirety, like a language model does. In some sense, a language model takes all that there is to be taken from a given resource, and incorporates it into the weights.