There's an exceedingly simple way of doing this that's pretty much bullet proof: Just check the given text against the database of stored generations (which they no doubt keep) and you'll have a pretty much perfect result.
Logistically, searching untold terabytes of text might be challenging but to say there's fundamental difficulties is just not accurate imo.
that's the problem - no you won't. The problem is you've bought too much into the stochastic parrot view of LLM's to a point where you are viewing it as a fancy content storage/retrieval system.