If they're doing that, perhaps it's because they want to leverage the "preprint ...

If they're doing that, perhaps it's because they want to leverage the "preprint server" exceptions to copyright agreements in order to amass a corpus of otherwise-locked papers. If the author doesn't upload it, it's scraping. If the author does upload it, it's at least a grey area, and may be permissible under many agreements.