Well back to my premise, could you not sort of figure that out by the access patterns? Subsequently the people that give the permission have contacted you about it, what else are we needing? Perhaps you could have the notion of "private" shares.
Would it be reasonable to identify your users? Perhaps require and SMS or something to gain membership and then when there is this deduplication case why couldn't you contact them to find out if it's valid or not?
>Well back to my premise, could you not sort of figure that out by the access patterns?
You "could", but you could be wrong and cheese off your users. Who's to say that an indie label isn't using your service for cheap file hosting? Just because $file is getting a lot of hits (and "lot" needs to be defined here) doesn't mean that those uses are necessarily infringing.
And then we're back to the links + deduping problem anyways. Let's say you've determined that 3 of the 9 non-DMCA'd links are coming from CoolHotWarez.ru (first remembering that the law doesn't require you to do this research, and second remembering that any warez site worth a crap uses an anonymizer to strip referrer URLs ANYWAYS...) - You take them down.
The 6 left don't have much traffic, so your metric kind of falls apart. And all this would result in would result in warez kiddies uploading two copies of each file, keeping one secret and sharing the other.
So... you'd ask the pirates if they were valid or not? The copyright holder is literally the only person who knows who does and does not have permission, so they're the only one who can reliably inform others. Though some of the big copyright holders have trouble keeping track of what they have and have not approved, as was show in the Viacom case.
Big websites operate at scale. Think millions of users and thousands of new ones each day. Things like this are not compatible with manual processes. And if you look at something like YouTube, I think they get more than a week's worth of video uploaded every day.
Half of the problem is that there isn't a reasonable way to enforce this.
Would it be reasonable to identify your users? Perhaps require and SMS or something to gain membership and then when there is this deduplication case why couldn't you contact them to find out if it's valid or not?