> I hope that GitHub is at least limiting any training data to a sensible whitelist of licenses (MIT, BSD, Apache, and similar)
Yes, and even those licences require preservation of the original copyright attribution and licence. MIT gives some wiggle room with the phrase "substantial portions", so it might just be MIT and WTFPL
Yes, and even those licences require preservation of the original copyright attribution and licence. MIT gives some wiggle room with the phrase "substantial portions", so it might just be MIT and WTFPL