This is a great point. If I recall correctly, prior to Microsoft's acquisition o...

heavyset_go · on June 29, 2021

> In any case, I hope that GitHub is at least limiting any training data to a sensible whitelist of licenses (MIT, BSD, Apache, and similar). Otherwise, I think it would probably be too much risk to use this for anything important/revenue-generating.

I'm going to assume that there is no sensible whitelist of licenses until someone at GitHub is willing to go on the record that this is the case.

CookieMon · on July 2, 2021

> I hope that GitHub is at least limiting any training data to a sensible whitelist of licenses (MIT, BSD, Apache, and similar)

Yes, and even those licences require preservation of the original copyright attribution and licence. MIT gives some wiggle room with the phrase "substantial portions", so it might just be MIT and WTFPL

fomine3 · on June 30, 2021

Interesting to see since Nat was a founder of Xamarin