> People look at the beta release of the software, and interpret flaws in an early version like a critical fundamental problems.
Maybe because they realize the flaws are fundamentally inherent in the very core of the product. They're using a GPT-3 derivative here. DNN models are not the right tool for this job.
Why following wouldn't work for licenses: Train 3 models:
1. Only permissive licenses - Only include in the training set repos with permissive licenses - MIT, Apache.
2. Copy-left - Step 1 + GPL, excluding AGPL and other "hardcore copy-left licenses".
3. All - Include all code, even unlicensed and AGPL.
User can choose which version they prefer based on profile of their project and their company? Majority of github repos have LICENSE, so it doesn't seem implausible?
Almost all permissively licensed code still require preserving copyright notices or other attribution. So where copilot is creating copyright violations, restricting its training to MIT or Apache licensed code will not resolve the issue.
Maybe because they realize the flaws are fundamentally inherent in the very core of the product. They're using a GPT-3 derivative here. DNN models are not the right tool for this job.