Early days, but trying to solve the lack of data problem with a project called "Oxen".
At it's core it's a data version control system. Built from the ground up to handle the size and scale of some of these ML datasets (unlike git-lfs or DVC which I have found to be relatively slow and hard to work with). Also building out a web hub similar to GitHub to collaborate on the data with a nice UI.
Would love any feedback on the project as it grows! Here's the github repo:
At it's core it's a data version control system. Built from the ground up to handle the size and scale of some of these ML datasets (unlike git-lfs or DVC which I have found to be relatively slow and hard to work with). Also building out a web hub similar to GitHub to collaborate on the data with a nice UI.
Would love any feedback on the project as it grows! Here's the github repo:
https://github.com/Oxen-AI/oxen-release#-oxen