sure, just starting to get some up on HF. A good example might be GSM8K as this ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		decodebytes 3 months ago \| parent \| context \| favorite \| on: DeepFabric – Generate high-quality synthetic datas... sure, just starting to get some up on HF. A good example might be GSM8K as this shows the structured output where every result is strictly formatted - I am using this right now to train models and managaing to get a small qwen model up in the 60% range, which wildly is higher then llama2 and xAI Grok 1 GSM8K: https://huggingface.co/datasets/lukehinds/deepfabric-GSM8K-c... also some others infra failures reasoning / CoT: https://huggingface.co/datasets/lukehinds/deepfabric-devops-... Medical (multi-turn): https://huggingface.co/datasets/lukehinds/deepfabric-7k-medi... Programming challenges: https://huggingface.co/datasets/lukehinds/programming-challe... If there is anything in particular you need, drop me a message or feel free to open an issue and I can create something for you.

dcreater 3 months ago [–]

Thanks, what LLMs were used to create these?

decodebytes 3 months ago | [–]

I think it was gpt4-mini, but local models do surprisingly well too.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact