check out cs 336 stanford, they cover DPO/GRPO and relevant parts needed to trai... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		nafizh 14 days ago \| parent \| context \| favorite \| on: CS234: Reinforcement Learning Winter 2025 check out cs 336 stanford, they cover DPO/GRPO and relevant parts needed to train LLMs.

storus 13 days ago [–]

It's also covered by CS329H.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact