Neat. I built a web app to learn languages that used podcasts and YouTube transcriptions too. The problem with YouTube was that their API was very limiting, so I ended up having to use a proxy and some unofficial API to scrape the videos. The whole thing felt very sketchy so I ended up removing the whole YouTube functionality and just focused on podcasts (https://langturbo.com)
I hope you have better luck.