Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How are these AI recaps generated? Are they fed a video file of the entire game and it spits out a summary, or maybe a score tally with timestamps for goals (written by a human) which the AI then pads with language and makes into a story?


Back when I worked on it, before AI, you have all the information from a game in an API and you just fill in the template with it.

Now, I imagine they take that raw API call and just use a prompt like, "write a summary article for a game using this data" and it spits it out. And I assume the prompt is more thought out than that (or not? It is ESPN after all).

I don't ever remember "retiring_players" being part of an API response, though, ;P

edit: Oh and yes, the play by play recap is documented EXTREMELY well. You would be surprised. The more popular sports like Gridiron Football and Basketball would literally have player locations by the second. This data all comes from feeds like SportsRadar.

They probably wouldn't pipe the fine tuned stuff like that in to a prompt, but you still have a decent summary like how many 3-pointers someone had and where they shot them from.


If I was going to build this prototype I'd start with just a semistructured textual play by play recap as the input. Also including roster, injury, amd schedule information with a fairly basic prompt would probably go a long way.

This data exists for most live games at this point via various web services. I'm sure espn has significant resources internally to source that info


I don't think ESPN does anything that takes significant resources. That's all handled by SportsRadar or ... there's another big provider but their name alludes me. They basically firehose you all the game information as structured data and you can use it programmatically however you'd like.


I assume this is what lets baseball games show obscure factoids like "3rd in the NL West when facing left-handed pitchers on Tuesday"?


You have the Elias Sports Bureau to thank for all the fun baseball stats out there https://en.wikipedia.org/wiki/Elias_Sports_Bureau


Definitely. I have no experience in live game statistics, but from my sports content experience I bet there's data scientists and applications behind the scenes that specifically pull this data to be read on-air.


I imagine the primary customers of the data feeds are gambling companies who let people bet on matches that are in progress.


Yeah it feels like the ideal way is to feed in a transcript of the announcer audio + some standard stats. That would ensure you catch both the human stories & the factual content.

But I wonder if there are licensing issues with using the audio/transcript to generate your summary. I know that the raw stats are public domain but I wouldn't be surprised if they can't use the transcripts or audio.


There are a couple companies that provide real time sports data via API (or recaps after) so I’d bet they use that.


They use the box score and play-by-play events.


The gaps between expectations and reality of “genAI” is too vast at this point to ignore. If a multibillion dollar system breaks down because of “human error”, then maybe its capabilities are way overstated. If it needs carefully crafted queries (“prompt engineering”), 100% error proof data, a megaton of power, and humans still need to re-check the output. What have we gained?

Can we all just admit this AI phase is just a bubble?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: