Shower thought: Is anyone looking at using neural networks to fill in the pillarbox area of 4:3 content? It seems like you'd be able to turn old 4:3 video into 16:9 by learning what was outside the active region from adjacent frames and filling it in.
Yes, this is an established task known as image outpainting. What you’re describing is actually video outpainting, because you would want to use surrounding frames for context.