This is the thing that made no sense to me about it as a premise. Doing correct ...

This is the thing that made no sense to me about it as a premise. Doing correct program synthesis is really hard even when you have really opinionated and well-defined models of the domain (e.g. the Termite project for generating Linux device drivers). The domain model for Copilot is somewhere between non-existent to so open-ended (i.e. all the diverse code on Github, et al.) as to be functionally non-existent.

A bare minimum baseline validation check for Copilot would be to see if it provides you code which won't compile in-context. If it will, then that means it's not even taking into account well-specified domain model of your chosen programming language's semantics. Which, upon satisfaction, is still miles away from taking into account the domain of your actual problem that you're using software to solve.

The only place where the approach taken, as-is, makes sense to me is for truly rote boilerplate code. However, that then begs the question... how is this machine learning approach more effective than a targeted heuristic approach already taken by existing IDE tooling, etc.?

FWIW, I don't think any of this is lost on GitHub. I think Copilot is more likely a tremendously marketable half-step and small piece of a larger longer-term strategy unfolding at Microsoft/GitHub to leverage an incredible asset they're holding, i.e... practically everybody's source code. The combination of detailed changelogs, CI results (e.g. GitHub actions), Copilot, and a couple other key pieces makes for a pretty incredible basis for reinforcement learning to multiple ends.