Right now, the API just returns back the raw text for simplicity's sake, but it would be possible to make an option for returning a bit of HTML structure, which would address the problem of sections, inline images, tables, etc.
It would be great if you could return a normalized and simplified version of the HTML structure. I know a lot of people who would be interested in this.
The combination of the two APIs is a great idea.