Batching in vLLM doesn't combine prompts into the same context - it processes separate requests in parallel while sharing compute resources, so there's no perplexity tradeoff, just efficiency gains.
It's worth noting that reason this works is because basically every LLM architecture currently in use is severely limited by memory bandwidth, not by compute. So it's trivial to run several requests at a time, while waiting for the next weights to arrive from VRAM.