The data you're sending out is just the position and motion vectors of the pupils. And you probably only need about 16 bits for each of these numbers for 2 eyes. So the equivalent of two floating point numbers along a particular channel or 32 bits at minimum. Any lag can be compensated for by simply interpolating the motion vectors.
It actually makes a lot of sense!