I love the RAM layout, a great benefit of being able to pick your pin layout on an FPGA. If you can do that it is a no brainer, but in general you don't really have to worry about routing your 100MHz RAM so carefully. That is pretty slow as things go and with a 3 meter wavelength the transmission line effects over a few inches with a couple vias don't do all that much. There are a lot of dev boards out there with that speed class of RAM routed relatively far and ugly.
I've laid out a stripline transmitter for 100 MHz, the transmission line effects are substantial enough that you can turn circuit traces into coils and resonant circuits easily enough. The fact that it in the analog domain and that this is digital is the big difference but during the rise and fall of the clock those effects are very much present.
Interestingly if you are doing a standard height four layer stack up with something like signal-power-ground-signal your stripline is much closer to the plane and default trace widths have a natural impedance much closer to what you want. On two layer even if most of one side is a ground plane the distance from trace to plane leads to impedance much further out of spec. I haven't had to do any sort of impedance controlled traces yet luckily on a two layer where standard 1.6mm PCBs mean something crazy like 3mm trace widths to hit 50 ohms characteristic impedance at ~100-150MHz.
I think about 150mm is the furthest I've routed non-terminated 100MHz uSD lines and I've been lucky enough so far to get away with the only real design goal being all trace lengths within +/- 5% of the clock line, even with the trace lines having a couple hundred ohms of Z when the spec calls for 50. I do wonder things like if my error rate went from once per 20 years to once per 5 years type of thing. I'm also guessing the error rate follows some sort of inverse S-curve of probability.
> Xilinx FPGAs offer a wide range of configuration methods with excellent documentation.
You can always tell an experienced engineer if he says something like "Doing <X> was not hard, it's described here.", and points to a 300+ page manual.
I remember working on a similar project in college. We had to create our own XGA driver in verilog, display a chessboard and a few chess pieces. Code running on the ARM in the SoC would handle some simple commands to send new positions of the sprites to the FPGA. We couldn't use a frame buffer so all the each pixel was output based on combinational logic. It made it so your code had to be really fast to make sure that you could get that pixel figured out within 15ns before the driver hopped to the next one.
I haven't done anything in this area (except an fpga prac at Uni), but I think these days you can get a board with fpga and ARM core on it from a number of suppliers, for example the SmartFusion 2 SoC. That would make this kind of work more accessible.
The miniZed board⁽¹⁾ uses a Xilinx chip, but it doesn't seem to come with a license for software except for SDSoC⁽²⁾, which doesn't seem to provide any software for HDL, just C++ and OpenCL.
I've made a simple PCB layout before, and worked with FPGA's before (generated some simple HDMI output, and communicated via SPI). Is a similar project feasible for me, or likely to be a step too far?