I wonder if this architectural change makes it easier to use other vision models such as the ones in Llama 3 and 4, or possibly a future Llama 5.
I wonder if this architectural change makes it easier to use other vision models such as the ones in Llama 3 and 4, or possibly a future Llama 5.