If I want to use an LLM to do translation, should I use a base model or an instruction tuned version? I've had mixed results using the chat models and a simple "Translate this to <language>: "
For a 9B model like EuroLLM, fine tuning the base model is pretty viable. You don't need a lot of samples, on the order of 300 high quality examples can produce good results, and the GPU time is pretty manageable with rented GPU instances
Just the base model and a template like "English: {text}\n{language}:" can also work with a bit of filter and retry logic