Preferences

As I explained in a sister comment, it is not possible to rate translation quality objectively, as opinions and positions about what constitutes a good translation vary. But in my tests of reasoning models since the release of o1-preview, they have not seemed as reliable as the straight nonreasoning versions of ChatGPT, Claude, or Gemini. The translation process itself usually doesn’t seem to require the kind of multistep thinking those reasoning models can be good at.

For more than a year, regular LLMs, when properly prompted, have been able to produce translations that would be indistinguishable from those of some professional translators for some types of translation.

General-purpose LLMs are best for translating straight expository prose without much technical or organization-specific vocabulary. Results are mixed for texts containing slang, dialogue, poetry, archaic language, etc.—partly because people’s tastes differ for how such texts should be translated.

Because most translators are freelancers, it’s hard to get a handle on what impact LLMs have been having on their workloads overall. I have heard reports from experienced translators who have seen work drop off precipitously and have had to change careers, while others report an increase in their workloads over the past two years.

Many translation jobs involve confidential material, and some translators may be hanging onto their jobs because their clients or employers do not allow the use of cloud-based LLMs. That safety net won’t be in place forever, though.

I suspect that those who work directly with translation clients and who are personally known and trusted by their clients will be able to keep working, using LLMs as appropriate to speed up and improve the quality of their work. That’s the position I am fortunate to be in now.

But translators who do piecework through translation agencies or online referrers like Fiverr will have a hard time competing with the much faster and cheaper—and often equally good—LLMs.

I made a few videos about LLMs and translation a couple of years ago. Parts of them are out of date, but my basic thinking hasn’t changed too much since then. If you’re interested:

“Translating with ChatGPT”

https://youtu.be/najKN2bXqCo

“Can GPT-4 translate literature?”

https://youtu.be/5KKDCp3OaMo

“What do translators think about GPT?”

https://www.youtube.com/watch?v=8JUepj7wIl0

I’m planning to make a few more videos on the topic soon, this time focusing on how I use LLMs in my own translation work.


Not that you'd want to have to do more steps, but how do they do if you split it up in separate parts and translate them individually, then feed it back in interleaved in parts/translated parts and ask it to keep the style but fix any errors due to it originally not having full context?

Or another approach, feed it all into context but tell it to wait and not translate, and then feed it in an additionalt time part by part asking it to translate each part and with the translation style instructions repeated.

tkgally OP
That might work to prevent the glitches I noticed toward the end of the long texts.

In general, though, I haven’t seen any sign yet that reasoning models are better at translation than nonreasoning ones even with shorter texts. LLMs in general have reached the level where it is difficult to assess the quality of their translations with A/B comparisons or on a single linear scale, as most of the variation is within the range where reasonable people disagree about translation quality.

This item has no comments currently.