watermark
Game Translation - English, Japanese, Chinese, Korean, Dutch and other languages
Don't localize. Loekalize.

Other language pairs available upon request
SEGA: "You are faster than Sonic! It's easy to read, and you clearly have experience with these types of texts."

Recent projects

  • Shadow Gambit: The Cursed Crew (Japanese, Simplified Chinese, Traditional Chinese)
  • EA SPORTS™ F1® 23 (Dutch)
  • Multiple AAA titles for Electronic Arts (Dutch)
  • Someday You'll Return: Director's Cut (Japanese)
  • Stray (Dutch)
  • Syberia: The World Before (Dutch)
  • Arma Reforger (Japanese)
  • Cyber Shadow (Simplified/Traditional Chinese)
  • Pathway (Japanese and Chinese)
  • DayZ (Japanese)
  • Draugen (Japanese and Chinese)
  • Swag and Sorcery (Japanese, Chinese and Korean)
  • Return of the Obra Dinn (Simplified/Traditional Chinese and Korean)
  • Graveyard Keeper (Japanese, Chinese and Korean)
  • Moonlighter (Japanese and Chinese)
  • Beat Cop (Japanese and Chinese)
  • Dota 2 (Japanese)
  • Motorsport Manager (Dutch)
  • Gremlins Inc. (Japanese and Chinese)
  • Punch Club (Japanese)
  • Arma 3 (Japanese)
  • Mario & Sonic at the Olympic Games (Dutch)
Why machine translation sucks

Man versus machine

Machine translation (MT) is another new buzzword these days. Who needs dusty old translators that can only translate a few thousands words per day, if you've got machines that never complain and can do millions of words an hour? A translator's instinct tells us that machine translation will never work, but instinct is never a convincing argument.

Translators tend to be great at language, and somewhat less great at technology. They're therefore an easy target for programmers and salesmen behind MT technology, who slap old-fashioned linguists in the face with numbers, statistics, formulas and what more to show the world that machine translation is here to stay, and will eventually catch up with us.

According to this formula, machine translation works!
Now this translator happens to be a hobbyist programmer himself though, and also somewhat of a gadget freak. I love technology, and I welcome it with open arms, so I am everything but a Luddite. Anything I can automate, I will automate (provided quality is never compromised), and I estimate that especially because of this I've been able to more or less double my income since ten years ago. There's one condition though: I always let robots and computers work for me, and never the other way around.

When Slate Desktop, a new desktop application based on the famous Moses machine translation engine, was released, I grabbed my chance and bought the 549 dollar software right away. Slate Desktop can be integrated with my favourite translation tool (currently memoQ), so that I could easily compare the results with my current workflow to see if I could benefit. The idea was to switch to Slate if I benefited (knowing the current stance of MT technology, I never expected perfect results in the first place), and to drop the whole plan if Slate only slowed me down.

After two weeks of careful evaluation, I eventually decided to drop the whole idea. Below I'll explain why, but before I do that, I want to stress that the company behind Slate is very passionate, very service-minded and very honest about the possibilities and unpossibilities of their product. In the end they also gave me a full refund. I have seldom worked with a company that was so honest and that spent so much time on one single client. These are great people and what they state is absolutely true. In theory. However, as usual, reality is stubborn.


Slate's Customer Service to me: Good luck with your "tinkering" while the rest of us serve customers.

Addendum: In november 2017, when it turned out that this article literally cost Slate sales, even though Slate themselves decided to quote parts of this article on their very own website, Slate's attitude made a 180-degree turn.

"In the months following your blog last year, five would-be customers reported they did not purchase Slate because of your blog. As a result, I removed your "testimonial" from my website over 6 months ago. Since posting my two comments today [about me testing other software like Moses and OpenNMT], 3 of those lost-customers you cause (sic) me have contacted me to apologize. One included a note that your reply comments are inauthentic. All three purchased Slate within minutes. ... I have no desire to have you as a customer. Good luck with your "tinkering" while the rest of us serve customers."

Friendly people, eh? Great customer service too. I'd like to stress the following, so that you can draw your own conclusions: I sent Slate the entire Excel sheet showing all data, and they told me themselves everything was correct. They are the experts. If my data was inauthentic, it is because they approved it. Also, it was them who decided to put my testimonial on their website. I never asked for that. That too was their business decision, not mine. Why does Slate blame me for their own decisions?

If you're inclined to believe this story anyway, ask yourself one thing: how much money do I make by exposing the truth about machine translation, and how much money does Slate lose by me exposing the truth about machine translation? As always, follow the money.

The best way however to prove that I'm right and that Slate is wrong, is by using Slate's own words against them. From an e-mail sent by Tom Hoar, CEO of Slate Rocks, LLC on May 5th 2016, 16:04 CET:

Reviewing your comments here was very promising. While disappointed, I'm not totally surprised at your results. (...) Several years ago at a GALA event, I had some great conversations with an American gaming localization project manager working for a gaming company in China. I bumped into him again at a LocWorld event. From his descriptions of the requirements, I think it's almost impossible to develop a corpus that will guide SMT to work well. Your comments here confirm that a non-gaming (sic) related corpus doesn't work.


Before we go on, it's important to know my current workflow. As said, I use memoQ, which is basically a tool that remembers everything I have ever translated in my entire life. As soon as it encounters a sentence that resembles a sentence I have translated before, it will generate a so-called fuzzy match. If the different or missing word is found in my terminology database, basically a dictionary with one-on-one translations of thousands of words, it can even try to patch up the fuzzy match to come up with something even better.

Dumb assembly and continuous improvement

memoQ also has a feature called Assemble, which basically translates texts word by word using my terminology database and nothing else. The result is a one-on-one, robotic, very Google Translatish but also very predictable "translation" that at least saves me a lot of typing work. By not only adding words, but also phrases, the results of Assemble are gradually improving. This is how it works:

Source textAssembled resultOperation
you have feltyou have feltadd you = je
you have feltje have feltadd have = hebben
you have feltje hebben feltadd felt = gevoeld
you have feltje hebben gevoeldadd you have = je hebt
you have feltje hebt gevoeld(correct translation)


By constantly looking at the results of Assembly and improving them by adding new words and phrases (akin to the Japanese concept of kaizen), I am slowly building my own translation engine, be it a very robotic and stupid one. It does exactly as told and nothing else, and it definitely cannot "think", neither will it try to think. It does get better and better though, as I've been improving my terminology database for years.

In my current workflow, memoQ will first try to find a fuzzy match (patched up with terms from the terminology database if needed). If the sentence I'm translating shows no resemblance to any sentence I did before, the dumb assembled match kicks in. I will call this workflow assembled translation (Fuzzy + Assemble).

Smart machine translation

At my office. Meet Hayabusa, Japanese for peregrine falcon.
The new workflow I wanted to try out was similar to assembled translation, however, instead of assembling fuzzy matches, I would have the source text machine-translated by Slate (Fuzzy + Machine, from now on "machine translation"). To do this, first Slate needed to know how I translate. I therefore fed it my specialized game translation memory with about 2,5 million words. Basically Slate tries to detect patterns in word order to map my brain, so that the same patterns (=my brain) can be applied to new texts. Building an engine this way took about 4-5 hours on one of the fastest laptops currently available: Hayabusa, my 7000 euro game laptop with an i7-6700K processor, 2 GTX980 graphic cards and 64 GB on-board memory. If that sounds like Chinese to you: currently (2016) this is a bit of a monster and a very high-end laptop. I should add that because the texts fed to the engine were game-related, the engine was used to generate game-related translations only. In fact, it was even narrower and used for the very same game franchises too. It doesn't get any better than that.

Now, I already knew how it felt to change assembled translation into an acceptable end product, id est a translation that reads like an original (we'll call this human translation). It's what I've been doing for many years. What I wanted to know is how it felt to go from machine translation to the same end product.

I won't cut around the bushes and get right to the point: it felt a lot less pleasant. I had to make many corrections, I had to go to and fro, I had to check and double-check, and generally speaking it was a painful process, to the extent I was wondering whether it was actually saving or taking time.

The Excel sheet with the Levenshtein analysis.
To find out the cause of this, I compared assembled translation, machine translation and human translation. The idea was that the more I had to change to go from assembled translation/machine translation to human translation, the worst the translation given by assembled translation/machine translation was. The number of characters that are changed during an edit is called "edit distance", and the de facto way to express the difference between two texts (before and after) is a so-called Levenshtein analysis. I found a VBA macro on the net for Levenshtein analyses, put it in Excel and had it calculate the edit distance of one typical 4000-word game text for both assembled translation-human translation and machine translation-human translation. Surprisingly, machine translation-human translation turned out to be 28% more efficient than assembled translation-human translation expressed in Levenshtein. However, this didn't match my personal experience at all. How come a 28% statistical advantage was completely diminished in practice?

There was only one way to find out: I had to compare the kind of differences between assembled translation-human translation and machine translation-human translation.

On one hand, the differences between assembled translation and human translation were predictable. All words in the assembled translation were there (especially because memoQ automatically inserts the original—in this case English—source word if no match in the attached terminology database is found), so there was never anything missing. Generally the sentences sounded very English (even in Dutch) and robotic, and often the translations were completely wrong: as assembled translations are one-on-one, things go terribly wrong when words can have multiple translations. Assemble cannot think, so it will just pick whatever you defined in your terminology database, no matter the context. This explains why in one instance, enemy compound was translated as vijandelijk mengsel (enemy mix), as compound can also be translated as mix in Dutch. However, the errors are so obvious, predictable and easy to fathom, that correcting them is a matter of seconds. On top of that, the context issue can be solved partially by stacking specialized terminology databases on general terminology databases: for example a war game database on top of a game database, that tells Assemble that in war games, compound tends to refer to buildings instead of materials.

Though machine translation, on the other hand, generally read less robotic and was a lot closer to human translation than assembled translation, there was one huge problem: its total unpredictability. As the whole logic behind the engine was fuzzy (pattern-based instead of rule-based), the errors in machine translation were also fuzzy. This led to all kinds of issues: missing words, inserted words, wrong translations, added phrases, missing phrases, wrong interpretations, and so on. Please have a look at the table below to see the kind of errors I was dealing with.
SourceAssembled matchMachine translation
64 players64 players64 agents and criminals
new way of storytellingnew style of storytellinga story about new style
Whether you're the cops or the criminals, you mustWhether you're the cops or the criminals, you mustIf you the police of the criminals of the target
all-out-war on an epic scaleall-from-war on an epic scalean epic bowl of the total war
a deeper infantry experience from the jungles to the beaches of Vietnama deeper infantry experience out the jungles to the beaches of Vietnamand a deeper and enjoy the jungles of Vietnam
captaining a PT boat through a dangerous jungle river passagecaptain a PT boat through a dangerous jungle river tripan exciting captaining dangerous: a ship passage the jungle
twotwothree
players also controlplayers also the controlthe game has no control
the intensity and excitementthe intensity and excitementthe excitement
Ready and eager to deliver Battlefield to an ever-growing audienceReady and would love delivering Battlefield to an once-growing audienceAnd to the battlefield to ever-growing people
The multiplayer is filled with new innovationsThe multiplayer is filled with new innovationsis filled with the new innovations
gadgets like the grappling hookgadgets like the grappling hookplayers communicate with gadgets
new modesnew modesnew multiplayer modes


As you can see, the problems are numerous and very diverse. Very alarming however is the fact that MT sometimes states exactly the opposite of the source (players can control <> the game has no control), sometimes states something very different (two <> three), sometimes invents completely new words (modes <> multiplayer modes) and is generally obfuscating the true meaning of the source text. (What does "and to the battlefield to ever-growing people" mean?) Yes, assembled translation often sounds dumb and stupid, but it almost never obfuscates or changes the meaning of the source text. Machine translation tries to be smart, but obviously is not smart enough to be smart, and is therefore actually dangerous.

Machine translation is not smart enough to be smart, and therefore dangerous

As the errors made by machine translation are so dangerous and unpredictable, in the end every single word output by machine translation needs to be checked manually and compared to the source text. Assembled translation on the other hand is so literal, that a comparison with the source text is only needed when assembled translation is incomprehensible, which is almost never. This explains why the so-called 28% advantage of machine translation is totally diminished right away: just comparing the translation to the source text means that I will need to read about twice as much text, which is a 50% disadvantage when it comes to checking alone. This is true even if the machine translation is perfect, as I can only conclude it's perfect after checking it. And I haven't even taken into account the careful double- and triple-checking that is needed, as it's so easy to follow in MT's wrong footsteps.

Other things I found were:

1. The somewhat optimistic numbers given by MT vendors are mainly due to internal fuzzies in the memory that is fed to generate the MT algorithm. As soon as a new text is fed with no (fuzzy) repetitions, the numbers look very different (the above-mentioned 28% speed improvement in a best-case scenario, based on Levenshtein analysis, confirmed by the vendor).

2. It's important to realize that the above is based on translations from English to Dutch. Dutch is the closest language on Earth to English (Frisian excepted). For Japanese, the results were absolutely disastrous (if it's any comfort: even assembled translation is absolutely meaningless when it comes to this language).

3. When using machine translation, you need to rebuild your entire engine whenever new translations are made if you want to benefit from the newly generated knowledge. As mentioned before, this takes about four to five hours on the fastest systems available. Every single time. Slate is working on an update that will enable engine add-ons, but the result will always be less precise than a full rebuild, and it will still be a process that needs to be initiated manually, every single time.

On the other hand, if you use assembled matches, new terminology can be added on the fly as you translate. The result is quick, immediate and continuous.

Conclusions

No matter what technology you opt for, if any, based on the above I think we can agree about the following:

1. Assembled translation > machine translation. Apart from that, it can also be steered much more precisely, as the translator determines how words and phrases should be translated, case by case.

2. If you believe that machine translation works against you, as I do, you should never ever give discounts on machine-translated matches delivered by your clients. Why should you give discounts on something that works against you instead of for you?

3. If you believe that everything I wrote is nonsense and that machine translation is here to stay, you'll need to make sure that the total discount given on machine-translated matches delivered by all your clients together never exceeds 549 USD, as you could have generated said added value yourself by investing 549 USD in Slate Desktop. Unless you want to become your own slave, of course.

No matter your angle: giving discounts on machine-translated matches is always a terrible idea. You're robbing your own purse.

Loek van Kooten
Your English/Japanese-Dutch game translator

Addendum: DeepL

You've probably already read about DeepL, the next revolution, Google Translate's big brother and the mother of all computers that will end the world. Though the results of machine translation by this new super computer in Iceland are definitely impressive when it comes to robotic texts like newspaper articles (DeepL chose these to market themselves for a reason), we ran some tests ourselves and definitely recommend against it using it for your games, unless you're up for a good laugh of course! Using English-Dutch game translation as an example (once more, this is as easy as it gets language-wise), DeepL turned out to deliver translations that were 97% more off than using the above-mentioned dumb assembly. Just a few jewels we found:

Appease quickly and floor goals. Spawn quickly, zip line to a waiting car, and floor it to your next objective. Snel gepaaid, rits naar een wachtende auto, en vloer het naar uw volgende doelpunt.

Environments on trampolines. The heart of downtown Los Angeles is a jumping environment – the streets, the buildings, the cars – are all open to criminal deeds. Het hart van het centrum van Los Angeles is een springende omgeving - de straten, de gebouwen, de auto's - staan allemaal open voor criminele daden.

Criminals breaking into eachother. Once the criminals break in, they have to nab two bags of cash and jam out with them back to each of the two base points. Als de criminelen eenmaal in elkaar zijn gebroken, moeten ze twee zakken contant geld binnenhalen en met hen naar elk van de twee basispunten terugstoten.

Battlefield loves staplers. A Battlefield staple, Conquest is based on the idea of controlling a base. Een Battlefield nietjes, Conquest is gebaseerd op het idee om een basis te besturen.

Top gangs jerking off. The top gangs are drawing up plans, stockpiling gear, and risking it all to pull off the boldest, craziest heists ever in Battlefield Hardline: Robbery, the second expansion pack for Battlefield Hardline. De topbendes maken plannen, leggen voorraden aan en riskeren om de moedigste, gekste heistes ooit af te trekken in Battlefield Hardline: Robbery, het tweede uitbreidingspakket voor Battlefield Hardline.

 

About Me | Contact Me | ©2006-2024 Loek van Kooten