Tuesday, April 15, 2014

Translate In-Game Japanese Text with OCR (Optical Character Recognition)

I know that a lot of my viewers are interested in playing imported games, particularly newly released Japanese titles or earlier titles that were unfortunately pushed off into the abyss for English or other international audiences. While I think the idea is wholly crazy for reading intensive games such as RPGs, it's otherwise alright for games that are rather light on it, such as action games in veins of KOEI's Warriors series.

To help your experience be less of a misery, I'm going to share with you a useful tool you can use yourself to quickly translate in-game Japanese text to English, or whatever other language you desire. Now if you are already fluent in Japanese as I mostly am, you can still use it for a dozen other languages and not just for games either. I've used this tool myself to translate Chinese, Korean, German and Russian text to English from games, movies, video clips and television shows (none of which I'm fortunately any adept in).

The tool in question is called OCR, or Optical Character Recognition. If you haven’t heard about it yet, it essentially allows you to grab specific sections of text in a picture/screenshot and converts them to machine-encoded/computer-readable format, which is text that you can manipulate freely on your PC including copying and pasting into a translator.

OCR is widely available as both online web embedded services or via downloaded software. There are additionally both free and paid versions; free versions offering basic functionality with less accuracy while paid offering higher accuracy and more features.

Before we dive into anything, let’s briefly go over the prerequisites. You’re going to need the following:

1.) A computer that preferably has access to the internet.
2.)  OCR software.
I tend to favor and prefer ABBYY Screenshot Reader, included in ABBYY FineReader. Although it is paid software, it has worked mostly flawlessly for me converting virtually any text in any language in screenshots to computer-readable with high accuracy. If you visit their website, you can download a free, fully featured trial of the software. Albeit its rather hefty price tag, I highly recommend purchasing as it is one of the best OCR software available on the internet.

Do not download or purchase ABBYY Screenshot Reader individually, as you will not be receiving Asian language support with it.

If you prefer to go the free route, there are numerous free OCR services available, both as online web embedded and via downloaded software. Be wary that a lot of them do not support Asian languages.

Capture2text is a free downloadable OCR software with Asian language support and specific section/text area capturing:

Here are some free OCR web embedded services that have Asian language support:
http://www.i2ocr.com/free-online-japanese-ocr (Japanese only) 

Keep in mind that I have no experience or knowledge with Capture2text or any online based OCR services, let alone free ones. All I ever use is the fantastic ABBYY Screenshot Reader. Nevertheless, a simple Google search will yield more results, if the above ones are not favorable or working for you.

The following is necessary if you desire game console support (e.g. PS3, Xbox 360, etc.):
3.) A capture card or device compatible with your game console.
The capture card or device is necessary because we require a method to take screenshots in the game with the text we want converted/translated, so we can then run OCR on it. Now, you may get away substituting this with a camera, but since most OCR requires the text in the picture be very clear and precise, a capture card or device will undoubtedly work much better.

If you want to do this on PC games, then you can simply use "Print Screen" command.

Alright, now with the prerequisites and boring things aside, let’s get into the actual process!

Let’s say for example you're playing Musou Orochi 2 Ultimate right now on your PS3, pretend you don’t know Japanese and really like to know what the highlighted text said.

Now being that it’s a game, you can’t really just copy the text and paste it into a translator, could you? You could possibly manually convert the text into computer-readable by finding out which letter is which (using some sort of Japanese dictionary or characters list) and typing it out one by one on the computer. But let’s face it, that’s going to take until Hell freezes over and infinite eye strain, especially given the amount of Kanji that exists!

Well, fear not, as this is where OCR comes to save the day! It will easily convert the text to computer-readable for you in the speed that it takes you to type a sentence or less.

So the first thing you have to do is take a screenshot of the screen with the text on it you want converted, utilizing your capture card or device. If you're on a PC game, print screen should work dandy. Remember, you can also attempt to use a camera if you're on a game console and have no access to a capture device. That may or may not work, depending on how large, clear and prominent the text comes out on the photo.

The next step is to run OCR on the screenshot. For this purpose, I'm going to be using ABBYY Screenshot Reader exclusively. Methods and instructions will vary upon the OCR being used. In my case, I'm going to show you how easy and fast it is to use ABBYY Screenshot Reader.

Here ABBYY Screenshot Reader can be launched directly from the Start Menu after installing ABBYY FineReader:

In order to change language to Japanese or any other Asian language, click on the Language drop down menu and select "More languages...":

In the Language Editor window, Asian languages, including Japanese can now be selected under "Asian Languages". Be sure to uncheck any other languages that have already been enabled, such as the default English. If you have more than one language selected, the software will attempt to read all of those languages in the screenshot, which may make the results not come out what you'd expect.

After clicking OK and confirming that you only have one sole language selected, you can now go ahead and click on the capture screenshot button (the big button with the notepad and clipboard icon to the right). However; before doing that, you have to make sure the screenshot the software needs to read from is already opened in a picture viewer, fully zoomed to 100% and non-hidden/obstructed on the screen. Because once you click on that button, it will essentially freeze everything on the screen.

Click the capture screenshot button to start the reading process, but do not click it until your screenshot is launched in a separate window, fully zoomed to 100% and non-hidden/obstructed on the screen.
After clicking that button, the ABBYY Screenshot Reader window will disappear and your cursor will transform into a crosshair.

What you basically wish to accomplish here is to simply just drag and select the section of text you want the screenshot reader to read and copy from.

It might consume a few lousy seconds, but once it's done its magic, that bit of text that is in the screenshot should be correctly converted and additionally copied to the clipboard.

We can find that out by simply opening a software that can input text (such as Notepad) and clicking Control + P to paste.

So here are the results of attempting to run ABBYY Screenshot Reader on that specific section of Japanese text in that Musou Orochi 2 Ultimate screenshot:

In comparison to the actual text that appears in the game, it read it 100% perfectly!

The next step is, as you have guessed, copy that bit of text and paste into a translator. My main web translator of choice is Google Translate, but if results fair poorly for you, I also recommend Babylon Translation or Babelfish.

After running that bit of Japanese text on Google Translate, this is what it returned:

Google Translate doesn't do a fantastic job on translation, especially regarding Japanese to English, generating broken Engrish a fair majority of the time. Regardless, it should be enough most of the time for you to get a general idea of what the text means. Here, I can pretend that I can only read the English, so to my understanding it basically means that turning this on will cause enemy morale to raise greatly at the start of the battle.

Now if you still don't understand it after the above translation, you are welcome to torment ask me to give you a proper translation that doesn't make anyone's mind melt in confusion (Japanese only of course). No OCR necessary, just send me the screenshot or text directly. :)

And that pretty much concludes this guide on OCR. You can envision other endless possibilities you can have with OCR, given all the possible languages it can read from. Of course and like I already indicated, it doesn't have to be Japanese or be a game. I honestly think every person who enjoys foreign media and games should have knowledge on how to effectively use OCR, and most certainly ABBYY Screenshot Reader.

One last thing I want to mention is that any OCR, even ABBYY Screenshot Reader, isn't a miracle product by any means. Meaning, it's not going to perfectly read all text you hand to it. If you desire perfect or near perfect results, you have to make sure whatever text in your screenshots you want read from is as crisp, clear and in large font as possible. The text should additionally have no background behind the text, straight (not crooked) and be single colored. Vertical text is fine as long as it's straight.