The WebTranslateIt Blog

i18n news and Product Updates about WebTranslateIt

New in Web Translate It: Translation validations

By Edouard on July 1, 2011

I just added translation validations checks to Web Translate It.

When translating software, language files often contain what we call “interpolated variables” among the text to translate. In this segment for instance:

%{person_name} could not be found.

%{person_name} is a variable. The text of the variable shouldn’t be translated and exactly match the source text, or it will break the software being translated. However it often happens that translators translate the variable.

By default, Web Translate It now checks whether the variables are matching and prevents the translator from saving a broken string.

If you are really sure of what you are doing, you can bypass this validation check by unticking the box.

Web Translate It currently recognizes the following variables: {variable}, ruby-style %{variable} or {{variable}}, and C-style variables such as %d or %s.

I hope you will find this improvement useful, translation validations was by far the most requested feature on Web Translate It.

Edit 01/08/2011: I released an improvement to this feature.

Downtime post-mortem: June 21st

By Edouard on June 21, 2011

At 07:11 AM GMT today Web Translate It wasn’t responding. I was immediately notified, woke up and noticed the server wasn’t reachable by any means.

I immediately reported back on Twitter and contacted our hosting company, DigitalOne in order to know if they were having issues.

The word from the hosting company was that the service disruption was due to a bug one of their core routers and they appeared to be working on it. In the meanwhile, I noticed that services such as Pinboard and Instapaper —which are both partially or fully hosted at the same ISP— were also having network issues.

Since the service wasn’t completely down, but continuously coming on and off-line, I decided against moving the service to the backup server, which is much weaker than the main server.

Around 10:30 AM GMT the service became online again and has been stable throughout the day. We suffered no data loss.

Right now the word is that the FBI raided our hosting company’s datacenter and pulled three racks of servers. Some servers were removed, others (such as Web Translate It’s) were temporarily disconnected.

I am really sorry for any and all problems this has caused. In the coming months I will take actions to make Web Translate It able to fail over more easily in such extraordinary events.

web_translate_it gem v1.8.1.2 released

By Edouard on June 20, 2011

I just released a new version of the web_translate_it gem, the synchronization tool for Web Translate It.

This release fixes 2 bugs happening when running wti under Microsoft Windows.

Here’s the changelog:

  • Bug fix: Disable colors when running under MS Windows. #58

  • Bug fix: Don’t verify SSL certificate when running under MS Windows. #57

Install or Upgrade

To install web_translate_it, please refer to the gem documentation.

To upgrade web_translate_it to its latest version, type in a terminal: gem install web_translate_it.

File encoding detection improvements

By Edouard on June 10, 2011

Today I released an update to Web Translate It to improve the file encoding detection.

Internally, Web Translate It use UTF-8, but we have to accept files of any file encoding. This is needed because while most files are UTF-8 encoded, Java .properties files are ISO 8859-1 encoded, while Apple .strings are sometimes UTF-16LE encoded. We also support uploading text files which can have any encoding.

Under the hood, Web Translate It detects your file’s encoding, saves it to database and converts your strings to UTF-8 before importing them to database. When you’ll download your translated file, Web Translate It will convert your strings back to your file original encoding.

In some rare cases, file imports were failing or strings were imported wrongly because Web Translate It wasn’t able to recognise your file encoding.

To remediate this issue, I improved the file encoding detection strategy. Here’s what Web Translate It does now:

  1. If your file contains a BOM, we’re sure that your file is encoded in UTF-8, UTF-16LE, UTF-16BE, UTF-32, etc.

  2. If your file doesn’t contain a BOM, it means that either your file is encoded in something else that UTF-something, but it could also mean it is a UTF-8 file without a BOM. So we scrub the content of your file and look for a hint. For instance, if your Gettext .po file contains in its header "Content-Type: text/plain; charset=UTF-8\n", then we assume your file is UTF-8 encoded.

  3. If we can’t find any indication of the encoding of your file, a character detection algorithm is used (we use rchardet). rchardet takes a sequence of bytes from your file (of unknown encoding), and attempts to determine the encoding so you we can read the text and import it to database.

  4. Finally, there is so much we can do. If rchardet couldn’t reliably detect your file encoding, then our fallback strategy is to assume you’re using UTF-8.

Detecting character encodings is tricky. If you’re having a character encoding problems when uploading a file to Web Translate It, please don’t hesitate to open a support ticket and we’ll work with you to correctly import your file.

Now testing: support for Microsoft Word .docx and Powerpoint .pptx

By Edouard on June 8, 2011

Web Translate It works pretty well to translate software language files. Could it translate documents too?

Web Translate It now supports for 2 new file formats: Microsoft Word .docx and Powerpoint .pptx. Note that we currently don’t support the older Microsoft Word .doc and Powerpoint .ppt.

The biggest challenge for supporting these files is to minimise the amount of Open XML’s infamous “tag soup”. Web Translate It does a pretty good job at keeping this markup code out of your way.

For instance, here’s what a .docx file might look like in a word processor:

The underlying code generated by Microsoft Word looks like this:

And this is the same paragraph imported in Web Translate It:

Web Translate It removes unnecessary markup tags and “folds” complex and untranslatable markup tags into smaller tags.

We’re currently testing these new file formats. If you notice anything strange when translating these files, please open a support ticket and I will investigate and sort you out.

Web Translate It now supports more than 20 different file formats. If you’d like to use a file format Web Translate It doesn’t support yet, open a support ticket and let me know.