Improvement on Web Translate It: better YAML importer
By Edouard on 14 janvier 2010
Today I improved and launched a new parser for the importer used for YAML files on Web Translate It. The previous parser was buggy and inefficient.
The default Ruby on Rails implementation for i18n, so-called “Simple” use YAML to store its files.
A YAML file looks like so:
fr:
some_key:
key: value
# a comment
hello: bonjour
The YAML parser used until today was home-made. It used to browse every line in the YAML file, extracted keys, values and comments.
Really good YAML parsers exist for pretty much every language, but they don’t allow me to extract comments and display them on the translation interface, and it was a no-go when I implemented this feature.
My parser was working fine with really simple YAML files. In practice many customers had problems importing their YAML files because my parser did not support some edge-cases.
As Web Translate It grows, a few more customers needed to be able to import YAML files. As a consequence, last night and today a few more import jobs failed for a customer. I am really sorry about that.
Being able to actually import YAML files is obviously more important that being able to extract comments. On top of that, my parser had become incredibly complex and I felt I would not be able to fix the issue in a simple way, and to maintain a such hullabaloo.
So I switched to the vanilla ruby YAML parser, which not only parses files really well, but also parses them really fast. The only downside is that it is not possible to extract and import comments from the YAML file any longer.
A note of caution with YAML
Be cautious when writing copy in your YAML file because everything your write in the file is interpreted by the parser (by your application as well as by Web Translate It). Words like true
, false
, yes
, no
are interpreted as the booleans True
or False
by YAML parsers.
Consider this YAML file:
en:
date:
true: foo
false: bar
yes: baz
no: boo
It will be interpreted as:
en:
date:
true: foo
false: bar
Because yes == true
and false == no
in YAML, the parser consider the compounds yes: baz
and no: boo
duplicate respectively true: foo
and false: bar
. If you must use yes
and no
as keys, wrap them around quotes, like so:
en:
date:
true: foo
false: bar
"yes": baz
"no": boo
Same goes for the value. This:
en:
date:
a_key: yes
b_key: no
will be interpreted as:
en:
date:
a_key: true
b_key: false
So wrap them around quotes:
en:
date:
true: "yes"
false: "no"
Thank you for your patience with this issue, and thank you for using Web Translate It.