Introducing Ordbok - An open source localization library for SketchUp Ruby Extensions

localization
translation
dictionarry

#1

After some weeks of passive thinking and some days of tinkering I’m ready to announce a half-baked version of Ordbok, a localisation library for SketchUp extensions.

I think a lot of people are quite frustrated about LangHandler, both it’s unusual style and lack of features. There has been other attempts to localization in the past, e.g. @thomthom’s Babelfish (wish is where I learned string interpolation from).

This library aims to take SketchUp extension localization one step further, with features inspired by rails/l18n.

Some notable features are:

  • String interpolation
OB = Ordbok.new
OB[:message_notification, count: 14]
# => "You have 14 new messages."
  • Simple pluralization rules
    (Pick different phrases based on count)
OB = Ordbok.new
OB[:message_notification, count: 1]
# => "You have 1 new message."
# Note 'message' being in singular!
  • Phrase grouping
    (Required for pluralization)
OB = Ordbok.new
OB["errors.file_missing", count: 1]
# => "Could not perform operation.\n\n1 file is missing."

The library is also built around descriptive keys, rather than full English strings, which both makes the code more readable and prevents translation errors due to homonyms in the source language(different phrases that just happens to spell out the same).

The library is not 100% finished but ready to be tested out. Enjoy!


#3

Documentation is now available at http://www.rubydoc.info/github/Eneroth3/ordbok


#4

The lib has also ben updated today to let end users select language (if the extension developer decides to expose the option) and to save the language preference between sessions.


#5

Hi, I am having trouble understanding what this library is used for… Can you explain it in layman is term?

Thanks and keep the good work!


#6

Thanks for asking

It’s used to pull in text strings from external, localized (translated) files. This allows for adding multiple languages to an extension. SketchUp is shipped with LangHandler that also does this, but without fancy features such as string interpolation and pluralization.


#7

I need to look into this because when I see my analytics of who buys my extensions I see people from all over the globe.

What would be the best way to have the users themself update their native languages into the extension?

I’ve seen game developers use Excel Spreadsheet to handle multiple languages because it makes it easy for everyone to edit.

Anyways, you’ve picked my interest with this and I’ll look into it more. Thanks!


#8

I haven’t decided on the format yet. For now translations are stored in JSON files, which allows for nested keys.

Spreadsheets could be useful, especially as it allows translators/users to see the phrase in several languages at once. However there may be different number of phrases in different languages. In English (and most other germanic languages nouns (and sometimes other words) have a singular and a plural form, meaning you’d need to specify the phrases “There is 1 new message” and “there are X new messages” separately (unless you can stand writing “is/are” and “message(s)”). Some languages, like Russian, have more grammatical numbers, requiring more phrases to be specified. In a spreadsheet there would therefore rows that don’t apply to all languages, which may look odd and be confusing.

Anyhow, the format the strings are saved in inside of the plugin and the format used when writing the translations don’t need to be the same. As long as both are clearly specified you can always convert between them.

I’m thinking about either use Google Docs (spreadsheet) or a public GitHub repo for my translations when making localized extensions in the future, but haven’t decided yet.


#9

Excel is certainly not professionally used for this (probably as a workaround to involve people who are just familiar with Excel). It is the input “form” that these developers convert into their translation files.

In any case, a process that can be automated (by a script) is ideal. This saves manual work and can be repeated endless times without human intervention (continuous integration).

There are also collaborative localization tools, either by the usual open source suspects (1, 2, 3) or for general purpose, like zanata and weblate. Apart from .po files they also support some JSON formats. It is also interesting that there exists already a JSON format used by Firefox and Chrom(e|ium) extensions.


#10

I have scripts that scrape my Ruby, JS and HTML sources for strings to be localised. Using static analysis libraries I parse the AST of both Ruby and JS to reliably extract the strings. The strings are collected into a single source file which another script uploads to CrowdIn. A third script is then run whenever I need to update the strings from CrowdIn. That’s been working very well - especially since services like CrowdIn allows you to upload screenshots of where the various strings are used. Makes it easier for me as well as the translators.


#11

Hi,

I found the video I saw a while back about this Game Developer who used Excel…

Programmers Dilemma…

Also, listen to the following podcast from 1:04:06

Start listening at 1:04:06

Choose whatever technology works for you because it is not really that important.

Cheers!


#12

I’m quite happy with my own Language translator. I developed a translator that I can use with my embedded html editors and with my C++ applications.

I’m using simple Key Value pairs in a text file. This allows users in different countries to easily edit a file using Notepad++

Translations are almost always context sensitive - therefore a user who speaks fluent English and the language (be it Spanish, Russian etc.) and who has sufficient knowledge in specific application is always the best choice as a proficient translator.

I also have a configuration editor where the user can choose a language. I provide an English language template. The user simply copies it using their own 2 character language code and begins translating. The English template looks like this. I should mention that empty strings simply return the original string.

“Cabinet Backs” = “”
“Cabinet Hangers” = “”
"Cabinet Library " = “”
“Cabinet List” = “”
“Cabinet Style:” = “”
“Cabinet Type:” = “”
“Cabinet” = “”
“Cabinets” = “”

Interestingly enough I have some English users who copy the “en.lang” translatioin over to their own ie: “e1.lang” so they can modify wood working terms to suit their background. For instance in the UK woodworkers call the “kick” a “plinth” and others call a “sink stretcher” a “sink rail”

Then there are other users - many in Quebec who do not use the French translation - they choose to stay with English as it speeds up customer support.

Lots to think about when you start considering all the use cases.


#13

That depends! Technology does matter. That something “works” does not prove it to be good. We have to deal a lot with customers that abuse Excel in areas where it is inappropriate, error prone and leads to unmaintainable calculations that cannot be reused in an automated manner. As soon as you grow to do something at a bigger scale, you better switch to subject-specific solutions.

But it’s curious to see this creative solution!

There even exist language codes for that, like “en-GB-WLS” for the language “English” in the region “Great Britain” and the variant “Welsh”. But most softwares cannot parse such differentiated codes.


#14

I learned some week s ago there is a language code for Scanian, my local accent :smiley: . I may translate some future plugins to it, but sadly I don’t know much of the writing language as we use standard Swedish. The language code was even removed in a later revision of the ISO specification :frowning: . But for a moment we were almost independent of the Swedish tyranny!


#15

I’ve made a few updates and published version 1.0.0.

There are now 5 different pluralization rules, and I think all languages SketchUp is available in are supported. The file format has changed slightly, now containing language name and optionally specifies pluralization rule. I’ve also added a menu that can be optionally implemented in an extension to let the user change language and the language setting is preserved between sessions (on a per extension basis).

I’ve been looking briefly at Crowdin but can’t make very much sense o what formats they support. The list includes JSON but I can’t find anywhere how pluralization is treated, e.g. if a translator can add more versions of a phrase than the original has (day, days -> день, дня, дней).

For now Ordbok still uses JSON but perhaps I’ll switch over to YAML for consistency with rails-l18n (or support both).