Making Web Pages with Built-in Word Translation

Project Proposal by Martin Stacey


Making Web Pages with Built-in Word Translation

Software

XHTML; probably XML or JSON; JavaScript; possibly Java or another programming language; maybe a database management system

Covers

Programming, systems design, web interface design, possibly databases

Skills Required

Programming, web publishing, maybe databases, preferably some interest in HCI and learning foreign languages.

Challenge

Conceptual ?? Technical ??? Programming ???

Brief Description

A major problem for people learning foreign languages is the time and effort involved in looking words up in hardcopy dictionaries - online dictionaries are a bit quicker, but still more of a slog than you want when you're doing casual reading. The result is that learners don't read enough or don't improve their vocabulary by checking words they're not sure of. There are programs such as TranslateIt! that provide very quick one-point or one-click lookup of words in downloaded dictionaries, but these cost money and involve some effort in setting up.

The aim of this project is to develop a system that takes a different approach to supporting people learning or just wanting to read foreign languages - manufacturing web pages with built-in word translation support, so the users don't need their own translation software. The idea is that the reader reads the text of the webpage in, say, German, and points at or clicks on any word he or she is unsure of, and a translation appears in another window or a sidebar or a region at the bottom of the window.

This will involve building a system for taking a web page (a text file with particular characteristics) and generating another textfile that looks the same or similar displayed by a web browser, but supporting one-point or one-click word translation. Alternatively, the system might make use of templates that the person producing the webpages can add text to, with the construction of the finished web page being done automatically, as part of a content management system. Another approach would be an application that can process .txt files, and/or can accept text copied and pasted via the Windows clipboard. The system will need to include an interface for the generator that enables a user to create adapted web pages quickly and easily. (It's reasonable to demand that the source web pages are subject to certain restrictions, like having all the to-be-translated text within HTML elements having the class "translate".)

What's involved in building the program will depend on whether you're writing your own dictionary (avoid this if you can) or making use of a free or commercially available dictionary. You might want to use the dictionary directly, and do the conversion from the dictionary's own data format at runtime for each word, or you might want to write a separate program that converts the dictionary into an XML or JSON file or a database, that your program can then use. There is a free German-English dictionary available from the TU Chemnitz at http://ftp.tu-chemnitz.de/pub/Local/urz/ding/de-en/; this appears to be in Backus-Naur form, rather than XML, but using it with a little editing or converting it to an XML or JSON version shouldn't be difficult, and it looks like it will do the job.

I seriously want one of these I can use in earnest, so you have a customer!

Variants/Extensions

A version of the system that generates pages suitable for reading on small mobile devices.

A version of the system that produces pages offering the reader a choice of languages to translate words into.

Tricky: Some support for recognising and offering translations of separable verbs, standard phrases, etc.


Back to