Home

Using xyaku with Mozilla

THE INFORMATION ON THIS PAGE IS OBSOLETE
このページの情報はもう古いです

It is much easier to use xyaku if you convert your system to UTF-8. See my UTF-8 page. Just convert the edict dictionary itself from EUC-JP encoding to UTF-8 using iconv, and xyaku will work perfectly. Select a word in any program, including (modern versions of) Mozilla, and xyaku will display the translation(s), because copy/paste from Mozilla does not produce the ‘quoted’ form any more (e.g. \x{3042} for あ).
 

1. Introduction

xyaku is a Japanese -- English dictionary utility for Linux, written by Seiichiro Inoue. It is a front-end for the EDICT dictionary. You highlight a word on the screen (either English or Japanese) and press a hotkey. A window appears on the screen which shows the EDICT entries containing the selected word.

This works OK for English words, and for Japanese words displayed traditionally, i.e. as JIS or EUC code, as for instance in kterm. However, in modern versions of Mozilla, Japanese is displayed using Unicode. If you pick, for instance, a Hiragana 'A' off the screen, you get this:

\x{3042}
instead of the EUC code for Hiragana A (which looks like ¤¢ on a Latin-1 terminal). EDICT (which is EUC coded) cannot handle Unicode strings. This pretty much excludes using xyaku (and other EDICT-using products, like xjdic) with Mozilla.

Here I present a solution for xyaku which involves 2 intermediate steps:

  1. Changing the Unicode to UTF-8 with a small utility which I wrote. This produces 3 bytes for each Japanese character. For instance, hiragana A (=\x{3042}) becomes E3 81 82.
  2. Changing the UTF-8 into EUC using a program called lv written by Tomio Narita.
The result is then passed to EDICT by xyaku, and the dictionary look-up works.

2. Getting the programs

  1. Get EDICT. In Debian, this is done by means of "apt-get install edict". The EDICT dictionary file itself will end up in /usr/share/edict.
  2. Get xyaku from its homepage. Get the source; compile & install according to the instructions on the page. Debian has an xyaku package, but it contains an old version; compiling from source is better.
  3. Get lv from its homepage. Download the newest version and compile & install according to the instructions on the page. In Debian, "apt-get install lv" may be sufficient, but I did not try this myself.
  4. Get the Unicode-UTF8 converter from here. Compile with:
    cc -Wall -o unuquote unuquote.c
    and put the resulting binary in /usr/local/bin.
There may be other solutions for 3 and 4; simpler in the case of 3, better in the case of 4!

3. Changing xyaku's configuration

  1. In ~/.xyakurc change the line
    Key F1 C true 0 0 0 edict.sh
    
    to
    Key F10 M true 0 0 0 edict2.sh
    
    The 'hotkey' for translation is now ALT-F10, one of the few keys not already claimed by KDE. The translation script is now edict2.sh (instead of edict.sh; we still have to construct edict2.sh). Also, comment out all other lines starting with Key or AutoKey. Comment out the line
    ModulePath  /home/inoue/work/xyaku/addin
    
  2. Go to the directory where xyaku's translation scripts are (in my case /usr/local/libexec/xyaku). Become root, and copy edict.sh to edict2.sh. Edit edict2.sh as follows: change
    # target word from standard input
    read target
    if [ -z $target ]
    then
            exit 0
    fi
    
    to
    # target word from standard input
    read -r targetA
    if [ -z $targetA ]
    then
            exit 0
    fi
    target=`echo $targetA|unuquote|lv -Tut8 -Oej|cat`
    
    Mind the backquotes! Mind the -r option on the read command (this protects the backslashes in the Unicode sequences). We also change the name of the 'target' string temporarily.
With this, running xyaku by means of
LANG=ja_JP xyaku &
should work, with words selected from the Mozilla screen. It also still works with words selected from kterm. xyaku's dictionary lookup is somewhat slow -- perhaps because of these extra filters. Anyway, enjoy.

Home