User's Guide For Simplified Chinese Input Methods

[ Chinese Version ]



Table of Contents:

I. Introduction

II. Input Window Areas

Preedit Area
Status Area
Lookup Choice Area
Auxiliary Window


III. Basic Functions For Simplified Chinese Input Methods

1. Opening and Closing Input Methods
2. Selecting An Input Method
3. Switching Input Methods Roundly
4. Switching Between Half_width Character Mode and Full_width Character Mode
5. Switching Between Chinese Punctuation Mode and English Punctuation Mode
IV. Utilities For Simplified Chinese Input Methods
1. Selecting the Utility Menu
2. Input Method Selection Tool
3. Input Method Options Setting Tool
4. Lookup Table
Lookup table with native encoding
Lookup table with UNICODE encoding
Lookup table for special characters
5. Virtual Keyboard
PC Keyboard
Greek Characters Lookup Keyboard
Russia Characters Lookup Keyboard
ZhuYin Characters Lookup Keyboard
Chinese Punctuation Characters Lookup Keyboard
Number Symbols Lookup Keyboard
Mathmatic Symbols Lookup Keyboard
Table Symbols Lookup Keyboard
Special Symbols Lookup Keyboard
6. User Define Charater(UDC)
7. Input Method Help


V. Function Specfication for Simplified Chinese Input Methods

1. ASCII Input Mode
2. New QuanPin and New ShuangPin Input Mode
3. QuanPin Input Mode
4. ShuangPin Input Mode
5. English_Chinese Input Mode
6. NeiMa Input Mode
7. Wubi Input Mode


VI. CodeTable Input Method Interface

1. Introduction
2. Creating a Codetable
3. Convert the codetable text file to binary format
4. Convert the binary codetable file to text format
5. Creating a new codetable input method

 
 

I. Introduction

Solaris 9 provides more friendly and extensible input methods and management tools for Chinese Solaris users, including Mainland China, HongKong and Taiwan users. An Input method auxiliary window has also been developed which supports the following new functions and utilities: Two kinds of input methods are supported: Input methods with auxiliary window supports all the Simplified Chinese locales: In all Simplified Chinese locales, the following input methods are supported:
[ Back Home ]

II. Input Window Areas

Four separate areas of an application subwindow are involved in entering characters. These areas are typically displayed, named, and used as follows:

Preedit Area: Highlighted (such as inverse video or underlined) entry display area.
Status Area: Indicating the current input/conversion mode.
Lookup Choice Area: Displaying multiple character choices.
Auxiliary Windows: Utilities for input method management.

input window areas

Preedit Area

The highlighted preedit area (for example, reverse video and underlined) displays characters as they are typed or converted. It displays text characters before they are converted to Simplified Chinese characters or symbols and put into the applications.

Status Area

The status area shows which input conversion mode is in effect. In the above example, it is located in the lower left corner of the window margin.

Lookup Choice Area

The lookup choice area displays multiple Simplified Chinese or special symbol character choices available for conversion of the character(s)/radical(s) in the preedit area. In the above example, it is a pop-up.

Auxiliary Windows

The auxiliary windows provide tools and utilities to manage input methods or to make the input easier.
 
[ Back Home ]

 

III. Basic Functions For Simplified Chinese Input Methods

1. Opening and Closing Input Methods

Type [Control+Spacebar ] to open the input methods, An auxiliary window appears as below:

Input bar
 

Type [Control+Spacebar] again to close the input methods, The auxiliary bar disappears.
 
 

2. Selecting An Input Method

In Chinese status window, type Function key " F2 " to switch to the first input method, " F3 " to the second input method, "F4" to switch to the third selected one,  and so on.
Or click the input method selection button on the auxiliary bar to display the input method selection menu as below:

Input method list menu

And then select the input method you want to use.
 
 

3. Toggling Between Input Methods

In Chinese status window, Type [Control+Escape ] to toggle between  input methods.
 
 

4. Switching Between Half_width Character Mode and Full_width Character Mode

In Chinese status window, Type [Shift+Spacebar ] to switch between Half_width Character Mode and Full_width Character Mode, or click the Half_width/Full_width button on the auxiliary bar.

The input method system is in Full_width Character Mode when the button appears as below:
FullWidth Mode
The input method system is in Half_width Character Mode when the button appears as below:
HalfWidth Mode

When in Full_width mode, the Full_width character of the input key will be committed to applications.

For example: Inputting 'a' when in Full_width mode, the fullwidth character of 'a' will be committed to application as shown below:

Commit Fullwidth Character
 
 

5. Switching Between Chinese Punctuation Mode and English Punctuation Mode

In Chinese status window, type [Control+. ] to switch between Chinese Punctuation Mode and English Punctuation Mode,
or click the Chinese/English Punctuation Button on the auxiliary bar to switch modes.

The input method system is in Chinese Punctuation Mode when the button appears as below:

Chinese Punctuation Mode ,
and he input method system is in English Punctuation Mode when the button appears as below:
English Punctuation Mode .

When typing a punctuation key in Chinese Punctuation mode, the corresponding Chinese punctuation character will be committed to application.

For example: when you type "$" in Chinese Punctuation mode, the Simplified Chinese currency symbol character Simplified Chinese Currency Symbol will be committed to application as shown below:

Chinese Punctuation Mode

The punctuation keys include these characters: , . / <> :;'"\$!^&_-

And the mapping between English and Chinese punctuation is as follows:

Chinese Punctuation Map
 

[ Back Home ]

IV. Utilities For Simplified Chinese Input Methods

Solaris 9 provides some auxiliary tools or utilities to help user to manage input methods, to set the properties of input methods, to ease the input of special characters, and etc.

The following tools are supported:

1. Selecting the Utility Menu

Click the utility button utilities menu button on the auxiliary bar to display the utilities menu as below:

Utilities Menu

And select one of the input method tools from the menu. 


[ Back Home ]

2. Input Method Selection Tool

The input method selection tool allows you to select a list of input methods. You can also set the default input method and the sequence of the input methods.

Click the input method selection item from the utilities menu, and the input method selection panel appears as below:

input method selection

After selecting some input methods and clicking "OK" or "Apply", the setting will be activated. The first input method selected becomes the default input method.
Press [Control+Spacebar] in the application window to activate Chinese input. The default input method will be selected as the current input method.

Press "F2" to switch to the first selected input method, "F3" to switch to the second selected one, "F4" to switch to the third selected one, and so on.
 
 

[ Back Home ]

3. Input Method Options Setting Tool

Click the input method options setting item from the utilities menu, and the input method options setting panel appears as follows:

Option Setting

With the options setting tool, you can set input method options. After setting the options in this panel, then clicking "OK" or "Apply", the setting is activated.

For input methods based on code table structure, there are 4 options that can be set as described below:

This option can help user to learn the input method, for example, viewing the external codes of a Chinese character in that input method.
 
  • "Automatically commit if only one candidate"
  •  

    [ Back Home ]

    4. Lookup Table

    The lookup table tools can be used to search for an input Chinese characters. You can select a Chinese character, then double click it, or click the "OK" button to commit the selected character.

    Three kinds of lookup tables are provided:

    In zh/zh_CN/zh_CN.EUC locale, lookup table with GB2312 encoding is provided, and in zh_CN.GBK locale, lookup table with GBK encoding is provided, In zh_CN.GB18030 locale, lookup table with GB18030 encoding is provided.

    The lookup table panel with GB18030 encoding looks appears as below:

    Native language Lookup Table
     

    The lookup table panel with UNICODE encoding appears as below:

    Unicode Lookup Table
     

    The lookup panel for special characters , such as Greek characters, Mathmatic symbols, etc, appears as below:

    Special Symbols Lookup Table
     

    [ Back Home ]

    5. Virtual Keyboard

    Virtual Keyboard tools can be used as a lookup utilities to simplify the input of some special symbols. It can also be used to display the keyboard mapping of input methods that are based on radicals, such as Wubi input methods, which is a typical input method based on Chinese radicals.
     

    The Simplified Chinese Environment supports the following virtual keyboards:

    The virtual keyboards support to input a character by clicking the corresponding buttons on the virtual keyboard. The PC Keyboard appears as below:

    PC virtual keyboard
     

    You can click the keyboard button on the auxiliary bar and select the "Greek Keyboard" item. The Greek character keyboard appears as below:

    Greek virtual keyboard
     

    You can click the keyboard button on the auxiliary bar and select the "Russia Keyboard" item. The Russia character keyboard appears as below:

    Russia virtual keyboard
     

    You can click the keyboard button on the auxiliary bar and select the "ZhuYin Keyboard" item. The ZhuYin character keyboard appears as below:

    ZhuYin virtual keyboard
     

    You can click the keyboard button on the auxiliary bar and select the "Chinese Punctuation Keyboard" item. The Chinese punctuation keyboard appears as below:

    Chinese punctuation virtual keyboard
     

    You can click the keyboard button on the auxiliary bar and select the "Number Symbol Keyboard" item. The number symbol keyboard appears as below:

    number symbol virtual keyboard
     

    You can click the keyboard button on the auxiliary barand select the "Mathmatic symbols Keyboard" item. The Mathmatic symbols keyboard appears as below:

    math symbol virtual keyboard
     

    You can click the keyboard button on the auxiliary barand select the "Table Symbol Keyboard" item. The table symbol keyboard appears as below:

    Table symbol virtual keyboard
     

    You can click the keyboard button on the auxiliary barand select the "Special Symbol Keyboard" item. The special symbols keyboard appears as below:

    Special symbol virtual keyboard
     

    [ Back Home ]

    6. User Define Charater(UDC)

    The UDC editor tool allow you to draw and save new characters, After ascribing the character to an input method, it can be input and displayed in an application.

    Click the user define character item from the utilities menu to invoke the UDC tool, which appears as below:

    UDC
     
     

    [ Back Home ]

    7. Input Method Help

    Select the input method help item from the utilities menu, a bowser such as Netscape or HotJava will appear and display the help information for input methods.
     
    [ Back Home ]

     

    V. Function Specfication for Simplified Chinese Input Methods

    The following input methods and conversion modes are available for entering ASCII/English, Simplified Chinese and other text:

    In zh/zh_CN/zh_CN.EUC locales:

    In zh.GBK/zh_CN.GBK locales: In zh_CN.GB18030/zh.UTF-8/zh_CN.UTF-8 locales:


    Press [Control+spcaebar] to toggle Chinese input conversion on or off.
    Press [Control+Escape] to toggle through Chinese input modes.
     
     

    [ Back Home ]

    1. ASCII Input Mode

    Each application starts in ASCII input mode, which you can toggle on or off by pressing [Control+Spacebar ]. You can use this mode to type ASCII characters, as shown below:

    Input ASCII code

    ASCII status is displayed in the window's status area when in ASCII input mode, when ASCII input mode is off, ther current conversion mode symbol appears.
      


    [ Back Home ]

    2. New QuanPin and New ShuangPin Input Mode

    This section describes the features in the New QuanPin and New ShuangPin input methods, and how to use some of the features in the zh/zh_CN.EUC/zh_CN.GBK/zh_CN.GB18030/zh_CN.UTF-8 locales.

    PinYin is a popular input method in PRC, and there are various PinYin-based input methods. Two of them, New QuanPin and New ShuangPin, contain the following features:

    These features are described in details in the following sections.

    (1). Defining Phrases for Later Use

    The following example shows how to define the phrase "ke lin dun" and store it for later use.

    Type the phrase "kelindun" without spaces. The New QuanPin and New ShuangPin input methods will insert spaces for you automatically:

    NewPinYin input

    Then type the number  representing the first character you want to select. The following example shows the second candidate selected:

    NewPinYin input

    Input the second and third characters of the phrase in the same way as above. as below:

    NewPinYin input

    Then the new phrase is defined and added to the user dictionary file. The next time you type "ke lin dun", you will see the phrase you defined appears in the lookup choice area:

    NewPinYIn
     
     

    (2). Selecting Frequently-Used Candidates

    The candidates that have been selected will be presented at the beginning of the candidate list so they can be found more readily.

    The following example shows how it work:

    Type "sh yi".  Notice the order of the five available candidates:

    NewPinYin input

    Then select the fifth candidate and type "sh yi" again:

    NewPinYin input

    Notice that the fifth candidate has moved to the first position because you previously selected it, which means that
    frequently-used candidates are promoted for faster selection.
     

    (3). Inputting Long PinYin Strings

    The New QuanPin input methods accepts PinYin strings up to 222 Chinese characters long.

    The following example shows how to input a long Chinese phrase:

    >>meiguoztongkelindunzhengzaitaolunhaiwanjushiwenti<<

    NewPinYin

    The result is the following Chinese phrase:

    long sentence
     

    (4). Inputting phrase with ShengMu

    You can also type ShengMu only to input a Chinese phrase, as shown in the following example:

    NewPinYin
     
     

    (5). GBK Support

    In zh.GBK/zh_CN.GBK locale, NewQuanPin and New ShuangPin input methods support GBK by default, as shown in the following example:

    NewPinYin

    The second Chinese character input GBK with NewQuanPin in the phrase NewPinYin is defined only in the GBK standard.

    Single GBK candidates are placed at the end of the list of GB2312 candidates. Press [Return] to scroll to the GBK area. For easier selection next time, you can define the GBK candidate as a phrase (for more information, see Defining Phrases for Later Use). Once a phrase is defined, you can input it easily.

    Both New QuanPin and New ShuangPin support GBK Chinese character by default in the zh.GBK/zh_CN.GBK locale. However, because several Chinese character have the same ShengMu (the first part of Pinyin), New QuanPin and New ShuangPin do not display GBK candidates if you provide only the ShengMu.

    For example, typing the string "rong " will display GBK candidates because it is a complete Pinyin string. However,
    typing "r" alone will not display any GBK candidates because it is only a ShengMu.
     
     

    (6). Keyboard Definition

    The following table shows the definitions of the edit keys.
     
    Key Definition
    [a-z] PinYin character
    Home Moves to the start of the preedit line
    End Moves to the end of the preedit line
    Left Moves the caret in the preedit line to the left. If left is Chinese character, the original PinYin is recovered.
    Right Moves the caret in the preedit line to the right.
    Delete Deletes the PinYin character following the caret on the preedit line.
    Backspace Deletes the PinYin character preceding the caret on the preedit line.
    The candidates of a Pinyin string belong to the following groups:

    G1 - Highest frequency Hanzi + Long (3 or more) Cizu + Double Chinese Cizu

    G2 - GB Single Hanzi

    G3 - GBK Single Hanzi (in the zh_CN.GBK locale)

    Some Pinyin strings may have more candidates than can be displayed in the same window. In that case, use the keys described in the following table to scroll through the candidates.

    Page Scroll Key Definitions
     
    Key Definition
    - = Scrolls to previous/next candidate(s)
    [ ] Scrolls to previous/next candidate(s)
    , . Scrolls to previous/next candidate(s)
    Return Quickly scrolls through all candidates

    New QuanPin and New ShuangPin use the numeric selection keys. In accord with the national Pinyin standard, the separator (') is supported to avoid ambiguous interpretations of Pinyin strings.

    For example, the Pinyin string [jiang] can be interpreted as [jiang] or [ji][ang], and both are valid.  In New QuanPin,
    however, [jiang] is interpreted only as [jiang]. You must use the separator and enter [ji'ang] for it to be interpreted as
    [ji] and [ang]. New ShuangPin does not require the use of separators.
     
     

    (7).  Dictionary Files

    New QuanPin and New ShuangPin share two dictionary files: PyCiku.dat and Ud.Ciku.dat.  They are :
    /usr/lib/im/locale/zh_CN/data/PyCiku.dat  and
    /usr/lib/im/locale/zh_CN/data/UdCiku.dat .

    Users can not normally write to these files. However, since users can affect the way New QuanPin and New ShuangPin work through features such as frequency adjustment and user-defined phrases, it is necessary to update the dictionary files frequently.

    A user's dictionary is normally located in ~/.Xlocale/PyCiku.dat or ~/.Xlocale/UdCiku.dat (~ indicates the home
    directory of the user who starts the htt command). When New QuanPin and New ShuangPin are started, they locate and read the dictionary files in the user's home directory. If the user dictionary files are not found, the system default dictionary files are used (that is, /usr/lib/im/locale/zh_CN/data/... ).
     
     

    (8).  New ShuangPin Features

    ShuangPin is an abbreviated form of QuanPin. It is faster but more difficult to use than QuanPin. New ShuangPin supports all of the features, keyboard definitions, and dictionary files of New QuanPin.

    There are various ShuangPin keyboard mapping designs in PRC. The most popular three are ZiRanMa, Chinese Star, and Intelligent_ABC. The New ShuangPin input method supports all three of these keyboard mappings.

    The following tables contain keyboard mappings for the ZiRanMa keyboards.
     
    Key Definition
    i ch
    u sh
    v zh
    a a
    b ou
    c iao
    d uang, iang
    e e
    f en
    g eng
    h ang
    i i
    j an
    k ao
    l ai
    m ian
    n in
    o o, uo
    p un
    q iu
    r uan, er
    d iong, ong
    t ue
    u u
    v v, ui
    w ua, ia
    x ie
    y uai, ing
    z ei

     
     

    [ Back Home ]

    3. QuanPin Input Mode

    The QuanPin input method requires up to 12 keystrokes to type each Chinese PinYin character. QuanPin maps PinYin phonetics to single lowercase Roman letters. You can type the PinYin of the Chinese character to input this character.

    A lookup area show the characters that match the QuanPin keystrokes. if more than one character matches the keystroke sequence, you can type a period (.) or [PageDown] key to display the next pages of candidates, and type a comma(,) or [PageUp] key to display the previous page of candidates. You can select a Chinese character you want by typing the corresponding number label key.

    This section describes how to use the QuanPin input method to input Chinese characters.

    (1). Open a new Terminal, type [Control+Spacebar ] to turn on Chinese input conversion.

    (2). Press F5 to turn on QuanPin input mode, or click the Input method selection button on the auxiliary window and select QuanPin input method. The status area shows that QuanPin input mode is on, as below:

    QuanPin Status

    (3). Type zhang.

    The QuanPin input converter finds six matching characters and a lookup choice appears as below:

    input with QuanPin

    (4). Type number key to select the appropriate character. such as '1' to select the first candidate. the application appears as below:

    input with QuanPin
     
     

    [ Back Home ]

    4. ShuangPin Input Mode

    The ShuangPin input method requires up to 12 keystrokes to type each Chinese PinYin character.  the input rule of the ShuangPin input method is a simplified version of  the input rule of QuanPin input method, which representes a Chinese character only with two PinYin letters, one for ShengMu(the first part of PinYin), one for YunMu(the rest part of PinYin), so you can only type two keystrokes to input a Chinese character with ShuangPin.

    For example:  For Chinese character:Example Hanzi , the QuanPin representation is "zhang", while its ShuangPin is "vh"

    The following tables define the keyboard mappings for the ShuangPin rule.
     
    Key Definition
    i ch
    u sh
    v zh
    a a
    b b
    c iao
    d uang, iang
    e e
    f en
    g eng
    h ang
    i i
    j an
    k ao
    l ai
    m ian
    n in
    o o, uo
    p un
    q iu
    r uan, er
    s iong, ong
    t ue
    u u
    v v, ui, ue
    w ua, ia
    x ie
    y uai
    z ei
    ; ing

    You can use the ShuangPin input method to type individual Chinese characters in zh_CN.EUC,zh_CN.GBK and zh_CN.GB18030 locales.

    A lookup area show the characters that match the ShuangPin keystrokes. if more than one character matches the keystroke sequence, you can type a period (.) or [PageDown ] key to display the next pages of candidates, and type a comma(,) or [ PageUp] key to display the previous page of candidates. You can select the Chinese character you want by typing the corresponding number label key.

    This section describes how to use the ShuangPin input method to input Chinese characters.

    (1). Open a new Terminal, type [Control+Spacebar ] to turn on Chinese input conversion.

    (2). Press F6 to turn on ShuangPin input mode, or click the Input method selection button on the auxiliary window and select ShuangPin input method. The status area shows that ShuangPin input mode is on, as below:

    ShuangPin Status

    (3). Type vh.

    The ShuangPin input converter finds six matching characters and a lookup choice appears as below:

    input with ShuangPin

    (4). Type a number key to select the appropriate character, such as '1' to select the first candidate. The application appears as below:

    input with ShuangPin
     
     
     

    [ Back Home ]

    5. English_Chinese Input Mode

    The English_Chinese input method requires up to 15 keystrokes to type each Chinese word. An English word maps several Chinese phrases. The lookup area list all the Chinese phrases that match the English word, and each Chinese phrase follow with the English word.

    If more than one Chinese phrase matches the English word, you can type a period (.) or [PageDown ] key to display the next pages of candidates, and type a comma(,) or [PageUp] key to display the previous page of candidates. You can select the Chinese phrase you want by typing the corresponding number label key.

    The following figure shows how to use this input method to type the Chinese phrase representing the Engilsh word "hello". The word requires five keystrokes.

    (1). Open a new Terminal, type [Control+Spacebar ] to turn on Chinese input conversion.

    (2). Press F7 to turn on English_Chinese input mode, or click the Input method selection button on the auxiliary window and select English_Chinese input method. The status area shows that English_Chinese input mode is on, as below:

    English_Chinese status

    (3).Type hello, as follows:

    input with English_Chinese

    (4). Type a number key to select the appropriate character, such as ' 1' to select the first candidate. The application appears as below:

    input with English_Chinese

    (5). Wild characters ( * or ? ) can be used to search in the dictionary, '*' stands for one or several letters, and '?' represents only one letter. For example, to search all English words which end with ' lution ', you can input '*lution ' and the lookup choices appear as shown below:

    English_Chinese Wild character

    Or to search all English words which begin with 'c' , and only three letters, you can input 'c?? ' , the lookup choices appears as below:

    English_Chinese wild character
      


    [ Back Home ]

    6. NeiMa Input Mode

    In zh/zh_CN.EUC locale, GB2312 NeiMa Code input method is available. in zh_CN.GBK locale, GBK NeiMa input method is available, in zh_CN.GB18030 locale GB18030 input method is available. NeiMa input method uses the internal code to input Chinese characters. Each Chinese character or symbol is identified by a four or eight hexadecimal digital internal code.

    This section describes how to use the GB2312 internal codes to input Chinese characters and symbols in zh and zh_CN.EUC locale.

    (1). Open a new Terminal, type [Control+Spacebar ] to turn on Chinese input conversion.

    (2). Click the Input method selection button on the auxiliary window and select GB2312 NeiMa input method. The status area shows that GB2312 NeiMa input mode is on, as below:

    GBCode Status

    (3). Press the first three of the four keys that represent a character, For example, b0a1, as below:

    input with GB2312Code

    (4). Type the fourth key '1 '. The character automatically is committed to the application, as below:

    input with GB2312Code
     

    This section describes how to use the GBK internal codes to input Chinese characters and symbols in zh.GBK/zh_CN.GBK locale.

    (1). Open a new Terminal, type [Control+Spacebar ] to turn on Chinese input conversion.

    (2). Click the Input method selection button on the auxiliary window and select GBK NeiMa input method. the status area shows that GBK NeiMa input mode is on, as below:

    GBKCode Status

    (3). Press the first three of the four keys that represent a character, For example, 8141, as below:

    input with GBKCode

    (4). Type the fourth key '1 '.  the character automatically is committed to the application, as below:

    input with GBKCode
     
     

    This section describes how to use the GB18030 internal codes to input Chinese characters and symbols in zh_CN.GB18030 and zh_CN.UTF-8 locale.

    (1). Open a new Terminal, type [Control+Spacebar ] to turn on Chinese input conversion.

    (2). Click Input method selection button on the auxiliary window and select GB18030 NeiMa input method. the status area shows that GB18030 NeiMa input mode is on, as below:

    GB18030 Status

    (3). To input a Chinese character with 2 bytes of GB18030 internal code, For example, 0x8141 .

    Press the first three keys, as below:

    input with GB18030 Code

    (4). Type the fourth key '1 '. The character automatically is committed to the application, as below:

    input with GB18030 Code

    (5). To input a Chinese character with 4 bytes of GB18030 internal code, For example, 0x8139ef30:

    Press the first seven keys, as below:

    input GB18030 8_byte character

    (6). Type the last key '0', then the character is committed to the application, as below:

    input GB18030 8_byte character

    [ Back Home ]

    7.  Wubi Input Method

    Wubi input method is the most popular input method in China. Its encoding rule is based on the radical or stroke shape of Chinese characters.

    Wubi's primary advantage is that user can input a character rapidly since there is scarcely more than one candidates to select for one Wubi code.  And because the Wubi input method is based on shape, almost every CJK characters  can be encoded by its encoding rule, while it is very difficult for a phonetic based input method.

    About the Wubi encoding rule , you can refer to the document:  《Tutorial Book For Standard Wubi》。

    Solaris WangMa Wubi input method support the following functions::

  •  Support GB18030 charset.
  •  Support Wubi simplified code.
  •  Support Wubi mistake compatible code.
  •  Support three levels of identified code.
  •  Support "z/Z" as help key.
  •  Support phrase input and optional professional phrase libraries.
  •  Support character/phrase association.
  •  Support input method properties setting.

  • (1). Support GB18030 charset.

    GB18030 standard is a new Chinese character encoding standard issued in 2000. It is mandatory,  that it is illegal to sell products in China if not conform to this standard.

    GB 18030 has the following significant properties:

    GB 18030 encodes characters into one, two, and four bytes. The following are valid byte sequences (byte values are hexadecimal): Currently GB18030 charset include 27533 Chinese chararcters, in which 21003 Chinese characters are with two bytes,  6530 with four bytes. Wubi input method can input all these Chinese characters.

    In additiona, Wubi input method also support GB2312, GBK charset.

    Solaris WangMa wubi input method divide GB18030 charset into three levels: GB2312, GBK and GB18030 level, in which GB2312 level include 6763 frequently-used Chinese characters, GBK include 21003 Chinese characters, and GB18030 include 27533.

    While inputting,  user can switch between these three levels, just like stretch or shorten an antenna.

    For example: type "gigg", and scroll pages to the end, you will find a GB18030 character:  ,  as below:


     

    (2).  Support Wubi simplified code.

    Some Chinese characters that are used frequently can be inputed by pressing the first one or two or three radical keys and then the space key.

    Wubi simplified codes are devided into 3 levels:

    + Level 1:  This first level Chinese characters include 25 most frequently used characters, they are:
                 我人有的和主产不为这工要在地一上是中国工以发了民同
               User can only press the corresponding radical key and space key to input these Chinese characters.

    + Level 2: These second level Chinese characters are frequently used, user input these characters by pressing the first two radical keys and the  space key.

    + Level 3:  These third level Chinese characters are frequently used, user input these characters by pressing the first three radical keys and the  space key.

    For example:  type "di", and then type spacebar,  character "耗" will be inputed,  whose level_2 simplified code just is "di".
     

    (3).   Support Wubi mistake compatible code.

    Each Chinese character has only one Wubi code according to the WuBi rule, but  with the user's handwriting custom, some Chinese characters can be encoded with another Wubi codes, we call them mistake compatible code.

     For example:  For character "长",  "tayi" is the correct WuBi code, but "atyi" can also be a wubi code for this character,  user can input this character with both of these two wubi codes.

     
    (4).   Support three levels of identified code.

     With Wubi encoding rule, some Chinese characters has an identified code to distinguish itself from other characters that with similar shape .

     For example, according to the Wubi coding rule,  "吧" and "邑" have the same code "KC",  we can assign an identified code to them to distinguish them, The identified codes are assigned by the shape or the last radical of the character.

    Solaris WangMa Wubi input method will support identified codes with three levels:

    +  "A" mode:   Every characters with no more than 4 byte wubi codes should be inputed with an identified code.
    +  "B" mode:   only characters whose shape is left_to_right mode should be inputed with an identified code.
    +  "C" mode:   all characters should be inputed with no identified code.
     "A" mode is the default mode.

    For example,  when set idetified code mode to "C mode",  type "tkg" and space key, Two Chinese characters: "和", and "程" will be listed, while in "A mode", only "和" will be selected and committed to application.
     

    (5).   Support "z/Z" as help key.

    When user do not know the Wubi code of a Chinese character, he can use "z/Z" as a help key to search this character.
    For example:   user can use "azzd" to search all characters/phrases whose Wubi code begin with "a" and end with "d". as below:


     

    (6).   Support phrase input and optional professional phrase libraries.

    Solaris WangMa Wubi input method support inputting phrase with Wubi codes. Beside the 90000 basic phrases , Wubi input method also provide 11 professional phrase libraries for selection. user  can activate one of them according to his professional domain.

      The professional phrase libraries as follow ( Every one hava about 3000 - 20000 phrases):

  • Tranpotation
  • Computer
  • Economics and Finance
  • Agriculture
  • Medicine
  • Mineralogy
  • Trade
  • Martial
  • Law
  • Gazetteer
  • Idioms
  •   For example: When use choose "Medicine" phrase libray, and type "mino", some medicine phrases will be listed for selection, as below:


     

    (7).   Support character/phrase association.

     When user input a Wubi code which represent a character and submit it to application, then the phrases which begin with this character will be listed in candidate area for selection.
     
    For example: type "iuxx", and the Chinese character "滋" will be automatically committed to application, after the character is appeared in application window, a new candicate window will pop up and the phrases which begin with this Chinese character will be listed in this candidate window.  as below:

    (8).   Support input method properties setting.

    Solaris WangMa Wubi input method can set the following properties:

     The input method properties setting panel looks as below:


     
     

    For example:  Switch between the three level of Chinese charset,  as below:


     
     

    For example:  Switch between professional phrase libraries, as below:


     

    For example: Switch between the three levels of identified code,  as below:


     

    [ Back Home ]

    VI. CodeTable Input Method Interface

    1. Introduction

    Solaris 9 provides a codetable input method interface which allows Chinese users to add new input methods based on codetable into their system.
     
     

    2. Creating a Codetable

    Codetable file with text format contains some function specific sections and a list of code-word mapping items.

    Here is an example to specify the format of a codetable text file:

    Codetable example
     

    A codetable text file contains the following function specific sections:

    Each section is briefly described as below: This section describes some attribuates of the codetable, such as encoding, name, valid characters, the maximum number of codes for one input items, and wild characters.

    This section contains the following entry items:
    (1). "Name:",     Specify the name of this codetable.
    (2). "Encode:",   Specify the encoding of this codetable, can be UTF-8, GB, GB2312, GBK, GB18030, EUC_TW, BIG5, BIG5HK.
    (3). "WildChar:" ,  Specify the wild characters for input codes. default values are '*' and '?'.
    (4). "UsedCodes:" ,  Specify the valid characters to input.
    (5). "MaxCodes:" ,  Specify the maximum number of input codes for one items.

    This section can be used to enter comments or information for explanation. This section describes the prompt string of an input key.  The prompt string will be displayed on the Preedit Area of the application software. This section describes the key defination of some function keys, such as PageUp key to scroll up the candidate items, PageDown key to scroll down the candidate items, BackSpace Key to delete an input code, and ClearAll key to cancel the input keys.

    This section contains the following entry items:
    (1). "PageUp:"
    (2). "PageDown:"
    (3). "BackSpace:"
    (4). "ClearAll:"

    Notes: '^' means [ Control ] key, for example: '^N' means '[ Control+N ]' key.

    This section describes the options of the codetable input method. such as whether display help information for each candidate items, whether display the prompt string of the input key in preedit area, whether display the lookup candidates key by key or only display the lookup candidates when the Space key is entered, whether commit the candidate when only one lookup result, and the select key mode: Number mode or Lower case mode or Upper case mode.

    This section contains the following entry items:

    (1). "HelpInfo_Mode:" Values: "ON" or "OFF"
    (2). "KeyByKey_Mode:" Values: "ON" or "OFF"
    (3). "KeyPrompt_Mode:" Values: "ON" or "OFF"
    (4). "AutoSelect_Mode:" Values: "ON" or "OFF"
    (5). "SelectKey_Mode:" Values: "Number", "Lower" or "Upper"

    This section describes the input codes and their corresponding single Chinese characters. These Chinese characters must not be seperated by a Space key.

    The format of every line as follow:

    keystroke_sequence Characterlist

    Notes: "CharacterList " means a list of Chinese characters with no Space seperated.

    This section describes the input codes and its corresponding phrase words. these Chinese phrase words must be seperated by Space key.

    The format of every line as follow:

    keystroke_sequence word1 word2 word3 ...
     
     

    3. Convert the codetable text file to binary format

    The utility tools "txt2bin " can be used to convert a text codetable file to binary file that the codetable input method interface can recognize.

    the tool "txt2bin" is under directory: "/usr/lib/im/locale/zh_CN/common/"

    The command syntax is:

    # /usr/lib/im/locale/zh_CN/common/txt2bin source_codetable_file binary_codetable_file
     
     

    4. Convert the binary codetable file to text format

    The utility tools "bin2txt " can be used to convert a binary codetable file to text format.

    the tool "bin2txt" is under directory: "/usr/lib/im/locale/zh_CN/common/"

    The command syntax is:

    # /usr/lib/im/locale/zh_CN/common/bin2txt binary_codetable_file source_codetable_file
     
     

    5. Creating a new codetable input method

    (1). Create and edit codetable source file:

    Prepare the code table source file to present the new input method according to the format as specified above.

    (2). Convert the source codetable file to binary format:

    Use the utility tool "txt2bin " to convert the prepared text codetable file to a binary file.

    The command syntax is:

    # /usr/lib/im/locale/zh_CN/common/txt2bin source_codetable_file binary_codetable_file

    (3). Copy the binary codetable file to path " /usr/lib/im/locale/zh_CN/common/data".

    (4). Add the codetable infomation into the input method specification file "/usr/lib/im/locale/zh_CN/sysime.cfg ".

    (5). Restart the input method server (htt) and relogin to the system to enable the new input method.
    To restart the input method server (htt), you need to run the following command as root:

    # /etc/init.d/IIim stop
    # /etc/init.d/IIim start

    Then your new input method is ready to use.
     
     

    For example: To add a new codetable input method named "new_codetable_im":
    (1). First create a codetable format file named "new_codetable_im.txt",
    (2). Use tool "txt2bin" to convert it to binary file " new_codetable_im. data",
    (3). Then copy it to path "/usr/lib/im/locale/zh_CN/common/data ",
    (4). Add the codetable name "new_codetable_im" into the input method configuration file: "/usr/lib/im/locale/zh_CN/sysime.cfg ".
    (5). Restart input method server (htt).
     

    [ Back Home ]