Ideographic generator

Reduced keyboard text input system for the Japanese language

6646573

Abstract

A reduced keyboard system for the Japanese language which uses word-level disambiguation of entered keystroke sequences, and which enables the user to select the desired interpretation of an ambiguous input key sequence as kana, and then select the desired textual interpretation associated with the selected kana interpretation. The system uses a highly compressed database format which has several advantages in terms of reduced size and minimal processing requirements during operation. Also disclosed is a reduced keyboard system which uses sequences of two keystrokes to specify each syllable, including the syllables with palatalized vowels that are written with two kana each. Input sequences of keystrokes are interpreted as ordered pairs of keystrokes which select a character according to its position in a two-dimensional matrix. The first keystroke of each ordered pair specifies the row of the matrix in which the desired character appears, and the second keystroke of each pair specifies the column. The organization of the characters in the first five columns of the matrix conforms to the manner in which the Japanese syllabary is learned and conceptualized by a native Japanese speaker. An additional three columns are organized in a manner that corresponds with the natural model of how the syllables with palatalized vowels are formed (each as a combination of two kana). Up to two more specialized columns are added to handle two special cases that do not fit into the simple patterns of the first eight columns.


Claims

The embodiments of the invention in which an exclusive property or privilege is claimed are defined as follows:

1. A system for generating input sequences of Japanese phonetic kana characters entered by a user, the input system comprising:

a. a user input device having a plurality of input means, each of the plurality of input means associated with a plurality of characters, an ordered input sequence pair being generated each time a sequence of two input means is selected by manipulating the user input device;

b. a memory associating each syllable of the Japanese syllabary with one or more ordered input sequence pairs, wherein:

each of the following sets of Japanese syllables is associated with a set of ordered input sequence pairs, all members of each said associated set of ordered input sequence pairs having the same input means as the first element, and wherein the first element is a different input means for each set, said sets of syllables comprising: {A, I, U, E, O}, {KA, KI, KU, KE, KO, KYA, KYU, KYO}, {SA, SHI, SU, SE, SO, SHA, SHU, SHO}, {TA, CHI, TSU, TE, TO, CHA, CHU, CHO}, {NI, NJ, NU, NE, NO, NYA, NYU, NYO}, {HA, HI, FU, HE, HO, HYA, HYU, HYO}, {MA, MI, MU, ME, MO, MYA, MYU, MYO}, {YA, YU, YO}, {RA, RI, RU, RE, RO, RYA, RYU, RYO}, and {WA, WO}; and

each of the following sets of Japanese syllables is associated with a set of ordered input sequence pairs, all members of each said associated set of ordered input sequence pairs having the same input means as the second element, and wherein the second element is a different input means for each set, said sets of syllables comprising: {A, KA, SA, TA, NA, HA, MA, YA, RA, WA}, {I, KI, SHI, CHI, NI, HI, MI, RI}, {U, KU, SU, TSU, NU, FU, MU, YU, RU}, {E, KE, SE, TE, NE, HE, ME, RE}, {O, KO, SO, TO, NO, HO, MO, YO, RO, WO}, {YA, KYA, SHA, CHA, NYA, HYA, MYA, RYA}, {YU, KYU, SHU, CHU, NYU, HYU, MYU, RYU}, and {YO, KYO, SHO, CHO, NYO, HYO, MYO, RYO};

c. a display to depict system output to the user; and

d. a processor coupled to the user input device, memory and display, said processor comprising:

a sequence identifying component for classifying each selection of an input means as a first or second selection of an ordered pair of selections, and identifying from the memory the Japanese syllable associated with each completed input sequence pair; and

an output component for displaying the identified Japanese syllable associated with each generated input sequence pair as the textual interpretation of the generated input sequence.

2. The system of claim 1, wherein upon receiving the first input of a generated input sequence pair, the processor causes the output display to display the set of Japanese syllables associated with the set of ordered input sequence pairs, all members of said associated set of ordered input sequence pairs having the received input means as the first element.

3. The system of claim 2, wherein each syllable in the displayed set of Japanese syllables is displayed in association with an indication of the input means comprising the second element of the ordered input sequence pair associated with said syllable.

4. The system of claim 1, wherein there are ten input means associated with a plurality of characters.

5. The system of claim 1, wherein there are nine input means associated with a plurality of characters, such that the two sets of characters {A, I, U, E, O} and {YA, YU, YO} are both associated with ordered input sequence pairs having the same input means as the first element.

6. The system of claim 1, wherein the Japanese syllables small TSU and N are associated with sets of ordered input sequence pairs having distinct input means as the second element, respectively.

7. A disambiguating system for disambiguating ambiguous input sequences entered by a user and generating textual output in the Japanese language, the disambiguating system comprising:

a. a user input device having a plurality of inputs, each of a plurality of said plurality of inputs is associated with a plurality of characters, an input sequence being generated each time an input is selected by manipulating the user input device, the generated input sequence having a textual interpretation that is ambiguous due to the plurality of characters associated with said inputs;

b. a memory containing data used to construct a plurality of Yomikata objects, each of the plurality of Yomikata objects associated with an input sequence and a frequency of use, wherein each of the plurality of Yomikata objects comprises a sequence of kana which corresponds to the phonetic reading to be output to the user, said Yomikata objects including completed word and phrase objects, stem objects comprising a sequence of the initial kana of a yet uncompleted word or phrase object, and objects that are both a completed word or phrase and a stem of a word or phrase, and wherein all word, phrase and stem objects are constructed from data stored in the memory in a tree structure comprised of a plurality of nodes, each node associated with an input sequence and with one or more Yomikata objects;

c. a display to depict system output to the user; and

d. a processor coupled to the user input device, memory and display, the processor constructing the one or more Yomikata objects from the data in the memory associated with each generated input sequence and identifying at least one candidate object with the highest frequency of use, wherein said candidate object is a word or phrase object when at least one word or phrase object is associated with the generated input sequence, and wherein said candidate object is a stem object when no word or phrase object is associated with the generated input sequence, and generating an output signal causing the display to display the at least one identified candidate object associated with each generated input sequence as a textual interpretation of the generated input sequence.

8. The system of claim 7 in which one or more kana which include special diacritic marks including dakuten and handakuten are associated with the same input with which the corresponding kana without special diacritic marks is associated.

9. The system of claim 7 in which the user input device includes an additional input that is associated with the special diacritic marks including dakuten and handakuten, and in which text objects that include one or more of the kana which include these special diacritic marks are associated with input sequences that include one or more activations of this additional input.

10. The system of claim 7, wherein one or more Yomikata object in the tree structure in memory is associated with one or more Midashigo objects, wherein each Midashigo object is a textual interpretation of the associated Yomikata object, and wherein each Midashigo object is comprised of a sequence of characters comprised of any combination of kanji, hiragana katakana, symbols, letters and numbers, and wherein each Midashigo object is associated with a frequency of use.

11. The system of claim 10, wherein the frequency of use associated with each Yomikata object corresponds to the sum of the frequencies of use of all Midashigo objects associated with said Yomikata object.

12. The disambiguating system of claim 10, wherein

a. one of the plurality of inputs is an unambiguous Selection input, wherein the user may accept the Yomikata object having the highest frequency of use as the textual interpretation of the entered input sequence by selecting said unambiguous Selection input;

b. the user may select an alternate Yomikata object as the interpretation of the input sequence by additional selections of said unambiguous Selection input, each selection of said unambiguous Selection input selecting a Yomikata object from the identified one or more Yomikata objects in the memory associated with the generated input sequence, said alternate Yomikata object having a decreasing frequency of use;

c. one of the plurality of inputs is an unambiguous Conversion input, wherein the user may select the Midashigo object having the highest frequency of use associated with the Yomikata object having the highest frequency of use as the textual interpretation of the entered input sequence by selecting said unambiguous Conversion input;

d. the user may select an alternate Midashigo object associated with the Yomikata object having the highest frequency of use as the textual interpretation of the entered input sequence by additional selections of said unambiguous Conversion input, each selection of said unambiguous Conversion input selecting a Midashigo object from the identified one or more Midashigo objects in the memory associated with the Yomikata object having the highest frequency of use, said alternate Midashigo object having a decreasing frequency of use;

e. after the user has selected an alternate Yomikata object as the interpretation of the input sequence by additional selections of said unambiguous Selection input, the user may select the Midashigo object having the highest frequency of use associated with said selected Yomikata object as the textual interpretation of the entered input sequence by selecting said unambiguous Conversion input; and

f. after the user has selected an alternate Yomikata object as the interpretation of the input sequence by additional selections of said unambiguous Selection input, the user may select an alternate Midashigo object associated with said selected Yomikata object as the textual interpretation of the entered input sequence by additional selections of said unambiguous Conversion input, each selection of said unambiguous Conversion input selecting a Midashigo object from the identified one or more Midashigo objects in the memory associated with the selected Yomikata object, said alternate Midashigo object having a decreasing frequency of use.

13. The disambiguating system of claim 12, wherein

a. a selection of any one of the plurality of inputs associated with one or more characters following one or more selections of said unambiguous Selection input is processed as the first input of a new input sequence; and

b. a selection of any one of the plurality of inputs associated with one or more characters following one or more selections of said unambiguous Conversion input is processed as the first input of a new input sequence.

14. The system of claim 12, wherein one or more kana syllables are specified unambiguously by additional subsequences of one or more inputs, such that when the user selects the unambiguous Selection input following an input sequence of one or more inputs associated with one or more characters including one or more subsequences which unambiguously specify one or more kana syllables, the processor identifies from the one or more constructed Yomikata objects only those Yomikata objects which contain the same unambiguously specified one or more kana syllables in the same positions.

15. The system of claim 14 wherein the one or more kana syllables are specified unambiguously by an ordered pair of keystrokes.

16. The system of claim 14 wherein the one or more kana syllables are specified unambiguously by selecting the input with which the syllable is associated one or more times, wherein each syllable that is associated with each input is associated with a unique number of times that the input is to be selected to unambiguously generate the syllable.

17. The disambiguating system of claim 7, wherein one of the plurality of inputs is an unambiguous Selection input, wherein the user may accept the Yomikata object having the highest frequency of use as the textual interpretation of the entered input sequence by selecting said unambiguous Selection input, such that a following selection of one of the plurality of inputs associated with one or more characters is processed as the first input of a new input sequence.

18. The disambiguating system of claim 17, wherein the user may select an alternate Yomikata object as the interpretation of the input sequence by additional selections of said unambiguous Selection input, each selection of said unambiguous Selection input selecting a Yomikata object from the identified one or more Yomikata objects in the memory associated with the generated input sequence, said alternate Yomikata object having a decreasing frequency of use, and such that a selection of one of the plurality of inputs associated with one or more characters following one or more selections of said unambiguous Selection input is processed as the first input of a new input sequence.

19. The disambiguating system of claim 18, wherein each of the plurality of inputs that is associated with a plurality of characters is also associated with a numeric digit, such that each generated input sequence has a numeric textual interpretation that is composed of said numeric digits due to the numeric digit associated with each input, and wherein said numeric interpretation is included among the Yomikata objects that the user may select as the interpretation of the input sequence by additional selections of said unambiguous Selection input.

20. The disambiguating system of claim 19, wherein said numeric interpretation is presented to the user following all of the Yomikata objects that the user may select as the interpretation of the input sequence by additional selections of said unambiguous Selection input.

21. The system of claim 18 wherein one or more kana syllables to be output are specified unambiguously by additional subsequences of one or more inputs, such that when the user selects the unambiguous Selection input following an input sequence of one or more inputs associated with one or more characters including one or more subsequences which unambiguously specify one or more kana syllables, the processor identifies from the one or more constructed Yomikata objects only those Yomikata objects which contain the same unambiguously specified one or more kana syllables in the same positions.

22. The system of claim 21 wherein the one or more kana syllables are specified unambiguously by an ordered pair of keystrokes.

23. The system of claim 21 wherein the one or more kana syllables are specified unambiguously by selecting the input with which the syllable is associated one or more times, wherein each syllable that is associated with each input is associated with a unique number of times that the input is to be selected to unambiguously generate the syllable.

24. The disambiguating system of claim 7, wherein the plurality of nodes are connected by a plurality of paths, each of the plurality of paths linking a parent node associated with a base input sequence with a child node associated with the base input sequence of the parent node and an additional input.

25. The disambiguating system of claim 24, wherein the Yomikata objects associated with a child node are based on the Yomikata objects associated with the corresponding parent node to which the child node is linked.

26. The disambiguating system of claim 25, wherein the Yomikata objects associated with a child node are constructed using a code pre-stored in memory to modify Yomikata objects associated with the corresponding parent node.

27. The disambiguating system of claim 26, wherein the code used to construct Yomikata objects associated with a child node by modifying Yomikata objects associated with the corresponding parent node comprises a specification of the numerical index of the Yomikata object associated with the corresponding parent node and a specification of the numerical index of one of the characters associated with the additional input linking the parent node to the child node.

28. The disambiguating system of claim 27, wherein the code used to construct Yomikata objects associated with a child node by modifying Yomikata objects associated with the corresponding parent node further includes a specification of whether the code is the final code of the sequence of codes which create objects associated with the child node.

29. The disambiguating system of claim 27, wherein the number and identity of additional inputs which correspond to child nodes linked to a parent node is indicated in the parent node by a field of valid key bits that indicate the number and identity of said child nodes.

30. The disambiguating system of claim 29, wherein each set of one or more codes used to create the Yomikata objects associated with a child node is immediately followed by a pointer to said child node, and wherein the one or more sets of one or more codes and following pointer are placed sequentially in memory within the parent node in the same order as the valid key bits that indicate the number and identity of said child nodes.

31. The disambiguating system of claim 27, wherein the sequence of codes which create Yomikata objects associated with a child node are ordered in memory such that Yomikata objects are created in a sequence that is sorted with respect to the frequency of use of said objects.

32. The disambiguating system of claim 27, wherein the indices of the characters associated with each of the inputs are assigned sequentially to the characters in descending order of the frequency of occurrence of the characters in Yomikata objects in memory.

33. The disambiguating system of claim 32, wherein the code used to construct Yomikata objects associated with a child node by modifying Yomikata objects associated with the corresponding parent node also includes a specification of an object type associated with the constructed object associated with the child node.

34. The disambiguating system of claim 33, wherein the object type that is specified includes information regarding the part of speech of the constructed object.

35. The disambiguating system of claim 33, wherein the object type that is specified includes information regarding the inflectional endings and suffixes that may be appended to the constructed object.

36. The disambiguating system of claim 33, wherein the object type that is specified includes a code that uniquely identifies the constructed object among the objects in memory.

37. The disambiguating system of claim 33, wherein the object type that is specified includes information regarding the frequency of use of the constructed object.

38. The disambiguating system of claim 33, wherein the object type that is specified includes information regarding whether the constructed object is a completed word.

39. The disambiguating system of claim 27, wherein the indices of the characters associated with each input means are assigned sequentially to the characters in descending order of the frequency of occurrence of the characters following the immediately preceding character in the Yomikata object associated with the corresponding parent node to which the indexed character is appended to form a Yomikata object associated with the child node.

40. The disambiguating system of claim 27, wherein where two parent nodes of the tree are redundant in that all codes associated with a given input means that are present in both of said redundant parent nodes are identical in that the codes occur in the same sequence and specify the same numerical Yomikata object index and the same numerical character index, and further in that for all inputs for which child nodes are linked to each of the two redundant parent nodes said child nodes are also redundant in the same recursive sense, one of said redundant parent nodes is omitted from the tree structure in memory and the remaining redundant parent node is augmented by any codes and links to child nodes that were present only in the omitted redundant parent node.

41. The disambiguating system of claim 40, wherein one or more codes that are associated with a given input means and are present in both of said redundant parent nodes are defined as being identical when the codes specify the same numerical Yomikata object index and the same numerical character index, even when said codes occur in a different sequence in the two redundant parent nodes.

42. The disambiguating system of claim 40, wherein one or more codes that are associated with a given input means and are present in both of said redundant parent nodes are defined as being identical when the codes specify the same numerical Yomikata object index and the same numerical character index and the same object type, even when said codes occur in a different sequence in the two redundant parent nodes.

43. The disambiguating system of claim 40, wherein one or more codes used to construct Yomikata objects associated with child nodes by modifying Yomikata objects associated with the corresponding parent node also include a specification of an object type associated with the constructed Yomikata object associated with the child node, and wherein two codes are defined as being identical when they specify the same numerical Yomikata object index and the same numerical character index, wherein the code that is present in the remaining redundant parent node that is augmented by any codes and links to child nodes that were present only in the omitted redundant parent node includes the specification of all said object types that were specified in either redundant node.

44. The system of claim 27, wherein each Yomikata object constructed in each node of the tree structure in memory is associated with zero or more Midashigo objects, wherein each Midashigo object is a textual interpretation of the associated Yomikata object, and wherein each Midashigo object is comprised of a sequence of characters comprised of any combination of kanji, hiragana, katakana, symbols, letters and numbers.

45. The disambiguating system of claim 44, wherein the Midashigo objects associated with a Yomikata object in a child node are based on the Midashigo objects associated with one or more of the corresponding parent nodes of which the child node is a descendent.

46. The disambiguating system of claim 45, wherein a Midashigo object associated with a Yomikata object in a child node is constructed using a code pre-stored in memory to modify a Midashigo object associated with one of the corresponding ancestor nodes.

47. The disambiguating system of claim 46, wherein the code used to construct a Midashigo object associated with a Yomikata object in a child node by modifying a Midashigo object associated with one of the corresponding ancestor nodes comprises a specification of the number of nodes that must be traversed back up in the tree to reach the corresponding ancestor node and a specification of the numerical index of the Midashigo object associated with the Yomikata object in the corresponding ancestor node from which the Yomikata object in the child node was constructed and a specification of a numerical character code designating a kanji or other character to be appended to the specified Midashigo object in the ancestor node to construct a Midashigo object associated with the Yomikata object in the child node.

48. The disambiguating system of claim 46, wherein the code used to construct a Midashigo object associated with a Yomikata object in a child node by modifying a Midashigo object associated with one of the corresponding ancestor nodes comprises a specification of the number of nodes that must be traversed back up in the tree to reach the corresponding ancestor node and a specification of the numerical index of the Midashigo object associated with the Yomikata object in the corresponding ancestor node from which the Yomikata object in the child node was constructed and a specification that indicates that the sequence of one or more kana that is appended to the Yomikata object in the corresponding ancestor node in constructing the Yomikata object in the child node is also to be appended as hiragana to the specified Midashigo object in the ancestor node to construct a Midashigo object associated with the Yomikata object in the child node.

49. The disambiguating system of claim 46, wherein the code used to construct Midashigo objects associated with a Yomikata object in a child node by modifying Midashigo objects associated with the corresponding parent node further includes a specification of whether the code is the final code of the sequence of codes which create objects associated with the Yomikata object in the child node.

50. The disambiguating system of claim 47, wherein the code used to construct a Midashigo object associated with a Yomikata object in a child node by modifying a Midashigo object associated with one of the corresponding ancestor nodes appends a kanji character, wherein the specification of said kanji character to be appended comprises:

a. a specification of the number of nodes that must be traversed back up in the tree to reach the corresponding ancestor node;

b. a specification of the numerical index of the Midashigo object associated with the Yomikata object in the corresponding ancestor node from which the Yomikata object in the child node was constructed;

c. a specification that indicates that the sequence of one or more kana that is appended to the Yomikata object in the corresponding ancestor node in constructing the Yomikata object in the child node comprises the root Yomikata object which can be found in said tree structure starting from the root node and with which said kanji character is associated as a Midashigo object; and

d. a specification of the numerical index of the Midashigo object that is associated with said root Yomikata object and that corresponds to said kanji character to be appended to construct said Midashigo object associated with a Yomikata object in a child node.

51. A disambiguating system for disambiguating ambiguous input sequences entered by a user and generating textual output in the Japanese language, the disambiguating system comprising:

a. a user input device having a plurality of inputs, each of a plurality of said plurality of inputs is associated with a plurality of romaji characters, an input sequence being generated each time an input is selected by manipulating the user input device, the generated input sequence having a textual interpretation that is ambiguous due to the plurality of characters associated with each input;

b. a memory containing data used to construct a plurality of Yomikata objects, each of the plurality of Yomikata objects associated with an input sequence and a frequency of use, wherein each of the plurality of Yomikata objects comprises a sequence of romaji characters corresponding to the kana which comprise the phonetic reading to be output to the user, said Yomikata objects including completed word and phrase objects, stem objects comprising a sequence of the romaji corresponding to the initial kana of a yet uncompleted word or phrase object, and objects that are both a completed word or phrase and a stem of a word or phrase, and wherein all word, phrase and stem objects are constructed from data stored in the memory in a tree structure comprised of a plurality of nodes, each node associated with an input sequence and with one or more Yomikata objects;

c. a display to depict system output to the user; and

d. a processor coupled to the user input device, memory and display, the processor constructing the one or more Yomikata objects from the data in the memory associated with each generated input sequence and identifying at least one candidate object with the highest frequency of use, wherein said candidate object is a word or phrase object when at least one word or phrase object is associated with the generated input sequence, and wherein said candidate object is a stem object when no word or phrase object is associated with the generated input sequence, and generating an output signal causing the display to display the at least one identified candidate object associated with each generated input sequence as a textual interpretation of the generated input sequence.

52. The system of claim 51, wherein each Yomikata object in the tree structure in memory is associated with one or more Midashigo objects, wherein each Midashigo object is a textual interpretation of the associated Yomikata object, and wherein each Midashigo object is comprised of a sequence of characters comprised of any combination of kanji, hiragana, katakana, symbols, letters and numbers.

53. The disambiguating system of claim 52, wherein

a. one of the plurality of inputs is an unambiguous Selection input, wherein the user may accept the Yomikata object having the highest frequency of use as the textual interpretation of the entered input sequence by selecting said unambiguous Selection input;

b. the user may select an alternate Yomikata object as the interpretation of the input sequence by additional selections of said unambiguous Selection input, each selection of said unambiguous Selection input selecting a Yomikata object from the identified one or more Yomikata objects in the memory associated with the generated input sequence, said alternate Yomikata object having a decreasing frequency of use;

c. one of the plurality of inputs is an unambiguous Conversion input, wherein the user may select the Midashigo object having the highest frequency of use associated with the Yomikata object having the highest frequency of use as the textual interpretation of the entered input sequence by selecting said unambiguous Conversion input;

d. the user may select an alternate Midashigo object associated with the Yomikata object having the highest frequency of use as the textual interpretation of the entered input sequence by additional selections of said unambiguous Conversion input, each selection of said unambiguous Conversion input selecting a Midashigo object from the identified one or more Midashigo objects in the memory associated with the Yomikata object having the highest frequency of use, said alternate Midashigo object having a decreasing frequency of use;

e. after the user has selected an alternate Yomikata object as the interpretation of the input sequence by additional selections of said unambiguous Selection input, the user may select the Midashigo object having the highest frequency of use associated with said selected Yomikata object as the textual interpretation of the entered input sequence by selecting said unambiguous Conversion input; and

f. after the user has selected an alternate Yomikata object as the interpretation of the input sequence by additional selections of said unambiguous Selection input, the user may select an alternate Midashigo object associated with said selected Yomikata object as the textual interpretation of the entered input sequence by additional selections of said unambiguous Conversion input, each selection of said unambiguous Conversion input selecting a Midashigo object from the identified one or more Midashigo objects in the memory associated with the selected Yomikata object, said alternate Midashigo object having a decreasing frequency of use.

54. The disambiguating system of claim 53, wherein

a. a selection of any one of the plurality of inputs associated with one or more romaji characters following one or more selections of said unambiguous Selection input is processed as the first input of a new input sequence; and

b. a selection of any one of the plurality of inputs associated with one or more romaji characters following one or more selections of said unambiguous Conversion input is processed as the first input of a new input sequence.

55. A disambiguating system for disambiguating ambiguous input sequences entered by a user and generating textual output in the Japanese language, the disambiguating system comprising:

a. a user input device having a plurality of inputs, each of the plurality of inputs is associated with a plurality of characters, an input sequence being generated each time an input is selected by manipulating the user input device, the generated input sequence having a textual interpretation that is ambiguous due to the plurality of characters associated with said inputs;

b. a memory containing data used to construct a plurality of Yomikata objects, each of the plurality of Yomikata objects associated with an input sequence and a frequency of use, wherein each of the plurality of Yomikata objects comprises a sequence of kana which corresponds to the phonetic reading to be output to the user, said Yomikata objects including completed word and phrase objects, stem objects comprising a sequence of the initial kana of a yet uncompleted word or phrase object, and objects that are both a completed word or phrase and a stem of a word or phrase, and wherein all word, phrase and stem objects are constructed from data stored in the memory in a tree structure comprised of a plurality of nodes, each node associated with an input sequence and with one or more Yomikata objects;

c. a display to depict system output to the user; and

d. a processor coupled to the user input device, memory and display, the processor constructing the one or more Yomikata objects from the data in the memory associated with each generated input sequence and identifying at least one candidate object with the highest frequency of use, wherein said candidate object is a word or phrase object when at least one word or phrase object is associated with the generated input sequence, and wherein said candidate object is a stem object when no word or phrase object is associated with the generated input sequence, and generating an output signal causing the display to display the at least one identified candidate object associated with each generated input sequence as a textual interpretation of the generated input sequence;

wherein one or more Yomikata object in the tree structure in memory is associated with one or more Midashigo objects, wherein each Midashigo object is a textual interpretation of the associated Yomikata object, and wherein each Midashigo object is comprised of a sequence of characters comprised of any combination of kanji, hiragana katakana, symbols, letters and numbers, and wherein each Midashigo object is associated with a frequency of use;

and wherein

aa. one of the plurality of inputs is an--unambiguous Selection input, wherein the user may accept the Yomikata object having the highest frequency of use as the textual interpretation of the entered input sequence by selecting said unambiguous Selection input;

bb. the user may select an alternate Yomikata object as the interpretation of the input sequence by additional selections of said unambiguous Selection input, each selection of said unambiguous Selection input selecting a Yomikata object from the identified one or more Yomikata objects in the memory associated with the generated input sequence, said alternate Yomikata object having a decreasing frequency of use;

cc. one of the plurality of inputs is an unambiguous Conversion input, wherein the user may select the Midashigo object having the highest frequency of use associated with the Yomikata object having the highest frequency of use as the textual interpretation of the entered input sequence by selecting said unambiguous Conversion input;

dd. the user may select an alternate Midashigo object associated with the Yomikata object having the highest frequency of use as the textual interpretation of the entered input sequence by additional selections of said unambiguous Conversion input, each selection of said unambiguous Conversion input selecting a Midashigo object from the identified one or more Midashigo objects in the memory associated with the Yomikata object having the highest frequency of use, said alternate Midashigo object having a decreasing frequency of use;

ee. after the user has selected an alternate Yomikata object as the interpretation of the input sequence by additional selections of said unambiguous Selection input, the user may select the Midashigo object having the highest frequency of use associated with said selected Yomikata object as the textual interpretation of the entered input sequence by selecting said unambiguous Conversion input; and

ff. after the user has selected an alternate Yomikata object as the interpretation of the input sequence by additional selections of said unambiguous Selection input, the user may select an alternate Midashigo object associated with said selected Yomikata object as the textual interpretation of the entered input sequence by additional selections of said unambiguous Conversion input, each selection of said unambiguous Conversion input selecting a Midashigo object from the identified one or more Midashigo objects in the memory associated with the selected Yomikata object, said alternate Midashigo object having a decreasing frequency of use;

and wherein each Yomikata object in the tree structure in memory is associated with one or more Midashigo objects, and wherein each Midashigo object is comprised of a sequence of characters comprised of any combination of kanji, hiragana, katakana, symbols, letters and numbers, and wherein a corresponding katakana-only Midashigo object comprised only of katakana is generated by the processor for each Yomikata object, and wherein said katakana-only Midashigo object is included among the Midashigo objects that the user may select as the interpretation of the input sequence by additional selections of said unambiguous Conversion input.

56. The disambiguating system of claim 55, wherein said katakana-only Midashigo object is presented to the user following all of the Midashigo objects that the user may select as the interpretation of the input sequence by additional selection of said unambiguous Conversion input.


Description

FIELD OF THE INVENTION

The invention relates generally to reduced keyboard systems, and more specifically to reduced keyboard systems generating text composed of the hiragana, katakana and kanji characters of the Japanese language.

BACKGROUND OF THE INVENTION

For many years, portable computers have been getting smaller and smaller. The principal size-limiting component in the effort to produce a smaller portable computer has been the keyboard. If standard typewriter-size keys are used, the portable computer must be at least as large as the keyboard. Miniature keyboards have been used on portable computers, but the miniature keyboard keys have been found to be too small to be easily or quickly manipulated by a user.

Incorporating a full-size keyboard in a portable computer also hinders true portable use of the computer. Most portable computers cannot be operated without placing the computer on a flat work surface to allow the user to type with both hands. A user cannot easily use a portable computer while standing or moving. In the latest generation of small portable computers, called Personal Digital Assistants (PDAs) or palm-sized computers, companies have attempted to address this problem by incorporating handwriting recognition software in the device. A user may directly enter text by writing on a touch-sensitive panel or screen. This handwritten text is then converted by the recognition software into digital data. Unfortunately, in addition to the fact that printing or writing with a pen is in general slower than typing, the accuracy and speed of the handwriting recognition software has to date been less than satisfactory. In the case of the Japanese language, with its large number of complex characters, the problem becomes especially difficult. To make matters worse, today's handheld computing devices which require text input are becoming smaller still. Recent advances in two-way paging, cellular telephones, and other portable wireless technologies has led to a demand for small and portable two-way messaging systems, and especially for systems which can both send and receive electronic mail ("e-mail").

It would therefore be advantageous to develop a keyboard for entry of text into a computer device that is both small and operable with one hand while the user is holding the device with the other hand. Prior development work has considered use of a keyboard that has a reduced number of keys. As suggested by the keypad layout of a touch-tone telephone, many of the reduced keyboards have used a 3-by-4 array of keys. A number of the keys in the array are associated with multiple characters. There is therefore a need for a method for the user to indicate which of the characters associated with a given key is the desired character.

One suggested approach for unambiguously specifying hiragana characters entered on a reduced keyboard requires the user to enter two or more keystrokes to specify each kana. The keystrokes may be entered either simultaneously (chording) or in sequence (multiple-stroke specification). Neither chording nor multiple-stroke specification has produced a keyboard having adequate simplicity and efficiency of use. Multiple-stroke specification is inefficient, and chording is often complicated to learn and use.

Each syllable in the Japanese syllabary consists of either a single vowel, or a consonant followed by a vowel. There are two exceptions: the syllable {character pullout} which has no vowel, and the "small" {character pullout} which is used to indicate the "doubling" or "hardening" of the pronunciation of the following consonant. These syllables can be written as hiragana (commonly used when writing native Japanese words) or katakana (commonly used when writing words of foreign origin). The term kana is used to refer to either hiragana or katakana. The syllabary is commonly represented as a table of rows and columns (shown in Table 1), wherein each row may have up to five entries in columns corresponding to the five Japanese vowels {character pullout}, and {character pullout}. Each row corresponds to an initial consonant, although a given consonant may undergo sound changes for certain entries in a row (e.g. s(a){character pullout}sh(i){character pullout}; t(a){character pullout} ts(u){character pullout}; etc.). The first row consists of five syllables corresponding to each of the five vowels with no initial consonant. The 8.sup.th row consists of the palatalized vowels {character pullout}, and {character pullout}(YI and YE are not used in modern Japanese). The diacritic marks " and .sup..smallcircle. are used to indicate changes in the pronunciation of the consonant, generally indicating a change from an unvoiced to a voiced consonant. Table 2 shows the basic syllables formed by adding the diacritic marks " and .sup..smallcircle. to syllables in Table 1. Smaller versions of the syllables {character pullout}, and {character pullout} are also used in combination with syllables in the second, or "{character pullout}" columns of Tables 1 and 2 to represent syllables consisting of the corresponding consonant and the palatalized vowel (e.g. {character pullout} followed by "small" {character pullout} to represent {character pullout}). These syllables with palatalized vowels are thus written as a pair of kana, as shown in Table 3, which includes forms written with diacritic marks.

Lexicographic order in Japanese is generally represented by the sequence of syllables in the first column (corresponding to the vowel A) of Table 1: {character pullout}, {character pullout}, and {character pullout}, where each of these syllables (except {character pullout}) represents a sub-class of up to five syllables composed from the vowels {character pullout}, and {character pullout}, in that order. Currently, products such as cellular telephones that require Japanese text input generally use a multiple-stroke specification method wherein each of nine keys is associated with each of the first nine rows ({character pullout} through {character pullout}). Multiple strokes on a key are used to indicate which of the syllables of the corresponding row is intended, wherein each additional stroke on a key sequentially changes the character to be output to the character appearing in the next column of Table 1 or 2. A separating key or a timeout method is used to allow entry of consecutive characters associated with the same key. A tenth key is used for the syllables {character pullout}, and the katakana "bo" symbol, which indicates a vowel-only syllable that repeats the vowel of the preceding syllable. The "small" {character pullout}, and {character pullout} are also associated with the {character pullout} key, requiring additional keystrokes to be selected. An additional key is commonly used to add the diacritic marks following a syllable.

Entering Japanese hiragana (or katakana) using a reduced keyboard continues to be a challenging problem. With the current multi-stroking approach as described above, generating a single kana syllable requires an average of at least three keystrokes. Syllables with palatalized vowels which are represented by two characters (i.e. those in Table 3 consisting of a syllable from the second, or "{character pullout}", column of Tables 1 and 2, followed by a "small" {character pullout}, or {character pullout}) can require up to eight keystrokes to generate. It would therefore be desirable to develop a reduced keyboard system that tends to minimize the number of keystrokes required to enter hiragana, and is also simple and intuitive to use.

Typing standard Japanese text, which includes Chinese characters (kanji) in addition to kana, on a reduced keyboard is an even more challenging problem. Entering text on a standard computer with a full keyboard and a large display is generally achieved by first typing the pronunciation of the desired text using the letters of the Latin alphabet (called "romaji" in Japanese) corresponding to each hiragana syllable as shown in Tables 1-3. As the letters are typed, the input is automatically converted to the corresponding hiragana syllables and displayed on the screen. In many cases, the user then needs to convert the text which is initially displayed as hiragana into the specific textual interpretation desired. The hiragana that are displayed represent the phonetic reading of the combination of kanji and hiragana that the user actually wants to enter, and which conveys the user's intended meaning. Due to the large number of homophones in the Japanese language, there can be a number of possible meaningful combinations of kanji and hiragana that correspond to the hiragana input by the user. On a standard computer, a number of these alternative conversions can be displayed where, for example, each alternative is associated with a numeric key so that pressing the key converts the input hiragana to the displayed kanji interpretation. Additional complications arise when trying to implement this process on a small hand-held device due to the limited display size and the small number of keys available.

An alternative approach for specifying hiragana entered on a reduced keyboard allows the user to enter each hiragana with a single keystroke. Each key of the reduced keyboard is associated with multiple hiragana characters. As a user enters a sequence of keys, there is therefore ambiguity in the resulting output since each keystroke may indicate one of several hiragana. The system must therefore provide a means by which the user can efficiently indicate which of the possible interpretations of each keystroke was intended. Several approaches have been suggested for resolving the ambiguity of the keystroke sequence.

A number of suggested approaches for determining the correct character sequence that corresponds to an ambiguous keystroke sequence are summarized in the article "Probabilistic Character Disambiguation for Reduced Keyboards Using Small Text Samples," published in the Journal of the International Society for Augmentative and Alternative Communication by John L. Arnott and Muhammad Y. Javad (hereinafter the "Arnott article"). The Arnott article notes that the majority of disambiguation approaches employ known statistics of character sequences in the relevant language to resolve character ambiguity in a given context. That is, existing disambiguating systems statistically analyze ambiguous keystroke groupings as they are being entered by a user to determine the appropriate interpretation of the keystrokes. The Arnott article also notes that several disambiguating systems have attempted to use word level disambiguation to decode text from a reduced keyboard. Word level disambiguation processes complete words by comparing the entire sequence of received keystrokes with possible matches in a dictionary after the receipt of an unambiguous character signifying the end of the word. The Arnott article discusses many of the disadvantages of word-level disambiguation. For example, word level disambiguation oftentimes fails to decode a word correctly, because of the limitations in identifying unusual words and the inability to decode words that are not contained in the dictionary. Because of the decoding limitations, word level disambiguation does not give error-free decoding of unconstrained English text with an efficiency of one keystroke per character. The Arnott article therefore concentrates on character level disambiguation rather than word level disambiguation, and indicates that character level disambiguation appears to be the most promising disambiguation technique. However, in contrast to alphabetic languages, each hiragana character in Japanese represents a syllable, rather than a single letter that represents what is essentially a phoneme. For this reason, character level disambiguation is inefficient in the Japanese language because there are almost no constraints on possible sequences of hiragana, and the probability distribution of hiragana sequences is not skewed enough for this approach to be effective.

Still another suggested approach is disclosed in a textbook entitled Principles of Computer Speech, authored by I. H. Witten, and published by Academic Press in 1982 (hereinafter the "Witten approach"). Witten discusses a system for reducing ambiguity from text entered using a telephone touch pad. Witten recognizes that for approximately 92% of the words in a 24,500 word English dictionary, no ambiguity will arise when comparing the keystroke sequence with the dictionary. When ambiguities do arise, however, Witten notes that they must be resolved interactively by the system presenting the ambiguity to the user and asking the user to make a selection between the number of ambiguous entries. A user must therefore respond to the system's prediction at the end of each word. Such a response slows the efficiency of the system and increases the number of keystrokes required to enter a given segment of text.

In the case of the Japanese language, users of word processing software are accustomed to having to select from a number of ambiguous interpretations following the entry of a word due to the large number of homophones in the language. The same sequence of kana can frequently be converted to two or more different kanji interpretations. Thus, after entering a sequence of kana, the user is generally required to select the desired kanji conversion from a set of possible choices, and often is also required to somehow confirm that the correct conversion was selected. When the hiragana are entered using a reduced keyboard, there is also ambiguity as to what the user actually intends as the sequence of hiragana to be converted to kanji. As a result, the number of possible interpretations is greatly increased.

Disambiguating an ambiguous keystroke sequence continues to be a challenging problem. As noted in the publications discussed above, satisfactory solutions that minimize the number of keystrokes required to enter a segment of text have failed to achieve the necessary efficiencies to be acceptable for use in a portable computer. It would therefore be desirable to develop a disambiguating system that resolves the ambiguity of entered keystrokes while minimizing the total number of keystrokes required, within the context of a simple and easy to understand user interface. Such a system would thereby maximize the efficiency of text entry.

An effective reduced keyboard input system for the Japanese language must satisfy all of the following criteria. First, the arrangement of the syllables of the Japanese language (kana) on the keyboard, and the method by which they are generated, must be easy for a native speaker to understand and learn to use. Second, the system must tend to minimize the number of keystrokes required to enter text in order to enhance the efficiency of the reduced keyboard system. Third, the system must reduce the cognitive load on the user by reducing the amount of attention and decision-making required during the input process. Fourth, the approach should minimize the amount of memory and processing resources needed to implement a practical system.

Kisaichi et al. [JP 8-314920; U.S. Pat. No. 5,786,776; EP 0 732 646 A2] disclose an approach wherein the keys 1-0 of a telephone keypad are labeled with the hiragana syllables {{character pullout}}, {{character pullout}}, {{character pullout}}, {{character pullout}}, {{character pullout}}, {{character pullout}}, {{character pullout}}, {{character pullout}}, {{character pullout}}, and {{character pullout}}, respectively. This corresponds to what is the de facto standard for Japanese telephone keypads wherein the keys 1-9 of the telephone keypad are labeled with the hiragana syllables {character pullout}, and {character pullout}, respectively. The single hiragana appearing on each key represents the complete set of hiragana assigned to that key, corresponding to the entire row of hiragana appearing in Table 1 in which the single hiragana appears in the first column. The 0 key is often labeled explicitly with {{character pullout}}.

Kisaichi et al. disclose a word-level disambiguation approach, wherein the user ambiguously inputs a sequence of characters (hiragana) by pressing the key with which each character is associated a single time. At the end of each input sequence, the user presses a "Conversion/Next Candidate" key to display the first textual interpretation of one of the possible sequences of hiragana associated with the input key sequence. Kisaichi et al. disclose a dictionary structure wherein all textual interpretations of a given input key sequence are stored consecutively in a contiguous block of memory. Each additional press of the "Conversion/Next Candidate" key displays the next textual interpretation stored in the dictionary, if one exists. If no more textual interpretations exist, an error message is displayed and optional anomaly processing may be performed. When the desired textual interpretation is displayed, a special "Confirmation" key must be pressed to confirm that the desired text has been displayed before the user can go on to enter the next text object.

There are a number of difficulties with the approach disclosed by Kisaichi et al. One is that, due to the fact that there is ambiguity both in specification of the hiragana string and in the conversion of each possible hiragana candidate string, there tend to be a very large number of possible textual interpretations of a given key sequence. This can require the user to step through a large number of interpretations using the "Conversion/Next Candidate" key in order to find the desired interpretation. Further, in stepping through the possible interpretations, the user sees various kanji and/or hiragana strings that correspond to a variety of hiragana strings due to the ambiguity in the input. This can be distracting, and require additional attention from the user in trying to find the desired interpretation. In addition, the database of textual interpretations is arranged such that all data consists only of complete words, and all data for all key sequences of a given length is also stored consecutively in a contiguous block of memory. Kisaichi et al. do not disclose any approach to enable the display of an appropriate stem corresponding to a longer, yet uncompleted word at those points in an input sequence that do not correspond to any completed word. At such points in the input, the system of Kisaichi et al. can only display a default indication of each key entered, such as a numeral or a default letter or character. This is confusing to the user, and fails to provide feedback which is effective in helping the user to confirm that the intended keys have been entered. Finally,.the user is required to press the "Confirmation" key for each word input, having to enter an additional keystroke for each input. Thus, the system disclosed by Kisaichi et al. fails to satisfy the criteria discussed above.

Another significant challenge facing any application of word-level disambiguation is successfully implementing it on the kinds of hardware platforms on which its use is most advantageous. As mentioned above, such devices include two-way pagers, cellular telephones, and other hand-held wireless communications devices. These systems are battery powered, and consequently are designed to be as frugal as possible in hardware design and resource utilization. Applications designed to run on such systems must minimize both processor bandwidth utilization and memory requirements. These two factors tend in general to be inversely related. Since word-level disambiguation systems require a large database of words to function, and must respond quickly to input keystrokes to provide a satisfactory user interface, it would be a great advantage to be able to compress the required database without significantly impacting the processing time required to utilize it. In the case of the Japanese language, additional information must be included in the database to support the conversion of sequences of kana to the kanji intended by the user.

Another challenge facing any application of word-level disambiguation is providing sufficient feedback to the user about the keystrokes being input. With an ordinary typewriter or word processor, each keystroke represents a unique character which can be displayed to the user as soon as it is entered. But with word-level disambiguation this is often not possible, since each keystroke represents multiple characters, and any sequence of keystrokes may match multiple words or word stems. It would therefore be desirable to develop a disambiguating system that minimizes the ambiguity of entered keystrokes, and also maximizes the efficiency with which the user can resolve any ambiguity which does arise during text entry. One way to increase the user's efficiency is to provide appropriate feedback following each keystroke, which includes displaying the most likely word following each keystroke, and in cases where the current keystroke sequence does not correspond to a completed word, displaying the most likely stem of a yet uncompleted word.

In order to create an effective reduced keyboard input system for the Japanese language, a system has been designed that does meet all of the criteria mentioned above. First, the arrangement of the syllables of the Japanese language (kana) on the keyboard, and the method by which they are generated, are easy for a native speaker to understand and learn to use. Second, the system tends to minimize the number of keystrokes required to enter text. Third, the system reduces the cognitive load on the user by reducing the amount of attention and decision-making required during the input process, and by the provision of appropriate feedback. Fourth, the approach disclosed herein tends to minimize the amount of memory and processing resources required to implement a practical system.

SUMMARY OF THE INVENTION

The present invention provides a reduced keyboard using word level disambiguation to resolve ambiguities in keystrokes to enter text in the Japanese language. The keyboard may be constructed with full-size mechanical keys, preferably twelve keys arrayed in three columns and four rows as on a standard telephone keypad. Alternatively, the keyboard can be implemented on a display panel which is touch sensitive, wherein contact with the surface of the display generates input signals to the system corresponding to the location of contact.

A plurality of kana characters and symbols are assigned to each of at least several of the keys, so that keystrokes by a user are ambiguous. A user enters a keystroke sequence wherein each keystroke is intended to be the entry of one kana. Each keystroke sequence is thus intended to represent the phonetic reading of a word or common phrase (hereinafter referred to by the Japanese term "Yomikata"). Because individual keystrokes are ambiguous, the keystroke sequence could potentially match more than one Yomikata with the same number of kana.

The keystroke sequence is processed by comparing the keystroke sequence with one or more stored vocabulary modules to match the sequence with corresponding Yomikata. The various Yomikatas associated with a given key sequence are stored in the vocabulary module in the order determined by their expected frequency of occurrence in general usage, where the expected frequency of a Yomikata is calculated as the sum of the frequencies of occurrence of all possible textual interpretations of that Yomikata (including words composed of kanji, hiragana, katakana, or any combination thereof) in general usage (hereinafter referred to by the Japanese term "Midashigo"). In another preferred embodiment, the Yomikata and Midashigo are initially stored in the order determined by their expected frequency of occurrence in general usage, and this order is modified to reflect the frequency of actual usage by the system user. The most frequent Yomikata that matches the sequence of keystrokes and corresponds to at least one completed word or phrase is automatically identified and presented to the user on a display as each keystroke is received. If there is no completed word or phrase whose Yomikata matches the sequence of keystrokes, the most commonly occurring stem of a yet uncompleted word or phrase is automatically identified and presented to the user on the display. The term "selection list" is used generically to refer to any list of textual interpretations (either Yomikata or Midashigo) generated by the system corresponding to an input keystroke sequence. On devices with sufficient display area available (hereinafter referred to as "large-screen devices"), the selection list may be shown (in whole or in part) in a "selection list region" of the display. On such devices, as each keystroke is received, the various Yomikata corresponding to the input sequence are simultaneously presented to the user in a list on the display in descending order of expected frequency in the selection list region. On devices with limited display area, the selection list is maintained internally, and text objects in the list are displayed one at a time in response to activations of a Select key as described below. For each Yomikata in the stored vocabulary module that is associated with one or more alternate textual interpretations, or Midashigo, the Midashigo are stored in order of decreasing frequency of expected occurrence in general usage, so that the most commonly used Midashigo is presented first. As briefly noted above, in an alternate embodiment, the system keeps track of which Midashigo are selected for output by the user most frequently, and modifies the order of presentation to first present the Midashigo most frequently selected.

In accordance with one aspect of the invention, the user presses an unambiguous Select key to delimit an entered keystroke sequence. After receiving the Select key, the disambiguating system automatically selects the most frequently occurring Yomikata and adds the kana to the text being constructed when the user continues to enter additional text. By default, the Yomikata is shown on the display in the form of hiragana, unless the kana is one of the few katakana for which there is no corresponding hiragana (e.g. {character pullout}) In another embodiment, the Yomikata is displayed in the form of katakana, or, in some of the alternate keyboard arrangements described below, in the form of romaji.

In accordance with another aspect of the invention, the Select key that is pressed to delimit the end of a keystroke sequence is also used to select less commonly occurring Yomikata. If the most commonly occurring Yomikata presented on the display is not the desired Yomikata, the user presses the Select key again to advance from the most frequently occurring Yomikata to the second most frequently used Yomikata, which replaces the Yomikata first displayed. If this is not the desired Yomikata, the user presses the Select key again to advance to the third most frequently used Yomikata, and so on. By repetitively pressing the Select key, the user may therefore select the desired Yomikata from the stored vocabulary module. In accordance with another aspect of the invention, upon reaching the end of the Yomikata found in the stored vocabulary, the first and most frequently occurring Yomikata is again displayed, and the cycle repeats.

In accordance with yet another aspect of the invention, each keystroke of the input sequence is also interpreted as a numeral associated with the key, so that the last item displayed following the associated Yomikata is the number corresponding to the input key sequence. This number may be selected for output, thus eliminating the need for a separate numeric mode in the system.

In accordance with another aspect of the invention, once the user has selected the desired Yomikata corresponding to the input key sequence, if the desired Midashigo is identical to the Yomikata which has been selected and is already shown on the display as kana text (i.e. no conversion is required), the user may simply proceed to press the keys corresponding to the next desired text to be input. No additional confirmation or conversion keystroke is required--the selected text has already been provisionally sent to the display for output, and unless it is explicitly changed (e.g. by additional presses of the Select key), it will remain as part of the output text. The user can immediately proceed to enter additional following text, move the text cursor location, or select some other system function. If the desired Midashigo is not the selected Yomikata on the display, i.e., the desired Midashigo consists of kanji, kanji plus hiragana, or katakana, the user presses a Convert key until the desired Midashigo is displayed. If the first (default) Yomikata displayed after the entry of a key sequence is the desired Yomikata, a user need not press the Select key, and may immediately press the Convert key to obtain the desired Midashigo.

The present invention also discloses a method for unambiguously generating the syllables of the Japanese language from ordered pairs of keystrokes on a reduced keyboard in such a way as to meet two of the criteria mentioned above. First, the arrangement of the syllables of the Japanese language (kana) on the keyboard, and the method by which they are generated, are easy for a native speaker to understand and learn to use. Second, the arrangement tends to minimize the number of keystrokes required to unambiguously enter the syllables of the Japanese language. In this aspect of the invention, a sequence of two keystrokes is entered to specify each syllable unambiguously, including the syllables with palatalized vowels shown in Table 3 that are written with two kana each.

Input sequences of keystrokes are interpreted as ordered pairs of keystrokes which select a character according to its position in a two-dimensional matrix. The first keystroke of each ordered pair specifies the row of the matrix in which the desired character appears, and the second keystroke of each pair specifies the column. The organization of the characters in the first five columns of the matrix conforms to the manner in which the Japanese syllabary is learned and conceptualized by a native Japanese speaker, as they are shown in Table 1. An additional three columns may be organized in a manner that corresponds with the natural model of how the syllables with palatalized vowels are formed (each as a combination of two kana), although these are generally regarded as a separate matrix from the basic syllabary (Table 3). Two more columns may be added to handle two special cases (small {character pullout} and {character pullout}) that do not fit into the simple patterns of the first eight columns. These two characters can also be handled somewhat differently in a variety of alternate embodiments. The simplicity and logical organization of this matrix approach makes it possible to use the matrix even when there is no display available to provide feedback to the user. When a display is available, the matrix can be used to organize how feedback is provided to make the operation of the system transparent to the user.

The Japanese syllabary includes 108 syllables (counting the "small" {character pullout}, and {character pullout} as separate syllables from the full-sized {character pullout}, and {character pullout} since they are written and pronounced in a distinct fashion). There are some additional seldom used syllables, such as the "small" versions of the vowel syllables {character pullout}, and {character pullout} that are primarily used in katakana. These seldom used syllables may also be easily generated by using the matrix system as discussed above. Of the 108 syllables, 37 are generated by simply adding one of the diacritic marks, dakuten (")or handakuten (.sup..smallcircle.), to one of the other 71 syllables. These 71 syllables without diacritic marks can be logically organized into a single matrix of nine or ten rows and eight to ten columns. A plurality of the keys on the reduced keyboard of the present invention may be labeled with two kana, one representing the consonant associated with a given row of the matrix, and second kana representing the vowel associated with a given column of the matrix.

The organization is logical and intuitive for a native speaker of Japanese for 106 of the 108 syllables, and the method for generating the remaining two syllables, i.e., small {character pullout}, and {character pullout}, is simple and easily learned. Every syllable is generated by a single pair of keystrokes, including syllables with palatalized vowels that are represented by two separate kana. This results in significantly fewer keystrokes than is required by the currently used multiple-stroke method for entering kana on a reduced keyboard. Thus, the present invention provides a reduced keyboard that is easily understood and quickly learned by native speakers of Japanese, and that is efficient in terms of reducing the length of input keystroke sequences.

In yet another aspect of the invention, both ambiguous and unambiguous methods of specification of syllables as described above may be combined to achieve greater efficiencies in the input method. In one preferred embodiment, the first syllable of each word or phrase to be generated is specified unambiguously by entering an ordered pair of keystrokes using the matrix approach as discussed above, while the remaining syllables of the word or phrase are specified ambiguously with a single keystroke for each syllable using the word level disambiguation method.

In accordance with yet another aspect of the invention, multiple interpretations of the keystroke sequence are provided to the user in the selection list. The keystroke sequence is interpreted as forming one or more words, where the most frequently occurring corresponding word is displayed, and where other corresponding words may also be displayed in a selection list. Simultaneously, the keystroke sequence is interpreted as a number (as explained above), as a word entered using the two-stroke method of the present system or the well-known multiple-stroke specification method, and as a stem of an uncompleted word. On large-screen devices, the multiple interpretations are simultaneously presented to the user in a selection list region for each keystroke of a keystroke sequence entered by the user. On any device, the user may select from the alternate interpretations by pressing the Select key a number of times.

In accordance with still another aspect of the invention, the database of words and phrases that is used to disambiguate input key sequences is stored in a vocabulary module using tree data structures. Words corresponding to a particular keystroke sequence are constructed from data stored in the tree structure in the form of instructions. The instructions modify the set of words and word stems associated with the immediately preceding keystroke sequence (i.e., the particular keystroke sequence without the last keystroke) to create a new set of words and word stems associated with the keystroke sequence to which the current keystroke has been appended. Constructing words in this manner reduces the storage space of the vocabulary module, since the instructions to build word stems are stored only once, at the top of the tree structure, and are shared by all words constructed from the word stems. The tree structure also greatly reduces processing requirements, since no searching is required to locate stored objects, i.e., words and word stems, for example. The objects stored in the tree data structure may contain frequency or other ranking information which indicates which object is to be displayed first to the user, thus further reducing processing requirements. Furthermore, this tree data structure may be modified using a special algorithm which further compresses the total size required for the database, without engendering an additional processing burden when the database is utilized to retrieve objects associated with keystroke sequences.

In yet another aspect of the present invention, the tree data structure includes two types of instructions. Primary instructions create the Yomikata of the words and phrases stored in a vocabulary module which consist of sequences of kana corresponding to the pronunciation of the words and phrases. Corresponding to each Yomikata is a list of secondary instructions which create the Midashigo associated with each Yomikata. Each Yomikata is created by a primary instruction which modifies one of the Yomikata associated with the immediately preceding keystroke sequence. Likewise, each Midashigo is created by a secondary instruction which modifies one of the Midashigo associated with the Yomikata which was modified by the primary instruction with which the secondary instruction is associated.

The internal, logical representation of the keys in an embodiment need not mirror the physical arrangement represented by the labels on the actual keys. For example, in a database constructed to represent a Japanese vocabulary module, four additional characters ({character pullout}) may also be associated with the key labeled only with the single characters {character pullout}. Similarly, characters with special diacritic marks such as the dakuten and handakuten ("and .sup..smallcircle.) can also be associated with a key. For example, the characters ({character pullout}, and {character pullout}) may also be associated with the key labeled only with the single character {character pullout}. This allows the user to easily recall and type words containing characters with diacritic marks, performing only one key activation per character, simply by activating the logically associated physical key for the associated accented character.

Furthermore, in yet another aspect of the invention, greater efficiencies in database compression are achieved by storing each kanji character only once in the database structure for any particular associated reading. In general, a database may include a number of different instances of the same kanji with the same reading (e.g. the kanji {character pullout} (read as {character pullout}) in {character pullout} ({character pullout}) and {character pullout} ({character pullout})). In one preferred embodiment, starting immediately from the root of the tree structure, each reading associated with a given kanji is included in the database, together with a full specification of the code for the kanji. Other occurrences of the kanji having the same reading in the database (not starting immediately from the root of the tree structure) are defined though an indirect reference, which specifies the relative position of the fully specified kanji in the list of Midashigo associated with the reading starting immediately from the root of the tree structure.

The combined effects of the assignment of multiple characters to keys, the delimiting of words using a Select key, the selection of the desired Yomikata using a Select key optionally followed by the selection of the desired Midashigo using a Convert key, the presentation of the most commonly occurring word or word stem as the first word in the selection list, the inclusion of multiple interpretations in the selection list, the automatic addition of a selected word to a sentence by the first keystroke of the following word, the ability to compress a large database for disambiguation without incurring any significant processing penalties, and the ability to generate words with characters with diacritic marks by typing the key associated with the characters without a diacritic mark produces a surprising result: for the Japanese language, well over 99% of words found in a representative corpus of text material can be typed on the system with extremely high efficiency. On average, in comparison with entering text on a full keyboard containing one key for each possible basic kana (i.e. 50 keys including those shown in Table 1 plus the "small" {character pullout}, and {character pullout}), only 0.61 additional keystrokes per word are required using the reduced keyboard of the present invention with only twelve keys. Compared with a conventional keyboard where kana are entered by spelling the desired word using romaji, on average the system actually requires even fewer keystrokes. When words include characters with diacritics, additional keystroke savings can be achieved. When the words are presented in frequency of use order, the desired word is most often the first word presented and in many cases is the only word presented. The user can then proceed to enter the next word with no additional keystrokes required. High speed entry of text is therefore achieved using a keyboard having a small number of keys.

The reduced keyboard disambiguation system disclosed herein reduces the size of the computer or other device that incorporates the system. The reduced number of keys allows a device to be constructed to be held by the user in one hand, while being operated with the other hand. The disclosed system is particularly advantageous for use with cellular telephones, PDAs, two-way pagers, or other small electronic devices that benefit from accurate, high-speed text entry. The system efficiently compresses a large database for disambiguating keystroke sequences without requiring additional processing bandwidth when utilizing the compressed database. The system can provide both efficiency and simplicity when implemented on a touchscreen based device or a device with a limited number of mechanical keys that may also have limited display screen area.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing aspects and many of the attendant advantages of this invention will become more readily appreciated as the same becomes better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein:

FIG. 1a is a schematic view of an embodiment of a cellular telephone incorporating a reduced keyboard disambiguating system of the present invention;

FIG. 1b is a schematic view of a cellular telephone keypad, similar to FIG. 1a, but wherein each of a plurality of keys is associated with one or more romaji characters;

FIG. 1c is a schematic view of a cellular telephone keypad incorporating an embodiment of the reduced keyboard system of the present invention with limited or no display capabilities;

FIG. 1d is a schematic view of a cellular telephone keypad incorporating an embodiment of the reduced keyboard system of the present invention with display capabilities, showing the display following an activation of key 2 as the first of an ordered pair of keystrokes in the unambiguous two-stroke method;

FIG. 2a is a hardware block diagram of the reduced keyboard disambiguating system of FIG. 1a;

FIG. 2b is a schematic view of an embodiment of a portable computer touchscreen incorporating a reduced keyboard system of the present invention, showing the keypad displayed prior to the first keystroke of an ordered pair of keystrokes in the unambiguous two-stroke method;

FIG. 2c is a schematic view of the touchscreen of FIG. 2b, showing the keypad displayed following an activation of the key associated with the syllables {character pullout}, and {character pullout} as the first of an ordered pair of keystrokes;

FIG. 3 is a flow chart of an embodiment of word-level disambiguating software for a reduced keyboard disambiguating system for the Japanese language;

FIG. 4 is a schematic view of an alternate embodiment of a portable computer touchscreen incorporating a reduced keyboard system of the present invention;

FIG. 5 is a schematic view of yet another alternate embodiment of a portable computer touchscreen incorporating a reduced keyboard system of the present invention having nine keys;

FIG. 6 compares the physical association of symbols to keys with an instance of a logical association including additional accented variations of the characters appearing on the physical key;

FIG. 7 is an example of a table associating logical symbols to key indices;

FIG. 8A depicts an internal arrangement of data in a node of a tree of a vocabulary module;

FIG. 8B depicts the semantic components of an embodiment of a primary instruction to build a Yomikata text object;

FIG. 8C depicts the semantic components of an embodiment of four different types of secondary instructions used to build Midashigo text objects;

FIG. 8D depicts the semantic components of another preferred embodiment of two of the four different types of secondary instructions used to build Midashigo text objects;

FIG. 9 depicts four examples of possible internal data items in the structure of nodes in an embodiment;

FIG. 10 depicts the preferred tree structure of an uncompressed vocabulary module;

FIG. 11 depicts example states of object lists, which are the preferred embodiment for intermediate storage of objects in the process of being retrieved from the vocabulary modules;

FIG. 12 is a flowchart of a preferred embodiment of a software process for retrieving text objects from a vocabulary module given a list of key presses;

FIG. 13 is a flowchart of an embodiment of a software process for traversing the tree structure of the vocabulary module given a single key press and altering the state of the object lists;

FIG. 14 is a flowchart of an embodiment of a software process for building a folded, compressed vocabulary module;

FIG. 15 is a flowchart of an embodiment of a software process for folding the tree data structure of a vocabulary module;

FIG. 16 is a flowchart of an embodiment of a software process for locating a second node in a tree of a vocabulary module which has the greatest redundancy in comparison to a given node;

FIG. 17 is a flowchart of an embodiment of a software process for computing the redundancy between two nodes of a tree in a vocabulary module;

FIG. 18 is a chart showing the partial contents of the database of the present invention for a sequence of three consecutive keystrokes on a key, which is ambiguously associated with the syllables {character pullout}, and {character pullout};

FIG. 19 shows three representative examples of system operation, showing the contents of the text display of the system illustrated in FIG. 1a following each keystroke on a sequence of keys entered while inputting text;

FIG. 20 is a table showing basic Japanese syllabary;

FIG. 21 is a table showing additional Japanese syllabary using diacritics;

FIG. 22 is a table showing Japanese syllabary with palatalized vowels;

FIG. 23a is a table showing a classifying matrix for Japanese syllabary;

FIG. 23b is a table showing an alternative classifying matrix for Japanese syllabary; and

FIG. 24 is a table showing another alternative classifying matrix for Japanese syllabary.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

I. System Construction and Basic Operation

With reference to FIG. 1a, a reduced keyboard disambiguating system formed in accordance with the present invention is depicted as incorporated in a portable cellular telephone 52 having a display 53. Portable cellular telephone 52 contains a reduced keyboard 54 implemented on the standard telephone keys. For purposes of this application, the term "keyboard" is defined broadly to include any input device including a touch screen having defined areas for keys, discrete mechanical keys, membrane keys, etc. An arrangement of the kana on each key in the keyboard 54 is depicted in FIG. 1a, corresponding to what has become a de facto standard for Japanese telephones. Note that keyboard 54 thus has a reduced number of data entry keys as compared to a standard QWERTY keyboard, or a keyboard which includes at least 46 keys, where one key is assigned for each of the kana in the basic Japanese syllabary shown in Table 1. More specifically, the preferred keyboard shown in this embodiment contains ten data keys numbered 1 through 0 arranged in a 3-by-4 array, together with a Select key 60, and a Convert key 62. Optionally, the keyboard may also include a Clear key 64 to delete previous keystrokes; a Mode key 67 for entering modes to type unambiguous characters, numbers, and symbols; and a Diacritic key 68 to add dakuten and handakuten to previously entered kana.

Data are input into the disambiguation system via keystrokes on the reduced keyboard 54. In a first preferred embodiment, as a user enters a keystroke sequence using the keyboard, text is displayed on the telephone display 53. When the system is implemented on a device with limited display space, such as the cellular telephones depicted in FIGS. 1a-d, only the currently selected or most likely word object is displayed at the insertion point 88 in the text being generated. As keys are pressed in sequence to enter a desired word, the most likely word corresponding to the input sequence is displayed in some distinguishing format. In the preferred embodiment depicted in FIG. 1a, the current word is displayed with a dotted underline. As explained in more detail below, after the Select key 60 or Convert key 62 is pressed, the dotted underline is changed to a solid underline.

In a second preferred embodiment illustrated in FIGS. 2b and 2c, two regions are defined on the display 53 to display information to the user. A text region 66 displays the text entered by the user just as described for the first embodiment described above, serving as a buffer for text input and editing. As shown in FIGS. 2b and 2c, a selection list region 77, typically located below the text region 66, shows a list of words and other interpretations corresponding to the keystroke sequence entered by a user. The selection list region aids the user in resolving the ambiguity in the entered keystrokes by simultaneously showing both the most frequently occurring interpretation of the input keystroke sequence, and other less frequently occurring alternate interpretations displayed in descending order of frequency.

A block diagram of the reduced keyboard disambiguating system hardware is provided in FIG. 2a. The keyboard 54 and the display 53 are coupled to a processor 100 through appropriate interfacing circuitry. Optionally, a speaker 102 is also coupled to the processor. The processor 100 receives input from the keyboard, and manages all output to the display and speaker. Processor 100 is coupled to a memory 104. The memory includes a combination of temporary storage media, such as random access memory (RAM), and permanent storage media, such as read-only memory (ROM), floppy disks, hard disks, or CD-ROMs. Memory 104 contains all software routines to govern system operation. Preferably, the memory contains an operating system 106, disambiguating software 108, and associated vocabulary modules 110 that are discussed in additional detail below. Optionally, the memory may contain one or more application programs 112, 114. Examples of application programs include word processors, software dictionaries, and foreign language translators. Speech synthesis software may also be provided as an application program, allowing the reduced keyboard disambiguating system to function as a communication aid.

Returning to FIG. 1a, the reduced keyboard disambiguating system allows a user to quickly enter text or other data using only a single hand. Data are entered using the reduced keyboard 54. Each of the data keys 1 through 0 has multiple meanings, represented on the top of the key by characters, numbers, and other symbols. (For the purposes of this disclosure, each data key will be identified by the number and character(s) appearing on the data key, e.g., 3{character pullout} to identify the upper right data key.) Since individual keys have multiple meanings, keystroke sequences are ambiguous as to their meaning. As the user enters data, the various keystroke interpretations are therefore displayed in multiple regions on the display to aid the user in resolving any ambiguity. On large-screen devices, a selection list of possible interpretations of the entered keystrokes is also displayed to the user in the selection list region. The first entry in the selection list is selected as a default interpretation and displayed in the text region 66 at an insertion point 88. In the preferred embodiment, this entry is displayed with a dotted underline drawn beneath it at the insertion point 88 (and in the selection list region on large-screen devices). The formatting signifies that this object is implicitly selected by virtue of being the most frequently occurring object in the current selection list. If the display includes a selection list region as in FIG. 2b, this formatting also establishes a visual relationship between object at the insertion point 88 and the same object displayed in the selection list region 77. In FIG. 1a, no selection list is displayed, and only the default object (the object that would be displayed first in the selection list prior to any activation of the Select key), or the currently selected object if one has been explicitly selected, is displayed at the insertion point 88.

The selection list of the possible interpretations of the entered keystrokes may be ordered in a number of ways. In a normal mode of operation, the keystrokes are initially interpreted as the entry of kana to spell a Yomikata corresponding to the desired word (hereinafter the "word interpretation"). For example, as shown in FIG. 2b, a keystroke sequences {character pullout}, and {character pullout} has been entered by a user. As keys are entered, a vocabulary module look-up is simultaneously performed to locate Yomikata that match the keystroke sequence. The Yomikata are returned from the vocabulary module according to frequency of use, with the most commonly used Yomikata listed first. Using the example keystroke sequence, the Yomikata "{character pullout}", "{character pullout}", and "{character pullout}" are identified from the vocabulary module as being the three most probable Yomikata corresponding to the keystroke sequence. Of the eight identified Yomikata in this selection list, "{character pullout}" is the most frequently used, so it is taken as the default interpretation and provisionally posted as hiragana text at the insertion point 88. As shown in FIG. 1a, prior to pressing the Select key 60, this first Yomikata is taken as the default interpretation and is posted at the insertion point 88 using a distinctive format. This format indicates that a subsequent keystroke on one of the data keys will be appended to the current key sequence rather than start a new sequence. For example, as in FIG. 1a, the distinctive formatting consists of displaying the Yomikata as hiragana text with a dotted underline. The list of other potential matching Yomikata is kept in memory, sorted according to their relative frequency.

In the preferred embodiment, following entry of the keystroke sequence corresponding to the desired Yomikata, the user simply presses the Select key 60. The dotted underline beneath the default Yomikata "{character pullout}" displayed at the insertion point 88 is replaced with a solid underline. If the default Yomikata displayed is not the desired Yomikata, the Select key 60 is pressed repeatedly until the desired Yomikata appears. In one preferred embodiment, after all of the Yomikata in memory that match the key sequence have been displayed through repeated activation of the Select key 60, the key sequence is interpreted as a number, where each keystroke generates the digit appearing on the label of the key. This allows numbers to be generated without a separate numeric mode, and also serves as an easily recognizable indicator of the end of the selection list of Yomikata interpretations. The next press of the Select key 60 cycles back to the first Yomikata in the selection list.

Once the desired Yomikata is displayed, if the desired Midashigo (textual interpretation) is in fact identical to the Yomikata that is already displayed in hiragana text, the user proceeds to press the data key corresponding to the first kana of the next Yomikata to be entered. On the other hand, if the desired Midashigo consists of kanji, kanji plus hiragana, katakana, or some combination thereof, the user presses the Convert key 62. This causes the displayed Yomikata to be replaced with the most frequently occurring Midashigo that is associated with that Yomikata in the vocabulary module. Repeated presses of the Convert key 62 replace the displayed Midashigo with other associated Midashigo in descending order of frequency. In one preferred embodiment, after all of the Midashigo in memory that are associated with the selected Yomikata have been displayed through repeated activation of the Convert key 62, the selected Yomikata is displayed as katakana. This allows katakana words to be generated without a separate mode, and also serves as an easily recognizable indicator of the end of the selection list of Midashigo interpretations. In another preferred embodiment, if the user wishes to choose a Midashigo that is associated with the first (default) Yomikata associated with the input key sequence, the Convert key 62 can be pressed immediately to obtain the desired Midashigo without first pressing the Select key 60.

After one or more presses of either or both the Select key 60 and the Convert key 62, any data key that is pressed removes the special formatting (solid underline in the preferred embodiment) of the displayed Yomikata or Midashigo, and becomes the first keystroke of a new key sequence to be interpreted by the system. No special keystroke is required to confirm the interpretation of the preceding keystroke sequence.

In the preferred embodiment described above, pressing the Select key 60 cycles forward through the Yomikata in memory that are associated with the current key sequence (in descending order of frequency). In another preferred embodiment, pressing-and-holding the Select key 60 past a predetermined time threshold cycles backward through the Yomikata in memory in ascending order of frequency. Thus, when the numeric interpretation is included at the end of the sequence of associated Yomikata in memory as described above, a press-and-hold of the Select key 60 prior to any regular press of the Select key 60 cycles backward immediately to the numeric interpretation. Repeatedly pressing-and-holding the Select key 60 then cycles back up through the associated Yomikata in ascending order of frequency. Likewise, pressing-and-holding the Convert key 62 cycles backward through the Midashigo associated with the currently selected Yomikata in ascending order of frequency. Similarly, a first press-and-hold of the Convert key 62 prior to any regular press of the Convert key 62 cycles backward immediately to the katakana interpretation.

Still referring to FIG. 1a, in another preferred embodiment, when entering data keys, the Clear key 64 can be pressed to delete the previously entered data key. If all data keys in the current key sequence are thus deleted, pressing the Clear key 64 deletes the character on the text display 53 to the left of the insertion point 88, where a standard text cursor is displayed when the current selection list is empty. After one or more presses of either or both the Select key 60 and the Convert key 62, pressing the Clear key 64 replaces the currently selected textual interpretation at the insertion point 88 with the default Yomikata interpretation of the current key sequence, but does not delete any of the data keys from the key sequence. In other words, the first press of the Clear key 64 after any number of presses of the Select key 60 and/or the Convert key 62 effectively "deletes" all of the activations of the Select key 60 and the Convert key 62, returning the system to the state immediately prior to the first press of either the Select key 60 or the Convert key 62. In another preferred embodiment, after one or more presses of the Convert key 62, pressing the Select key 60 replaces the currently selected Midashigo at the insertion point 88 with the Yomikata with which the Midashigo is associated. Further presses of the Select key 60 continue to cycle forward from that point through the other Yomikata in memory that are associated with the current key sequence (in descending order of frequency).

In another preferred embodiment, activation of any other means which explicitly generates an unambiguous character (such as entering a special Symbols mode and pressing a key that is unambiguously associated with a single specific character) serves to terminate the current key sequence. As a result, any special formatting (dotted or solid underline in the preferred embodiment) of the displayed Yomikata or Midashigo at the insertion point 88 is removed, and the specific unambiguous character is appended to the output word at a new insertion point 88.

Provisionally posting the selected Yomikata or Midashigo to the text region at the insertion point 88 allows the user to maintain their attention on the text region without having to refer to the selection list. At the user's option, the system can also be configured such that, upon receipt of the first press of the Select key 60 (or Convert key 62), the Yomikata (or Midashigo) provisionally posted at the insertion point 88 can expand (vertically or horizontally) to display a copy of the current selection list. The user may select the maximum number of words to be displayed in this copy of the selection list. Alternatively, the user may elect to have the selection list always displayed at the insertion point 88, even prior to the first activation of the Select key. The disambiguating system interprets the start of the next word (signaled by the activation of an ambiguous data key or the generation of an explicit unambiguous character) as an affirmation that the currently selected entry is the desired entry. The selected word therefore remains at the insertion point 88 as the choice of the user, the underline disappears completely and the word is redisplayed in normal text without special formatting.

In the majority of text entry, keystroke sequences are intended by the user as kana forming a Yomikata. It will be appreciated, however, that the multiple characters and symbols associated with each key allow the individual keystrokes and keystroke sequences to have several interpretations. In the preferred reduced keyboard disambiguating system, various different interpretations are automatically determined and displayed to the user at the same time as the keystroke sequence is interpreted and displayed to the user as a list of words.

For example, the keystroke sequence is interpreted in terms of the word stems corresponding to possible valid sequences of kana that a user may be entering (hereinafter the "stem interpretation"). Unlike word interpretations, word stems are incomplete words. By indicating the possible interpretations of the last keystrokes, the word stems allow the user to easily confirm that the correct keystrokes have been entered, or to resume typing when his or her attention has been diverted in the middle of the word. There are key sequences which correspond to the partial entry of a long word or phrase, but which do not correspond to any completed word or phrase. In such cases, the most useful feedback that can be provided to the user is to show the kana that correspond to the stem of the word that has been entered up to that point. In the example shown in FIG. 2b, the keystroke sequence {character pullout} can be interpreted as forming the valid stem "{character pullout}" (leading to the word "{character pullout}"). The stem interpretations are therefore provided as entries in the selection list. Preferably, the stem interpretations are sorted according to the composite frequency of the set of all possible words that can be generated from each stem by additional keystrokes on the data keys. The maximum number and the minimum composite frequency of such entries to be displayed may be selected by the user or configured in the system, so that some stem interpretations may not be displayed. When listing a stem interpretation in the selection list, the stem is omitted if a stem interpretation duplicates a word that appears in the selection list. When the stem is omitted, however, the word corresponding to the omitted stem may be marked with a symbol to show that there are also longer words which have this word as their stem. Stem interpretations provide feedback to the user by confirming that the correct keystrokes have been entered to lead to the entry of a desired word.

FIG. 3 is a flow chart of a main routine of the disambiguation software that processes a selection list and determines what is to be displayed at the insertion point 88 to aid the user in disambiguating ambiguous keystroke sequences. At a block 150, the system waits to receive a keystroke from the keyboard 54. Upon receipt of a keystroke, at a decision block 151, a test is made to determine if the received keystroke is a mode selection key. If so, at a block 172 the system sets a flag to indicate the current system mode. At a decision block 173, a test is made to determine if the system mode has changed. If so, at a block 171 the display is updated as needed to reflect the current system mode. If block 151 determines the keystroke is not a mode selection key, then at a decision block 152, a test is made to determine if the received keystroke is the Select key. If the keystroke is not the Select key, then at decision block 152A, a test is made to determine if the received keystroke is the Convert key. If the keystroke is not the Convert key, then at decision block 153, a test is made to determine if the system is in a special explicit character mode such as the explicit Symbols mode. If so, at decision block 166 a test is performed to determine if any provisionally selected item is present in the selection list. If so, at a block 167 the item is accepted and is output as normal text. Then, at a block 168, the explicit character corresponding to the keystroke is output to the text area. Then, at decision block 169, a test is made to determine if the system mode should be automatically changed, as in the case of Symbols mode. If so, execution proceeds to block 170 and the system mode is returned to the previously active mode, otherwise execution returns to block 150.

If at block 153 no explicit character mode is active, at a block 154 the keystroke is added to a stored keystroke sequence. At block 156, objects corresponding to the keystroke sequence are identified from the vocabulary modules in the system. Vocabulary modules are libraries of objects that are associated with keystroke sequences. An object is any piece of stored data that is to be retrieved based on the received keystroke sequence. For example, objects within the vocabulary modules may include numbers, characters, words, components of words, stems, phrases, or system functions and macros. Each of these objects is briefly described in the following table:

    Object          Corresponding data
    Numbers         A number, each digit of which corresponds to a single
                    keystroke, e.g., the two-digit sequence "42".
    Characters      A character or sequence of characters corresponding
                    to pairs of keystrokes, e.g., the three character
                    sequence "{character pullout}". Each pair of keystrokes is
     used to
                    disambiguate using the two-stroke specification
                    method of inputting characters unambiguously.
    Word            A Yomikata or Midashigo corresponding to single or
                    multiple keystrokes, e.g., the four character
                    word "{character pullout}".
    Stem            A sequence of kana representing a valid portion of a
                    longer sequence of kana forming a word, e.g.,
                    "{character pullout}" as a stem of the word "{character
     pullout}."
    Phrase          A user-defined or system-defined phrase corresponding
                    to single or multiple keystrokes, e.g.,
                    "{character pullout}".
    System Macro    A word and associated code describing a system or
                    user-defined function, e.g., "<clear>" to clear the
                    current text region. In addition to the descriptive
                    word, in the vocabulary module the system macro
                    object is associated with the executable code necessary
                    for performing the specified function.


While the preferred vocabulary objects are discussed above, it will be appreciated that other objects may be contemplated. For example, a graphic object may be associated with a stored graphic image, or a speech object may be associated with a stored segment of speech. A spelling object may also be envisioned that would link the keystroke sequence of commonly misspelled words and typing errors with the correct spelling of the word. To simplify processing, each vocabulary module preferably contains similar objects. It will be appreciated, however, that various objects may be mixed within a vocabulary module.

Returning to FIG. 3, at block 156 those objects that correspond to the received keystroke sequence are identified in each vocabulary module. At blocks 158-165 the objects found by looking up the keystroke sequence in the vocabulary modules are prioritized to determine the order in which objects are displayed to the user. To determine the sequence of objects displayed in the selection list, priorities are established between each vocabulary module and also between the returned objects from each vocabulary module.

To prioritize the object lists identified from the various vocabulary modules, at block 158 the mode of operation of the reduced keyboard disambiguating system is examined. As discussed above, in a normal mode of operation the word interpretations (Yomikata and Midashigo) are displayed first in the selection list. The object list from a word vocabulary module would therefore be assigned a higher priority than the object list from the other vocabulary modules. Conversely, if the disambiguating system were in the numeric mode of operation, the numeric interpretations would be assigned a higher priority than the other vocabulary modules. The mode of the disambiguating system therefore dictates the priority between vocabulary module object lists. It will be appreciated that in certain modes, the object lists from certain vocabulary modules may be omitted from the selection list entirely.

Object lists generated from vocabulary modules may contain only a single entry, or they may contain multiple entries. At block 160, the priority between the objects from the same vocabulary module is therefore resolved if the object list contains multiple entries. The objects that match a particular keystroke sequence that are looked-up in a given vocabulary module are also given a priority that determines their relative presentation with respect to each other. As noted above, preferably the default presentation order is by decreasing frequency of use in a representative corpus of usage. The priority data associated with each object is therefore used to order the objects in the selection list.

Many of the properties associated with the presentation of the objects looked-up in a vocabulary module are user-programmable by accessing appropriate system menus. For example, the user can specify the order of individual objects or classes of objects in the selection list. The user may also set the priority level that determines the priority between vocabulary modules and between the objects identified from each vocabulary module.

After the priorities between the objects have been resolved, at a block 165 a selection list is constructed from the identified objects and presented to the user. As a default interpretation of the ambiguous keystroke sequence entered by the user, the first entry in the selection list is provisionally posted and highlighted at the insertion point 88 in the text region 53 as illustrated in FIGS. 1a and 1c. The disambiguating software routine then returns to block 150 to wait for the next keystroke.

If the detected keystroke is the Select key 60, the "yes" branch is taken from decision block 152 to decision block 163, where a test determines if the current selection list is empty. If so, then execution returns to block 150. If at decision block 163 the selection list is not empty, the "no" branch is taken to a block 174. At block 174, the dotted-underline under the default Yomikata displayed at the insertion point 88 where it has been provisionally posted is changed to a solid-underline. At a block 175, the system then waits to detect the next keystroke entered by the user. Upon receipt of a keystroke, at a decision block 176, a test is made to determine if the next keystroke is the Select key. If the next keystroke is the Select key, at a block 178 the system advances to the next Yomikata in the selection list and marks it as the currently selected item. At block 179, the currently selected entry is provisionally displayed at the insertion point with a solid-underline. The routine then returns to block 175 to detect the next keystroke entered by the user. It will be appreciated that the loop formed by blocks 175-179 allows the user to select various Yomikata interpretations of the entered ambiguous keystroke sequence having a lesser frequency of use by depressing the Select key multiple times.

If the next keystroke is not the Select key, at a decision, block 177, a test is made to determine if the next keystroke is the Convert key. If the detected keystroke is the Convert key, the "yes" branch is taken from decision block 177 to block 190 where the first Midashigo associated with the current Yomikata is marked as the selected item and the Midashigo text is provisionally displayed at insertion point 88 with a solid-underline. At a block 191, the system then waits to detect the next keystroke entered by the user. Upon receipt of a keystroke, at a decision block 192, a test is made to determine if the next keystroke is the Select key. If the next keystroke is the Select key, at a block 196 the system changes the currently selected item back to the Yomikata with which the currently selected Midashigo is associated, and marks it as the currently selected item, and then proceeds at block 179 as before. If at decision block 192 the next keystroke is not the Select key, at a decision block 193, a test is made to determine if the next keystroke is the Convert key. If it is the Convert key, then at a block 194 the currently selected object is advanced to the next Midashigo associated with current Yomikata, and marked as the selected item. At block 195, the now selected Midashigo is provisionally displayed at the insertion point 88 with a solid-underline. The system then returns to block 191 to wait to detect the next keystroke entered by the user.

If at decision blocks 177 or 193 the next keystroke is not the Convert key, the routine continues to a block 180 where the provisionally displayed entry is selected as the keystroke sequence interpretation and is converted to normal text formatting in the text region. At a block 184, the old keystroke sequence is cleared from the system memory, since the receipt of an ambiguous keystroke following the Select key or the Convert key indicates to the system the start of a new ambiguous sequence. The newly received keystroke is then used to start the new keystroke sequence at block 154. Because the Yomikata interpretation having the highest frequency of use is presented as the default choice, the main routine of the disambiguation software allows a user to continuously enter text with a minimum number of instances which require additional activations of the Select key.

FIG. 1b is a schematic view of a cellular telephone keypad, similar to FIG. 1a. A reduced keyboard 54' includes a plurality of data entry keys 21'-30'. One or more of the data entry keys are associated with a plurality of romaji characters (Latin letters used to phonetically spell the pronunciations of Japanese kana characters), and are labeled with each of the romaji characters associated with the key. An input sequence is generated each time an input is selected by user manipulation of the input device. The generated input sequence has a textual interpretation that is ambiguous due to the plurality of romaji characters associated with one or more of the data entry keys. This embodiment of the system is conceptually very similar to that shown in FIG. 1a, but does not require a diacritic key 68 since the kana with diacritic marks are specified in romaji through the use of different Latin letters. For example the kana {character pullout} is specified in romaji as "KA," while the same kana with the diacritic dakuten attached ({character pullout}) is specified in romaji as "GA."

In a normal mode of operation, the keystrokes are initially interpreted as the entry of a sequence of romaji corresponding to the kana to spell a Yomikata which corresponds to the desired word interpretation. For example, as shown in FIG. 1b, a keystroke sequence 5 KLM, 4 HIJ, 5 KLM, 1 ABC, 5 KLM and 9 TUV has been entered by a user. As keys are entered, a vocabulary module look-up is simultaneously performed to locate Yomikata that match the keystroke sequence. The Yomikata are returned from the vocabulary module according to frequency of use, with the most commonly used Yomikata listed first. Using the example keystroke sequence, the Yomikata KIKAKU ("{character pullout}") and MIKAKU ("{character pullout}") are identified from the vocabulary module as being the two most probable Yomikata corresponding to the keystroke sequence. Of the two identified Yomikata in this selection list, KIKAKU "{character pullout}" is the most frequently used, so it is taken as the default interpretation and provisionally posted as hiragana text at the insertion point 88'. As shown in FIG. 1b, prior to pressing the Select key 60', this first Yomikata is taken as the default interpretation and is posted at the insertion point 88' using a distinctive format. Specifying a Yomikata in romaji requires on average approximately twice as many characters (and consequently twice as many key selections) as the corresponding specification in kana. Consequently, the system shown in FIG. 1b will generally result in fewer ambiguous choices than that shown in FIG. 1a, since statistically more information is specified when twice as many keystrokes are entered that are distributed among the same number of inputs keys (ten).

II. Unambiguous Text Entry Method

The present invention also provides a method for a reduced keyboard for the Japanese language which allows the user to unambiguously specify each desired kana as an ordered pair of keystrokes. The Japanese syllabary includes 108 syllables (counting the "small" {character pullout}, and {character pullout} as separate syllables from the full-size {character pullout}, and {character pullout} since they are written and pronounced in a distinct fashion). There are some additional seldom used syllables, such as the "small" versions of the vowel syllables {character pullout}, and {character pullout} that are primarily used only in katakana. These seldom used syllables may also be easily generated by the system when used in conjunction with a display as discussed below. Of the 108 standard syllables, 37 are generated by simply adding one of the diacritic marks " or .sup..smallcircle. to one of the other 71 syllables. These 71 syllables without diacritic marks can be logically organized into a single matrix of nine or ten rows and eight to ten columns, as explained in detail below. A plurality of the keys on the keyboard of the present invention are labeled with two kana, one representing the consonant associated with a given row of the matrix, and second kana representing the vowel associated with a given column of the matrix.

The organization is logical and intuitive for a native speaker of Japanese for 106 of the 108 syllables, and the method for generating the remaining two syllables (small {character pullout}, and {character pullout}) is simple and easily learned. Every syllable is generated by a single pair of keystrokes, including syllables with palatalized vowels that are represented by two separate kana (for example, KYA, KYU, and KYO). This results in significantly fewer keystrokes than is required by the currently used multiple-stroke method for entering kana on a reduced keyboard. Thus, the present invention provides a reduced keyboard that is easily understood and quickly learned by native speakers of Japanese, and that is efficient in terms of reducing the length of input keystroke sequences.

In the preferred embodiment, 71 syllables of the Japanese syllabary are organized in the matrix shown in Table 4a. In the general case which includes all 69 syllables appearing in the first eight columns of Table 4a, the first keystroke of the corresponding ordered pair determines the consonant of the syllable to be output, and the second keystroke determines the vowel. The two remaining syllables ("small" {character pullout} and {character pullout}) are exceptional cases discussed below. The remaining 37 syllables not shown in Table 4a are output by generating the corresponding base syllable appearing in the matrix of Table 4a, then adding a diacritic mark using a separate key. FIG. 1c shows a schematic view of a cellular telephone with limited or no display capabilities which has a keypad incorporating an embodiment of the reduced keyboard system of the present invention for the Japanese language. Each of the ten keys 121 through 130 is labeled with one of the kana labels from the row headers in the column labeled "Key 1" in Table 4a (hereinafter a "row label kana") in the upper left region of the key, and with one of the kana labels from the column headers in the row labeled "Key 2" in Table 4a (hereinafter a "column label kana") in the lower right region of the key. In the preferred embodiment, a syllable appearing in Table 4a is generated when the two key sequence is entered wherein the first key is the key labeled at the upper left with the row label kana corresponding to the row in which the syllable appears, and the second key is the key labeled at the lower right with the column label kana corresponding to the column in which the syllable appears. The ten row label kana appear in their standard lexicographic sequence at the upper left of the ten keys 121-130 of the keyboard shown in FIG. 1c. The first five column label kana appear in their standard lexicographic sequence at the lower right of the first five keys 121-125 of the keyboard, followed by the "small" {character pullout}, and {character pullout} (also in standard lexicographic sequence) on the next three keys 126-128. Finally, the "small" {character pullout} appears on the next key 129 followed by {character pullout} on key 130. A diacritic mark may be added to any syllable by activating the Diacritic key 131 once to generate the diacritic " and twice in succession to generate the diacritic .sup..smallcircle.. When a diacritic is added to a syllable with a palatalized vowel (represented by two output kana consisting of a syllable from the "{character pullout}" column of Table 4 followed by a "small" {character pullout}, or {character pullout}), the diacritic is added to the output at its correct location immediately following the first of the two kana.

Thus, for the 69 syllables appearing in the first eight columns of Table 4a, the first keystroke of the corresponding ordered pair determines the consonant of the syllable to be output, and the second keystroke determines the vowel. In another preferred embodiment, a display is used to provide feedback to the user. Upon receipt of the first keystroke of an ordered pair, the system displays the various syllables that can be generated by each key that is valid as the second keystroke of an ordered pair. The association between which key generates which syllable can be indicated by labeling each syllable with the number (or other identifier) associated with the key that generates the syllable. Alternatively, the syllables can be displayed (with or without a numeric label) in a geometric arrangement that corresponds with the arrangement of the corresponding keys. For example, in the case of a device such as a cellular telephone with a text display, the syllables can be displayed in a three-by-three matrix corresponding to the arrangement of the 1 through 9 keys of the telephone keypad. When a display is used in this fashion, even rarely used syllables such as the "small" versions of the vowel syllables {character pullout}, and {character pullout} can be easily generated. For example, using the matrix shown in Table 4b, upon pressing the first key of an ordered pair wherein the first key corresponds to the vowel-only syllables in the top row of Table 4b, the display shows the normal-sized versions of the vowel syllables {character pullout}, and {character pullout} associated with a second keystroke on keys 1 through 5, and the "small" versions of the vowel syllables {character pullout}, and {character pullout} associated with keys 6 through 0.

FIG. 1d shows a schematic view of a cellular telephone that includes text display capabilities, with a keypad incorporating an embodiment of the reduced keyboard system of the present invention for the Japanese language. Each of the ten keys 121 through 130 is labeled only with one of the row label kana from Table 4b in a central region of the key. The ten row label kana appear in their standard lexicographic sequence on the ten keys 121-130 of the keyboard shown in FIG. 1d. FIG. 1d shows the appearance of the display following the activation of key 122 (ambiguously associated with {character pullout}, and {character pullout}) as the first key of an ordered pair of keystrokes. In the preferred embodiment, the syllables appearing in the corresponding row of Table 4b are displayed when the key labeled with the row label kana is activated. The display shows the association between the kana that will be generated and each key that may be pressed as a second key of an ordered pair of keystrokes to generate the kana. The columns in Table 4b are labeled with the numbers which also appear on each key of the cellular telephone shown in FIG. 1d. Note that in this preferred embodiment, the three syllables with palatalized vowels ({character pullout}, and {character pullout}) are associated with keys 127-129 (labeled with the digits 7, 8 and 9) and thus these can be displayed in a single row of the display. This makes it easier for the user to associate these related syllables to the keys that will generate them, since the association can be made not only in terms of the number labels on keys and displayed syllables, but also in terms of the same three-by-three matrix arrangement of syllables on