Language selection

Search

Patent 1172335 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent: (11) CA 1172335
(21) Application Number: 347137
(54) English Title: MEANS FOR ENCODING IDEOGRAPHIC CHARACTERS
(54) French Title: APPAREIL DE CODAGE DE CARACTERES IDEOGRAPHIQUES
Status: Expired
Bibliographic Data
(52) Canadian Patent Classification (CPC):
  • 197/119
  • 340/175
(51) International Patent Classification (IPC):
  • B41J 3/01 (2006.01)
(72) Inventors :
  • LEUNG, DANIEL L. (Canada)
  • LEUNG, LAI-WO S. (Canada)
(73) Owners :
  • LEUNG, DANIEL L. (Not Available)
  • LEUNG, LAI-WO S. (Not Available)
(71) Applicants :
(74) Agent: ARTHURS & GARRETT
(74) Associate agent:
(45) Issued: 1984-08-07
(22) Filed Date: 1980-03-06
Availability of licence: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): No

(30) Application Priority Data: None

Abstracts

English Abstract



File 1358 P/2 CA

APPARATUS FOR ENCODING IDEOGRAPHIC CHARACTERS
ABSTRACT OF THE DISCLOSURE

A word processing system for Chinese type characters
includes a keyboard with a generally standard key arrangement
for encoding the characters in accordance with their basic stroke
type and sequence. Up to eight basic stroke types may be
employed, although a five stroke system is preferred. Recurrent
code sequences of two, three, four and five strokes are
identified. An "end of character" code may be generated with the
space bar. Preferred sequences are assigned key positions so as
to provide an ergonometrically efficient keyboard. Average typing
speeds using the keyboard are comparable on a character/word basis
to those for English.


Claims

Note: Claims are shown in the official language in which they were submitted.



- 24 - File 1358 P/2

THE EMBODIMENTS OF THE INVENTION IN WHICH AN EXCLUSIVE PROPERTY OR
PRIVILEGE IS CLAIMED ARE DEFINED AS FOLLOWS:
1. Apparatus for encoding Chinese type characters in
accordance with their basic stroke type and sequence comprising
a keyboard having a plurality of up to eight inclusive of
monographic keys, and a plurality of digraphic keys, key
responsive means responsive to the actuation of each said
monographic key for generating a code signal representative
of a basic stroke and to the actuation of each said digraphic key
for generating a code signal representative of a sequence
of said basic strokes of two.
2. Apparatus in accordance with Claim 1, wherein the
number of monographic keys is five.
3. Apparatus in accordance with Claim 1 or 2, wherein the
number of digraphic keys is about fifteen.
4. Apparatus in accordance with Claim 1, further including
a plurality of trigraphic keys, and wherein said key responsive
means is responsive to the actuation of each said trigraphic key
for generating a code signal representative of a sequence of
said basic strokes of three.
5. Apparatus in accordance with Claim 1, 2 or 4, wherein
the number of said trigraphic keys is about five.
6. Apparatus in accordance with Claim 1, 2 or 4, wherein
the stroke sequence assigned to each said digraphic and trigraphic
keys is selected from two and three stroke sequences generally
having the highest stroke-saving function values.
7. Apparatus in accordance with Claim 1, 2 or 4, further
including at least one tetragraphic key and at least one



- 26 - File 1358 P/2 CA

two or more, at least a portion of said polygraphic keys being
digraphic, means responsive to the actuation of said keys to generate
a code signal representative of the stroke or sequence of strokes
represented thereby.
13. A keyboard apparatus in accordance with Claim 12, wherein
the polygraphic code sequence represented by said polygraphic keys are
those code sequences selected from all possible code sequences
generally having the highest stroke saving function values.
14. A keyboard in accordance with Claim 12, wherein said
polygraphic keys include about fifteen digraphic keys, about
five trigraphic keys, at least one tetragraphic key and at least
one pentagraphic key.
15. A keyboard apparatus in accordance with Claim 12, 13 or 14,
wherein the keys are assigned to positions on the keyboard as
determined generally in accordance with their monostrike
frequency and dual strike frequency.
16. A keyboard apparatus in accordance with Claim 12, 13 or 14,
wherein said monographic keys are assigned home finger positions.
17. A keyboard apparatus in accordance with Claim 12, 13 or 14,
including a space bar representative of an "end of character".
18. A keyboard apparatus in accordance with Claim 12,
wherein the character coding keys are arranged substantially as
illustrated in Figure 6, or the mirror image thereof.
19. Apparatus as defined in Claim 1, 2 or 4 wherein said code
signal is a binary code signal.
20. Apparatus as defined in Claim 1, 2 or 4 further including
output information means responsive to the input of said code signals,
writing means responsive to said output information means.



- 25 - File 1358 P/2

pentagraphic key, and wherein said key responsive means
respectively generates a code signal representative of a sequence
of said basic strokes of four and five.
8. Apparatus as defined in Claim 1, 2 or 4, wherein said
keys are assigned to positions locating in three horizontal
ranks, each rank consisting of about 10 said keys.
9. Apparatus as defined in Claim 1, 2 or 4, wherein said
keys are assigned to positions locating in three horizontal ranks,
and wherein said monographic keys locate in home finger positions.
10. Apparatus as defined in Claim 1, 2 or 4, wherein each
said character encoding key has an indicium thereon in the form
of an arabic numeral, each said monographic key having a single
digit numeral indicative of the basic stroke type represented by
said key, and each said key having a graphicity of greater than
one having a sequence of single digit numerals in accordance with
the sequence of basic strokes represented by that key.
11 Apparatus as defined in Claim 11 2 or 4, including a
space bar, and wherein said key responsive means is responsive
to the actuation of said space bar to generate an end of code
signal.
12. A keyboard for encoding Chinese type characters in
accordance with five basic strokes and the sequence thereof
comprising three horizontal ranks, each rank comprising about
ten keys wherein all character coding keys locate, wherein five
said keys locating in home finger positions are monographic and
representative of a basic stroke, and about twenty three said keys
are polygraphic, representative of a sequence of basic strikes of



- 27 - 1358 P/2 CA
21. Apparatus as defined in Claims 12, 13 or 14, wherein
said code signal is a binary code signal.
22. Apparatus defined in Claim 12, 13 or 14, further
including output information means responsive to said code signals,
writing means responsive to said output information means.
23. Apparatus as defined in Claim 1, 2 or 4, further
including output information mans including telegraphic signal
means responsive to the input of said code signals.
24. Apparatus as defined in Claim 1, 2 or 4, further
including output information means responsive to the input of said
code signals wherein said output information means generates a
minimum redundancy code signal.
25. Apparatus as defined in Claim 12, 13 or 14, further
including output information means including telegraphic
signal means responsive to the input of said code signals.
26. Apparatus as defined in Claim 12, 13 or 14, further
including output information means responsive to the input
of said code signal wherein said output information means
generates a minimum redundancy code signal.
27. In a method for writing ideographic characters using
a keyboard wherein the characters are encoded in accordance
with system of up to 8 basic stoke elements and the sequence
thereof the improvement wherein not less than about 25 percent
of all possible digraphic sequences are inputtable from said
keyboard using digraphic keys.
28. A method in accordance with Claim 27, wherein said
system comprises 5 basic stroke elements and wherein not less



- 28 - 1358 P/2 CA

than about 50 percent of all possible digraphic sequences are
inputtable using digraphic keys,
29. A method in accordance with Claim 27 or 28, wherein
at least a portion of all possible trigraphic sequences are
inputtable using trigraphic keys.
30. A method in accordance with Claim 27 or 28, wherein
said basic stroke elements are represented on said keys with
single digit Arabic numerals, and wherein polygraphic sequences
of said basic stroke elements are represented on said keys with
corresponding sequences of said single digit Arabic numerals.
31. A method in accordance with Claim 27 or 28, including
the step of outputting said encoded characters in minimum
redundancy code for signal transmission.


Description

Note: Descriptions are shown in the official language in which they were submitted.






33~

- 1 - File 1358 P/2 CA
ME~NS FOR ENCODING IDECGRAP}~C CHAR~CI~RS
FIELD OF INVENTION
This invention relates to improvements in method and
apparatus for information processing. It particularly relates
to improvements in such method and apparatus applicable for use
in cor~ection with Chinese type eharacters currently used in
Chinese, Japanese and Korean script, which are commonly referred
to as ideographs.
B~CKGRDUND OF THE INVENTION
m e Chinese language is rep~rted to comprise about
30,000 charaeters. Some 8,000 are listed in a cc~monly used
Chinese-English dietionary, these beirlg sufficient for modern
Chinese prose. A voeablllary of about 3,000 characters accounts
for 95~ of the eharaeters in every day use. Telegraph ecde
books are limited to abo~t 9,600 charaeters.
The 30,000 charaeters that ccmprise the writing system
of the Chinese language are a heterogenous set, and were ereated
at different stages of the development of the language. The
pronuneiation, :in general, has been assigned arbitrarily to the
eharaeters, and the strokes from whieh the characters are
composed have no syntaetic meaning in themselves. m e


`!1~

33~
-2- File 1358 P/2CA


characters are not amenable to classification in a well structured
system. The traditional Chinese dictionary arrangement method is
in accordance with radicals and strokes comprising the character.
The system has many deficiences. Ihere are 214 different radicals
listed in the Kang Hsi dictionary, and it is sometimes difficult
to determune the radical group to which a character is related,
especially when it is not a phonetic ccmpound or a c~mplex
ideograph. Iooking up a character is tedious and involves some
six steps. Also, there is considerable degeneracy, with up to
30 characters having the same radical-stroke number characteristic.
A second system that is sometimes used is one wherein -the
four corners of the character are assigned a number in accordance
with the stroke types and configuration of the strokes in that
corner. The rules are relatively ccmplicated, and mis-coding
frequently occurs. Degeneracy is also a problem; for example,
in the 2000-2999 section of the four corner code table in the
Xinhua Zidian (New Chinese Dictionary) 1971, there are 1599
characters defined by 885 codes.
The Pinyin method of classification was introduc~d in
Beijing (Peking~ in the 1950's. The method involves a standardized
phonetic system of representation using the Latin letters and
tone indicators. m e assignment of phonetic values necessitates
a kncwledge of the official dialect (Mandar m), and also subtle
differences in the sound and tonè must be discerned in nany
characters. Pinyin spelling of characters involves considerable
degeneracy.
Other systems of classification are also known, serving

.~ ~L7~33~i
- 3 - File 1358 P/2 CA


for different purposes. me telegraph code lists some 9~600
characters numerically, thus avoiding degeneracy entirely.
However, the list is accessed by operator search, or memory in the
case of comm~nly used characters, hence the system is slow and
requires considerable training. More recently, Caldwell, US Patent
2,950,800, proposed a system based upon the type of stroke from
which the characters are constructed, and the sequence thereof.
Scme 21 "basic" strokes were identified. Some degeneracy was
observed, but this was relatively small in comçarison to the
more traditional systems. Mbreover, the method did not
necessitate a fine knowledge of the Chinese written language
or a particular dialectic manner of pronunciation~ hence it
could be o~en for widespread use.
Once a Chinese character is converted into a code signal,
such signal may be employed in an inormation processing system
such as ccmmMnication, printing, translation and machine control.
Thus, Caldwell described an electro-Techanical keybvard device
for mputting the oode elements into an accumulator. The
concatenated code elements in the accumwlator were then
converted into X-Y coordinates so as to select and control the
position of a film matrix upon which the preformed characters
were stored, whereby the selected coded character could be
optically printed. More recently micro-processor developments
wculd readily permit the construction of electronic analogues
embodying Caldwell's system, such as shown by Shashoua et al,
US Patent 3,325,786. Still more recently in accordance with
well kncwn prccedures, writing instructions converted from code


7233~;
- ~ ~ File 1358 P/2CA


signals may be for CRT, LED or "liquid crystal" display, or for
printing such æ impact printing, matrix wire printing, hot point
printing or jet printing, for example. Plso, whilst such
instructions may relate to writing pre-formed character, they may
relate to instructions for synthesising such characters. A
simple synthesis was proposed by Li, US Patent 3,950,734
wherein a "prefix" an~ "suffix" were ccmbined to form a
character. More complex systems of synthesis in accordance with
the stroke type and spatial configuration of the strokes are
also known, for example as in the electronic system designed
by Wakamatsu, US Patent 4,144,405, or in the various mechanical
systems that have heretofore been proposed.
It is important to note here that the kinds of strokes
for synthesis of the character for writing purposes are not
well-defined. Most strokes are not known by name to the average
Chinese writer, and the classification of such strokes into types
is quite arbitrary. ~ilst Caldwell defined and employed 21
such basic writing stroke types for encoding purposes, it has been
recognized heretofore that a small number of stroke types would
suffice for this purpose. A summary of different stroke types
for coding systems which have heretofore been proposed is given
by Stallings, "Pattern Recognition", Pergamon Press, Vol 8
pp 87-98 (1976). Cheung and Chan, in "Ccmputer-aided instruction
in Chinese characters" Proc. 1st Int. Symposium on Computers and
Chinese Input/Output Systems, 599-616 (1973) identify scme 31
different stroke types. Liu, in "Real Time Chinese Hand Writing
Recognition Machine" MIT Cambridge, E.E. Thesis, 1966 identifies

~'7~335.
S ~ File 1358 P/2CA


19 stroke types. Yoshida and Ede~n, in "Handwritten C~inese
-~C'haracter Reoognition by an Analysis-by-Synthesis Method". Proc.
1st Int. Conference on Pattern Recognition, 197-204 tl973)
identify 7 stroke types, and Groner let al, "On-line computer
classification of handprinted Chinese characters as a translation
aid" TFF~ Trans Elect. Cbmput. 16, pp 856-860 (1967) propo~e
5 types. The 7 stroXe types and the 5 stroke types coding
methods are referred to in greater detail subsequently herein.
A keyboard for encoding characters in accordance with
stroke type and sequence may permit touch typing of the
characters. Using his definition of 21 stroke types, Caldwell
designed a keykoard with 21 "stroke keys", each assigned to one
stroke type. ~lowever it was at once apparent that the spaed
attainable with such design would be, character for word, low
in comparison to the average typing speed in English language
on a Qwerty keyboard, the average strokes per character being
about 10, and the average number of keystrikes per English
word being about 5.
Caldwell reduced the number of keystrokes per
2Q character by two expedients. The first was texmed "munimwm
spelling", whereby the length of the code word (that is, the
sequence of code elements corresponding to the stroke types) for
a character was truncated so as to just distinguish the character
from other characters oomprising the vocabulary list, whilst
avoiding redundancy. For example, when an operator keye~ in the
oode word BGD ~GV BDP BDP BGE OE, the keyboard would lock after
the seventh key had been hit, as the further information was


- 6 - File 1358 P/2CA


not rec~lired to distinguish the character from the remaining
characters comprising the vocabulary list. In an expanded vocabulary
list containing the code word sDG EGV BDP BDP BGE GF, which differs
from the above example in the last cocle element only, all of the code
elements are required to avoid degeneracy. It is apparent -that the
applicabilit~ of "minimum spelling" in reducing the length of
code words is very much dependent upon vocabulary size. The
second exp~dient was the adclition of "entity keys", which keys
generate a signal corresponcling to a sequence of strokes as oppos~d
- 10 to one stroke. Some 20 different "entities" were described,
each representing several strokes in specific spatial arrangement,
often that of a radical or having other syntactical significance.
From his relatively small vocabulary of 2,333 characters, Caldwell
reported a reduction of the median value of ~0.2 strokes per
15 character to 6.7 using the "mu mmum spelling" methocl. When using
a small sample drawn from the aforementioned vocabulary, Caldwell
estimated the average number of keystrokes necessary to enter a
character making full use of the "entity keys" was 4.7, which
coincides quite closely to the average word length in the English
language. However! after a suitable period of training, the typing
speed on such a keyboard WdS reported by Caldwell to be onl~ 14
characters per minute. Sueh typing speed is, of course, much less
than is considered average for typing English words.
~e consider that stroke coding systems for Chinese type
characters which employ highly discriminating "basie" str~kes have
inherent disadvantages which tend to limit the attainment of good
typing speeds. For example, certain strokes have a close


33~

- 7 - File 1358 P/2 CA


resemblance to other strokes; this may be conducive to error in
coding, and considerable effort must be expended on the part of ~he
typist to distinguish between the types. Further, whilst there is
no theoretical limit to the number of ~rd encoding keys whih may
locate on a keyboard, there would appear to be a practical limit
beyond which touch typing becomes increasingly difficult. As
a first approximation it is not believed to be desirable to exce0d
the 26 letter keys of a Qwerty keyboard. mus, whilst Caldwell
identified some 20 different "entities", only 6 were assigned a
key position, together with the 21 "basic" stroke keys. This
restriction on the number of keys severely limits the
applicability of the entity keys, since the percentage of
characters of an expanded vocabulary list which may be enccded using
the assigned '1entity" keys is necessarily limited. Still further,
in accordance with Information meory, an optimal coding system
should have a set of code elements each of which is used an
approximately equal num~er of times when coding an average text.
A 21 basic stroke code system is far from optimal sinoe, as stated
by ~aldwell, "90% of all Chinese writing is accounted for by only
9 of the 21 basic strokes". (op. cit.) The shortest uniform
length binary signals that could be assigned to each of these
stroke types w~uld be 5 bits, and would be highly redundant. ~ence,
- Caldwell employed Huffman's method of constructing munimum
redundancy codes of non-uniform lengths tD.A. Huffman, "A method for
the construction of Minimum-Redundancy Ccdes", Pr~c. I.R.E., 40,
pp. 1098-1101, 1952). Such non-uniform length signals for code
elements are used in serial transmission of information, and pose


~ ~Lt7~33~i

- 8 - File 1358 P/2 JAP


no problems for large computers with large accumulators. Hcwever
in smaller information processing sys-tems where the accun~lators
commonly have 8 to 16 bits, additional circuits and ccmponents are
required before such code signals can be processed.
5 OBJECT OF INVENTION
It is an object of the present invention to provide
an efficient system of coding to facilitate the inputting of
information representing Chinese type characters into a information
processing system.
It is an object of our invention to provide an improved
information processing system for writing Chinese type characters.
It is a further object of our invention to provide
improved keyboard apparatus for use in the above system.
SUMMARY OF INVENTIVE ASPECTS
In accordance with one embodiT~nt of the invention, an
apparatus for encoding Chinese type characters in accordance with
their basic stroke type and sequence CQmpriSeS a keyboard having
not more than eight'~onographic"keys and a plurality of "digraphic"
keys. A "monographic" key is defined here to be representative of a single
basic stroke type. A "digraphic" key is defined to be representative
of a sequence of two basic strokes, which may be identical or
otherwise. Means is provided responsive to the actuation of
each "monographic" key for generating a code signal representative
of the basic stroke associated therewith, which means is responsive
to the actuation of each "digraphic" key for generating a ccde
signal representative of the sequential basic strokes associated
therewith.


3~
- 9 - File 1358 P/2CA


In a preferred aspect of ~he invention, the number of
basic keys representing basic stroke types is limited to five,
and the stroke types are classified by simple geGmetric properties,
as is described herein.
In accordance with another aspect of the invention, the
keykoard includes in addi-tion to the nographic and digraphic keys
a plurality of keys having a graphicity of more than two, e.g.
three, four or five. In a preferred form, the selection of the
polygraphic keys (which expression includes digraphic) for
inclusion on the keyboard is made primarily in accordance with
their stroke saving function, as defined herei~. In accordance
with a still further aspect of the invention, the selected keys
are assigned positions on the keybcard so as to provide a key~oard
of good ergono~etric efficiency.
m e above mentioned and other features and objects of
our invention and the manner of obtaining them will beccme more
apparent and the invention itself will be best understood by
reference to the following description of an embodirent of the
invention taken together with the accompanying drawing wherein:
BRIEF DESCRIPTION OF THE DR~WINGS
Fig. 1 - shows the stroke types of a first prior art
oode arrangement;
Fig. 2 - shows the stroke types of a second prior art
code arrangement;
Fig. 3 shcws the stroke types which we preferably
employ herein for character ooding
purposes;

7;~;~3~

- 10 - File 135~ P/2 CA


Fig. 4 - shows certain strokes which are treated
exceptionally for character coding
purposes;
Fig. 5a, 5b and 5c - sh~7w examples of Chinese characters
enccded in accordance with the code system
illustrated in Fig. 3;
Fig. 6 - shows a keykoard arrangement in accordance
with this invention, and
Fig. 7 - shows in schematic form a word processing
system embodying the present invention.
DESCRIPTION OF A PREFER~ED EMBODIMENT
Referring to Fig. 1, there is shown therein the five
basic stroke types proposed by Groner et al, loc. cit. for the
purpose of encoding Chinese type characters. In Fig. 2 there is
shcwn the seven elements proposed ~y Yoshida et al, lcc. cit.
for this purpose. It is tc ke remarked ~hat the above authors
e~ployed a tablet inputting means necessitating pattern
recognition techniques by the ccmputer to identify the stroke
types. The above basic strokes are not necessarily suitable for
use in connection with the present invention, but they are
illu~strative of codes which may ke represented by three characteristic
bits.
m e particular classification of basic stroke types that
we identify for coding purposes and which seems well suited to our
invention is shcwn in Fig. 3. m ese basic strake types are as
follows:

3~S

~ File 1358 P/2 CA
TYPE STR3KE(S)
1 horizontal
2 vertical, optionally with left hook
3 left or right obliques and curves
4 dot
S angu~ate~ strokes sustaining an acute or
right angle
Three strokes are illustrated in Fig. 4 tha-t are
treated as an exception. These strokes are conventionally
considered as one-stroke executions; hGwever, in accordance our
preferred method of coding, these are considered as being ccmposed
of tw~ basic strokes of the types indicated.
Examples of characters encoded in accordance with our
basic stroke types are shcwn in Fig. 5:
Fig. 5a: (to) call; code sequence 25152
Fig. 5b: bird; code sequence 354251
Fig. 5c: (to) bolt; code sequence 4251
The akove described five basic stroke types are defined
in such manner as to facilitate easy recognition, and also such
that the frequencies of occurrence for each of the five stroke
types are approximately equal so as to optimize the efficiency of
the coding system and allcw for the efficient use of uniform length
signals for representation of code elements.
An analysis of Chinese type characters coded in
accordance with the code elements defined in Fig. 3 indicates that
certain code sequences of two or more elements are recurrent. The
samples that we employed for analysis were excerpted from recently


7~335
- 12 - File 1358 P/2CA


published rnaterials of diverse contents from kooks, newspapers and
magazines. Separate samples of 5,250 characters were analysed
as described below, and we found that: all -the frequencies
pertinent to our design are stabilized at this sample size; that
is, these samples of about 5,000 characters are statistically
representative of rnodern Chinese prose. Four such samples
totalling 21,000 characters were ana]ysed in detail.
A value we refer to as the Stroke Saving Function
(SSF) in accordance with the follcwing definition was calculated
for each code sequence of significance that was identified in the
above analysis
SSF = f (n~l)
where f is the frequency of occurrence of the given sequence and
n is the number of code elements in the sequence. In calculating
the SSF value, any character with distir.ct left and right parts is
considered to comprise two disconnected code sequer~ces rather
than one continuous code sequence. For example, the character
illustrated in Fig. 5a having a code sequence 25152 is treated
for the purpose of calculating the SSF as comprising separate
sequences 251 and 52. mis lS the natural manner that an operator
would tend to treat such character, and is analogous to the
preference to spell Fnglish words in syllables. The hundred or
so most commonly occuring code sequences identified in the above
analysis were investigated and preliminary SSF values calculated
therefor. The sequence having the highest SSF value was then
"selected"and identified by a specific designation whereby in
subsequent calculation shorter code sequences comprised in the
"selected" sequence would not be encountered, and whereby in longer


:~7'~33~

- 13 - File 1358 P/2 CP-

code sequences which include the selected sequence, the selected
sequence would be considered as cc~nprising a single code element.
To illustrate the concept, assume that in a hypothetical
sample of 40 characters, the character with code word "251
occurs 20 times, the character with code word "2511" occurs 10
times, and the character with code word l'4125'l occurs 10 times.
Preliminary SSF values are deterrnined as follows:
~ODE SEQ Fl~ S.S.F. V~LUF
.. .. . _ .
"25" 20 from code word "251" (20+10+10)Xt2-1)=40
10 fran code ~rd "2511"
10 fran code word "4125"
"51" 20 from ccde word "251" (20+10)x(2-1)-30
10 fran code word "2511l'
u25lll 20 fran code wor~l "251" t20+10)X(3-1)=60
10 fram code ~.7Ord "2511"
"2511" 10 fram code word "2511" (10)x(4-1)=30
4125" 10 fonTI code word "4125" (10)x(4-1)=30
Fr~m the above, it is seen that the grea~:est SSF value is
60, being that attaching to the code sequence "251". If it now be
20 assumed that this sequence is "selected" and represented by t~e
sy}~ol "*", the sequences of the above example may be
identified as "25", "51", "*", "*1", "4125".
The SSF values attaching to the newly defined sequences
are determined as follows:
25 CODE SEQ. E~ OF ~[~RR~E S.S.F. VPLI3E
_ _ _ _
"25" 0 from code word "*"
O fran code word "*l"

3~3~
- 14 - ~ile 1358 P/2

C~ODE SEQ. P~I~U~C~! OF OCCU~NCE S.S.F. V~l.UE
10 frcm code word "4125" (lO)x(2~ 10
"51" 0 from code w~rd "*"
O from code w~rd "*l" (O)x(2~ 0
"*"(previously "251")
20 from code word "*" (20)x(1-1~=0
"*"(previously "2511")
10 frcm ccde word "*1" (lO)x(2-1)=10
"4125" 10 from code w~rd "4125" (lO)x(4-1)=30
In accordance with the newly calculated SSF values the
code sequence "4125" would be next "selected" as having the highest
SSF value, and the process of calculation repeated for the rema ming
se~uences. It should be understood that in the above example the
frequenc~s and the SSF v~ ues calculated therefrcm are illustrative
only of the concept, and that they do not bear any quantitative
significance. In practice the SSF value of khe code sequence "4125"
is relatively low, and the sequence is not observed a ngst the
code sequPnces having the top 30 SSF values. It may also be noted
that in practice the SSF value of "selected" sequences w~l~d nok be
recalculated in the manner shown, sin oe a seq~loe, once s@lected,
is defined for recalculation purposes as ccmprising a single code
element (n=l~ for which the SSF value has no significan oe.
As will be appreciated from the above,the determ mation of
the Stroke-saving Function values is a non-linear process heav;ly
dependent on which code sequenoe s have already been selected, and
for test purposes the number of sequences investigated is desirably
greater than the number to be selected.
A list of 28 code sequences in generally descending order

. ~ .,

~'7~335:

- 15 - File 1358 P/2


of SSF values as determined in accord~nce with the foregoing
principles is given in Table 1 below
251 33 121 12 44 35 53 11 52 32
2511 41 51 21 354 32511 1233 31 45
54 132 123 331 453 533 111 551 25
TAELE 1: Ccde sequences havinq hiqhest SSF values
It should be noted that ther:e are relatively ~small
differences only in the SSF values associated with the last several
sequences listed, and it is again stressed that the values
associated with the sequences will be dependent upon the "selected"
sequences, hence some change in the order, particularly tcwards
the lower end, may be found.
The p~lygraphic sequences of code elements such as are
given in Table 1 are not intended to define the spatial arrangement
of the strokesl and the significance thereof may vary from character
to character. This is illustrated by the codes of the characters
defined in Figs. 5a, 5b and 5c where in each instance the
trigraphic code sequence of 251 occurs. In Fig. 5a this
sequence represents the radical ~ " uth". In Fig. 5b this s3me
sequence represents an arrangement of strokes having no syntactical
significance. In Fiq. 5c the same sequence represents a still
further arrangement of strokes differing from those of Fig. 5a
and 5b and again having no syntactical significance. In this
respect, then~ the sequences do not correspond to the "entities"
of strokes defined by Cal*well, such "entities", it being recalled,
being representative of a defined and specific spatial arrangement
of strokes normally having a syntactical significance. Thus whilst


~ 7'~33~

- 16 - File 1358 P/2

C31dwell proposed an entity key representative of the r~dical
~Imouth~ and such key would be of utility in encoding the
character of Fig. 5a, such utility w~uld not extend to encading
the strokes comprising the characters of Figs. 5b and 5c.
Whilst the limitation of the nu~er of basic stroke types
used for code pu~poses has considerable significance in assisting
stroke type identification as earlier discussed, such limitation
has further significance m relation to an ergonometric-efficient
keyboard. Given a standard Qwerty keyboard with 44 keys, let it be
assumed that 10 such keys are assigned to input numerals and 4 to
input special instructions and punctuations; there will then
remain 30 keys for character and machine function ooding purposes.
If 21 such keys are required for encoding basic stroke, there will
be available only 9 keys to which polygraphic code sequences
may be assigned. In a 21 basic stroke codin~ system there is
a total of 441 digraphic sequences (212) that are theoretically
possible, h~nce it will be ~pparent that only a relatively small
proportion of the digraphic sequences could be assigned a key
position. Also, in such s`ystem the stroke saving functions of
the various ccde sequences are generally lcw due to the ccmparatively
law value of f, the frequency with which a sequence recurs.
In ca~parison, in our preferred 5 basic stroke system, there may
be up to 25 keys available for assignment to polygraphic sequences,
whereby all of the 25 possible digraphic code sequences might be
assigned key positions. However we prefer a more efficient
keyboard where those polygraphic sequences generally having the
highest stroke-saving function values are assigned to the

3L~7~35

- 17 - File 1358 P/2

available keys. Twenty eight of these sequences are listed in
Table 1.
Referring to Fig. 6, a "standard" keyboard is identified
therein by the numeral 10. As used herein, "standard" refers
to a key arrangement wherein there are provided three horizontally
æranged ranks of keys identified as upper rank 12, middle rank 14
and lower rank 16, wherein all character coding keys are located.
Generally there are some ten keys in each rcmk. In a Qwerty key
arrangement, the twenty six letters of the Latin alphabet are assigne~
standard positions in these three ranks, the four remaining keys
being assigned punctuation functions. Other keys to the left of
the left hand keys and to the right of the right hand keys may
also be present; these keys are generally assigned machine
operation, punctuation or special symbol functions. These
additional keys are not normally used for ch æacter coding purposes.
Still further keys may be present and are here shown as a rank 18
superior to upper rank 12, and are used for inputting the
numerals 0 - 9 and/or sym~ols. This rank of keys is shown without
specific designation appearing thereon so as to avoid any
confusion in the ensuing description. Such numerical vc~lue input~Lng
keys may ccmmonly be formed as a separate array in a computer input
keyboard. m ere is no fixed limit to the numker of keys on a
"standard" keyboard, but the maxImum number is usually about 50.
For touch typing of w~rds (which expression here includes
Chinese type characters) the ~ord writing keys are to be considered
as preferably con~isting of a left hand and a right hand sphere
of operation, the keys being divided accordingly by an imagin ry


7~33Si

- 18 - File 1358 P/2 CA

line 19. "Home" finger positions are located on middle rc~nk 14
and comprise the four keys of each hand commencing one key
removed from line 19. In the Qwerty keykoard assignment such eight
home keys are identified as "A,S,D,F" and l'J,K,L,;". Our standard
Chinese character coding keyboard includes monographic keys
for entering basic strokes, and polygraphic keys for entering
sequences of basic strokes, selection of the sequences for
inclusion on the key~oard being detenmned from the ranking of
their SSF values as earlier discussed. The assignment of the
exact location for each of the aforementionel keys, both
monographic and polygraphic, is determuned by studies
of the "mono-strike" and "dual-strike" frequencies of these
keys. The "mono-strike" frequency of a key, whether a monographic or a
polygraphic key, is the frequency of that key being hit in coding
Chinese characters frcm a properly selected sample reflecting
the average Chinese prose. The "dual-strike" frequency is the
frequency of occurrence of a sequence of tw~ keys, whether
monographic or polygraphic, in a similar sample as described.
To achieve maximum ergonomic efficiency in the keyboard
; 20 design, the work loads of both hands are distributed approximately
equally, that is, the sum total of the ll no-strike" frequencies
of all the keys fcr the left hand is about the same as the right.
Also, the work load, i.e. the sum of "mono-strikell frequencies, for
each finger is distributed directly proportionally to the strength
and tapping speed of that finyer. Furthermore, the keys with the
highest "mono-strike"frequency, which in our studies include all
five basic keys, are assigned to the most accessible keys, namely

' ;3 ~l~7Z335

~ 19 - File 1358 P/2C~


the home keys. Lastly, pairs of keys that have high "dual-strike"
frequencies are arranged so that the t~o keys of each pair are
assigned to opposite hands, such that the operation of successive
keys by alternate hands be maximized.
We prefer to identify the keys in accordance with a code
or code sequence corresponding to the basic stroke or sequence of
basic strokes assigned to the keys, arabic numerals being preferred
for this purpose in view of their easy recognition. It will be
appreciated that the single digit n ~nber used to identify a basic
stroke is arbitrary.
me various code elements assigned to the character
coding keys of keyb~ard 10 in general accordance with the
above principles may be seen in Fig. 6. It may be observed from
a comparison of this Figure with Table 1 that the sequences "25"
and "123" have been assigned keyboard positions in preference to
others of nominally superior SSF value; in practice it was found
that such keys were preferred by an operator to other possible
keys of approximately equal SSF value. Also in keyboard 10 an
"end of code" designation is assigned to the space bar of the
keyboard. Keyboard 10 used in a word processing sysLem for the
typing of Chinese type characters to be described was subject
to a perfor~ance test by an operator. A text lumited to 350
characters was selected, such characters being of varying
degrees of complexity such as would be found in a text of
wider scope. After several sessions totalling only 20 hours
of practice, the operator attained a speed of 50 characters per
minute. This is almost 4 times faster than the only reported
speed for the operation of a Chinese keyboard device of 14


~:~ 7Z33~

- 20 ~ File 1358 P/2CA

characters/munute. Inputting of the codin~ xequired an average
of 3.76 coding key-strikes/character plus 1 strike for the
space bar, to denote the end of the character code/ hence the
keying speed ccmpares v~ry favourably to typing an English
language text after an extended training period. It was found
that the frequency of use of the polygraphic keys by the trained
operator was quite close to that of the optimal result o~nputed.
The average n~nber of strokes/character in the selected text was
about 7.29, hence each key strike represented about 1.94 strokes
of a Chinese character.
An exemplary Chinese character word processing system
including keyboard 10 is indicated schematically in Fig. 7. m e
system further comprises a converter 20 which generates a bit
pattern corresponding to the code or code sequences represented
by a key struck on keyboard 10. Such bit pattern is preferably
generated in simple binary code.
The hit pattern generated by the actuation of a
monographic key can be assigned to 3 significant bits in our
preferred arrangement wherein there are 5 stroke types plus one
1'stop" code to designate the end of a code word, and there will
be 2 bit patterns left undefined in the 8 possible bit patterns.
Alternately, the code elements can be concatenated in an
accumulator, and three such code elements can be assigned to one
byte, taking up 216 (i.e. 63) of the 256 possible bit patterns,
leaving 40 bit patterns for alphabets, numerals, or other coded
instructions. Other alternatives are possible, such as
starting each C~hinese code word at the beginnin~ of a byte,


7~
- 21 - File 1358 P/2

and assigning 3 code element to a byte, thus using up only 180
of the 256 possible bit patterns, leaving 76 for other codes.
Arrangements as such can be varied ancl are generally known in
the art.
Ib take full advantage of the increasingly popular
and econcmical 8-hit microprocessors and integrated m~mory
circuits, uniforrn length code signals of not longer than 8 bits
are required. A11 the above listed possible binary code signal
patterns are of unifonn lengths. In a coding system of 6 code
elements (5 stroke types plus a stop code), the average number
of bits per code element is 2.25 (6 bit patterns _ x 3 bits).
8 possible bit patterns
In the prior art system of 21 strokes, the average number of bits
per code element is 3.4375 (22 bit patterns _ x 5 bits).
32 possible bit patterns
mus, our invention provides a 1.53 (3.4375 . 2.25) times
improvement in ~emory space requirement and processing speed when
uniform length code signals are ~sed. This significant improvement
in efficiency is achieved by a well designed stroke type
classification which extracts the most distinguishing properties
of the strokes in Chinese characters. Also, this efficiency is
achieved by a relatively unifonn~ distribution of stroke type
frequencies, ~hich is considered a more optimal code system
according to inforrnation theory.
The generated bit patterns are routed via bus 22 to an
input buffer 24 which provides temporary storage for editing
purposes, and in the case of several key~oards sharing the sarne

-
~ 7;~3~
- 22 - File 1358 P~2CA


output information generator 30, provides storage until the latter
is available.
The keyboard 10 also permits keyboard input of control
instructions via the central control unit 28 to the various
processing units of the word processing system, via control lines
26 for functions such as deletion or addition of certain bit
patterns during editing and correctiGn, or control of
information flGw to and from various units.
The binary machine words representing the code words for
Chinese characters are converted in the output information generator
30 into the appropriate forms of information. The matching of the
binary code word to the output information may be by one of the
many well doc~ented algorithms, such as the "hashing" method,
for example. The output information will also vary according to
the purpose of information process system, and may comprise for
example X-Y coordinates of the location of a character stored in
tangible form on film,disc or tape, or may be writing instructions
for producing hardcopies, such as instructions to matrix printers
and ink jet printers, for example. In the case of keyboard
controlled movable type printing system, the output information
can be the control signals to select the type of a particular Chinese
character; or in the case of telecomm~nication, the output
information can be the corresponding signals in minim~m redundancy
codes for telegraphic transmission. Ccmbinations of the various
25 information types may of course be utilizsd in the same Chinese
information processing system.


33~

~ 23 - File 1358 P/2CA

The information frc~ the output infor~tion generator
is received in an output buffer 34, which stores the information
temporarily, permitting suitable material to be viewed on a
video monitor 36, or light emitting diode arra~, or liquid crystal
S display, for example. This is desirable in permitting ~he operator
to examune the outputted material before any permanently printed
copy is made, or the informa-tion transmitted, so as to allow text
editing, correction and selection between characters having identical
codes. Output buffer 34 further permits time storing of storage
means, and production of a hardcopy by a printer 37. Storage
means 38 is connected to both input buffer 24 and output buffer
34 whereby information in either code wor.d form or in output
information form that had been earlier generated in accordance with
the foregoing may be stored and later examlned and/or printed or
transmitted.


Representative Drawing

Sorry, the representative drawing for patent document number 1172335 was not found.

Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee  and Payment History  should be consulted.

Administrative Status

Title Date
Forecasted Issue Date 1984-08-07
(22) Filed 1980-03-06
(45) Issued 1984-08-07
Expired 2001-08-07

Abandonment History

There is no abandonment history.

Payment History

Fee Type Anniversary Year Due Date Amount Paid Paid Date
Application Fee $0.00 1980-03-06
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
LEUNG, DANIEL L.
LEUNG, LAI-WO S.
Past Owners on Record
None
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Drawings 1993-12-09 2 51
Claims 1993-12-09 5 187
Abstract 1993-12-09 1 21
Cover Page 1993-12-09 1 15
Description 1993-12-09 23 926