Database access using data field translations to find unique database entries5778344Abstract A database with entries having possibly indistinguishable tuples of fields in one or more related vocabulary is accessed using vocabulary translations of fields to identify unique tuples thereof. Claims What is claimed is: Description BACKGROUND OF THE INVENTION
______________________________________
Database Spoken
representation representation
IPSWICH IPSWICH
IPSWICH KESGRAVE
IPSWICH FOXHALL
KESGRAVE KESGRAVE
KESGRAVE IPSWICH
FOXHALL FOXHALL
FOXHALL IPSWICH
______________________________________
(If desired any word used as a database representation which has a 1:1 correspondence with, and is the same as, a spoken vocabulary word may be omitted from the table, since no translation is required). The spoken translation table 9 has a separate area for each type of field and may be accessed by the processor 5 to determine the database representation(s) corresponding to a given vocabulary word and vice versa. If desired (or if the database representations are not in text form) all items may be translated. The pronunciation table 10 is a store containing a look-up table (and, if desired, a set of rules to reduce the number of entries in the look-up table) so that the processor 5 may access it (for synthesis purposes or for identifying homophones) to obtain, for a given spoken vocabulary word, a phonetic representation of one or more ways of pronouncing it, and, conversely (for recognition purposes), to obtain, for a given phonetic representation, one or more spoken vocabulary words which correspond to that pronunciation. A separate area for each type of field may be desirable. The operation of the apparatus is illustrated in the flow-chart of FIGS. 3a-3c which is implemented as a program stored in the memory 6. The first steps involve the generation, using the synthesiser, of questions to the user, and recognition of the user's responses. Thus in steps 100, 104, 108 the processor 5 sends to the synthesiser 4 commands instructing it to play announcements requesting the user to speak, respectively the surname, forename and town of the person whose telephone number he seeks. In steps 102, 106 and 110 the processor sends to the recogniser 5 commands instructing it to recognise the user's responses by reference to phonetic vocabularies corresponding to those fields. The recogniser may access the translation table 9, 10 to determine the vocabularies to be used for each recognition step, or may internally store or generate its own vocabularies; in the latter case the vocabularies used must correspond to those determined by the tables 9, 10 (and, if appropriate, the database) so that it can output only words included in the phonetic vocabulary. The recogniser is arranged so that it will produce as output, for each recognition step, as many phonetic representations as meet a predetermined criterion of similarity to the word actually spoken by the user. (The recogniser could of course perform a translation to spoken vocabulary representations, and many recognisers are capable of doing so). Preferably the recogniser also produces a "score" or confidence measure for each representation indicating the relative probability of correspondence to the word actually spoken. The preliminary steps 100-110 will not be discussed further as they are described elsewhere; for example reference may be made to our co-pending International patent application no.PCT/GB/02524. Following step 110, the processor 5 has available to it for each of the three fields, one or more phonetic representations deemed to have been recognised. What is required now is a translation to spoken vocabulary representations--i.e. the translation illustrated to the left of FIG. 1. Thus in step 112 the processor accesses the table 9 to determine, for each word, one or more corresponding spoken vocabulary representations, so that it now has three sets of spoken vocabulary representations. The score for each spoken vocabulary representation is the score for the phonetic representation from which it was translated. If two phonetic representations translate to the same vocabulary representation, the more confident of the two scores may be taken. In step 114, the processor 5 now performs a translation to database representations--i.e. the translation illustrated in the centre of FIG. 1--using the table 10 to determine, for each word, one or more corresponding database representations, so that it now has three sets of database representations. Scores may be propagated as for the earlier translation. These represent a number of triples (the actual number being the product of the number of representations in each of the three sets). The score for a triple is typically the product of the scores of the individual representations of which it is composed. At step 116, the processor generates a list of these triples and passes it to the database which returns a count of the number of database entries corresponding to these triples. If (step 118) this number is zero, then the processor in step 120 sends a command to the synthesiser to play an announcement to the effect that no entry has been found, and terminates the program (step 122). Alternatively other action may me taken such as transferring the user to a manual operator. If there are entries, then in step 124 the full tuples are retrieved from the database in turn to determine whether there are three or fewer distinguishable entries. The meaning of "distinguishable" and the method of its determination will be explained presently. Once a count of four is reached the test is terminated. If (step 126) the number of distinguishable entries is three of fewer then in step 128 the processor retrieves these entries from the database and forwards them to the synthesiser 4 which reads them to the user, using the tables 9, 10 for translation from database representation to phonetic representation. If there are more than three distinguishable entries then the process enters a confirmation phase in which an attempt is made to identify lists of extracted tuples which contain three or fewer distinguishable tuples, and to offer the tuples in turn to the user for confirmation. In this example the tuples are the duple corresponding to the name (i.e. forename+surname), and the single corresponding to the town. Note that, although the case in this example, it is not in principle necessary that the constituent words of these tuples correspond to fields for which the user has already been asked. Firstly, therefore, a list of extracted duples is prepared from the list of triples. (If desired the number of nonidentical duples in the list may be counted, and if this exceeds a predetermined limit, e.g. 3 the detailed examination process skipped (to step 144)). This process is iterative; thus in step 130 a check is made as to whether the name duple has already been offered for confirmation; on the first pass the answer will always be "no", and so at step 132 the name duples from the list are examined in similar fashion to that in step 124 to determine whether there are three or fewer distinguishable duples. If no, (step 134) then the duples are translated into a single phonetic representation and fed to the synthesiser in step 136 so that the synthesiser speaks the question (e.g.) "is the name John Smith?--please answer yes or no" one at a time with the recogniser forwarding the reply to the processor for (138) testing for "yes" or "no". If the user replies "yes", then, in step 140: (a) the surname and forename fields are marked "confirmed" so that further offering of them for confirmation is bypassed by the test at step 130; (b) all members of the list of triples, other than those which are related to the confirmed duple (see below), are deleted. The process may then recommence from step 124. If (step 142) the user has answered no to all the offered tuples, this is considered a failure and the process it terminated via steps 120 and 122. If in the test at step 134 the number of distinguishable name duples is too large for confirmation, or at step 130 on a second or subsequent pass the name confirmation has already occurred, and assuming (step 144) the town name has not yet been offered for confirmation, then a town name confirmation process is commenced, comprising steps 146 to 154 which are in all respects analogous to the steps 132 to 142 already described. If these processes fail to reduce the number of distinguishable entries at the test 126 then the process eventually terminates with an announcement 156 that too many entries have been found for a response to be given. Alternatively, a further procedure may follow in which one or more further questions are asked (as in step 100) to obtain information on further fields. This process shown in FIG. 3b from step 116 onwards has, for clarity, been described in terms of confirmation of a duple and a single. A more generalised algorithm might proceed as follows. Start: If there are no database entries still active: Give "none" message. Finish algorithm. Jump: If there are three or less distinguishable database entries: Offer them. Finish algorithm. If not, then: Do the following for successive prioritised fields or combinations of fields that have not already been confirmed until no such fields remain: If for this there is a tuple list with 3 or less distinguishable tuples then: Attempt to confirm on this list. If positive confirmation, confirm it and go to JUMP. If negative confirmation give "wrong entry" message. Go back to do the following: In a prioritised list, get the next vocabulary which may be asked. If there is an un-asked and un-confirmed vocab remaining: Ask for it. Goto start of algorithm. If not: Give "too many" message. Finish algorithm In the above procedures, it is required to examine a list of tuples in database representation to determine how many distinguishable tuples there are. The tuple in question may be an entire database entry (as in step 124 above), it may be an extracted tuple containing representations from two (or more) fields (as in step 132) or it may be an extracted single (as in step 146). Two representations are considered indistinguishable if: (a) they are identical; or (b) they translate to identical spoken vocabulary words (e.g. they are synonyms or are geographically confused); or (c) they translate to spoken vocabulary words which are homophones (i.e. those words translate to identical phonetic representations). Two tuples are considered indistinguishable if every field of one tuple is indistinguishable (as defined above) from the corresponding field of the other tuple. Suppose that we have a list of tuples in database representation where the first tuple in the list is D(1) and the tuple currently occupying the n'th position in the list is D(n) where n=1, . . . , N, there being N tuples in the list. Each tuple consists of M fields, designated d, so that the m'th field of tuple D(n) is d(n,m)--i.e. D(n)={d(n,m)}, m=1, . . . , M. Preferably the list is ordered by score; i.e. the tuple having the highest confidence is D(1), the next D(2) and so on. The process to be describe is illustrated in the flowchart of FIG. 4 and involves taking the first tuple from the list, and comparing it with the tuple below it in the list to ascertain whether the two are distinguishable. If they are not, the tuple occupying the lower position it is deleted from the list. This is repeated until all tuples have been examined. The same steps are then performed for the tuple now occupying the second position in the list, and so on; eventually every tuple remaining in the list is distinguishable from every other. If desired, the process may be terminated as soon as it is certain that the number of distinguishable tuples exceed that which can be handled by subsequent steps (i.e., in this example, 3). In FIG. 4, i points to a tuple in the list and j points to a tuple lower down the list. I is the number of tuples in the list. In step 200, i is initialised to 1, I set to N, and in step 202 D(i) is read from the database. Step 204 sets j to point to the following tuple and in step 206 D(j) is read. A field pointer m is then initialised to 1 in step 208 and this is followed by a loop in which each field of the two tuples is taken in turn. Field m of tuple D(i) is (step 210) translated, with the aid of the table 9, into one or more spoken vocabulary words s1(a) where s1=1, . . . A and A is, effectively the number of synonyms found. The spoken vocabulary word(s) s1(a) is/are then translated (212) with the aid of the table 10 into a total of B phonetic representations p1(b) (b=1, . . . B). B is the number of such representations, i.e. A multiplied by the number of homophones. Analogous steps 214, 216 perform a two-stage translation of the corresponding field of D(j) to produce one or more phonetic representations s2(d) (d=1, . . . D). In step 218, each of the phonetic representations p1(b) is compared with each of the representations p2(d) (i.e. BD comparisons in total). If equality is found in none of these comparisons, then the two tuples are considered distinguishable. If (step 226) j has not reached the last tuple in the list, it is incremented (228) prior to reading a further tuple in a repeat of step 206; otherwise the tuple pointer i is tested at step 230 as to whether it has reached the penultimate member of the list and either (if it has not) is incremented (232) prior to a return to step 202, or (if it has) the process ends. At this point, the list now contains only mutually distinguishable tuples--I in number--and thus the result k is set to I in step 233 prior to exit from this part of the process at step 234. If on the other hand the comparison at 218 indicates identity between one of the phonetic representations generated for one field of one tuple and one of the phonetic representations generated for the same field of the other tuple then it is necessary to increment m (step 236) and repeat steps 210 to 218 for a further field. If all fields of the two tuples have been compared and all are indistinguishable then this is recognised at step 238 and the tuples are deemed to be indistinguishable. In this case, the lower tuple D(j) is removed from the list and I is decremented so that it continues to represent the number of tuples remaining in the list (steps 240, 242). j is then tested at step 244 to determine whether it points beyond the end of the (now shortened) list and if not a further tuple is examined, continuing from step 206. Otherwise the process proceeds to step 230, already described. Each time step 232 increments i to point to a tuple, it is known that there are at least i tuples which will not be removed from the list by step 240. Thus at this point i can be tested (step 246) to see if it has reached 3, and if so the process may be interrupted, k set to 4, and thence to the exit 234. In order to clarify the relationship between the algorithm of FIG. 4 and the steps of FIG. 3, it should be mentioned that (a) the algorithm represents the execution of step 124, with the list at the conclusion of FIG. 4 being used to access (from the database) the entries to be offered in step 128; (b) the algorithm represents the execution of step 132, with the list at the conclusion of FIG. 4 representing the list of name duples (in database representation) to be offered to the user in step 136; (c) the algorithm represents the execution of step 146, with list at the conclusion of FIG. 4 representing the list of towns to be offered to the user in step 150. It remains to explain the removal which occurs in steps 140 and 154 in FIG. 3b. Taking step 140 as an example, the principle followed is that where the user has confirmed a tuple (in this case a duple) which is one of a pair (or group) of tuples deemed indistinguishable, then this is considered to constitute confirmation also of the other tuple(s) of the pair or group. For example, if the list of name duples contains: Dave Smith David Smyth and these are considered by step 132 to be indistinguishable and thus only the first entry ("Dave Smith") is offered to the user for confirmation in step 136. However, if the user says "yes", then in step 140, all tuples containing "Dave Smith" and all tuples containing "David Smyth" are retained. Whilst this could be done using the results of the translations performed in step 132, we prefer to proceed as follows. Each field p of the confirmed duple in phonetic representation (i.e. the one generated in step 136) is translated using the tables 9, 10 into one or more database representations. All duples represented by combinations of these representations are to be confirmed--i.e. any of the list of triples which contains one of these duples is retained, and the other triples are deleted. It is perhaps worth clarifying the relationship between the Entity Relationship Diagram of FIG. 1 and the processes set out in FIGS. 3a-3c and 4. In these processes, translations occur from database representation to spoken vocabulary representation to phonetic representation (i.e. right to left in FIG. 1) and in the opposite direction, viz. from phonetic representation to spoken vocabulary representation to database representation (i.e. left to right in FIG. 1). The existence of alternative paths in the diagram (e.g. may be spoken; is primarily spoken) implies a choice of translation routes. For synthesis, the "primarily spoken" routes would normally be used; for other purposes, variations are possible according to one is desire to include or exclude synonyms or homophones in the translation. From different routes some results are tabulated below, with an example set of routes for forenames.
______________________________________
Used in Typical Route for
Direction
steps Description forenames
______________________________________
phonetic to
112, 114 Mappings used to
MayBeSpoken/
database convert a MayBePronounced
recognition result
(i.e. include synonyms
into database
and homophones)
representation
database to
124, 132,
mappings used to
IsPrimarilySpoken/
phonetic
146 decide on MayBePronounced
distinguishable
(i.e. exclude synonyms
database but include homophones
representations
in the translation -result-
ing in synonyms but not
homophones being in-
cluded in the resulting
list of distinguishable
tuples)
database to
136, 150 mappings used for
IsPrimarilySpoken/
phonetic synthesis IsPrimarilyPronounced
phonetic to
140,154 mappings used to
MayBeSpoken/
database confirm a MayBePronounced (i.e.
pronunciation back
include both synonyms
into database
and homophones
representation
______________________________________
Note that in practice the stores 9, 10 may contain separate "tables" for each mapping. If spellings are to be used during confirmation rather than pronunciation then all the routes mentioned above substitute for `Pronounced` `Spelt` and the algorithms all still apply. It should also be mentioned that spoken input is not essential--since the considerations concerning the offering and confirming still arise. For example, keypad input could be used. In this case a third vocabulary--of keypad input codes, is required.
|
Same subclass Same class Consider this |
||||||||||
