Dynamic multi-lingual software translation system6092037Abstract A software system facilitates the translation of text strings into multiple languages. The software system includes a macro which substitutes for a text string and a message collection and source update utility which scans the source code, locates the macro in the source code, derives a key relating to the text string and updates a database with the text string and key. The macro is included into a source code. Claims What is claimed is: Description BACKGROUND OF THE INVENTION
______________________________________
#ifdef MULTILINGUAL
#defineXLATE(a,b) (b)
typedef unsigned int MCH.sub.-- MSG
#else
#defineXLATE(a,b) (a)
typedef char* MCH.sub.-- MSG
#endif
MCH.sub.-- MSG table[]={
XLATE ("xxx",0x0000),
XLATE ("yyy",0x0000)
}
.....
.....
printf(fGetMessage(XLATE("zzz",0x0000)));
printf(fGetMessage(table[0]));
______________________________________
One example of a string that utilizes the macro 210 for various languages is illustrated as follows:
______________________________________
ASM MACRO/USAGE
MACRO XLATE text, key
DW key
#ifdef MULTILINGUAL
DB 0
#else
DB text
#endif
#endm XLATE
message: XLATE "xxx", 00000h
....
....
PUSH DS
PUSH OFFSET message
CALL fGetMessage
POP BX
POP ES
....
....
______________________________________
A source code module including the macro 210 and the dummy place holder 222 is complete for usage with the native language alone if multilingual capabilities are not desired. For native language usage alone, the source code module including the macro 210 and dummy place holder 222 is used simply to facilitate software program debugging. Referring to FIG. 3, a flow chart illustrates an overview of a system 300 for managing software modules in multiple languages. Insert token step 310 inserts macros 210 and dummy place holders 222 into source code of source files 252. Message collection and source update step 320 scans the source files 252, locates the macros 210 and verifies the format of the macros 210. A message translation step 330 translates a message into multiple languages. A load module step 340 selects languages that are to be included in a module. A dynamic message selection step 350 functions at execution time to determine which language is operational. Insert token step 310 and message collection and source update step 320 are described in detail with reference to FIG. 4 in conjunction with FIG. 2. In FIG. 4 a flow chart depicts steps in a process 400 performed by the message edit utility program 230. The message edit utility program 230 is used during the process of building an executable file 250 from one or more source files 252. The message edit utility program 230 includes a plurality of functions for collecting messages, reconciling a multilingual database 260 and constructing various types of files. These functions are requested by a software developer during the build process. First, in a scan and verify step 410, the message edit utility program 230 scans each of the source files 252, locates the macro 210 within the source files and verifies the format of the macro 210 for the type of source 252. A utility program that performs the scan and verify step 410 has a syntax such as: MSGEDIT keylist <keylist file> from <source.sub.-- list> using <database>, MSGEDIT create <devmod> from <source.sub.-- list>, MSGEDIT create <devmod> from <source.sub.-- list> using <database>, or MSGEDIT reconcile <database> with <source.sub.-- list>, where <devmod> designates a module-specific language file that is built to include multilingual support, <source.sub.-- list> is a list of source files that are included in the software module, and <database> designates a language database such as a single language database of the multilingual database 260. In a derive message step 412, the message edit utility program 230 operates on the verified macro 210 at the command of a software developer to derive a message key 220. The message edit utility program 230 accesses a native language text string 264 from a native language database 262 which is similar to but separate from the multilingual database 260 in some embodiments. The message edit utility program 230 derives the message key 220 based on the native language text string in the native language database 262 that corresponds to a particular macro 210. The message key 220 represents the text string in the multilingual database 260. In one embodiment, the message edit utility program 230 uses an algorithm based on a cyclic redundancy check (CRC-16) error checking procedure to derive the message key 220. When the message key 220 has been derived, a reconcile text string step 414 reconciles the text string 264 and the message key 220 with the native language database 262 and adds additional new messages as new messages are encountered and adds additional error reports as error conditions are detected. In update source code step 416, the source code of a source file 252 is updated with the derived message key 220 so that any changes to the text string are detected automatically during subsequent build processes. In detect duplicate key step 418, the message edit utility program 230 detects whether a single message key is duplicated from a plurality of different text strings. Any hashing algorithm, a transformation of a number into a key, possibly produces such duplications. When detect duplicate key step 418 detects such an error condition, duplication error notification step 420 interactively signals the error condition to the software developer, permitting data to be entered or flow of the utility program 230 to be modified during execution. The software developer responds to the duplicate key error condition by modifying the text string slightly so that the key is changed. The message edit utility program 230 operates on an object code and executable module as well as source code. For object code and executable file load modules, the source code is not available, the message edit utility program 230 extracts native language text strings from the load module or, if a special key list file was used to build the load module, the special key list file. In circumstances where object code or executable modules are accessed by the message edit utility program 230, updating of the source code is achieved by supplying the developer of the source code that is compiled into the object code or executable file with a copy of the message extraction and source updating utility of the message edit utility program 230. Referring to FIG. 5 in conjunction with FIG. 2, a flow chart depicts steps in a process performed by a message collection and reconciliation utility routine 500 of the message edit utility program 230 operating on ASM and C/C++ source files. In determine source type step 510, the message collection and reconciliation utility routine determines whether the source file is an ASM or C.backslash.C++ type file. In a read source step 512, the source file is read until an XLATE token is detected or an end-of-file message is encountered. In verify format step 514, after an XLATE token is found the message collection and reconciliation utility routine continues scanning the source file to verify that the format of the source file is correct for the particular type of source file. In derive key step 516, a new message key 220 is derived based on the text string. Check key step 518 determines whether the newly derived message key matches the existing message key that previously was stored to the source file. If the newly derived key matches the existing message key, then the source has not been updated so that no update step 520 terminates the message collection and reconciliation utility routine. Otherwise, the newly derived and previously existing keys are different and reconcile database step 522 reconciles the database. If the newly derived key already exists in the native language database 262, then verify string step 524 verifies that the message string used to create the message key matches the text string in the native language database 262 to ensure that the key derivation algorithm has not failed. Otherwise, the newly derived message key does not exist in the native language database 262, update database step 526 writes the new message key and the text string used to derive the key to the native language database 262. In update database step 526, all additional language entries in the multilingual database 260 are set to a NULL value since no translations exist for the new message. To avoid completely losing any translations for a simple message change, the previously existing key is utilized. For example, if the source file has an existing previous message key with a nonzero value, then the translations are copied from the previously existing record to the new record in the multilingual database 260. The translations in the multilingual database 260 may have varying degrees of correctness so each translation that is copied from the previously existing record to the new record is flagged as being possibly incorrect. Furthermore, when the newly derived message key and previously existing message key are different, update source step 528 updates the source file with the newly derived message key. The message collection and reconciliation utility routine then continues with read source step 512. Referring to FIG. 6 in conjunction with FIG. 2, a flow chart depicts steps in a process performed by a message collection and reconciliation utility routine 600 of the message edit utility program 230 operating on object code and executable load modules. Open module step 610 opens the load module and, once the module is open, locates and verifies a header of the module. Read module step 612 reads a key list, index table and text corresponding to a first language, language 0. Repeating for each message key in the key list of the module, as directed by next key step 614, derive key step 616 derives a message key based on the text string and verifies whether the derived key matches the key from the load module. Verification ensures that the load module and native language database 262 were created using the same key derivation algorithm. In reconcile entry step 618, the newly derived message key is reconciled with the native language database 262. Special key files are library modules which are typically used when source code is not available. A key file is created when a library routine is used so that the system developer who uses the library routine determines which messages constitute the library routine without having access to the source code. For a special key file to be used, the native language database 262 must already include the messages corresponding to any message keys. The special key files contain utility information determining which messages are to be included with a load module when the multilingual database 260 is constructed. Referring to FIG. 7 in conjunction with FIG. 2, a flow chart depicts steps in a process performed by a message collection and reconciliation utility routine 700 of the message edit utility program 230 operating with special key files. Open keylist step 710 opens a keylist in a special key file, then locates and verifies the header of the key file. Verify key step 714 verifies that the message key exists in the native language database 262 for a key, repeating for each key in the keylist as directed by next key step 712. A utility program that performs the message collection and reconciliation operations and creates special keyfiles has a syntax such as: MSGEDIT keylist <keylist> from <source.sub.-- list> using <database>, where <keylist> designates a particular key list, <source.sub.-- list> is a list of source files that are included in a software module and <database> designates a language database such as a single language database of the multilingual database 260. A key file has a format, as follows:
______________________________________
Header
Key
Key
The header has a format, as follows:
#defineMKY.sub.-- FILE.sub.-- REV 1
#defineMKY.sub.-- FILE.sub.-- SIG "Message Key"
struct sMkyHeader
char m.sub.-- signature(18);
// "Message Key"
char m.sub.-- eof;
// 0x1A
char m.sub.-- revision;
// File revision
int m.sub.-- keyListNum;
// Number of keys in the file
int m.sub.-- checksum;
// Header checksum
};
______________________________________
Each key is a single datum element, described as follows: unsigned int key; If the native language database 262 does not contain the necessary messages, the library routine may alternatively be made available in combination with a Device Module Data File. The Device Module Data File furnishes a run-time module with multilingual messages. An external provider of software that supplies object-only modules uses the Device Module Data File to furnish a software recipient with the keys and messages that make up the object module or modules. When the Device Module Data File is used with an executable load module, the Device Module Data File is copied to the end of the load module where application program interfaces (APIs) can locate the Device Module Data File. The format of the Device Module Data File is, as follows: Header Key List Language 0/Index Table Language 0/Messages . . Language 15/Index Table Language 15/Messages File Size The header contains a text string and other identification information, including a language pointer array (m.sub.-- langPointer) and a language size array (m.sub.-- LANGSIZE). The language pointer array (m.sub.-- langPointer) is used to determine where the index and text for each language is stored in the Device Module Data File. The language size array (m.sub.-- langSize) is used to determine the number of bytes a language occupies in the Device Data Module File, including an index table and text. The format of the header is, as follows:
______________________________________
#defineMSG.sub.-- DBASE.sub.-- REVISION 1
#defineMSG.sub.-- DBASE.sub.-- SIGNATURE"signature"
struct sMdVHeader
char m.sub.-- signature(14);
// "signature"
char m.sub.-- eof
// 0x1A;
char m.sub.-- revision;
// File revision;
char m.sub.-- unused(10);
int m.sub.-- keyListSize;
// Size (in bytes) of key list
area
int m.sub.-- keyListNum;
// Number of keys in the file
int m.sub.-- checksum;
long m.sub.-- langPointer (MSG.sub.-- MAX.sub.-- LANGUAGES);
int m.sub.-- langSize (MSG.sub.-- MAX.sub.-- LANGUAGES);
};
______________________________________
The key list includes a key for each message in the Device Module Data File. The position of a key in the key is an index into a Language Specific Index Table. The format of a key is, as follows: unsigned int key; A Language Specific Index Table is created for each language that has at least one defined message. The Language Specific Index Table is located using the m.sub.-- langPointer array in the header. The Language Specific Index Table storage contains a list of offset values for each message in a Language Specific Message storage. This offset is the storage offset from the beginning of the Language Specific Message storage to the text string that the key represents. If a key is not translated, the Language Specific Index Table storage is constructed to contain the native language message. Accordingly, the Language Specific Index Table storage corresponds one-for-one with the entries in Language Specific Index Table for other languages. The Language Specific Index Table for each language is positioned in storage to always end on a 16-byte boundary. The format of the Language Specific Index Table is, as follows: unsigned int textoffset; The Language Specific Message Area is a list of C/C++ text strings that follows directly behind the Language Specific Index Table that describes the Language Specific Message Area. Information in the Language Specific Message Area is translated to a target language and may include "printf" type format identifiers as part of the text string. The Language Specific Message Area for each language is positioned in storage to always end on a 16-byte boundary. The entire size of the Device Module Data File is designated by the File Size. The File Size is used to allow an application program interface (API) to locate the header when the data file is copied to the end of a load module. The format of a File Size is, as follows: unsigned long fileSize; All keys, text and translated text are stored in a Language Database. For many implementations, a single Language Database is shared among many applications so that duplicate messages are not translated multiple times. Keys and text entries are automatically added to the Language Database by a module build utility. Upon completion of all key and text entries, the Language Database is sent to a language translation specialist who supplies translated text corresponding to each key and text pair using another utility. The Device Module Data File furnishes storage for storing multilingual messages. In one embodiment, the Language Database file has a simple proprietary format. In other embodiments, the Language Database File is implemented using one of a plurality of popular relational databases and database utilities made available with the relational databases. The relational database utilities are modified to use a suitable interface to the Language Database. The Language Database has a format, as follows: Header Record 0 Record 1 Record N The header contains a text string and other associated identification information. An m.sub.-- LangMask field is a bitmap that is used to determine which language of a plurality of languages can be placed into the Language Database. The m.sub.-- langMask field is primarily used during creation of separate databases that are sent out for translation. The format of the header is, as follows:
______________________________________
#defineMSGDBAS.sub.-- DBASE.sub.-- REVISION 1
#defineMSGDBAS.sub.-- DBASE.sub.-- SIGNATURE "database"
struct sMsgdbasFileHeader
char m.sub.-- signature(20);
// "signature"
char m.sub.-- eof; // 0x1A;
char m.sub.-- revision;
// File revision;
int m.sub.-- checksum;
// Checksum of the header
unsigned int m.sub.-- langMask;
// Legal language mask
char m.sub.-- unused(6);
};
______________________________________
The Language Database optionally contains several types of records. Each record contains a portion that is common to all records in the Language Database. Records are accessed only by program code that manages the Language Database. Thus, data in the records is not accessible by call instructions. The format of the Language Database records is, as follows:
______________________________________
#defineMSGDBAS.sub.-- TYPE.sub.-- KEY
0x55 // Key record
#defineMSGDBAS.sub.-- TYPE.sub.-- KEY
0x77 // Text record
#defineMSGDBAS.sub.-- TYPE.sub.-- KEY
0x99 // Invalid rccord
#defineMSGDBAS.sub.-- TYPE.sub.-- KEY
0x44 // Free space record
struct sMsgdbasCommonHeader
unsigned
char m.sub.-- type;
// Record type
unsigned
char m.sub.-- rsvd;
// Reserved (not yet used) - 0 value
int m.sub.-- size; // SIze of record, including the header
};
______________________________________
When a record is deleted or reallocated to another location in the Language Database, a free space record is created at the location of the deleted or reallocated record. Another record may be written to the free space record location, removing or reducing size of the free space record. Message information is stored in a text record. A text record includes a data portion that directly follows the common header. The data portion is typically a C/C++ text string. A key record defines the linkage between a key and a message text string. Many members of a key record are declared as private members and are therefore accessed only by program code that manages the Language Database. The private members of the key record are, in fact, used to manage the Language Database. The private members of the key record are not accessible by call instructions. The key record is used to manage access to elements stored in the Language Database. For example, the first element of a particular message, the first element designated by m.sub.-- langPointers, is reserved for text used to derive the key. Modification of this first element is not allowed. The key record includes a public key portion (sMsgdbasPubKeyRec) which stores data that is exchanged from a caller when a caller requests a record. The key record includes language flags (m.sub.-- langFlags) that serve as flag bytes for each language. The format of the key record is, as follows:
__________________________________________________________________________
struct sMsgdbasKeyRecord
sMsgdbasCommonHeader
m.sub.-- header;
// Type / size portion
sMsgdbasPubKeyRec
m.sub.-- pubKeyRec;
// User modifiable record
long m.sub.-- langPointers(MSG.sub.-- MAX.sub.-- LANGUAGES);
// Language text
location
};
struct sMsgdbasPubKeyRec
{
unsigned int m.sub.-- key;
// Key value
unsigned char m.sub.-- langFlags(MSG.sub.-- MAX.sub.-- LANGUAGES)
// Flag byte for each lang
};
//Language flags
#define MSGDBAS.sub.-- LFLAG.sub.-- EXISTS 0x80
// Language exists for key
#define MSGDBAS.sub.-- LFLAG.sub.-- NEEDXLAT 0x40
// Language to be examined
#define MSGDBAS.sub.-- LFLAG.sub.-- USER1 0x02
// Reserved for user
#define MSGDBAS.sub.-- LFLAG.sub.-- USER0 0x01
// Reserved for user
__________________________________________________________________________
Referring again to FIG. 3, the message translation step 330 is operational after all messages have been collected. Message translation step 330 translates a message into multiple languages, as desired. The message translation step 330 is realized using a message translation utility 332. The message translation utility 332 includes an interactive option which determines whether all messages are displayed or only messages that are to be translated are displayed, permitting data to be entered or flow of the message translation utility 332 to be modified during execution. This option allows a software developer to efficiently perform translations without examining every message. The multilingual database 260 is optionally created to contain only certain selected languages or can be restricted so that messages in the native language database 262 cannot be edited, modified, added or deleted. Initially, a separate database may be created for each language. Having multiple single-language databases is advantageous for avoiding the error of incorrectly supplying the wrong language to a module. Multiple single-language databases is particularly advantageous for source programs that are developed by an external source and implement only a few languages, such as the native language alone or the native language and one other language. The multiple single-language databases are later merged into the multilingual database 260 and then discarded. A further advantage of having multiple single-language databases is that a database may be sent to several translators to allow multiple translations to be performed in parallel, thereby improving a product's time-to-market. Various utility programs are used to configure the various language databases of the multilingual database 260. For example, a new database <new.sub.-- database> is initialized with a language mask <language.sub.-- mask> to define the language of the database using a database initialization utility having a syntax, as follows: MSGEDIT initialize <new.sub.-- database> using <language.sub.-- mask>. A "rebuild" utility is similar to the initialize utility but is typically requested to include a source file that already has defined tokens into a database. The rebuild utility is useful for flagging errors that occur when messages are unintentionally deleted and for recreating a damaged database without requiring editing of the source file and initializing the keys to a starting point. The default behavior of utilities including create, reconcile, keylist and the like is to generate an error notification when a message that has a defined key cannot be located in the database. The rebuild utility is overrides the default behavior. The rebuild utility has a syntax, as follows: MSGEDIT rebuild <database> with <source.sub.-- list>. An existing database <database> is compressed into a new database <new.sub.-- database> using a compress utility having a syntax, as follows: MSGEDIT compress <database> into <new.sub.-- database>. Message entries of a single language database <tran.sub.-- database> are written from a multiple language database <database> with a language mask <language.sub.-- mask> designating the single language using a database export utility having a syntax, as follows: MSGEDIT export <database> into <tran.sub.-- database> using <language.sub.-- mask>. Message entries of a single language database <tran.sub.-- database> are added to a multiple language database <database> using a database export utility having a syntax, as follows: MSGEDIT import <tran.sub.-- database> from <database>. The load module step 340 selects languages that are to be included in a module, extracts translated text strings from the multilingual database 260 and builds a simple reference file 342 that is attached to the module. Messages that are included with each module are determined by the source files that are combined to form the module. The load module step 340 is realized using a load module utility 344 which extracts information from the source files and the special key files. The load module utility 344 is automatically invoked so that intervention of the software developer is not necessary. The dynamic message selection step 350 functions at execution time to determine which language is operational. To determine which language a software developer intends to implement, typically either an option menu is presented to a software developer awaiting a response or a default selection is made in accordance with the operating system under which the software translation management system 200 is implemented. The module loads the text for the selected language from the multilingual database 260. When the executable software program of the module executes the code that was inserted by the macro 210, the software program simply passes the message key 220 corresponding to the macro 210 to a locate translated message utility 352 which locates the translated message in the loaded language area of the multilingual database 260. The located message is then available for passing to formatting routines and print routines as needed. While the invention has been described with reference to various embodiments, it will be understood that these embodiments are illustrative and that the scope of the invention is not limited to them. Many variations, modifications, additions and improvements of the embodiments described are possible.
|
Same subclass Same class Consider this |
||||||||||
