|
|
|
Health care management (e.g., record management, ICDA billing) |
Mobile communication and computing system and method6401085
Abstract
A system is disclosed that facilitates web-based information retrieval and display system. A wireless phone or similar hand-held wireless device with Internet Protocol capability is combined with other peripherals to provide a portable portal into the Internet. The wireless device prompts a user to input information of interest to the user. This information is transmitted a query to a service routine (running on a Web server). The service routine then queries the Web to find price, shipping and availability information from various Web suppliers. This information is then available for use by application programs such as wordprocessors, e-mail, accounting, graphical editors and other user tools. The system provides an innovative collaborative interface to many popular user applications that are useful in a mobile environment.
Claims
What is claimed is:
1. A method for creating an information summary on a mobile computing environment, comprising the steps of:
a) creating a query for an information summary based in part on a user input:
b) querying a network of information having a content database utilizing a wireless communication device;
c) receiving contents pertaining to the user from the network of information on the mobile computing environment; and
d) parsing and summarizing the contents utilizing an application tool, wherein the application tool is a wordprocessor.
2. A method for creating an information summary on a mobile computing environment, comprising the steps of:
a) creating a query for an information summary based in part on a user input:
b) querying a network of information having a content database utilizing a wireless communication device;
c) receiving contents pertaining to the user from the network of information on the mobile computing environment; and
d) parsing and summarizing the contents utilizing an application tool, wherein the application tool provides healthcare services.
3. A computer program embodied on a computer-readable medium that creates an information summary, comprising:
a) a code segment that creates a query for an information summary based in part on a user input;
b) a code segment that queries a network of information having a content database utilizing a wireless communication device;
c) a code segment that receives contents pertaining to the user from the network of information on the mobile computing environment; and
d) a code segment that parses and summarizes the contents utilizing an application tool, wherein the application tool is a wordprocessor.
4. A computer program embodied on a computer-readable medium that creates an information summary, comprising:
a) a code segment that creates a query for an information summary based in part on a user input;
b) a code segment that queries a network of information having a content database utilizing a wireless communication device;
c) a code segment that receives contents pertaining to the user from the network of information on the mobile computing environment; and
d) a code segment that parses and summarizes the contents utilizing an application tool, wherein the application tool provides healthcare services.
5. An apparatus that creates an information summary, comprising:
a) a processor;
b) a memory that stores information under the control of a processor;
c) logic that creates a query for an information summary based in part on a user input;
d) logic that queries a network of information having a content database utilizing a wireless communication device;
e) logic that receives contents pertaining to the user from the network of information on the mobile computing environment; and
f) logic that parses and summarizes the contents utilizing an application tool, wherein the application tool is a wordprocessor.
6. An apparatus that creates an information summary, comprising:
a) a processor;
b) a memory that stores information under the control of a processor;
c) logic that creates a query for an information summary based in part on a user input;
d) logic that queries a network of information having a content database utilizing a wireless communication device;
e) logic that receives contents pertaining to the user from the network of information on the mobile computing environment; and
f) logic that parses and summarizes the contents utilizing an application tool, wherein the application tool provides healthcare services.
Description
FIELD OF THE INVENTION
The present invention relates to agent based systems and more particularly to a mobile computing environment that accesses the Internet to obtain product information for a user and provides tools for collaborative computing.
BACKGROUND OF THE INVENTION
Computer assistance in all environments is increasingly necessary as computer technology becomes increasingly embedded in society. Mobile computing technology addresses this issue by allowing the individual to access computer related information at all times and in all environments.
One of the first major advances in mobile computer technology was the Personal Digital Assistant (PDA). A PDA allowed a user to access computer related information, yet fitted in the palm of the hand. Utilizing a PDA the user could organize personal affairs, write notes, calculate equations, and record contact numbers an address book. In addition, PDAs were usually capable of interfacing with a desktop computer, typically through a wire connection. The connection allowed the PDA to download information and upload information, with the desktop computer. Later developments gave the PDA wireless capabilities. The wireless capabilities allowed the PDA to interact with other computers that were not physically connected to the PDA.
Wireless PDAs could communicate with computers that were connected to the World Wide Web, and soon led to PDAs capable of Web browsing. One of the first companies to develop Web browsing capabilities for PDAs was Intercom. Intercom's Falcon Mobile Server allowed PDAs with Web functions to directly connect to a host computer. Just by installing the software onto the host server, PDA terminals were able to access information through the World Wide Web. Currently, more integration in mobile computing is desired. Nokia, an Irving Tex. company, has partially addressed the integration issue by developing the Nokia 9000 wireless voice phone. The Nokia 9000 includes a small keyboard, a specialized Web browser from microbrowser vendor Unwired Planet, Inc., and a small VGA monitor. Nokia worked with Ericsson Inc, Motorola Inc. and Unwired Planet to establish the Wireless Application Protocol (WAP), a standardized browser technology and server format. WAP gave manufacturers a standard way to put data capability into wireless phones, and allowed carriers to do more over-the-air management. For example, if a carrier wanted a field trial of a new data service, the carrier could implement the service on a server, deliver it to a phone through the microbrowser and adjust the service if they found the service unsatisfactory.
Prior Art FIG. 1A is a diagram of prior art mobile computing solutions based on web portal networks. In the Prior Art, the user 10 must deal separately with each participant of the network. In the Prior Art mobile computing solution, the user 10 utilizes an Internet service provider (ISP) 12 to gain access to a web portal 14. The web portal 14 accesses third party services 16 which provide information directly to the user 10. However, in addition to dealing with the Internet Service Provider 12, the user 10 must purchase the wireless device from the device manufactures or retailers 18. In most cases the user 10 would also have to purchase the browser from the browser provider 20. Generally, the user would have to pay the wireless communication cost, leading to the user needing to deal with the phone company 22. And finally, any web purchases would lead to the user 10 needing to deal with the credit card company 24. It is obvious that a coordinated and packaged service would be an ideal mobile computing solution. Furthermore, a coordinated and packaged service which made use of agents would be highly desired.
Agent based technology has become increasingly important for use with applications designed to interact with a user for performing various computer based tasks in foreground and background modes. Agent software comprises computer programs that are set on behalf of users to perform routine, tedious and time-consuming tasks. To be useful to an individual user, an agent must be personalized to the individual user's goals, habits and preferences. Thus, there exists a substantial requirement for the agent to efficiently and effectively acquire user-specific knowledge from the user and utilize it to perform tasks on behalf of the user.
The concept of agency, or the user of agents, is well established. An agent is a person authorized by another person, typically referred to as a principal, to act on behalf of the principal. In this manner the principal empowers the agent to perform any of the tasks that the principal is unwilling or unable to perform. For example, an insurance agent may handle all of the insurance requirements for a principal, or a talent agent may act on behalf of a performer to arrange concert dates.
With the advent of the computer, a new domain for employing agents has arrived. Significant advances in the realm of expert systems enable computer programs to act on behalf of computer users to perform routine, tedious and other time-consuming tasks. These computer programs are referred to as "software agents."
Moreover, there has been a recent proliferation of computer and communication networks. These networks permit a user to access vast amounts of information and services without, essentially, any geographical boundaries. Thus, a software agent has a rich environment to perform a large number of tasks on behalf of a user. For example, it is now possible for an agent to make an airline reservation, purchase the ticket, and have the ticket delivered directly to a user. Similarly, an agent could scan the Internet and obtain information ranging from the latest sports or news to a particular graduate thesis in applied physics. Current solutions fail to apply agent technology to provide targeted acquisition of information for a user's upcoming events.
SUMMARY OF THE INVENTION
A system is disclosed that facilitates web-based information retrieval and display system. A wireless phone or similar hand-held wireless device with Internet Protocol capability is combined with other peripherals to provide a portable portal into the Internet. The wireless device prompts a user to input information of interest to the user. This information is transmitted a query to a service routine (running on a Web server). The service routine then queries the Web to find price, shipping and availability information from various Web suppliers. This information is formatted and displayed on the hand-held device's screen. The user may then use the hand-held device to place an order interactively. The system provides an innovative collaborative interface to many popular user applications that are useful in a mobile environment.
DESCRIPTION OF THE DRAWINGS
The foregoing and other objects, aspects and advantages are better understood from the following detailed description of a preferred embodiment of the invention with reference to the drawings, in which:
Prior Art FIG. 1A is a diagram of Prior Art mobile computing solutions based on web portal networks;
FIG. 1 is a block diagram of a representative hardware environment in accordance with a preferred embodiment;
FIG. 2 is a flowchart of the system in accordance with a preferred embodiment;
FIG. 3 is a flowchart of a parsing unit of the system in accordance with a preferred embodiment;
FIG. 4 is a flowchart for pattern matching in accordance with a preferred embodiment;
FIGS. 5 is a flowchart for a search unit in accordance with a preferred embodiment;
FIG. 6 is a flowchart for overall system processing in accordance with a preferred embodiment;
FIG. 7 is a flowchart of topic processing in accordance with a preferred embodiment;
FIG. 8 is a flowchart of meeting record processing in accordance with a preferred embodiment;
FIG. 9 is a block diagram of process flow of a pocket bargain finder in accordance with a preferred embodiment;
FIGS. 10A and 10B are a block diagram and flowchart depicting the logic associated with creating a customized content web page in accordance with a preferred embodiment;
FIG. 11 is a flowchart depicting the detailed logic associated with retrieving user-centric content in accordance with a preferred embodiment;
FIG. 12 is a data model of a user profile in accordance with a preferred embodiment;
FIG. 13 is a persona data model in accordance with a preferred embodiment;
FIG. 14 is an intention data model in accordance with a preferred embodiment;
FIG. 15 is a flowchart of the processing for generating an agent's current statistics in accordance with a preferred embodiment;
FIG. 16 is a flowchart of the logic that determines the personalized product rating for a user in accordance with a preferred embodiment;
FIG. 17 is a flowchart of the logic for accessing the centrally stored profile in accordance with a preferred embodiment;
FIG. 18 is a flowchart of the interaction logic between a user and the integrator for a particular supplier in accordance with a preferred embodiment;
FIG. 19 is a flowchart of the agent processing for generating a verbal summary in accordance with a preferred embodiment;
FIG. 20 illustrates a display login in accordance with a preferred embodiment;
FIG. 21 illustrates a managing daily logistics display in accordance with a preferred embodiment;
FIG. 22 illustrates a user main display in accordance with a preferred embodiment;
FIG. 23 illustrates an agent interaction display in accordance with a preferred embodiment;
FIG. 24 is a block diagram of an active knowledge management system in accordance with a preferred embodiment;
FIG. 25 is a block diagram of a back end server in accordance with a preferred embodiment;
FIG. 26 is a flow chart illustrating how the hardware and software of one embodiment of the present invention operates;
FIG. 27A illustrates a display of the browser mode in accordance with a preferred embodiment; and
FIG. 27B is an illustration of a Mobile Portal platform in accordance with a preferred embodiment.
DETAILED DESCRIPTION
A preferred embodiment of a system in accordance with the present invention is preferably practiced in the context of a personal computer such as an IBM compatible personal computer, Apple Macintosh computer or UNIX based workstation. A representative hardware environment is depicted in FIG. 1, which illustrates a typical hardware configuration of a workstation in accordance with a preferred embodiment having a central processing unit 110, such as a microprocessor, and a number of other units interconnected via a system bus 112. The workstation shown in FIG. 1 includes a Random Access Memory (RAM) 114, Read Only Memory (ROM) 116, an I/O adapter 118 for connecting peripheral devices such as disk storage units 120 to the bus 112, a user interface adapter 122 for connecting a keyboard 124, a mouse 126, a speaker 128, a microphone 132, and/or other user interface devices such as a touch screen (not shown) to the bus 112, communication adapter 134 for connecting the workstation to a communication network (e.g., a data processing network) and a display adapter 136 for connecting the bus 112 to a display device 138. The workstation typically has resident thereon an operating system such as the Microsoft Windows NT or Windows/95 Operating System (OS), the IBM OS/2 operating system, the MAC OS, or UNIX operating system. Those skilled in the art will appreciate that the present invention may also be implemented on platforms and operating systems other than those mentioned.
A preferred embodiment is written using JAVA, C, and the C++ language and utilizes object oriented programming methodology. Object oriented programming (OOP) has become increasingly used to develop complex applications. As OOP moves toward the mainstream of software design and development, various software solutions require adaptation to make use of the benefits of OOP. A need exists for these principles of OOP to be applied to a messaging interface of an electronic messaging system such that a set of OOP classes and objects for the messaging interface can be provided.
OOP is a process of developing computer software using objects, including the steps of analyzing the problem, designing the system, and constructing the program. An object is a software package that contains both data and a collection of related structures and procedures. Since it contains both data and a collection of structures and procedures, it can be visualized as a self-sufficient component that does not require other additional structures, procedures or data to perform its specific task. OOP, therefore, views a computer program as a collection of largely autonomous components, called objects, each of which is responsible for a specific task. This concept of packaging data, structures, and procedures together in one component or module is called encapsulation.
In general, OOP components are reusable software modules which present an interface that conforms to an object model and which are accessed at run-time through a component integration architecture. A component integration architecture is a set of architecture mechanisms which allow software modules in different process spaces to utilize each others capabilities or functions. This is generally done by assuming a common component object model on which to build the architecture.
It is worthwhile to differentiate between an object and a class of objects at this point. An object is a single instance of the class of objects, which is often just called a class. A class of objects can be viewed as a blueprint, from which many objects can be formed.
OOP allows the programmer to create an object that is a part of another object. For example, the object representing a piston engine is said to have a composition-relationship with the object representing a piston. In reality, a piston engine comprises a piston, valves and many other components; the fact that a piston is an element of a piston engine can be logically and semantically represented in OOP by two objects.
OOP also allows creation of an object that "depends from" another object. If there are two objects, one representing a piston engine and the other representing a piston engine wherein the piston is made of ceramic, then the relationship between the two objects is not that of composition. A ceramic piston engine does not make up a piston engine. Rather it is merely one kind of piston engine that has one more limitation than the piston engine; its piston is made of ceramic. In this case, the object representing the ceramic piston engine is called a derived object, and it inherits all of the aspects of the object representing the piston engine and adds further limitation or detail to it. The object representing the ceramic piston engine "depends from" the object representing the piston engine. The relationship between these objects is called inheritance.
When the object or class representing the ceramic piston engine inherits all of the aspects of the objects representing the piston engine, it inherits the thermal characteristics of a standard piston defined in the piston engine class. However, the ceramic piston engine object overrides these ceramic specific thermal characteristics, which are typically different from those associated with a metal piston. It skips over the original and uses new functions related to ceramic pistons. Different kinds of piston engines have different characteristics, but may have the same underlying functions associated with it (e.g., how many pistons in the engine, ignition sequences, lubrication, etc.). To access each of these functions in any piston engine object, a programmer would call the same functions with the same names, but each type of piston engine may have different/overriding implementations of functions behind the same name. This ability to hide different implementations of a function behind the same name is called polymorphism and it greatly simplifies communication among objects.
With the concepts of composition-relationship, encapsulation, inheritance and polymorphism, an object can represent just about anything in the real world. In fact, our logical perception of the reality is the only limit on determining the kinds of things that can become objects in object-oriented software. Some typical categories are as follows:
Objects can represent physical objects, such as automobiles in a traffic-flow simulation, electrical components in a circuit-design program, countries in an economics model, or aircraft in an air-traffic-control system.
Objects can represent elements of the computer-user environment such as windows, menus or graphics objects.
An object can represent an inventory, such as a personnel file or a table of the latitudes and longitudes of cities.
An object can represent user-defined data types such as time, angles, and complex numbers, or points on the plane.
With this enormous capability of an object to represent just about any logically separable matters, OOP allows the software developer to design and implement a computer program that is a model of some aspects of reality, whether that reality is a physical entity, a process, a system, or a composition of matter. Since the object can represent anything, the software developer can create an object which can be used as a component in a larger software project in the future.
If 90% of a new OOP software program consists of proven, existing components made from preexisting reusable objects, then only the remaining 10% of the new software project has to be written and tested from scratch. Since 90% already came from an inventory of extensively tested reusable objects, the potential domain from which an error could originate is 10% of the program. As a result, OOP enables software developers to build objects out of other, previously built, objects.
This process closely resembles complex machinery being built out of assemblies and sub-assemblies. OOP technology, therefore, makes software engineering more like hardware engineering in that software is built from existing components, which are available to the developer as objects. All this adds up to an improved quality of the software as well as an increased speed of its development.
Programming languages are beginning to fully support the OOP principles, such as encapsulation, inheritance, polymorphism, and composition-relationship. With the advent of the C++ language, many commercial software developers have embraced OOP. C++ is an OOP language that offers a fast, machine-executable code. Furthermore, C++ is suitable for both commercial-application and systems-programming projects. For now, C++ appears to be the most popular choice among many OOP programmers, but there is a host of other OOP languages, such as Smalltalk, common lisp object system (CLOS), and Eiffel. Additionally, OOP capabilities are being added to more traditional popular computer programming languages such as Pascal.
The benefits of object classes can be summarized, as follows:
Objects and their corresponding classes break down complex programming problems into many smaller, simpler problems.
Encapsulation enforces data abstraction through the organization of data into small, independent objects that can communicate with each other. Encapsulation protects the data in an object from accidental damage, but allows other objects to interact with that data by calling the object's member functions and structures.
Subclassing and inheritance make it possible to extend and modify objects through deriving new kinds of objects from the standard classes available in the system. Thus, new capabilities are created without having to start from scratch.
Polymorphism and multiple inheritance make it possible for different programmers to mix and match characteristics of many different classes and create specialized objects that can still work with related objects in predictable ways.
Class hierarchies and containment hierarchies provide a flexible mechanism for modeling real-world objects and the relationships among them.
Libraries of reusable classes are useful in many situations, but they also have some limitations. For example:
Complexity. In a complex system, the class hierarchies for related classes can become extremely confusing, with many dozens or even hundreds of classes.
Flow of control. A program written with the aid of class libraries is still responsible for the flow of control (i.e., it must control the interactions among all the objects created from a particular library). The programmer has to decide which functions to call at what times for which kinds of objects.
Duplication of effort. Although class libraries allow programmers to use and reuse many small pieces of code, each programmer puts those pieces together in a different way. Two different programmers can use the same set of class libraries to write two programs that do exactly the same thing but whose internal structure (i.e., design) may be quite different, depending on hundreds of small decisions each programmer makes along the way. Inevitably, similar pieces of code end up doing similar things in slightly different ways and do not work as well together as they should.
Class libraries are very flexible. As programs grow more complex, more programmers are forced to reinvent basic solutions to basic problems over and over again. A relatively new extension of the class library concept is to have a framework of class libraries. This framework is more complex and consists of significant collections of collaborating classes that capture both the small scale patterns and major mechanisms that implement the common requirements and design in a specific application domain. They were first developed to free application programmers from the chores involved in displaying menus, windows, dialog boxes, and other standard user interface elements for personal computers.
Frameworks also represent a change in the way programmers think about the interaction between the code they write and code written by others. In the early days of procedural programming, the programmer called libraries provided by the operating system to perform certain tasks, but basically the program executed down the page from start to finish, and the programmer was solely responsible for the flow of control. This was appropriate for printing out paychecks, calculating a mathematical table, or solving other problems with a program that executed in just one way.
The development of graphical user interfaces began to turn this procedural programming arrangement inside out. These interfaces allow the user, rather than program logic, to drive the program and decide when certain actions should be performed. Today, most personal computer software accomplishes this by means of an event loop which monitors the mouse, keyboard, and other sources of external events and calls the appropriate parts of the programmer's code according to actions that the user performs. The programmer no longer determines the order in which events occur. Instead, a program is divided into separate pieces that are called at unpredictable times and in an unpredictable order. By relinquishing control in this way to users, the developer creates a program that is much easier to use. Nevertheless, individual pieces of the program written by the developer still call libraries provided by the operating system to accomplish certain tasks, and the programmer must still determine the flow of control within each piece after being called by the event loop. Application code still "sits on top of" the system.
Even event loop programs require programmers to write a lot of code that should not need to be written separately for every application. The concept of an application framework carries the event loop concept further. Instead of dealing with all the nuts and bolts of constructing basic menus, windows, and dialog boxes and then making these things all work together, programmers using application frameworks start with working application code and basic user interface elements in place. Subsequently, they build from there by replacing some of the generic capabilities of the framework with the specific capabilities of the intended application.
Application frameworks reduce the total amount of code that a programmer has to write from scratch. However, because the framework is really a generic application that displays windows, supports copy and paste, and so on, the programmer can also relinquish control to a greater degree than event loop programs permit. The framework code takes care of almost all event handling and flow of control, and the programmer's code is called only when the framework needs it (e.g., to create or manipulate a proprietary data structure).
A programmer writing a framework program not only relinquishes control to the user (as is also true for event loop programs), but also relinquishes the detailed flow of control within the program to the framework. This approach allows the creation of more complex systems that work together in interesting ways, as opposed to isolated programs, having custom code, being created over and over again for similar problems.
Thus, as is explained above, a framework basically is a collection of cooperating classes that make up a reusable design solution for a given problem domain. It typically includes objects that provide default behavior (e.g., for menus and windows), and programmers use it by inheriting some of that default behavior and overriding other behavior so that the framework calls application code at the appropriate times.
There are three main differences between frameworks and class libraries:
Behavior versus protocol. Class libraries are essentially collections of behaviors that you can call when you want those individual behaviors in your program. A framework, on the other hand, provides not only behavior but also the protocol or set of rules that govern the ways in which behaviors can be combined, including rules for what a programmer is supposed to provide versus what the framework provides.
Call versus override. With a class library, the code the programmer instantiates objects and calls their member functions. It's possible to instantiate and call objects in the same way with a framework (i.e., to treat the framework as a class library), but to take full advantage of a framework's reusable design, a programmer typically writes code that overrides and is called by the framework. The framework manages the flow of control among its objects. Writing a program involves dividing responsibilities among the various pieces of software that are called by the framework rather than specifying how the different pieces should work together.
Implementation versus design. With class libraries, programmers reuse only implementations, whereas with frameworks, they reuse design. A framework embodies the way a family of related programs or pieces of software work. It represents a generic design solution that can be adapted to a variety of specific problems in a given domain. For example, a single framework can embody the way a user interface works, even though two different user interfaces created with the same framework might solve quite different interface problems.
Thus, through the development of frameworks for solutions to various problems and programming tasks, significant reductions in the design and development effort for software can be achieved. A preferred embodiment of the invention utilizes HyperText Markup Language (HTML) to implement documents on the Internet together with a general-purpose secure communication protocol for a transport medium between the client and the Newco. HTTP or other protocols could be readily substituted for HTML without undue experimentation. Information on these products is available in T. Berners-Lee, D. Connoly, "RFC 1866: Hypertext Markup Language--2.0" (November 1995); and R. Fielding, H, Frystyk, T. Berners-Lee, J. Gettys and J. C. Mogul, "Hypertext Transfer Protocol--HTTP/1.1: HTTP Working Group Internet Draft" (May 2, 1996). HTML is a simple data format used to create hypertext documents that are portable from one platform to another. HTML documents are SGML documents with generic semantics that are appropriate for representing information from a wide range of domains. HTML has been in use by the World-Wide Web global information initiative since 1990. HTML is an application of ISO Standard 8879:1986 Information Processing Text and Office Systems; Standard Generalized Markup Language (SGML).
To date, Web development tools have been limited in their ability to create dynamic Web applications which span from client to server and interoperate with existing computing resources. Until recently, HTML has been the dominant technology used in development of Web-based solutions. However, HTML has proven to be inadequate in the following areas:
Poor performance;
Restricted user interface capabilities;
Can only produce static Web pages;
Lack of interoperability with existing applications and data; and
Inability to scale.
Sun Microsystem's Java language solves many of the client-side problems by:
Improving performance on the client side;
Enabling the creation of dynamic, real-time Web applications; and
Providing the ability to create a wide variety of user interface components.
With Java, developers can create robust User Interface (UI) components. Custom "widgets" (e.g., real-time stock tickers, animated icons, etc.) can be created, and client-side performance is improved. Unlike HTML, Java supports the notion of client-side validation, offloading appropriate processing onto the client for improved performance. Dynamic, real-time Web pages can be created. Using the above-mentioned custom UI components, dynamic Web pages can also be created.
Sun's Java language has emerged as an industry-recognized language for "programming the Internet." Sun defines Java as: "a simple, object-oriented, distributed, interpreted, robust, secure, architecture-neutral, portable, high-performance, multithreaded, dynamic, buzzword-compliant, general-purpose programming language. Java supports programming for the Internet in the form of platform-independent Java applets." Java applets are small, specialized applications that comply with Sun's Java Application Programming Interface (API) allowing developers to add "interactive content" to Web documents (e.g., simple animations, page adornments, basic games, etc.). Applets execute within a Java-compatible browser (e.g., Netscape Navigator) by copying code from the server to client. From a language standpoint, Java's core feature set is based on C++. Sun's Java literature states that Java is basically "C++, with extensions from Objective C for more dynamic method resolution".
Another technology that provides similar function to JAVA is provided by Microsoft and ActiveX Technologies, to give developers and Web designers wherewithal to build dynamic content for the Internet and personal computers. ActiveX includes tools for developing animation, 3-D virtual reality, video and other multimedia content. The tools use Internet standards, work on multiple platforms, and are being supported by over 100 companies. The group's building blocks are called ActiveX Controls, small, fast components that enable developers to embed parts of software in hypertext markup language (HTML) pages. ActiveX Controls work with a variety of programming languages including Microsoft Visual C++, Borland Delphi, Microsoft Visual Basic programming system and, in the future, Microsoft's development tool for Java, code named "Jakarta." ActiveX Technologies also includes ActiveX Server Framework, allowing developers to create server applications. One of ordinary skill in the art readily recognizes that ActiveX could be substituted for JAVA without undue experimentation to practice the invention.
In accordance with a preferred embodiment, BackgroundFinder (BF) is implemented as an agent responsible for preparing an individual for an upcoming meeting by helping him/her retrieve relevant information about the meeting from various sources. BF receives input text in character form indicative of the target meeting. The input text is generated in accordance with a preferred embodiment by a calendar program that includes the time of the meeting. As the time of the meeting approaches, the calendar program is queried to obtain the text of the target event and that information is utilized as input to the agent. Then, the agent parses the input meeting text to extract its various components such as title, body, participants, location, time etc. The system also performs pattern matching to identify particular meeting fields in a meeting text. This information is utilized to query various sources of information on the web and obtain relevant stories about the current meeting to send back to the calendaring system. For example, if an individual has a meeting with Netscape and Microsoft to talk about their disputes, and would obtain this initial information from the calendaring system. It will then parse out the text to realize that the companies in the meeting are "Netscape" and "Microsoft" and the topic is "disputes." Then, the system queries the web for relevant information concerning the topic. Thus, in accordance with an objective of the invention, the system updates the calendaring system and eventually the user with the best information it can gather to prepare the user for the target meeting. In accordance with a preferred embodiment, the information is stored in a file that is obtained via selection from a link imbedded in the calendar system.
PROGRAM ORGANIZATION
A computer program in accordance with a preferred embodiment is organized in five distinct modules: BF.Main, BF.Parse, Background Finder.Error, BF.PattemMatching and BF.Search. There is also a frmMain which provides a user interface used only for debugging purposes. The executable programs in accordance with a preferred embodiment never execute with the user interface and should only return to the calendaring system through Microsoft's Winsock control. A preferred embodiment of the system executes in two different modes which can be specified under the command line sent to it by the calendaring system. When the system runs in simple mode, it executes a keyword query to submit to external search engines. When executed in complex mode, the system performs pattern matching before it forms a query to be sent to a search engine.
DATA STRUCTURES
The system in accordance with a preferred embodiment utilizes three user defined structures:
1. TMeetingRecord;
2. TPatternElement; and
3. TPatternRecord.
The user-defined structure, tMeetingRecord, is? used to store all the pertinent information concerning a single meeting. This info includes userID, an original description of the meeting, the extracted list of keywords from the title and body of meeting etc. It is important to note that only one meeting record is created per instance of the system in accordance with a preferred embodiment. This is because each time the system is spawned to service an upcoming meeting, it is assigned a task to retrieve information for only one meeting. Therefore, the meeting record created corresponds to the current meeting examined. ParseMeetingText populates this meeting record and it is then passed around to provide information about the meeting to other functions.
If GoPattemMatch can bind any values to a particular meeting field, the corresponding entries in the meeting record is also updated. The structure of tMeetingrecord with each field described in parentheses is provided below in accordance with a preferred embodiment.
A.1.1.1.1.1 Public Type tMeetingRecord
sUserID As String (user id given by Munin)
sTitleOrig As String (original non stop listed title we need to keep
around to send back to Munin)
sTitleKW As String (stoplisted title with only keywords)
sBodyKW As String (stoplisted body with only keywords)
sCompany() As String (companys identified in title or body through
pattern matching)
sTopic() As String (topics identified in title or body through pattern
matching)
sPeople() As String (people identified in title or body through pattern
matching)
sWhen() As String (time identified in title or body through pattern
matching)
sWhere() As String (location identified in title or body through pattern
matching)
sLocation As String (location as passed in by Munin)
sTime As String (time as passed in by Munin)
sParticipants() As (all participants engaged as passed in by Munin)
String
sMeetingText As (the original meeting text w/o userid)
String
End Type
There are two other structures which are created to hold each individual pattern utilized in pattern matching. The record tAPattermRecord is an array containing all the components/elements of a pattern. The type tAPatternElement is an array of strings which represent an element in a pattern. Because there may be many "substitutes" for each element, we need an array of strings to keep track of what all the substitutes are. The structures of tAPatternElement and tAPattemRecord are presented below in accordance with a preferred embodiment.
Public Type tAPattemElement
elementArray() As String
End Type
Public Type TPatternRecord
patternArray() As tAPattemElement
End Type
COMMON USER DEFINED CONSTANTS
Many constants are defined in each declaration section of the program which may need to be updated periodically as part of the process of maintaining the system in accordance with a preferred embodiment. The constants are accessible to allow dynamic configuration of the system to occur as updates for maintaining the code.
Included in the following tables are lists of constants from each module which I thought are most likely to be modified from time to time. However, there are also other constants used in the code not included in the following list. It does not mean that these non-included constants will never be changed. It means that they will change much less frequently.
PRESET
CONSTANT VALUE USE
MSGTOMUNIN.sub.-- 6 Define the message number used
TYPE to identify messages between BF
and Munin
IP_ADDRESS.sub.-- "10.2.100.48" Define the IP address of the
MUNIN machine in which Munin and BF
are running on so they can transfer
data through UDP.
PORT_MUNIN 7777 Define the remote port in which
we are operating on.
TIMEOUT_AV 60 Define constants for setting time
out in inet controls
TIMEOUT_NP 60 Define constants for setting time
out in inet controls
CMD.sub.-- ".backslash." Define delimiter to tell which part
SEPARATOR of Munin's command represents
the beginning of our input meeting
text
OUTPARAM.sub.-- "::" Define delimiter for separating out
SEPARATOR different portions of the output.
The separator is for delimiting the
msg type, the user id, the meeting
title and the beginning of the
actual stories retrieved.
For the Search Module (BF.Search):
CURRENT
CONSTANT VALUE USE
PAST_NDAYS 5 Define number of days you want to
look back for AltaVista articles.
Doesn't really matter now because
we aren't really doing a news
search in alta vista. We want all
info.
CONNECTOR_AV.sub.-- "+AND+" Define how to connect keywords.
URL We want all our keywords in the
string so for now use AND. If you
want to do an OR or something,
just change connector.
CONNECTOR_NP.sub.-- "+AND+" Define how to connect keywords.
URL We want all our keywords in the
string so for now use AND. If you
want to do an OR or something,
just change connector.
NUM_NP_STORIES 3 Define the number of stories to
return back to Munin from
NewsPage.
NUM_AV_STORIES 3 Define the number of stories to
return back to Munin from
AltaVista.
For the Parse Module (BF.Parse):
CURRENT
CONSTANT VALUE USE
PORTION.sub.-- "::" Define the separator between
SEPARATOR different portions of the meeting
text sent in by Munin. For example
in "09::Meet with Chad::about
life::Chad .vertline. Denise::::::" "::" is
the
separator between different parts
of the meeting text.
PARTICIPANT.sub.-- ".vertline." Define the separator between each
SEPARATOR participant in the participant list
portion of the original meeting
text.
Refer to example above.
For Pattern Matching Module (BFPatternMatch): There are no constants in this module which require frequent updates.
General Process Flow
The best way to depict the process flow and the coordination of functions between each other is with the five flowcharts illustrated in FIGS. 2 to 6. FIG. 2 depicts the overall process flow in accordance with a preferred embodiment. Processing commences at the top of the chart at function block 200 which launches when the program starts. Once the application is started, the command line is parsed to remove the appropriate meeting text to initiate the target of the background find operation in accordance with a preferred embodiment as shown in function block 210. A global stop list is generated after the target is determined as shown in function block 220. Then, all the patterns that are utilized for matching operations are generated as illustrated in function block 230. Then, by tracing through the chart, function block 200 invokes GoBF 240 which is responsible for logical processing associated with wrapping the correct search query information for the particular target search engine. For example, function block 240 flows to function block 250 and it then calls GoPatternMatch as shown in function block 260. To see the process flow of GoPattemMatch, we swap to the diagram titled "Process Flow for BF's Pattern Matching Unit."
One key thing to notice is that functions depicted at the same level of the chart are called by in sequential order from left to right (or top to bottom) by their common parent function. For example, Main 200 calls ProcessCommandLine 210, then CreateStopListist 220, then CreatePatterns 230, then GoBackgroundFinder 240. FIGS. 3 to 6 detail the logic for the entire program, the parsing unit, the pattern matching unit and the search unit respectively. FIG. 6 details the logic determinative of data flow of key information through BackgroundFinder, and shows the functions that are responsible for creating or processing such information.
DETAILED SEARCH ARCHITECTURE UNDER THE SIMPLE QUERY MODE
SEARCH ALTA VISTA
(Function block 270 of FIG. 2)
The Alta Vista search engine utilizes the identifies and returns general information about topics related to the current meeting as shown in function block 270 of FIG. 2. The system in accordance with a preferred embodiment takes all the keywords from the title portion of the original meeting text and constructs an advanced query to send to Alta Vista. The keywords are logically combined together in the query. The results are also ranked based on the same set of keywords. One of ordinary skill in the art will readily comprehend that a date restriction or publisher criteria could be facilitated on the articles we want to retrieve. A set of top ranking stories are returned to the calendaring system in accordance with a preferred embodiment.
NEWS PAGE
(Function block 275 of FIG. 2)
The NewsPage search system is responsible for giving us the latest news topics related to a target meeting. The system takes all of the keywords from the title portion of the original meeting text and constructs a query to send to the NewsPage search engine. The keywords are logically combined together in the query. Only articles published recently are retrieved. The Newspage search system provides a date restriction criteria that is settable by a user according to the user's preference. The top ranking stories are returned to the calendaring system.
FIG. 3 is a user profile data model in accordance with a preferred embodiment. Processing commences at function block 300 which is responsible for invoking the program from the main module. Then, at function block 310, a wrapper function is invoked to prepare for the keyword extraction processing in function block 320. After the keywords are extracted, then processing flows to function block 330 to determine if the delimiters are properly positioned. Then, at function block 340, the number of words in a particular string is calculated and the delimiters for the particular field are and a particular field from the meeting text is retrieved at function block 350. Then, at function block 380, the delimiters of the string are again checked to assure they are placed appropriately. Finally, at function block 360, the extraction of each word from the title and body of the message is performed a word at a time utilizing the logic in function block 362 which finds the next closest word delimiter in the input phrase, function block 364 which strips unnecessary materials from a word and function block 366 which determines if a word is on the stop list and returns an error if the word is on the stop list.
PATTERN MATCHING IN ACCORDANCE WITH A PREFERRED EMBODIMENT
The limitations associated with a simple searching method include the following:
1. Because it relies on a stoplist of unwanted words in order to extract from the meeting text a set of keywords, it is limited by how comprehensive the stoplist is. Instead of trying to figure out what parts of the meeting text we should throw away, we should focus on what parts of the meeting text we want.
2. A simple search method in accordance with a preferred embodiment only uses the keywords from a meeting title to form queries to send to Alta Vista and NewsPage. This ignores an alternative source of information for the query, the body of the meeting notice. We cannot include the keywords from the meeting body to form our queries because this often results in queries which are too long and so complex that we often obtain no meaningful results.
3. There is no way for us to tell what each keyword represents. For example, we may extract "Andy" and "Grove" as two keywords. However, a simplistic search has no way knowing that "Andy Grove" is in fact a person's name. Imagine the possibilities if we could somehow intelligently guess that "Andy Grove" is a person's name. Information such as where he is employed and currently resides.
4. In summary, by relying solely on a stoplist to parse out unnecessary words, we suffer from "information overload".
PATTERN MATCHING OVERCOMES THESE LIMITATIONS IN ACCORDANCE WITH A PREFERRED EMBODIMENT
Here is how the pattern matching system can address each of the corresponding issues above in accordance with a preferred embodiment.
1. By doing pattern matching, we match up only parts of the meeting text that we want and extract those parts.
2. By performing pattern matching on the meeting body and extracting only the parts from the meeting body that we want. Our meeting body will not go to complete waste then.
3. Pattern matching is based on a set of templates that we specify, allowing us to identify people names, company names and other items from a meeting text.
4. In summary, with pattern matching, we no longer suffer from information overload. Of course, the big problem is how well our pattern matching works. If we rely exclusively on artificial intelligence processing, we do not have a 100% hit rate. We are able to identify about 20% of all company names presented to us.
PATTERNS
A pattern in the context of a preferred embodiment is a template specifying the structure of a phrase we are looking for in a meeting text. The patterns supported by a preferred embodiment are selected because they are templates of phrases which have a high probability of appearing in someone's meeting text. For example, when entering a meeting in a calendar, many would write something such as "Meet with Bob Dutton from Stanford University next Tuesday." A common pattern would then be something like the word "with" followed by a person's name (in this example it is Bob Dutton) followed by the word "from" and ending with an organization's name (in this case, it is Stanford University).
PATTERN MATCHING TERMINOLOGY
The common terminology associated with pattern matching is provided below.
Pattern: a pattern is a template specifying the structure of a phrase we want to bind the meeting text to. It contains sub units.
Element: a pattern can contain many sub-units. These subunits are called elements. For example, in the pattern "with $PEOPLE$ from $COMPANY$", "with" "$PEOPLE$" "from" "$COMPANY$" are all elements.
Placeholder: a placeholder is a special kind of element in which we want to bind a value to. Using the above example, "$PEOPLE$" is a placeholder.
Indicator: an indicator is another kind of element which we want to find in a meeting text but no value needs to bind to it. There may be often more than one indicator we are looking for in a certain pattern. That is why an indicator is not an "atomic" type.
Substitute: substitutes are a set of indicators which are all synonyms of each other. Finding any one of them in the input is good.
There are five fields which are identified for each meeting:
.diamond-solid. Company ($COMPANY$)
.diamond-solid. People ($PEOPLE$)
.diamond-solid. Location ($LOCATION$)
.diamond-solid. Time ($TIME$)
.diamond-solid. Topic ($TOPIC_UPPER$) or ($TOPIC_ALL$)
In parentheses are the placeholders I used in my code as representation of the corresponding meeting fields.
Each placeholder has the following meaning:
$COMPANY$: binds a string of capitalized words (e.g., Meet with Joe Carter of <Andersen Consulting>)
$PEOPLE$: binds series of string of two capitalized words potentially connected by "," "and" or "&" (e.g., Meet with Joe <Carter> of Andersen Consulting, Meet with <Joe Carter and Luke Hughes> of Andersen Consulting)
$LOCATION$: binds a string of capitalized words (e.g., Meet Susan at <Palo Alto Square>)
$TIME$: binds a string containing the format #:## (e.g., Dinner at <6:30 pm>)
$TOPIC_UPPER$: binds a string of capitalized words for our topic (e.g., <Stanford Engineering Recruiting> Meeting to talk about new hires).
$TOPIC_ALL$: binds a string of words without really caring if it's capitalized or not. (e.g., Meet to talk about <ubiquitous computing>)
Here is a table representing all the patterns supported by BF. Each pattern belongs to a pattern group. All patterns within a pattern group share a similar format and they only differ from each other in terms of what indicators are used as substitutes. Note that the patterns which are grayed out are also commented in the code. BF has the capability to support these patterns but we decided that matching these patterns is not essential at this point.
PAT PAT
GRP # PATTERN EXAMPLE
1 a $PEOPLE$ of Paul Maritz of Microsoft
$COMPANY$
b $PEOPLE$ from Bill Gates, Paul Allen and
$COMPANY$ Paul Maritz from Microsoft
2 a $TOPIC_UPPER$ meeting Push Technology Meeting
b $TOPIC_UPPER$ mtg Push Technology Mtg
c $TOPIC_UPPER$ demo Push Technology demo
d $TOPIC_UPPER$ Push Technology interview
interview
e $TOPIC_UPPER$ Push Technology
presentation presentation
f $TOPIC_UPPER$ visit Push Technology visit
g $TOPIC_UPPER$ briefing Push Technology briefing
h $TOPIC_UPPER$ Push Technology
discussion discussion
i $TOPIC_UPPER$ Push Technology
workshop workshop
j $TOPIC_UPPER$ prep Push Technology prep
k $TOPIC_UPPER$ review Push Technology review
l $TOPIC_UPPER$ lunch Push Technology lunch
m $TOPIC_UPPER$ project Push Technology project
n $TOPIC_UPPER$ projects Push Technology projects
3 a $COMPANY$ corporation Intel Corporation
b $COMPANY$ corp. IBM Corp.
c $COMPANY$ systems Cisco Systems
d $COMPANY$ limited IBM limited
e $COMPANY$ ltd IBM ltd
4 a about $TOPIC_ALL$ About intelligent agents
technology
b discuss $TOPIC_ALL$ Discuss intelligent agents
technology
c show $TOPIC_ALL$ Show the client our
intelligent agents
technology
d re: $TOPIC_ALL$ re: intelligent agents
technology
e review $TOPIC_ALL$ Review intelligent agents
technology
f agenda The agenda is as follows:
--clean up
--clean up
--clean up
g agenda: $TOPIC_ALL$ Agenda:
--demo client intelligent
agents technology.
--demo ecommerce.
5 a w/$PEOPLE$ of Meet w/Joe Carter of
$COMPANY$ Andersen Consulting
b w/$PEOPLE$ from Meet w/Joe Carter from
$COMPANY$ Andersen Consulting
6 a w/$COMPANY$ per Talk w/Intel per Jason
$PEOPLE$ Foster
7 a At $TIME$ At 3:00 pm
b Around $TIME$ Around 3:00 pm
8 a At $LOCATION$ At LuLu's resturant
b In $LOCATION$ in Santa Clara
9 a Per $PEOPLE$ per Susan Butler
10 a call w/$PEOPLE$ Conf call w/John Smith
B call with $PEOPLE$ Conf call with John Smith
11 A prep for $TOPIC_ALL$ Prep for London meeting
B preparation for Preparation for London
$TOPIC_ALL$ meeting
FIG. 4 is a detailed flowchart of pattern matching in accordance with a preferred embodiment. Processing commences at function block 400 where the main program invokes the pattern matching application and passes control to function block 410 to commence the pattern match processing. Then, at function block 420, the wrapper function loops through to process each pattern which includes determining if a part of the text string can be bound to a pattern as shown in function block 430. Then, at function block 440, various placeholders are bound to values if they exist, and in function block 441, a list of names separated by punctuation are bound, and at function block 442 a full name is processed by finding two capitalized words as a full name and grabbing the next letter after a space after a word to determine if it is capitalized. Then, at function block 443, time is parsed out of the string in an appropriate manner and the next word after a blank space in function block 444. Then, at function block 445, the continuous phrases of capitalized words such as company, topic or location are bound and in function block 446, the next word after the blank is obtained for further processing in accordance with a preferred embodiment. Following the match meeting field processing, function block 450 is utilized to locate an indicator which is the head of a pattern, the next word after the blank is obtained as shown in function block 452 and the word is checked to determine if the word is an indicator as shown in function block 454. Then, at function block 460, the string is parsed to locate an indicator which is not at the end of the pattern and the next word after unnecessary white space such as that following a line feed or a carriage return is processed as shown in function block 462 and the word is analyzed to determine if it is an indicator as shown in function block 464. Then, in function block 470, the temporary record is reset to the null set to prepare it for processing the next string and at function block 480, the meeting record is updated and at function block 482 a check is performed to determine if an entry is already made to the meeting record before parsing the meeting record again.
USING THE IDENTIFIED MEETING FIELDS
Now that we have identified fields within the meeting text which we consider important, there are quite a few things we can do with it. One of the most important applications of pattern matching is of course to improve the query we construct which eventually gets submitted to Alta Vista and News Page. There are also a lot of other options and enhancements which exploit the results of pattern matching that we can add to BF. These other options will be described in the next section. The goal of this section is to give the reader a good sense of how the results obtained from pattern matching can be used to help us obtain better search results.
FIG. 5 is a flowchart of the detailed processing for preparing a query and obtaining information from the Internet in accordance with a preferred embodiment. Processing commences at function block 500 and immediately flows to function block 510 to process the wrapper functionality to prepare for an Internet search utilizing a web search engine. If the search is to utilize the Alta Vista search engine, then at function block 530, the system takes information from the meeting record and forms a query in function blocks 540 to 560 for submittal to the search engine. If the search is to utilize the NewsPage search engine, then at function block 520, the system takes information from the meeting record and forms a query in function blocks 521 to 528.
Alta Vista Search Engine
The strength of the Alta Vista search engine is that it provides enhanced flexibility. Using its advance query method, one can construct all sorts of Boolean queries and rank the search however you want. However, one of the biggest drawbacks with Alta Vista is that it is not very good at handling a large query and is likely to give back irrelevant results. If we can identify the topic and the company within a meeting text, we can form a pretty short but comprehensive query which will hopefully yield better results. We also want to focus on the topics found. It may not be of much merit to the user to find out info about a company especially if the user already knows the company well and has had numerous meetings with them. It's the topics they want to research on.
News Page Search Engine
The strength of the News Page search engine is that it does a great job searching for the most recent news if you are able to give it a valid company name. Therefore when we submit a query to the news page web site, we send whatever company name we can identify and only if we cannot find one do we use the topics found to form a query. If neither one is found, then no search is performed. The algorithmn utilized to form the query to submit to Alta Vista is illustrated in FIG. 7. The algorithmn that we will use to form the query to submit to News Page is illustrated in FIG. 8.
The following table describes in detail each function in accordance with a preferred embodiment. The order in which functions appear mimics the process flow as closely as possible. When there are situations in which a function is called several times, this function will be listed after the first function which calls it and its description is not duplicated after every subsequent function which calls it.
Procedure
Name Type Called By Description
Main Public None This is the main
function
(BF.Main) Sub where the program
first
launches. It
initializes BF
with the
appropriate
parameters (e.g.,
Internet
time-out, stoplist
. . . ) and
calls GoBF to
launch the
main part of the
program.
ProcessCommandLine Private Main This function
parses the
(BF.Main) Sub command line. It
assumes
that the delimiter
indicating
the beginning of
input from
Munin is stored in
the
constant
CMD_SEPARATOR.
CreateStopList Private Main This function sets
up a stop
(BF.Main) Function list for future use
to parse out
unwanted words from
the
meeting text.
There are commas on
each
side of each word
to enable
straight checking.
CreatePatterns Public Main This procedure is
called once
(BF.Pattern Sub when BF is first
initialized to
Match) create all the
potential
patterns that
portions of the
meeting text can
bind to. A
pattern can contain
however
many elements as
needed.
There are
two types of
elements. The
first type of
elements are
indicators. These
are real
words which delimit
the
potential of a
meeting field
(e.g. company) to
follow.
Most of these
indicators are
stop words as
expected
because
stop words are
words
usually common to
all
meeting text so it
makes
sense they form
patterns. The
second type of
elements are
special strings
which
represent
placeholders.
A placeholder is
always in
the form of $*$
where * can
be either PEOPLE,
COMPANY,
TOPIC_UPPER,
TIME, LOCATION or
TOPIC_ALL. A
pattern can
begin with either
one of the
two types of
elements and
can be however
long,
involving however
any
number/type of
elements.
This procedure
dynamically
creates a new
pattern record
for
each pattern in the
table and
it also dynamically
creates
new
tAPatternElements for
each element within
a
pattern. In
addition, there is
the concept of
being able to
substitute
indicators within a
pattern. For
example, the
pattern $PEOPLE$ of
$COMPANY$ is
similar to
the pattern
$PEOPLE$ from
$COMPANY$. "from"
is a
substitute for
"of". Our
structure should be
able to
express such a need
for
substitution.
GoBF Public Main This is a wrapper
procedurer
(BF.Main) Sub that calls both the
parsing
and the searching
subroutines
of the
BF. It is also
responsible for
sending data back
to Munin.
ParseMeeting Public GoBackGroundFinder This function takes
the initial
Text Function meeting text and
identifies
(BF.Parse) the userID of the
record as
well as other parts
of the
meeting text
including the
title, body,
participant list,
location and time.
In
addition, we call a
helper
function
ProcessStopList to
eliminate all the
unwanted
words from the
original
meeting title and
meeting
body so that only
keywords
are left. The
information
parsed out is
stored in the
MeetingRecord
structure.
Note that this
function does
no error checking
and for the
most time assumes
that the
meeting text string
is
correctly formatted
by
Munin.
The important
variable is
this Meeting Record
is the
temp holder for all
info
regarding current
meeting.
It's eventually
returned to
caller.
FormatDelimitation Private ParseMeetingText, There are 4 ways in
which
(BF.Parse) DetermineNum the delimiters can
be placed.
We take care of all
these
Words, cases by reducing
them
GetAWordFrom down to Case 4 in
which
String there are no
delimiters
around but only
between
fields in a string
(e.g.,
A::B::C)
DetermineNumWords Public ParseMeeting This functions
determines
(BF.Parse) Function Text, how many words
there are in
ProcessStop a string
(stInEvalString) The
List function assumes
that each
word is separated
by a
designated
separator as
specified in
stSeparator. The
return type is an
integer that
indicates how many
words
have been found
assuming
each word
in the string is
separated by
stSeparator. This
function is
always used along
with
GetAwordFromString
and
should be called
before
calling
GetAWordFrom
String.
GetAWordFromString Public ParseMeeting This function
extracts the ith
(BF.Parse) Function Text, word of the
ProcessStop
string(stInEvalString)
List assuming that each
word in
the string is
separated by a
designated
separator contained
in the
variable
stSeparator.
In most cases, use
this
function with
DetermineNumWords.
The
function returns
the wanted
word. This function
checks
to make sure that
iInWordNum is
within
bounds so that i
is not greater than
the total
number of words in
string or
less than/equal to
zero. If it
is out of bounds,
we return
empty string to
indicate we
can't get anything.
We try to
make sure this
doesn't
happen by calling
DetermineNumWords
first.
ParseAndCleanPhrase Private ParseMeetingText This function first
grabs the
(BF.Parse) Function word and send it to
CleanWord in order
strip
the stuff that
nobody wants.
There are things in
parseWord that will
kill
the word, so we
will need a
method of looping
through
the body and
rejecting
words without
killing the
whole function
i guess keep
CleanWord and
check a return
value
ok, now I have a
word so I
need to send it
down the
parse chain. This
chain goes
ParseCleanPhrase
-->
CleanWord -->
EvaluateWord. If
the word
gets through the
entire chain
without being
killed, it will be
added at the
end to our keyword
string.
first would be the
function
that checks for "/"
as a
delimiter and
extracts the
parts of that. This
I will call
"StitchFace"
(Denise is more
normal and calls it
GetAWordFromString)
if this finds
words, then each
of these will be
sent, in turn,
down the chain. If
these get through
the entire
chain without being
added or
killed then they
will be
added rather than
tossed.
FindMin Private ParseAndCleanPhrase This function takes
in 6 input
(BF.Parse) Function values and
evaluates to see
what the minimum
non
zero value is. It
first creates
an array as a
holder so that
we can sort the
five
input values in
ascending
order. Thus the
minimum
value will be the
first non
zero value element
of the
array. If we go
through
entire array
without finding
a non zero value,
we know
that there is an
error and we
exit the function.
CleanWord Private ParseAndCleanPhrase This function tries
to clean
(BF.Parse) Function up a word in a
meeting text.
It first of all
determines if the
string is of a
valid length. It
then passes it
through a
series of tests to
see it is
clean and when
needed, it
will edit the word
and strip
unnecessary
characters off of
it. Such tests
includes
getting rid of file
extensions,
non chars, numbers
etc.
EvaluateWord Private ParseAndCleanPhrase This function tests
to see if
(BF.Parse) Function this word is in the
stop list so
it can determine
whether to
eliminate the word
from the
original meeting
text. If a
word is not in the
stoplist, it
should stay around
as a
keyword and this
function
exits beautifully
with no
errors. However, if
the
words is a
stopword, an error
must be returned.
We must
properly delimit
the input
test string so we
don't
accidentally
retrieve sub
strings.
GoPatternMatch Public GoBF This procedure is
called
(BF.Pattern Sub when our
QueryMethod is set
Match) to complex query
meaning
we do want to do
all the
pattern matching
stuff. It's a
simple wrapper
function
which initializes
some arrays
and then invokes
pattern
matching on the
title and the
body.
MatchPatterns Public GoPattern Match This procedure
loops through
(BF.Pattern Sub every pattern in
the pattern
Match) table and tries to
identify
different fields
within a
meeting text
specified by
sInEvalString. For
debugging
purposes it also
tries to tabulate
how many
times a certain
pattern was
triggered and
stores it in
gTabulateMatches to
see
which pattern fired
the
most.
gTabulateMatches is
stored as a global
because we
want to be able to
run a batch
file of 40 or 50
test strings
and still be able
to know how
often a pattern was
triggered.
MatchAPattern Private MatchPatterns This function goes
through
(BF.Pattern Function each element in the
current
Match) pattern. It first
evaluates to
determine whether
element is
a placeholder or an
indicator.
If it is a
placeholder, then it
will try to bind
the
placeholder with
some value.
If it is an
indicator, then we
try to locate it.
There is a
trick however.
Depending on
whether we are at
current
element is the head
of the
pattern or
not we want to take
different actions.
If we are
at the head, we
want to
look for the
indicator or
the placeholder. If
we
can't find it, then
we
know that the
current
pattern doesn't
exist and
we quit. However,
if it is
not the head, then
we
continue looking,
because
there may still be
a head
somewhere. We retry
in
this case.
etingField Private MatchAPattern This function uses
a big
(BF.Pattern Function switch statement to
first
Match) determine what kind
of
placeholder we are
talking
about and depending
on what
type of
placeholder, we have
specific
requirements
and different
binding criteria
as specified in the
subsequent
functions called
such as BindNames,
BindTime etc. If
binding is
successful we add
it to our
guessing record.
BindNames Private MatchMeetingField In this function,
we try to
(BF.Pattern Function match names to the
Match) corresponding
placeholder
$PEOPLE$. Names are
defined as any
consecutive
two words which are
capitalized. We
also what to
retrieve a series
of names
which are connected
by and,
or & so we look
until we
don't see any of
these 3
separators anymore.
Note
that we don't want
to bind
single word names
because it
is probably
too general anyway
so we
don't want to
produce broad
but irrelevant
results. This
function calls
BindAFullName which
binds
one name so in a
since
BindNames collects
all the
results from
BindAFullName
BindAFullName Private BindNames This function tries
to bind a
(BF.Pattern Function full name. If the
$PEOPLF$
Match) placeholder is not
the head of
the pattern, we
know that it
has to come right
at the
beginning of the
test string
because we've been
deleting
stuff off the head
of the
string all along.
If it is the head,
we search
until we find
something that
looks like a full
name. If we
can't find it, then
there's no
such pattern in the
text
entirely and we
quit entirely
from this pattern.
This
should eventually
return us to
the next pattern in
MatchPatterns.
GetNextWord Private BindAFull This function grabs
the next
AfterWhite Function Name, word in a test
string. It looks
Space BindTime, for the next word
after white
(BF.Pattern BindCompanyTo spaces, @ or /. The
word is
Match) picLoc defined to end when
we
encounter another
one of
these white spaces
or
separators.
BindTime Private MatchMeetingField Get the immediate
next word
(BF.Pattern Function and see if it looks
like a time
Match) pattern. If so
we've found a
time and so we want
to add it
to the record. We
probably
should add more
time
patterns. But
people don't
seem to like to
enter the time
in their titles
these days
especially since we
now have
tools like OutLook.
BindCompanyTopicLoc Private MatchMeetingField This function finds
a
(BF.Pattern Function continuous
capitalized string
Match) and binds it to
stMatch
which is passed by
reference
from
MatchMeetingField. A
continous
capitalized string
is a sequence of
capitalized
words which are not
interrupted
by things like, .
etc. There's
probably more stuff
we can
add to the list of
interruptions.
LocatePatternHead Private MatchAPattern This function tries
to locate
(BF.Pattern Function an element which is
an
Match) indicator. Note
that this
indicator SHOULD BE
AT
THE HEAD of the
pattern
otherwise it would
have gone
to the function
LocateIndicator
instead.
Therefore, we keep
on
grabbing the next
word until
either there's no
word for us
to grab (quit) or
if we find
one of the
indicators we are
looking for.
ContainInArray Private LocatePattern ' This function is
really
(BF.Pattern Function Head, simple. It loops
through all
Match) LocateIndicator the elements in the
array
' to find a
matching string.
LocateIndicator Private MatchAPattern This function tries
to locate
(BF.Pattern Function an element which is
an
Match) indicator. Note
that this
indicator is NOT at
the head
of the pattern
otherwise it
would have gone to
LocatePatternHead
instead.
Because of this, if
our
pattern is to be
satisfied, the
next word we grab
HAS to
be the indicator or
else we
would have failed.
Thus we
only grab one word,
test to
see if it is a
valid indicator
and then return
result.
InitializeGuessesRecord Private MatchAPattern This function
reinitializes
(BF.Pattern Sub our temporary test
structure
Match) because we have
already
transfered the info
to the
permanent
structure, we can
reinitialize it so
they each
have one element
AddToMeetingRecord Private MatchAPattern This function is
only called
(BF.Pattern Sub when we know that
the
Match) information stored
in
tInCurrGuesses is
valid
meaning that it
represents
legitamate guesses
of
meeting fields
ready to be
stored in the
permanent
record,
tInMeetingRecord.
We check to make
sure that
we do not store
duplicates
and we also what to
clean up
what we want to
store so that
there's no clutter
such as
punctuation, etc.
The reason
why we don't clean
up until
now is to save
time. We don't
waste resources
calling
ParseAndCleanPhrase
until
we know for sure
that we are
going to add it
permanently.
NoDuplicate Private AddToMeeting This function loops
through
Entry Function Record each element in the
array to
(BF.Pattern make sure that the
test string
Match) aString is not the
same as any
of the strings
already stored
in the array.
Slightly different
from
ContainInArray.
SearchAltaVista Public GoBackGroundFinder This function
prepares a
(BF. Search) Function query to be
submited to
AltaVista Search
engine. It
submits it and then
parses the
returning result in
the
appropriate format
containing the
title, URL and
body/summary of
each story
retrieved. The
number of
stories retrieved
is specified
by the constant
NUM_AV_STORIES.
Important variables
include
stURLAltaVista used
to store
query to submit
stResultHTML used
to store
html from page
specified by
stURLAltaVista.
ConstructAltaVistaURL Private SearchAltaVista This function
constructs the
(BF.Search) Function URL string for the
alta vista
search engine using
the
advanced query
search mode.
It includes the
keywords to
be used, the
language and
how we want to rank
the
search. Depending
on
whether we want to
use the
results of our
pattern
matching unit, we
construct
our query
differently.
ConstructSimpleKeyWord Private ConstructAltaVistaURL, This function
marches down
(BF.Search) Function ConstructNews the list of
keywords stored in
PageURL the stTitleKW or
stBodyKW
fields of the input
meeting
record and links
them up into
one string with
each keyword
separated by a
connector as
determined by the
input
variable
stInConnector.
Returns this newly
constructed string.
ConstructComplexAVKey Private ConstructAltaVistaURL This function
constructs the
Word Function keywords to be send
to the
(BF.Search) AltaVista site.
Unlike
ConstructSimpleKeyWord
which simply takes
all the
keywords from the
title to
form the query,
this function
will look at the
results of BF's
pattern matching
process
and see if we are
able to
identify any
specific
company names or
topics for
constructing
the queries. Query
will
include company and
topic
identified and
default to
simple query if we
cannot
identify either
company or
topic.
JoinWithConnectors Private ConstructComplexAvKey This function
simply replaces
(BF.Search) Function Word, the spaces between
the words
ConstructComplexNPKey within the string
with a
Word, connector which is
specified
RefineWithRank by the input.
RefineWithDate Private ConstructAltaVistaURL This function
constructs the
(NOT Function date portion of the
alta vista
CALLED AT query and returns
this portion
THE of the URL as a
string. It
MOMENT) makes sure that
alta vista
(BF.Search) searches for
articles within
the past
PAST_NDAYS.
RefineWithRank Private ConstructAltaVistaURL This function
constructs the
(BF. Search) Function string needed to
passed to
Altavista in order
to rank an
advanced query
search. If we
are constructing
the simple
query we will take
in all the
keywords from the
title. For
the complex query,
we will
take in words from
company
and topic, much the
same
way we formed the
query in
ConstructComplexAVKey
Word.
IdentifyBlock Public SearchAltaVista, This function
extracts the
(BF.Parse) Function SearchNewsPage block within a
string marked
by the beginning
and the
ending tag given as
inputs
string at a certain
location (iStart).
The block
retrieved does not
include the
tags themselves. If
the block
cannot be
identified with the
specified
delimiters, we
return unsuccessful
through
the parameter
iReturnSuccess
passed to use
by reference. The
return type
is the block
retrieved.
IsOpenURLError Public SearchAltaVista, This function
determines
(BF.Error) Function SearchNewsPage whether the error
encountered is that
of a
timeout error. It
restores the
mouse to default
arrow and
then returns true
if it is a time
out or false
otherwise.
SearchNews Public GoBackGroundFinder This function
prepares a
Page Function query to be
submited to
(BF.Search) NewsPage Search
engine. It submits
it and then
parses the
returning result in
the appropriate
format
containing the
title, URL and
body/summary of
each story
retrieved. The
number of
stories retrieved
is specified
by the constant
UM_NP_STORIES
ConstructNewsPageURL Private SearchNewsPage This function
constructs the
(BF.Search) Function URL to send to the
NewsPage site. It
uses the
information
contained in the
input meeting
record to
determine what
keywords to
use. Also depending
whether
we want simple or
complex
query, we call
different
functions to form
strings.
ConstructComplexNPKey Private ConstructNewsPageURL This function
constructs the
Word Function keywords to be send
to the
(BF.Search) NewsPage site.
UnlikeConstructKeyWord
String which simply
takes all
the keywords from
the title to
form the query,
this function
will look at the
results of BF's
pattern matching
process
and see if we are
able to
identify any
specific
company names or
topics for
constructing
the queries. Since
newspage
works best when we
have a
company name, we'll
use
only the company
name and
only if there is no
company
will we use topic.
ConstructOverallResult Private GoBackGroundFinder This function takes
in as
(BF.Main) Function input an array of
strings
(stInStories) and a
MeetingRecord which
stores
the information for
the
current meeting.
Each
element in the
array stores
the stories
retrieved from
each information
source.
The function simply
constructs the
appropriate
output to send to
Munin
including a return
message
type to let Munin
know that
it is the BF
responding and
also the original
user_id and
meeting title so
Munin
knows which meeting
BF is
talking about.
ConnectAndTransferTo Public GoBackGroundFinder This function
allows
Munin Sub Background Finder
to
(BF.Main) connect to Munin
and
eventually
transport
information to
Munin. We
will be using the
UDP
protocol instead of
the TCP
protocol so we have
to set up
the remote host and
port
correctly. We use a
global
string to store
gResult
Overall because
although it
is unecessary with
UDP, it is
needed with TCP and
if we
ever switch back
don't want
to change code.
Disconnect Public
FromMuninAnd Sub
Quit
(BF.Main)
FIG. 6 is a flowchart of the actual code utilized to prepare and submit searches to the Alta Vista and Newspage search engines in accordance with a preferred embodiment. Processing commences at function block 610 where a command line is utilized to update a calendar entry with specific calendar information. The message is next posted in accordance with function block 620 and a meeting record is created to store the current meeting information in accordance with function block 630. Then, in function block 640 the query is submitted to the Alta Vista search engine and in function block 650, the query is submitted to the Newspage search engine. When a message is returned from the search engine, it is stored in a results data structure as shown in function block 660 and the information is processed and stored in summary form in a file for use in preparation for the meeting as detailed in function block 670.
FIG. 7 provides more detail on creating the query in accordance with a preferred embodiment. Processing commences at function block 710 where the meeting record is parsed to obtain potential companies, people, topics, location and a time. Then, in function block 720, at least one topic is identified and in function block 720, at least one company name is identified and finally in function block 740, a decision is made on what material to transmit to the file for ultimate consumption by the user.
FIG. 8 is a variation on the query theme presented in FIG. 7. A meeting record is parsed in function block 800, a company is identified in function block 820, a topic is identified in function block 830 and finally in function block 840 the topic and or the company is utilized in formulating the query.
Alternative embodiments for adding various specific features for specific user requirements are discussed below.
Enhance Target Rate for Pattern Matching
To increase BF's performance, more patterns/pattern groups are added to the procedure "CreatePatterns." The existing code for declaring patterns can be used as a template for future patterns. Because everything is stored as dynamic arrays, it is convenient to reuse code by cutting and pasting. The functions BindName, BindTime, BindCompanyLocTopic which are responsible for associating a value with a placeholder can be enhanced. The enhancement is realized by increasing the set of criteria for binding a certain meeting field in order to increase the number of binding values. For example, BindTime currently accepts and binds all values in the form of ##:## or #:##. To increase the times we can bind, we may want BindTime to also accept the numbers 1 to 12 followed by the more aesthetic time terminology "o'clock." Vocabulary based recognition algorithms and assigning an accuracy rate to each guess BF makes allowing only guesses which meet a certain threshold to be valid.
Depending on what location the system identifies through pattern matching or alternatively depending on what location the user indicates as the meeting place, a system in accordance with a preferred embodiment sugges |