Collaborative agent interaction control and synchronization system7007235Abstract A system for supporting a coordinated distributed design process and which allows individuals to hold meetings over the internet and work together in a coordinated fashion on shared design problems is described. Claims What is claimed is: Description GOVERNMENT RIGHTS
Referring now to FIG. 9 and again to FIG. 3, a model of a design of a tool to support distributed collaborative meetings is shown. The tool provides an environment for structured information exchange across the internet in real-time. The system includes synchronous communication support, coordinated interaction support, system modularity and extensibility with a variety of media and tools, robust and reliable communication, and a multi-user interface for collaboration as described further hereinafter. Information exchanged in a shared environment requires a variety of media. Typical exchanges between members of a group involve speech, gestures, documents and sketches. Such interactions need to occur in real time for an on-line meeting. In an effort to enhance group design and collaboration processes in a distributed environment, the following three objectives have been included in the conferencing system. First, the relaxation of time and space constraints in traditional meeting settings. Secondly, the facilitation of distributed negotiation through the formalization of meeting control methodologies and the application of intelligent agent mechanisms to select the appropriate methodology. Thirdly, the capture of process and rationale that generated a product design through the documentation of meetings. To achieve these objectives a distributed negotiation meeting process is shown to include four critical components: co-location, cooperation, coordination, and documentation. This model maps the necessary physical meeting elements into a general requirement list. These requirements are based on the fact that physical meetings require four key components in order to exist. First, a physical meeting room in which the participant can meet (co-location). Secondly, a common language and a shared understanding of materials to be presented in the meeting (cooperation). Thirdly, an agenda and an individual or set of individuals that ensures the agenda is maintained and the group is focused on resolving the issues outlined in the agenda (coordination). Finally, a Group memory which is comprised of each individual's memory and notes as well as the formally defined group memory incorporated in the minutes of the meeting (documentation). In FIG. 9, the interaction and relationship of the layers of a computer-based conferencing system required for an effective collaborative environment is shown. Co-location involves dealing with the network infrastructure to provide seamless communication among distributed clients in a conference. This layer should provide naming services to identify client locations as well as interaction with the network protocols to transmit data across the network between the clients. Cooperation involves the sharing of information among clients in a team. Due to differences in software and capabilities of the various clients, translations need to be performed in order to provide a coherent view of the data among the clients. Coordination involves control of the work flow and communication process. This allows for efficient control mechanisms to coordinate group effort. The coordination layer acts as a "virtual manager" of the conferring clients. Documentation involves the capture and storage of conversation elements exchanged during a meeting. The documentation process provides a mechanism for the retention of group memory. The internet is a collection of interconnected nodes (machines) that interact via a common protocol that is TCP/IP. Due to the nature of the protocol as well as the packet transmission and routing mechanisms prevalent on the internet, the internet is a non-deterministic network. Hence, inter-packet arrival time is unpredictable due to varying network traffic. In a real time application, i.e. an application with pre-specified time dependence, such random delay patterns can render the application ineffective. Insuring real time communication via the internet requires a series of delay compensation techniques. These heuristics reduce the amount of variability in the underlying network as well as provide the end user with near real time performance. Synchronization of the various media inherent in a multimedia conference requires real time scheduling support by the conference support tools. Most current operating systems do not provide adequate support for real time scheduling. Real time system theory addresses the scheduling of multiple independent channels or streams of data. As shown in FIG. 10, these channels may have unpredictable arrival rates. Real time scheduling assures that all media channels are communicated within a given time period or frame, eliminating potential "lip-sync" effects. Due to the possibility of losing packets or delays in packet transmission by the medium, a queuing mechanism is provided to enforce the real time constraints. Referring now to FIG. 11, a control infrastructure model is shown to control the various portions of the conferencing system in order to support both individual interaction as well as process control in group interactions. A forum server 140 acts as the communication control mechanism from the conferencing or conferencing system. The forum server's primary function is the allocation of communication channels among individuals in the group, here client A 136a, client B 136b and client c 136c. Communication among individuals, here client A 136a, client B 136b and client c 136c, is not channeled through this server but is rather controlled by the forum process. Forum processes are initiated by a forum manager tool that allows the definition of meeting membership, meeting control strategies, meeting agenda and meeting notification. The meeting may be defined as open (i.e., any person can enter the meeting room) or closed in which all participants in the meeting must be predefined in the agenda tool 144. Each meeting member is also assigned particular access rights that include: agenda editing, chairperson control, and control of the meeting proceedings as shown and represented by member box 142. The agenda tool 144 is also used to define the meeting agenda items which are each assigned a floor control strategy by the meeting initiator. Once the agenda is complete, the system automatically sends notification messages to the participants and a forum server process is created with the appropriate membership and agenda. The forum class 140 processes messages from the client systems which represent each participant in the meeting. The forum class 140 is also responsible for maintaining meeting membership and temporal control of the meeting. This includes meeting notification, agenda traversal, and maintaining and traversing meeting logs. Communication requests received by the forum class 140 from the clients 136a,b,c are handled by one of the subclasses of the control class 146. The Control class 146 include functions to manipulate tokens and manipulate the speaker queue 148. The speaker queue 148 includes all members that are interested in acquiring the floor. The ordering of the queue 148 is based on the particular control strategy used including chairperson 150a, brainstorming 150b, lecture 150c and dynamic interaction 150d. For example, the chairperson 150a strategy would allow explicit ordering of the queue by the chairperson. The brainstorming 150b strategy would simply be a first-in first-out (FIFO) queue. Ordering of the queue 148 can also be based on more complex inputs provided by the user interface mechanisms described hereinafter. The token control mechanism and the forum processes are described in further detail hereinafter. The conferencing system also provides documentation. Documentation of meeting proceeding provides two purposes. First, providing a convenient snapshot of negotiation proceedings for late participants or for follow-up meetings, and second, retain group memory by saving design rationale knowledge encoded in the speech exchange during a negotiation. Two key mechanisms have been designed to support negotiation documentation requirements: Conversation indexing mechanisms and Conversation browsing tools. For the benefit of information retrieval and the retention of a meeting context for time-delayed participants a conversation model has been developed. The model is based on the four critical dimensions of design negotiation conversations: Problem space/Product-Process model, Information flow, Authority and role of participant, and Absolute time and time dependence among conversation elements. The conversation model provides a basis for indexing the free-form conversation occurring in a typical meeting event. The model is further refined by incorporating a semi-structured design rationale model that includes designer intent, recommendations and justifications. This model supports both structured and unstructured design rationale data. In the conferencing system, to represent conversations in negotiation, the system includes an intent (shown as a rectangle), a recommendation (shown as a rounded rectangle) or a justification (shown as an elongated hexagon). Each of these boxes contain several words that describe the conversation clip. Clicking on the box will cause a multimedia presentation of the issue to appear on the system console. A combination of an index and a causal/temporal model of the interaction is used to generate a directed graph of the proceedings of the meeting. Such a graph forms the core of the user interface and allows quick visualization of the meeting proceedings. Users can then browse the conversation data based on a single graph or on the intersection of several graphs. For example the user may wish to look at all conversations regarding the functional specification phase (according to the process model) of a specific joint (from the product model) generated by the mechanical engineer (from the designer index). The scope of the graph can also be limited in accordance to user preferences derived by an intelligent user interface agent currently being developed. It should be appreciated that social interaction and meeting control strategies must meet user approval to provide a sound basis for a communication tool to be effective. At a minimum, multiple media channels are required since group communication is generally comprised of audio, textual, and visual data. Multimedia channel synchronization is essential due to random delays inherent in the underlying network. A conference control mechanism is required to provide efficient group interaction. The system must be adaptable to different conference styles (from informal, unstructured conversation to a stringent and formal conversation control mechanism). Ability to support groups in the various stages of formation, i.e. the ability to have hierarchically structured groups that are easily expandable. Ability to retain group memory to build corporate experience as specified by the adjourning phase in the group life cycle. As shown in FIG. 3, the conferencing system includes several interlinked modules and servers. Each participant engaged in a collaboration conference spawns a Collaboration Manager (shown as a dashed box) which is comprised of media drivers (shown as pictograms of the media, i.e. video camera, microphone and X display, and message servers (indicated by the acronym 'MSG'). The message servers package data for transmission over the network and enforce synchronization constraints during media play-back. Forum servers are processes that maintain control of a conference among several individuals and enforces membership constraints. Furthermore forum servers log all conference proceedings. Forum servers are spawned by forum managers (not shown) that define a specific control methodology. Forum managers also provide mechanisms for converting a forum server's control strategy. Finally, the name server maintains a directory of all participants, forum managers and forum servers within the conferencing system. It allows each participant to easily address any other member or forum in the conferencing system. The Collaboration Manager incorporates the user interface and maintains lists of available media resources and forum servers as shown in FIG. 14. The Collaboration Manager also has a snapshot facility that allows each participant to retain portions of the meeting for his/her own personal notes. It also enforces conference controls associated with the forums in which the user is participating. For example, a specific forum may not allow any conversations with users outside of the forum or may not permit any private side conversations with other members of the forum. Media drivers handle all VO interaction between the multimedia conferencing system and the underlying media channel. Each driver is tailored specifically to the underlying media represented. Each driver is responsible for data acquisition and frame compilation for transmission and replay. This module must also provide the multimedia server with synchronization information, frame size, and delay and error tolerances. Several media drivers have been implemented that enable distributed schedule coordination, shared whiteboards for sketching, a text tool for chatting, and audio and video drivers using Microsoft NetMeeting technology. The conferencing system is designed to support multiple media channels in a conversation. As described hereinabove in connection with FIG. 10, due to delays in the transmission of the packets across the internet, packet arrival times are unpredictable. Therefore, each multimedia frame does not arrive at the destination as one chunk. The receiver must then reassemble the frame and ensure that play-back of the frame is synchronized such that it reflects the initial input from the source. Referring now to FIG. 12, an overview of the media channel synchronization subsystem is illustrated. Media synchronization is base on the synchronization parameters supplied by each media driver. Each media driver also supplies temporal relations with respect to the other media drivers in the system. Given these parameters, the system can compensate for skews in the delivery time of messages. A synchronization engine combines the synchronization heuristics and parameters to play-back the multimedia data to the receiver in as similar a form to the original data. Each channel in a multimedia conference must include a parameter that describes its temporal relation to each of the other channels. As each media device registers with the message server system 158, it is checked for consistency with the preferred parameters. Once all task computation times and deadlines are determined the scheduler operates on an earliest deadline first policy. That is within a given time unit the highest priority tasks to be scheduled are those that have the highest period. Multimedia frames transmitted by a source participant are encoded with a frame sequence number and a time stamp. Furthermore, the initial and final frames in a conversation are uniquely tagged to aid the synchronization and scheduling mechanism. Temporal constraints are encoded with respect to a single frame. Each frame is composed of multiple channels of media data for a given period of time. In order to ensure the arrival of all packets in a single frame, a delay in play-back at the destination is introduced. The conferencing system 30 enforces a delay of 0.5 seconds although this may be varied as the network infrastructure changes. The synchronization engine enforces three types of temporal constraints: before, after, and during. All three constraints are determined on the transmission side and the appropriate frame sequence numbers are chosen for each channel to reflect the constraint. For example, if text was required to appear after audio, and audio was sampled in frames i to i+10 then the text sequence number would be i+11. All packets arriving on the receiving end are placed in an input buffer queues by the media drivers. The queues store up to fmax frames (fmax=100 in the conferencing system prototype). Incoming data is placed in the queue according to the frame sequence number. The queue index is equal to the frame sequence number modulo fmax. Each media channel has its own queue structure, e.g. audio queues have 10 audio clips per queue element, text queues have 1 string per queue element. The queue structure is a list of lists. The top-level list is indexed by sequence. For each sequence index in the top-level list a secondary list contains the media packets indexed by source (this allows a receiver to listen to data from multiple sources). Hence, a single sequence index can be associated with multiple elements. Each element in the queue is also tagged with a time-stamp and a source specification. Referring now also to FIG. 13, a scheduler 168 operates on the basis of frames. The scheduler 168 is invoked periodically based on the frame time-out period. The frame time-out period is arbitrarily set at a 14 second. Each frame contains several packets on each media channel. At each interval the scheduler 168 polls each queue and retrieves a list of complete frames. If a complete frame exists and it is has the smallest sequence number the frame is scheduled for replay. However, if the frame with smallest sequence number is incomplete, the scheduler employs an applicable delay compensation heuristic. If none of the heuristics are applicable the user is notified that the communication channel can not support the quality of service requested and suggests decreases in the quality thresholds. There are two exceptions to the behavior of the scheduler 168. There are two special frame identifiers, initial and final. The scheduler 168 should not replay a frame unless three frames are available for replay, unless the final frame is among the last three frames. This buffering of frames ensures that data will usually be available for replay at the frame time-out. The synchronizer then takes a single frame and passes it on to the real time scheduler. The scheduler 168 then posts the appropriate events to replay the frame. The events are posted based on an earliest deadline first policy. The scheduler is implemented on top of the X event handler. The conferencing system 30 maintains the input and output buffers and ensures that all channels are assembled before play-back by the media drivers. In FIG. 13, the essential functionality of the synchronization engine as well as its relation to the input queue and multimedia frame output queue is shown. A multimedia frame output queue stores multiple frames prior to transmission on the ethernet. This is required to allow for complete channel frame transmission. An input media channel queue provides storage of incoming data packets on each media channel. These are stored until they can be compiled into a multimedia frame. A connection manager takes care of low level calls to the TCP/IP layer for maintaining socket connections and sending datagrams across the internet. A correspondence cache maintains a cache of addresses associated with all participants the user will broadcast to given that he/she is a member of a specific forum. Update requests are periodically transmitted to maintain cache coherence between the forum server and multimedia message server. All messages are TCP/IP datagrams and are asynchronous. Each component of the system has an interrupt handler that manages incoming and outgoing messages and appropriately routes the messages to the appropriate objects. Referring to FIG. 11, forum managers contain information on a particular type of meeting. They spawn off instances of forums that comply with the forum manager control mechanisms but with varying memberships. Currently, four such forum managers have been included in the conferencing system however the system is extensible and additional forum managers need only comply to a predefined message protocol. Control 146 controls various control modes and the underlying primitive control structures that are required and among the necessary provisions are membership request processing, membership grant, token request, token grant, as well as participant privilege explication. These parameters allow a forum manager to specify membership constraints as well as floor controls for a conference. A forum is a structured group of participants involved in a collaborative effort. The forum server maintains a list of all participants in a specified forum as well as the privileges associated with each participant. Each forum member is listed in one of three states in the forum: active (logged in and listening to conference), speaking (actively participating in conferencing, i.e. has control over the floor), or non-active (not logged in and not receiving any conference communications). Forum servers have two key functions: subscription control and speaker control. Subscription control may be a predefined list of allowable conference participants or it could be through a vote by existing participants or it may be a forum maintainer with the right to revoke and grant membership to potential members. Speaker control is the process by which a forum server maintains an orderly conversation among the members of the forum. Speaker control or floor control of the forum is achieved through the granting and revoking of conversation tokens as described herein. All restrictive controls on the participants in a forum are provided via token access. A Collaboration Manager (not shown) cannot issue any communications without having received a token granting access privilege to that specific speaker. Token controllers on both the Collaboration Managers and Forum Servers must be secure and trusted code. Forum Servers issue two commands related to tokens: a Grant Token command (specifying write or read rights to a communication channel with another participant) and a Retrieve Token command (retracting read or write rights specified by a Grant Token). The Collaboration Manager responds with an Accept Token or Reject Token message depending on conflicts with other active forums on that user's workstation (e.g. engagement in another forum that does not permit multiple parallel forums). Tokens have internal time-out counts after which tokens expire. Specialized tokens denote ability to participate in side conversations, external conversations, and interjection rights. These side and external conversation tokens can be used to maintain confidentiality within a conference and to minimize group distractions. Interjection tokens allow for emergency situations. Tokens are granted upon request submitted to the Forum Server by the Collaboration Manager. Such tokens can be granted automatically using a predetermined computer moderation scheme or can be granted manually by a moderator. Furthermore, conference logging is achieved via a specialized token requesting communication sent to the Forum server where all interactions are logged for future browsing and editing. This mechanism provides a process history support. The token structure provides a centralized control yet distributed communication structure for conferencing. Hence, all high bandwidth communication is decentralized and direct, while all floor control requests are centralized by the forum server. Structuring and control of group meetings enhances the efficiency of a collaborative team. Forums maintain a conference among individuals. Each forum is associated with a forum moderator that defines the control behavior of the conference. The forum server processes requests for membership to the forum as well as requests to speak by participants within the forum. As shown in FIG. 2, a forum is comprised of individuals and other forums. The forum Management that is a member of another forum Project must be at least as restrictive as the forum Project. Any restrictions on membership and communication must be upheld by the child forum, Management. During a meeting or conversation a particular participant can be in one of three states: active (i.e. speaking or demonstrating), pending (i.e. awaiting his/her turn to speak), or inactive (i.e. passive observer or listener). Each participant's state is relative to a specific forum and is stored in the forum server. The Speaker Request is equivalent to a professional raising his/her hand in a meeting situation. It indicates to the forum moderator and to the other members of the forum the participant's intent to speak. A speaker request is accompanied by a qualification of the speech act the speaker intends to perform. The forum server would then place the participant on a list of pending speakers depending on his/her qualifications. In a democratic forum a participant becomes active if a majority agrees to his/her request to speak. Furthermore, the computer can automatically moderate (i.e. choose the active speakers from the pending queue) a forum based on pre-compiled speaker qualification data. Interjection is a mode of conversation in which the participant can interrupt an ongoing conversation for a limited amount of time. Group meetings can take on multiple characters and structures. Group formation and meeting cycles require various group control procedures and paradigms. A list of primitive controls on each forum from which a more complex collaboration control mechanism may be devised shall now be discussed. The forum creator may choose to over-ride any of these primitives for a particular forum member. A Chairperson is a designation of a participant or group of participants who hold a privileged status within the forum. They may preempt speakers and arbitrarily choose active speakers. They can control interjection duration which is a parameters specified for a forum being the length of time allowed for interjections. An interjection time of zero indicates no interjections are allowed. Conversely an infinite interjection time allows for complete unstructured free-form conversation. Maximum speech duration is a parameters specified for a forum being the length of time allocated to a single member to hold the floor of the conference. Maximum number of active speakers is a parameter that indicates the number of concurrent speakers allowable during the conference. Side Conversations are two-way or multi-way conversations among a subset of the forum members. Forums may be created that do not allow such side conversations to exist. External conversations are conversations between a member of a forum and other non-members while a forum is active. This form of conversation may also be restricted by the forum. Currently the conferencing system only provides either continuous logging or no logging at all of the ongoing conference. Speaker evaluation is a voting mechanism that has been implemented to evaluate participant acceptance of a specific topic or to determine participant value to a conference. The results of this evaluation may be used to determine the order of speaker priority for a conference. Speaker ordering is the ordering of the pending speaker queue which may be on a first come first serve basis or other evaluation criteria. These include: ordering of speakers based on value determined by the participants, as described in speaker evaluation; or ordering based on chairperson choice in a chairperson controlled conference. This control mechanism provides for free form and structured conferencing. The collaboration primitives discussed above are combined to form a collaboration scheme or mechanism. The conferencing system can easily be extended to provide many different collaboration schemes. For example, Free is all participants may talk at any time. Completely uncontrolled all speakers may speak at once. That is Chairperson='none', Side Conversation=ALL, External Conversation=ALL, Speaker Ordering='first-come first-serve'. Democracy is the choice of the active speaker is based on a vote by all other participants. That is Chairperson='none', Side Conversation=ALL/NONE, External Conversation=ALL/NONE, Speaker Ordering='highest vote'. Chalk-Passing is the last active speaker chooses next person to be a designated active speaker. Each speaker may only speak for the time allotted by the Maximum Speech Duration parameter specified above. In this mode: Chairperson='last speaker', side conversation=ALL/NONE, external conversation=ALL/NONE, Speaker Ordering='chosen by chairperson'. Chairperson Control is a specific privileged participant (Mr. X) has the ability to choose the participant who should address the conference at any specific time. In this mode: Chairperson='Mrs. Q', side conversation=ALL/NONE, external conversation=ALL/NONE, Speaker Ordering='chosen by chairperson'. Modified Delphi is where the system polls all participants in the collaboration on their views regarding a specific problem. The results are compiled and presented to the conferring experts and the participants are then re-polled. This process is repeated by the questioner until the experts provide a consistent analysis. The Delphi method is used extensively in polling experts on directions in hi-tech industry. In this control strategy there exists a moderator as well as a questioner. A quicker more dynamic method using our collaboration methods is proposed. In this mode: Chairperson='moderator/questioner', side conversation=ALL/NONE, external conversation=ALL/NONE, Speaker Ordering='round robin'. As shown in FIG. 3, the name server 32 is an independent server that acts as a global directory for the collaboration conference system 30. The following information is stored in the name server for each participant and each forum and may be queried by any participant or forum server. (i) The Participant Name and Location including media driver locations and media descriptors. (ii) Participant Status recording whether each participant is either in an active or non-active state. Active denotes that the user is logged into the conference system 30 via a Collaboration Manager on his/her workstation. Non-active status is given to users who are subscribers to the conference system 30 but are not reachable. (iii) Forum Manager Name and Location including a brief description of control style. (iv) Forum Name and Location including a listing of shared media drivers. (v) Forum Status recording whether each forum is either in an active or non-active state. Active forums imply a conversation is occurring among the participants of the forum. Non-active forums are structured meeting skeletons with membership lists for a meeting that is not currently in session. The collaboration control manager of the conferencing system 30 includes several interacting servers and modules. A brief description of the operations of these modules/servers will now be discussed. Forum Creation is where a forum is initiated by invoking an Forum manager. The forum manager tool can be invoked by executing the appropriate forum manager program or by choosing the New Forum command from the control panel menu 180 as shown in FIG. 14. A series of dialog boxes and menus then guide the forum initiator through the creation process. As described earlier, FIG. 6 shows the forum manager user interface. The forum creation process involves specifying the group primitives described earlier as well as specifying the members of the forum and their associated privileges. The specified parameters are then stored in a forum specification file that is used by the forum server when instantiated. Forum managers can also be used to transfer an existing forum from one control mode to another. The forum manager loads the forum specification file from the existing forum and prompts the user for any additional information required by the new forum control mode. Forum Servers are initiated by forum managers. As described earlier, forum managers extract the necessary parameters for forum instantiation from the forum creator. The forum manager stores all parameters in a file according to a preferred format. The forum server is then started as an independent process. Upon startup the server reads the parameter file and initializes all internal objects accordingly. The server then registers itself with the name server. It is then ready to accept any login or membership requests from users of the conferencing system 30. The forum server maintains a membership list that includes an identification of each member's state. A forum member can be in any of the following four states: Member, Logged In, Waiting to Speak or Speaking. A Member is a user that has been specified as a person who is allowed to join the forum. Logged In (or active) is a user who is actively engaged in a forum discussion. Waiting to Speak is a user who has requested the right to speak but has not yet acquired the enabling token. Speaking is a user who has the floor (i.e. the user possesses a speech token) and has the ability to transmit information to any number of forum members. Each state described above assumes that the user was in the preceding state before transition. Users of the conferencing system 30 must each start a collaboration manager (CM) process on their workstations. The manager provides an interface/control panel to the distributed collaboration conferencing system or conferencing system 30. Upon startup, the CM registers with the name server. The CM then requests a list of the available forum managers and forum servers from the name server. Finally, the information is displayed in the first two list boxes in the control panel 180 as shown in FIG. 14. The control panel 180 also provides the following functionality: (i) Local conference logging control (including recording and retrieval); (ii) Screen capture; (iii) Forum server creation via the forum managers; and (iv) Instantiation of media drivers according to the local workstation's capabilities. Once the two key components (i.e. forum servers and collaboration managers) are running, conferences can be started on the conferencing system 30. The initial step in entering a conference is accessing a specified forum. This can be done by clicking on the appropriate forum name in the forum list box in the control panel 180. Once a forum is selected a login message is sent to the forum server, whose address has been supplied by the name server. The forum server then determines if the participant logging in has the appropriate access rights (i.e. the participant is on the membership list for a closed forum). An acknowledgment is returned to the collaboration manager if the user has been successfully logged in, otherwise a rejection message is transmitted to the user. Furthermore, if the login was successful, the forum server's active list is updated and all active members of the forum are informed of the addition to the community. The active member list box on the right side of the control panel 180 shows the currently logged in members of the forums highlighted in the forum list box. As described in the section above the forum server automatically updates all active members when any forum members have logged in or logged out. When Requesting to Speak, speech requests on the conferencing system 30 involve two steps: selecting the audience and describing the speech intent. Audience selection simply involves selecting the appropriate recipients from the active member list box on the control panel 180. Forums that do not allow side conversations will automatically have all items highlighted in the active member list box. A speech intent is indicated by pressing one of the speech request buttons. As soon as a speech request button is depressed token requests are sent to the forum server. A token request is sent for each highlighted member in the active member list box. The forum server then processes the token requests. The server's response is dependent on the forum control mode that is encoded in the forum server. According to the control mode the forum server decides whether to place the speaker request on the pending queue or to automatically grant tokens to the requester. For example, in a chairperson controlled mode, all requests are placed on the pending queue. When the chairperson allows a specific user to speak, his/her name is transferred from the pending queue to the speaking queue and tokens are granted to the user. Any changes in the contents of either the pending queue or speaker queue are automatically broadcast to all members of the forum. Once the previous steps have been completed successfully (i.e. a participant logs onto an existing forum server and is granted one or more communication tokens) real time multimedia information can be shared with other members of the forum. The user can then use any of the media drivers available (i.e. audio, text, X whiteboard) at his/her workstation to send data via all connections for which the user has tokens (the tokens act as keys that unlock a point to point connection). The data generated by the drivers is transformed into TCP/IP packets and tagged with a time stamp and frame sequence number. The data receiver then replays the packet as per the algorithm described earlier. It should be appreciated that all conference communication and control mechanisms described above are generic and can be applied to any conference control scheme. Although only a limited set of control modes has been discussed, simple tools are provided for control mode extensions to the conferencing system 30. Furthermore the tokenized control mechanism described is highly efficient and eliminates any bottlenecks associated with a centralized communication and routing center. Developing a user interface for distributed collaboration required a detailed study of the purpose and use of the system. The experiments conducted in group design processes provide a baseline to measure how closely the system developed conveys the information exchanged in a meeting (written, vocal, gesture and physical actions). The primary function of an distributed interaction interface 178 as shown in FIG. 14 is to convey the actions of others engaged in the distributed conference. Awareness of the state of conferring individuals and their respective ownership or generation of shared objects is necessary to keep track of the interaction in a non-physical setting. Who is doing what? Who is gesturing? Who is pointing? Who is signalling? Who is working on an artifact? Whose video image is this? Whose cursor is this? Whose voice is this? Where in an artifact are people working? What is the emotional state of individuals? Member actions? Who has been there? Who is there? Who is coming? Who is where? Are you aware of what I am doing? Membership awareness? Who owns artifacts? Who can access artifact? Who can change an artifact? What artifacts are being worked on? What roles are there? Who is playing what role? What are related roles? Who is interested? Who wants to speak? Who is speaking to whom? What type of protocols exist? What is the current protocol? The latter represent the requirements for a user interface for a conferencing system 30. Representing all four dimensions of awareness is a formidable task to accomplish within limited screen real estate. The choice of metaphor for representation becomes critical to reduce the cognitive demand on the system users. The conferencing system does not address all the awareness questions presented above. However, a large portion of the critical questions are addressed in the interaction metaphors provided by the conferencing system 30. The conferencing system 30 is implemented using metaphors chosen of combine elements from the physical setting (i.e., a meeting room) with standard "window" metaphors that are prevalent in modern operating systems. These metaphors were chosen to provide simple cognitive mappings between the intended use and the concept represented by the interface metaphor. In determining the metaphors for group engagement several criteria were examined including expressiveness, naturalness, input mapping, transparency and dynamism. Expressiveness is the degree to which the metaphor embodies the action or control represented and does not project any unintended meaning. Naturalness is the extent to which the metaphor complies to typical conferencing norms. Input mapping is the degree to which the metaphor can be logically mapped to keyboard strokes, mouse movements or any other generic input device. Transparency is the metaphor must not interfere with the task in which the conferees are engaged. Transparency in the interface also implies that the conference controls occupy a small section of valuable screen real estate. Dynamism is the interface must represent the dynamic nature of the group interaction. The controls can not be binary since the speaker state is not binary. A three dimensional interface is preferred to indicate the spatial relationship among the conferencing individuals. The initial interface was two dimensional, however, that system limited the representation of gaze and addressing in the control space. Metaphors were derived for the following concepts: Meeting entry and exit. Floor State; Member State and Addressability; Focus of Attention; Degree of Engagement; and Gaze/Addressability. Meeting Entry and Exit includes a hallway metaphor 184 as shown in FIG. 15 to represent the multiple meetings available on the conferencing system 30. This provides a simple metaphor that maps virtual distributed meetings to physical doors and hallways. This metaphor can be extended to include meeting hierarchies that include floors, buildings, blocks and cities. We have found no need to provide such a degree of hierarchy although if the system is scaled to a large organization such structures may be necessary. The doors in the hallway represent an individual distributed meeting. The door metaphor provides additional queues regarding meeting structure. A padlocked door indicates that the particular meeting has restricted membership. A red tab on the door indicates whether a meeting is currently active and the window at the top of the door indicates the number of people in the meeting. Finally, a descriptive name for the meeting is placed above the door. The Floor State metaphor has multiple representation in the user interface 190 as shown in FIG. 16. The item list at the top of the interface shows the current members of the conference and highlights pending and active speakers. Furthermore, the images of individuals in the interface are appropriately highlighted to show their different states. Additionally, the floor can be split into multiple "virtual" rooms. This allows individuals to create sub-meetings or side chats within the main meeting. Side chats are shown as tabbed folders in the meeting interface. Member State and Addressability is accomplished by several mechanisms which are employed to describe the distributed member state. Members choose the people they are to address by clicking on the appropriate members or clicking on the table to speak to everyone. FIG. 17 shows Feniosky requesting to speak to the group. As shown on the user interface 200, once a speech request is made, the pending speaker is shown by red highlighting of the name and a red halo around the pending speakers image. In a chairperson controlled forum, the chairperson can then allow a pending speaker to speak or leave him/her on the pending queue. FIG. 18 shows the chairperson, Karim, accepting the speech request from Feniosky. FIG. 19 shows the chairperson, Karim, granting the speech request from Feniosky. Finally, Feniosky gains the floor and is able to address the group as shown in FIG. 19A. The group members can determine the source of speaking by a green highlighting of the speakers name and a green halo around his/her image as shown on the left side of FIG. 19A. The speaker can determine his/her audience by bullet points that are displayed next to those that are listening to him/her as shown on the right side of FIG. 19A. Tool and Artifact Manipulation is accomplished by several tools which are available for interaction and they can be accessed from the menu system or by clicking on their appropriate icon in the room (e.g., clicking on the whiteboard will bring up a shared drawing tool). As users interact with objects in the different tools the owner of a particular object is highlighted. The interface with a variety of tools is shown in FIG. 14. Focus of Attention is supported by highlighting the currently active interaction tool. Furthermore, the current person speaking is also highlighted in the control console of the conferencing system 30 as shown in FIG. 19A. In the case where a tool is started by another individual in the conference, the tool will automatically be started on each distributed client to whom that individual is speaking. This creates an implicit focus of attention. More intrusive attention mechanisms were attempted (such as moving the focal window to the center of the screen), however, it was discovered that users resisted the loss of control over their environment that these automatic actions caused. To show degree of engagement, either a spring metaphor, a heat metaphor or a shadow metaphor can be used. The Spring Metaphor reflects the tension and dynamism of the participants as they attempt to control the floor. The springs are attached to a central object on the table which acts as the focus of attention for the interaction. As a participant becomes increasingly engaged the springs tense up thereby indicating to the whole group the degree to which the participant is interested in addressing the group. Active speakers can be represented through color (e.g. active coils could be red) or they can be represented by the full stretch of the spring (i.e. the spring becomes a straight wire). The Heat Metaphor utilizes color to show degree of engagement of a participant in a conference. The Shadow Metaphor represents engagement as a shadow that emanates from each participant and shows their presence at the conference table. The metaphor has important benefits in that it portrays a sense of physical presence. Before proceeding with a discussion of FIGS. 20-24, certain terminology is explained. The conferencing system of the present invention may be implemented using "object-oriented" computer programming techniques. Object-oriented computer programming techniques involve the definition, creation, use and destruction of software entities referred to as "objects." Each object is an independent software entity comprised of data generally referred to as "attributes" and software routines generally referred to as "member functions" or "methods" which manipulate the data. One characteristic of an object is that only methods of that object can change the data contained in the object. The term "encapsulation" describes the concept of packaging the data and methods together in an object. Objects are thus said to encapsulate or hide the data and methods included as part of the object. Encapsulation protects an object's data from arbitrary and unintended use by other objects and therefore prevents an object's data from corruption. To write an object-oriented computer program, a computer programmer conceives and writes computer code which defines a set of "object classes" or more simply "classes." Each of these classes serves as a template which defines a data structure for holding the attributes and program instructions which perform the method of an object. Each class also includes a means for instantiating or creating an object from the class template. The means for creating is a method referred to as a "constructor." Similarly, each class also includes a means for destroying an object once it has been instantiated. The means for destroying is a method referred to as a "destructor." An abstract object class refers to any incomplete class that cannot therefore be used to instantiate semantically meaningful objects. An abstract class is used as a base class to provide common features, provide a minimum protocol for polymorphic substitution or declare missing common features that its derived class must supply prior to instantiation of an object. When a processor of a computer executes an object-oriented computer program, the processor generates objects from the class information using the constructor methods. During program execution, one object is constructed, which object may then construct other objects which may, in turn, construct other objects. Thus, a collection of objects which are constructed from one or more classes form the executing computer program. Inheritance refers to a characteristic of object oriented programming techniques which allows software developers to re-use pre-existing computer code for classes. The inheritance characteristic allows software developers to avoid writing computer code from scratch. Rather, through inheritance, software developers can derive so-called subclasses from a base class. The subclasses inherit behaviors from base classes. The software developer can then customize the data attributes and methods of the subclasses to meet particular needs. With a base-class/sub-class relationship, a first method having a particular name may be implemented in the base-class and a second different method with the same name may be implemented differently in the sub-class. When the program is executing, the first or second method may be called by means of a statement having a parameter which represents an object. The particular method which is called depends upon whether the object was created from the class or the sub-class. This concept is referred to as polymorphism. For example, assume a computer program includes a class called Employee. Further assume that class Employee includes a member function which defines a series of method steps to be carried out when a worker retires from the company. In an object-oriented implementation, the retire method is automatically inherited by sub-classes of class Employee. Thus, if a class called Executive is a sub-class of the class called Employee, then class Executive automatically inherits the retire method which is a member function of the class Employee. A company or organization, however, may have different methods for retiring an employee who is an executive and an employee who is not an executive. In this case, the sub-class Executive could include its own retire method which is performed when retiring an employee who is an executive. In this situation, the method for retiring executive employees contained in the Executive class overrides the method for retiring employees in general contained in the Employee class. With this base class/sub-class arrangement another object may include a method which invokes a retirement method. The actual retirement method which is invoked depends upon the object type used in the latter call. If an Executive object type is used in the call, the overriding retirement method is used. Otherwise, the retirement method in the base-class is used. The example is polymorphic because the retire operation has a different method of implementation depending upon whether the object used in the call is created from the Employee class or the Executive class and this is not determined until the program runs. Since the implementation and manner in which data attributes and member functions within an object are hidden, a method call can be made without knowing which particular method should be invoked. Polymorphism thus extends the concept of encapsulation. Object-oriented computer programming techniques allow computer programs to be constructed of objects that have a specified behavior. Several different objects can be combined in a particular manner to construct a computer program which performs a particular function or provides a particular result. Each of the objects can be built out of other objects that, in turn, can be built out of other objects. This resembles complex machinery being built out of assemblies, subassemblies and so on. For example, a circuit designer would not design and fabricate a video cassette recorder (VCR) transistor by transistor. Rather, the circuit designer would use circuit components such as amplifiers, active filters and the like, each of which may contain hundreds or thousands of transistors. Each circuit component can be analogized to an object which performs a specific operation. Each circuit component has specific structural and functional characteristics and communicates with other circuit components in a particular manner. The circuit designer uses a bill of materials which lists each of the different types of circuit components which must be assembled to provide the VCR. Similarly, computer programs can be assembled from different types of objects each having specific structural and functional characteristics. The term "client object," or more simply "client," refers to any object that uses the resources of another object which is typically referred to as the "server object" or "server." The term "framework" can refer to a collection of inter-related classes that can provide a set of services (e.g., services for network communication) for a particular type of application program. Alternatively, a framework can refer to a set of interrelated classes that provide a set of services for a wide variety of application programs (e.g., foundation class libraries for providing a graphical user interface for a Windows system). A framework thus provides a plurality of individual classes and mechanisms which clients can use or adapt. An application framework refers to a set of classes which are typically compiled, linked and loaded with one particular application program and which are used by the particular application program to implement certain functions in the particular application program. A system framework, on the other hand, is provided as part of a computer operating system program. Thus, a system framework is not compiled, linked and loaded with one particular application program. Rather, a system framework provides a set of classes which are available to every application program being executed by the computer system which interacts with the computer operating system. Also it should be appreciated that FIGS. 20-24 are a series of object class diagrams illustrating the relationships between components (here represented as object classes) of the conferencing system. The rectangular elements (typified by element 22), herein denoted as "object icons," represent object classes having a name 222a, attributes 222b and member functions 222c. The object diagrams do not depict syntax of any particular programming language. Rather, the object diagrams illustrates the functional information one of ordinary skill in the art requires to generate computer software to perform the processing required of a conferencing system such as conferencing system 30 (FIG. 3). It should be noted that many routine program elements, such as constructor and destructor member functions are not shown. Turning now to FIG. 20, an object diagram illustrating the overall server connectivity includes a name server object class 222, a forum server object class 224, a participant class 226 and a control policy class 228. Referring now FIG. 21, an object diagram illustrating an agenda editor and a wizard includes an agenda editor object class 232, an agendaedititem object class 234, and an agendaitem object class 236. The agendaeditor object class 232 also uses a wizardrunner object class 238, and a plurality of different wizard object classes 240-248. Referring now FIG. 22, an object diagram 250 illustrating components of a conferencing system participant includes a collaboration manager 252 and a series of lists 254a-254d related to the collaboration manager object class 252 through a control condition 253a. Also related to the collaboration manager object class 252 through control conditions 253b-253d, respectively are a communication object class 256, a media driver object class 258 and a synchronization engine object class 260. Send and receive messages pass between objects instantiated from the Communication and Media Driver object classes 256, 258. Associated with the media driver object class 258 through a control 253e is a media Q object class 262. Access messages pass between objects instantiated from the media driver and media Q object classes 258, 260. Referring now FIG. 23, an object diagram 266 illustrating components of the media driver object classes includes a media driver object class 268 having associated therewith an audio media driver object class (Audio MD) 270a, a white board media driver object class (White MD) 270b and a text media driver object class (Text MD) 270c. Other media drivers could also be associated with the Media Driver object class 268. Referring now FIG. 24, an object diagram 280 illustrating components of a queue structure hierarchy which includes a list object class 282. Associated with the list object class 282 are a member list object class 284 and a media queue object class 290. An M element object class 286 is associated the member list 284 through a control 285a. A speaker Q, Participant Q and a pending Q 288a-288c are all associated with the member list class 284. Similarly an Audio Q, Text Q and X Q classes 292a-292c are associated with the Media Queue class. Audio clip, text string and X event classes 296a-296c each have a respective control 285c-285d to the classes 292a-292c. An element class 298 is coupled to the list class 282 through a control 285e and a Q element class 300 is coupled to the media queue class 290 through a control 285f The Audio clip, text string and X event classes 296a-296c are each associated with the Q element class 300. The message protocols for the system of the present invention are shown below in Tables 1-6.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
