|
|
|
APPLICATION PROGRAM INTERFACE (API) |
Automatic transport detection by attempting to establish communication session using list of possible transports and corresponding media dependent modules5754765
Abstract
The transports available in a local computer system for communicating with a remote computer system are automatically determined at either install time or run time. At install time, a list of transports supported by the local computer system is used to determine which supported transports are actually installed in the local computer system and the media dependent modules (MDMs) that correspond to those installed transports. At run time, a list of the installed transports and corresponding MDMs is used to determine which installed transports (and corresponding MDMs) can actually be used for an impending communications session with the remote computer system.
Claims
What is claimed is:
1. A computer-implemented process for transport detection, comprising the steps of
(a) providing a list of possible transports over which a local computer system communicates with at least one remote computer system; and
(b) automatically identifying, at run time, a subset of the list of possible transports that are available for communications with the remote computer system, wherein:
the list of possible transports corresponds to transports installed in the local computer system and corresponding MDMs;
the subset of the list of possible transports corresponds to transports that can be used by the local computer system for an impending communications session with the remote computer system; and
for each installed transport, step (b) comprises the steps of:
(1) attempting, by a network independent layer of the local computer system, to begin the communications session using a specified transport by calling a begin-session function into a network dependent layer of the local computer system;
(2) attempting, by the network dependent layer, to begin the communications session using the list of installed transports and corresponding MDMs, by loading an MDM corresponding to the specified transport and instructing the MDM to attempt to communicate with a corresponding network stack;
(3) if the MDM's attempt to communicate with the corresponding network stack is successful, then identifying the specified transport as one of the transports that can be used by the local computer system for the impending communications session with the remote computer system; and
(4) if the MDM's attempt to communicate with the corresponding network stack is unsuccessful, then rejecting the specified transport as one of the transports that can be used by the local computer system for the impending communications session with the remote computer system.
2. The process of claim 1, wherein:
step (b)(1) comprises the steps of:
(A) calling, by an application of the local computer system, into a conference manager of the local computer system to begin the communications session using the specified transport;
(B) calling, by the conference manager, into a conferencing application programming interface (API) of the local computer system to begin the communications session using the specified transport;
(C) calling, by the conferencing API, into a communications API of the local computer system to begin the communications session using the specified transport;
(D) causing, by the communications API, a data ink manager corresponding to the specified transport to be loaded; and
(E) calling, by the communications API, into the data link manager to begin the communications session;
step (b)(2) comprises the steps of:
(A) causing, by the data link manager, the MDM to be loaded using the list of installed transports and corresponding MDMs;
(B) calling, by the data link manager, into the MDM to begin the communications session; and
(C) attempting, by the MDM, to communicate with the corresponding network stack;
step (b)(3) comprises the step of passing a success message from the MDM to the data link manager to the communications API to the conferencing API to the conference manager to the application; and
step (b)(4) comprises the steps of:
(A) passing a failure message from the MDM to the data link manager to the communications API to the conferencing API to the conference manager to the application;
(B) causing, by the data link manager, the MDM to be unloaded; and
(C) causing, by the communications API, the data link manager to be unloaded.
3. The process of claim 1, further comprising the step of displaying, by the application, a list of callee addresses for the impending communications session, wherein each callee address corresponds to one of the transports that can be used for the impending communications session as identified in step (b).
4. The process of claim 1, wherein the installed transports and corresponding MDMs are identified in an initialization file.
5. The process of claim 1, wherein:
the installed transports comprise at least one of an ISDN transport and a LAN transport;
the installed transports comprise one or more LAN transports conforming to one or more LAN-transport standards; and
the one or more LAN transports comprise two or more LAN products conforming to a single LAN-transport standard.
6. An apparatus for transport detection, comprising:
(a) a list of possible transports over which a local computer system communicates with at least one remote computer system; and
(b) a network independent layer and a network dependent layer of the local computer system, adapted to automatically identify, at run time, a subset of the list of possible transports that are available for communications with the remote computer system, wherein:
the list of possible transports corresponds to transports installed in the local computer system and corresponding MDMs;
the subset of the list of possible transports corresponds to transports that can be used by the local computer system for an impending communications session with the remote computer system; and
for each installed transport:
the network independent layer attempts to begin the communications session using a specified transport by calling a begin-session function into the network dependent layer;
the network dependent layer attempts to begin the communications session using the list of installed transports and corresponding MDMs, by loading an MDM corresponding to the specified transport and instructing the MDM to attempt to communicate with a corresponding network stack;
if the MDM's attempt to communicate with the corresponding network stack is successful, then the network independent layer identifies the specified transport as one of the transports that can be used by the local computer system for the impending communications session with the remote computer system; and
if the MDM's attempt to communicate with the corresponding network stack is unsuccessful, then the network independent layer rejects the specified transport as one of the transports that can be used by the local computer system for the impending communications session with the remote computer system.
7. The apparatus of claim 6, wherein:
the network independent layer comprises an application, a conference manager, a conferencing application programming interface (API), and a communications API;
the network dependent layer comprises a data link manager, the MDM, and the corresponding network stack;
for each installed transport:
the application calls into the conference manager to begin the communications session using the specified transport;
the conference manager calls into the conferencing API to begin the communications session using the specified transport;
the conferencing API calls into the communications API to begin the communications session using the specified transport;
the communications API causes the data link manager to be loaded; and
the communications API calls into the data link manager to begin the communications session;
the data ink manager causes the MDM to be loaded using the list of installed transports and corresponding MDMs;
the data link manager calls into the MDM to begin the communications session; and
the MDM attempts to communicate with the corresponding network stack;
if the MDM's attempt to communicate with the corresponding network stack is successful, then a success message is passed from the MDM to the data link manager to the communications API to the conferencing API to the conference manager to the application; and
if the MDM's attempt to communicate with the corresponding network stack is unsuccessful, then:
a failure message is passed from the MDM to the data link manager to the communications API to the conferencing API to the conference manager to the application;
the data link manager causes the MDM to be unloaded; and
the communications API causes the data link manager to be unloaded.
8. The apparatus of claim 6, wherein the application displays a list of callee addresses for the impending communications session, wherein each callee address corresponds to one of the transports that can be used for the impending communications session.
9. The apparatus of claim 6, wherein the installed transports and corresponding MDMs are identified in an initialization file.
10. The apparatus of claim 6, wherein:
the installed transports comprise at least one of an ISDN transport and a LAN transport;
the installed transports comprise one or more LAN transports conforming to one or more LAN-transport standards; and
the one or more LAN transports comprise two or more LAN products conforming to a single LAN-transport standard.
11. A computer program embodied in a tangible medium, wherein, when the computer program is loaded into and executed by a local computer system:
the local computer system provides a list of possible transports over which the local computer system communicates with at least one remote computer system; and
the local computer system automatically identifies, at run time, a subset of the list of possible transports that are available for communications with the remote computer system, wherein:
the list of possible transports corresponds to transports installed in the local computer system and corresponding MDMs;
the subset of the list of possible transports corresponds to transports that can be used by the local computer system for an impending communications session with the remote computer system; and
the local computer system comprises a network independent layer of the local computer system and a network dependent layer of the local computer system; and
for each installed transport:
the network independent layer attempts to begin the communications session using a specified transport by calling a begin-session function into the network dependent layer;
the network dependent layer attempts to begin the communications session using the list of installed transports and corresponding MDMs, by loading an MDM corresponding to the specified transport and instructing the MDM to attempt to communicate with a corresponding network stack;
if the MDM's attempt to communicate with the corresponding network stack is successful, then the local computer system identifies the specified transport as one of the transports that can be used by the local computer system for the impending communications session with the remote computer system; and
if the MDM's attempt to communicate with the corresponding network stack is unsuccessful, then the local computer system rejects the specified transport as one of the transports that can be used by the local computer system for the impending communications session with the remote computer system.
12. The computer program of claim 11, wherein:
the network independent layer comprises an application, a conference manager, a conferencing application programming interface (API), and a communications API;
the network dependent layer comprises a data link manager, the MDM, and the corresponding network stack;
for each installed transport:
the application calls into the conference manager to begin the communications session using the specified transport;
the conference manager calls into the conferencing API to begin the communications session using the specified transport;
the conferencing API calls into the communications API to begin the communications session using the specified transport;
the communications API causes the data link manager to be loaded; and
the communications API calls into the data link manager to begin the communications session;
the data link manager causes the MDM to be loaded using the list of installed transports and corresponding MDMs;
the data link manager calls into the MDM to begin the communications session; and
the MDM attempts to communicate with the corresponding network stack;
if the MDM's attempt to communicate with the corresponding network stack is successful, then a success message is passed from the MDM to the data link manager to the communications API to the conferencing API to the conference manager to the application; and
if the MDM's attempt to communicate with the corresponding network stack is unsuccessful, then:
a failure message is passed from the MDM to the data link manager to the communications API to the conferencing API to the conference manager to the application;
the data link manager causes the MDM to be unloaded; and
the communications API causes the data link manager to be unloaded.
13. The computer program of claim 11, wherein the application displays a list of callee addresses for the impending communications session, wherein each callee address corresponds to one of the transports that can be used for the impending communications session.
14. The computer program of claim 11, wherein the installed transports and corresponding MDMs are identified in an initialization file.
15. The computer program of claim 11, wherein:
the installed transports comprise at least one of an ISDN transport and a LAN transport;
the installed transports comprise one or more LAN transports conforming to one or more LAN-transport standards; and
the one or more LAN transports comprise two or more LAN products conforming to a single LAN-transport standard.
16. A computer-implemented process for transport detection, comprising the steps of:
(a) providing a list of possible transports over which a local computer system communicates with at least one remote computer system; and
(b) automatically identifying, at install time, a subset of the list of possible transports that are available for communications with the remote computer system, wherein:
the list of possible transports corresponds to transports supported by the local computer system;
the subset of the list of possible transports corresponds to transports installed in the local computer system; and
for each supported transport, step (b) comprises the steps of:
(1) loading a media dependent module (MDM) corresponding to the supported transport;
(2) attempting to initialize a network transport stack corresponding to the MDM;
(3) if the attempt to initialize is successful, then identifying the supported transport as one of the installed transports and identifying the MDM as corresponding to the installed transport;
(4) if the attempt to initialize is unsuccessful, then determining if there is another MDM for the supported transport;
(5) if there is another MDM for the supported transport, then repeating steps (1)-(4) for the another MDM; and
(6) if there is not another MDM for the supported transport, then determining that the supported transport is not one of the installed transports.
17. The process of claim 16, wherein step (b) comprises the step of saving the installed transports and corresponding MDMs in an initialization file.
18. The process of claim 16, wherein:
the supported transports comprise at least one of an ISDN transport and a LAN transport;
the supported transports comprise one or more LAN transports conforming to one or more LAN-transport standards; and
the one or more LAN transports comprise two or more LAN products conforming to a single LAN-transport standard.
19. An apparatus for transport detection, comprising:
(a) a list of possible transports over which a local computer system communicates with at least one remote computer system; and
(b) a network independent layer adapted to automatically identify, at install time, a subset of the list of possible transports that are available for communications with the remote computer system, wherein:
the list of possible transports corresponds to transports supported by the local computer system;
the subset of the list of possible transports corresponds to transports installed in the local computer system; and
for each supported transport, the network independent layer:
(1) loads a media dependent module (MDM) corresponding to the supported transport;
(2) attempts to initialize a network transport stack corresponding to the MDM;
(3) identifies the supported transport as one of the installed transports and identifying the MDM as corresponding to the installed transport, if the attempt to initialize is successful;
(4) determines if there is another MDM for the supported transport, if the attempt to initialize is unsuccessful;
(5) repeats (1)-(4) for the another MDM, if there is another MDM for the supported transport; and
(6) determines that the supported transport is not one of the installed transports, if there is not another MDM for the supported transport.
20. The apparatus of claim 19, wherein the network independent layer saves the installed transports and corresponding MDMs in an initialization file.
21. The apparatus of claim 19, wherein:
the supported transports comprise at least one of an ISDN transport and a LAN transport;
the supported transports comprise one or more LAN transports conforming to one or more LAN-transport standards; and
the one or more LAN transports comprise two or more LAN products conforming to a single LAN-transport standard.
22. A computer program embodied in a tangible medium, wherein, when the computer program is loaded into and executed by a local computer system:
the local computer system provides a list of possible transports over which the local computer system communicates with at least one remote computer system; and
the local computer system automatically identifies, at install time, a subset of the list of possible transports that are available for communications with the remote computer system, wherein:
the list of possible transports corresponds to transports supported by the local computer system;
the subset of the list of possible transports corresponds to transports installed in the local computer system; and
for each supported transport, the local computer system:
(1) loads a media dependent module (MDM) corresponding to the supported transport;
(2) attempts to initialize a network transport stack corresponding to the MDM;
(3) identifies the supported transport as one of the installed transports and identifying the MDM as corresponding to the installed transport, if the attempt to initialize is successful;
(4) determines if there is another MDM for the supported transport, if the attempt to initialize is unsuccessful;
(5) repeats (1)-(4) for the another MDM, if there is another MDM for the supported transport; and
(6) determines that the supported transport is not one of the installed transports, if there is not another MDM for the supported transport.
23. The computer program of claim 22, wherein the local computer system saves the installed transports and corresponding MDMs in an initialization file.
24. The computer program of claim 22, wherein:
the supported transports comprise at least one of an ISDN transport and a LAN transport;
the supported transports comprise one or more LAN transports conforming to one or more LAN-transport standards; and
the one or more LAN transports comprise two or more LAN products conforming to a single LAN-transport standard.
Description
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to audio/video conferencing, and, in particular, to systems for real-time audio, video, and data conferencing in windowed environments on personal computer systems.
2. Description of the Related Art
It is desirable to provide real-time audio, video, and data conferencing between personal computer (PC) systems operating in windowed environments such as those provided by versions of Microsoft.RTM. Windows.TM. operating system. There are difficulties, however, with providing real-time conferencing in non-real-time windowed environments. It is also desirable to provide conferencing between PC systems over two or more different transports.
It is accordingly an object of this invention to overcome the disadvantages and drawbacks of the known art and to provide real-time audio, video, and data conferencing between PC systems operating in non-real-time windowed environments over two or more different transports.
It is a particular object of the present invention to provide real-time audio, video, and data conferencing between PC systems operating under a Microsoft.RTM. Windows.TM. operating system over ISDN and LAN networks.
Further objects and advantages of this invention will become apparent from the detailed description of a preferred embodiment which follows.
SUMMARY OF THE INVENTION
The present invention comprises a computer-implemented process, apparatus, and computer program for transport detection. A list of possible transports over which a local computer system communicates with at least one remote computer system is provided. A subset of the list of possible transports that are available for communications with the remote computer system is automatically identified. When implemented at install time, the invention determines which supported transports are installed and identifies the corresponding media dependent modules. When implemented at run time, the invention determines which of the installed transports can be used for an impending communications session with the remote computer system.
BRIEF DESCRIPTION OF THE DRAWINGS
Other objects, features, and advantages of the present invention will become more fully apparent from the following detailed description, the appended claims, and the accompanying drawings in which:
FIG. 1 is a block diagram representing real-time point-to-point audio, video, and data conferencing between two PC systems;
FIG. 2 is a block diagram of the hardware configuration of the conferencing system of each PC system of FIG. 1;
FIG. 3 is a block diagram of the hardware configuration of the video board of the conferencing system of FIG. 2;
FIG. 4 is a block diagram of the hardware configuration of the audio/comm (ISDN) board of the conferencing system of FIG. 2;
FIG. 5 is a block diagram of the software configuration of the conferencing system of each PC system of FIG. 1;
FIG. 6 is a block diagram of the hardware configuration of the audio/comm (ISDN) board of FIG. 4;
FIG. 7 is a block diagram of the conferencing interface layer between the conferencing applications of FIG. 5, on one side, and the comm, video, and audio managers of FIG. 5, on the other side;
FIG. 8 is a representation of the conferencing call finite state machine (FSM) for a conferencing session between a local conferencing system (i.e., caller) and a remote conferencing system (i.e., callee);
FIG. 9 is a representation of the conferencing stream FSM for each conferencing system participating in a conferencing session;
FIG. 10 is a representation of the video FSM for the local video stream and the remote video stream of a conferencing system during a conferencing session;
FIG. 11 is a block diagram of the software components of the video manager of the conferencing system of FIG. 5;
FIG. 12 is a representation of a sequence of N walking key frames;
FIG. 13 is a representation of the audio FSM for the local audio stream and the remote audio stream of a conferencing system during a conferencing session;
FIG. 14 is a block diagram of the architecture of the audio subsystem of the conferencing system of FIG. 5;
FIG. 15 is a block diagram of the interface between the audio task of FIG. 5 and the audio hardware of audio/comm (ISDN) board of FIG. 2;
FIG. 16 is a block diagram of the interface between the audio task and the comm task of FIG. 5;
FIG. 17 is a block diagram of the comm subsystem of the conferencing system of FIG. 5;
FIG. 18 is a block diagram of the comm subsystem architecture for two conferencing systems of FIG. 5 participating in a conferencing session over an ISDN connection;
FIG. 19 is a representation of the comm subsystem application FSM for a conferencing session between a local site and a remote site;
FIG. 20 is a representation of the comm subsystem connection FSM for a conferencing session between a local site and a remote site;
FIG. 21 is a representation of the comm subsystem control channel handshake FSM for a conferencing session between a local site and a remote site;
FIG. 22 is a representation of the comm subsystem channel establishment FSM for a conferencing session between a local site and a remote site;
FIG. 23 is a representation of the comm subsystem processing for a typical conferencing session between a caller and a callee;
FIG. 24 is a representation of the structure of a video packet as sent to or received from the comm subsystem of the conferencing system of FIG. 5;
FIG. 25 is a representation of the compressed video bitstream for the conferencing system of FIG. 5;
FIG. 26 is a representation of a compressed audio packet for the conferencing system of FIG. 5;
FIG. 27 is a representation of the reliable transport comm packet structure;
FIG. 28 is a representation of the unreliable transport comm packet structure;
FIG. 29 are diagrams indicating typical TII-DLM connection setup and teardown sequences;
FIGS. 30 and 31 are diagrams of the architecture of the audio/comm (ISDN) board;
FIG. 32 is a diagram of the audio/comm (ISDN) board environment;
FIG. 33 is a flow diagram of the on-demand application invocation processing of the conferencing system of FIG. 5;
FIG. 34 is a flow diagram of an example of the processing implemented within the conferencing system of FIG. 5 to manage two conferencing applications in a single conferencing session with a remote conferencing system;
FIG. 35 represents the flow of bits between two remote high-resolution counters used to maintain clock values over a conferencing network;
FIG. 36 is a flow diagram of the processing of the conferencing system of FIG. 1 to control the flow of signals over reliable channels;
FIG. 37 is a flow diagram of the preemptive priority-based transmission processing implemented by the communications subsystem of the conferencing system of FIG. 1;
FIG. 38 is a state diagram for the complete rate negotiation processing;
FIG. 39 is a state diagram for the rate negotiation processing for a called node during a 64 KBPS upgrade;
FIG. 40 is a state diagram for the rate negotiation processing for a calling node during a 64 KBPS upgrade; and
FIG. 41 is a state diagram for the rate negotiation processing in loopback mode during a 64 KBPS upgrade;
FIG. 42 is a flow diagram of the processing by the conferencing system of FIGS. 5 and 17 during the automatic transport detection implemented at install time;
FIG. 43 is a block diagram showing the network connections made by the conferencing system of FIGS. 5 and 17 during the automatic transport detection implemented at run time;
FIG. 44 is a representation of the DLMLAN packet header format;
FIG. 45 is a representation of the MDM packet header format for LAN transmissions;
FIG. 46 is a representation of the connection messages for a typical conferencing session from the perspective of the MDMs on the local and remote nodes;
FIG. 47 is a flow diagram of the video negotiation processing between two conferencing systems of FIG. 1;
FIG. 48 is a flow diagram of the call-progress processing when the placement of a conference call is successful;
FIG. 49 is a representation of the interrupt-time processing for receiving data signals by the audio/video conferencing system of FIG. 5;
FIG. 50 is a representation of the interrupt-time processing for transmitting data signals by the audio/video conferencing system of FIG. 5;
FIG. 51 is a representation of the auto registration environment for video conferencing;
FIG. 52 is a representation of the architecture for auto registration and remote confidence testing for the new node of FIG. 51;
FIG. 53 is a flow diagram of the processing for the auto registration and remote confidence testing of the auto registration environment of FIG. 51;
FIG. 54 is a flow diagram of the processing implemented by the client (i.e., a new node) for the auto registration processing of FIG. 53;
FIG. 55 is a flow diagram of the processing implemented by a confidence test server for the auto registration processing of FIG. 53;
FIG. 56 is a representation of the auto registration file format; and
FIG. 57 are connection diagrams that show the interactions between a DLM and an MDM in connection and session establishment and tear-down.
DESCRIPTION OF THE PREFERRED EMBODIMENT(S)
Point-To-Point Conferencing Network
Referring now to FIG. 1, there is shown a block diagram representing real-time point-to-point audio, video, and data conferencing between two PC systems, according to a preferred embodiment of the present invention. Each PC system has a conferencing system 100, a camera 102, a microphone 104, a monitor 106, and a speaker 108. The conferencing systems communicate via network 110, which may be either an integrated services digital network (ISDN), a local area network (LAN), or a wide area network (WAN). Each conferencing system 100 receives, digitizes, and compresses the analog video signals generated by camera 102 and the analog audio signals generated by microphone 104. The compressed digital video and audio signals are transmitted to the other conferencing system via network 110, where they are decompressed and converted for play on monitor 106 and speaker 108, respectively. In addition, each conferencing system 100 may generate and transmit data signals to the other conferencing system 100 for play on monitor 106. The video and data signals are displayed in different windows on monitor 106. Each conferencing system 100 may also display the locally generated video signals in a separate window.
Camera 102 may be any suitable camera for generating NSTC or PAL analog video signals. Microphone 104 may be any suitable microphone for generating analog audio signals. Monitor 106 may be any suitable monitor for displaying video and graphics images and is preferably a VGA monitor. Speaker 108 may be any suitable device for playing analog audio signals and is preferably a headset.
Conferencing System Hardware Configuration
Referring now to FIG. 2, there is shown a block diagram of the hardware configuration of each conferencing system 100 of FIG. 1. Each conferencing system 100 comprises host processor 202, video board 204, audio/comm (ISDN) board 206, LAN board 210, and ISA bus 208.
Referring now to FIG. 3, there is shown a block diagram of the hardware configuration of video board 204 of FIG. 2. Video board 204 comprises industry standard architecture (ISA) bus interface 310, video bus 312, pixel processor 302, video random access memory (VRAM) device 304, video capture module 306, and video analog-to-digital (A/D) converter 308.
Referring now to FIG. 4, there is shown a block diagram of the hardware configuration of audio/comm (ISDN) board 206 of FIG. 2. Audio/comm (ISDN) board 206 comprises ISDN interface 402, memory 404, digital signal processor (DSP) 406, and ISA bus interface 408, audio input/output (I/O) hardware 410.
LAN board 210 of FIG. 2 may be any conventional LAN card that supports standard driver interfaces and is preferably an Intel.RTM. EtherExpress.TM. 16C LAN Combo Card.
Conferencing System Software Configuration
Referring now to FIG. 5, there is shown a block diagram of the software configuration each conferencing system 100 of FIG. 1. Video microcode 530 resides and runs on pixel processor 302 of video board 204 of FIG. 3. Comm task 540 and audio task 538 reside and run on DSP 406 of audio/comm (ISDN) board 206 of FIG. 4. The one or more network stacks 560 reside and run partially on host processor 202 of FIG. 2 and partially on LAN board 210 of FIG. 2. All of the other software modules depicted in FIG. 5 reside and run on host processor 202.
Video, Audio, and Data Processing
Referring now to FIGS. 3, 4, and 5, audio/video conferencing application 502 running on host processor 202 provides the top-level local control of audio and video conferencing between a local conferencing system (i.e., local site or endpoint) and a remote conferencing system (i.e., remote site or endpoint). Audio/video conferencing application 502 controls local audio and video processing and establishes links with the remote site for transmitting and receiving audio and video over the ISDN or LAN network 110. Similarly, data conferencing application 504, also running on host processor 202, provides the top-level local control of data conferencing between the local and remote sites. Conferencing applications 502 and 504 communicate with the audio, video, and comm subsystems using conference manager 544, conferencing application programming interface (API) 506, LAN management interface (LMI) API 556, LMI manager 558, video API 508, comm API 510, and audio API 512. The functions of conferencing applications 502 and 504 and the APIs they use are described in further detail later in this specification.
Audio Processing
During conferencing, audio I/O hardware 410 of audio/comm (ISDN) board 206 digitizes analog audio signals received from microphone 104 and stores the resulting uncompressed digital audio to memory 404 via ISA bus interface 408. Audio task 538, running on DSP 406, controls the compression of the uncompressed audio and stores the resulting compressed audio back to memory 404.
Audio Processing for ISDN-Based Processing
For ISDN-based conferencing, comm task 540, also running on DSP 406, formats the locally-generated compressed audio for ISDN transmission and transmits the compressed ISDN-formatted audio to ISDN interface 402 for transmission to the remote site over ISDN network 110.
During ISDN-based conferencing, ISDN interface 402 also receives from ISDN network 110 compressed ISDN-formatted audio generated by the remote site and stores the compressed ISDN-formatted audio to memory 404. Comm task 540 then reconstructs the compressed audio format and stores the compressed audio back to memory 404. Audio task 538 controls the decompression of the compressed audio and stores the resulting decompressed audio back to memory 404. ISA bus interface then transmits the decompressed audio to audio I/O hardware 410, which digital-to-analog (D/A) converts the decompressed audio and transmits the resulting analog audio signals to speaker 108 for play.
Thus, for ISDN-based conferencing, audio capture/compression and decompression/playback are performed entirely within audio/comm (ISDN) board 206 without going through the host processor. As a result, audio is continuously played during an ISDN-based conferencing session regardless of what other applications are running on host processor 202.
Audio Processing for LAN-Based Processing
For LAN-based conferencing, audio task 538 passes the locally-generated compressed audio to the audio manager 520, which sends the compressed audio via comm API 510 to the comm manager 518 for transmission by the network stack 560 to the remote site via the LAN network 110.
During LAN-based conferencing, the network stack 560 also receives from LAN network 110 compressed LAN-formatted audio generated by the remote site and passes the compressed LAN-formatted audio to comm manager 518. Comm manager 518 then reconstructs the compressed audio format and passes the compressed audio via audio API 512 to audio manager 520, which stores the compressed audio into memory 404 of the audio/comm (ISDN) board 206 of FIG. 4. As in ISDN-based conferencing, audio task 538 controls the decompression of the compressed audio and stores the resulting decompressed audio back to memory 404. ISA bus interface then transmits the decompressed audio to audio 1/0 hardware 410, which digital-to-analog (D/A) converts the decompressed audio and transmits the resulting analog audio signals to speaker 108 for play.
Video Processing
Concurrent with the audio processing, video A/D converter 308 of video board 204 digitizes analog video signals received from camera 102 and transmits the resulting digitized video to video capture module 306. Video capture module 306 decodes the digitized video into YUV color components and delivers uncompressed digital video bitmaps to VRAM 304 via video bus 312. Video microcode 530, running on pixel processor 302, compresses the uncompressed video bitmaps and stores the resulting compressed video back to VRAM 304. ISA bus interface 310 then transmits via ISA bus 208 the compressed video to video/host interface 526 running on host processor 202.
Video/host interface 526 passes the compressed video to video manager 516 via video capture driver 522. Video manager 516 calls audio manager 520 using audio API 512 for synchronization information. Video manager 516 then time-stamps the video for synchronization with the audio. Video manager 516 passes the time-stamped compressed video to comm manager 518 via comm API 510.
Video Processing for ISDN-Based Conferencing
For ISDN-based conferencing, comm manager 518 passes the locally-generated compressed video through digital signal processing (DSP) interface 528 to ISA bus interface 408 of audio/comm (ISDN) board 206, which stores the compressed video to memory 404. Comm task 540 then formats the compressed video for ISDN transmission and transmits the ISDN-formatted compressed video to ISDN interface 402 for transmission to the remote site over ISDN network 110.
During ISDN-based conferencing, ISDN interface 402 also receives from ISDN network 110 ISDN-formatted compressed video generated by the remote site system and stores the ISDN-formatted compressed video to memory 404. Comm task 540 reconstructs the compressed video format and stores the resulting compressed video back to memory 404. ISA bus interface then transmits the compressed video to comm manager 518 via ISA bus 208 and DSP interface 528. Comm manager 518 passes the compressed video to video manager 516 via video API 508. Video manager 516 passes the compressed video to video decode driver 548 for decompression processing. Video decode driver 548 passes the decompressed video to video playback driver 550, which formats the decompressed video for transmission to the graphics device interface (GDI) (not shown) of the Microsoft.RTM. Windows.TM. operating system for eventual display in a video window on monitor 106.
Video Processing for LAN-Based Conferencing
For LAN-based conferencing, comm manager 518 formats the locally-generated compressed video for LAN transmission and transmits the LAN-formatted compressed video to the network stack 560 for transmission to the remote site over LAN network 110.
During LAN-based conferencing, the network stack 560 also receives from LAN network 110 LAN-formatted compressed video generated by the remote site system and passes the LAN-formatted compressed video to comm manager 518. Comm manager 518 then reconstructs the compressed video format and passes the compressed video via video API 508 to video manager 516. As in ISDN-based conferencing, video manager 516 passes the compressed video to video decode driver 548 for decompression processing. Video decode driver 548 passes the decompressed video to video playback driver 550, which formats the decompressed video for transmission to the graphics device interface (GDI) (not shown) of the Microsoft.RTM. Windows.TM. operating system for eventual display in a video window on monitor 106.
Data Processing
For data conferencing, concurrent with audio and video conferencing, data conferencing application 504 generates and passes data to comm manager 518 using conferencing API 506 and comm API 510.
Data Processing for ISDN-Based Conferencing
For ISDN-based conferencing, comm manager 518 passes the locally-generated data through board DSP interface 532 to ISA bus interface 408, which stores the data to memory 404. Comm task 540 formats the data for ISDN transmission and stores the ISDN-formatted data back to memory 404. ISDN interface 402 then transmits the ISDN-formatted data to the remote site over ISDN network 110.
During ISDN-based conferencing, ISDN interface 402 also receives from ISDN network 110 ISDN-formatted data generated by the remote site and stores the ISDN-formatted data to memory 404. Comm task 540 reconstructs the data format and stores the resulting data back to memory 404. ISA bus interface 408 then transmits the data to comm manager 518, via ISA bus 208 and DSP interface 528. Comm manager 518 passes the data to data conferencing application 504 using comm API 510 and conferencing API 506. Data conferencing application 504 processes the data and transmits the processed data to Microsoft.RTM. Windows.TM. GDI (not shown) for display in a data window on monitor 106.
Data Processing for LAN-Based Conferencing
For LAN-based conferencing, comm manager 518 formats the locally-generated data for LAN transmission and transmits the LAN-formatted data video to the network stack 560 for transmission to the remote site over LAN network 110.
During LAN-based conferencing, the network stack 560 also receives from LAN network 110 LAN-formatted data generated by the remote site system and passes the LAN-formatted data to comm manager 518. Comm manager 518 then reconstructs the data and passes the data to data conferencing application 504 using comm API 510 and conferencing API 506. As in ISDN-based conferencing, data conferencing application 504 processes the data and transmits the processed data to Microsoft.RTM. Windows.TM. GDI (not shown) for display in a data window on monitor 106.
Hardware Configuration for Conferencing System
LAN board 210 of FIG. 2 may be any suitable board for transmitting and receiving digital packets over a local (or wide) area network and is preferably an Intel.RTM. EtherExpress.TM. 16 card with appropriate control and network protocol software. Conferencing system 100 is capable of supporting LAN-based conferencing under different LAN transport standards (e.g., Novell IPX, Internet User Datagram Protocol (UDP), and/or NetBIOS standards). Furthermore, conferencing system 100 is capable of supporting LAN-based conferencing with different LAN products for a single LAN transport standard (e.g., LAN WorkPlace (LWPUDP) by Novell and FTPUDP by FTP Software, Inc., both of which conform to the LAN UDP standard). Thus, LAN board 210 corresponds to the LAN transports that are supported in conferencing system 100. Those skilled in the art will understand that more than one network stack 560 may be used to interface with a single LAN board 210.
Referring now to FIG. 6, there is shown a block diagram of the hardware configuration of audio/comm (ISDN) board 206 of FIG. 4. Referring now to FIGS. 30 and 31, there are shown diagrams of the architecture of the audio/comm (ISDN) board 206. Referring now to FIG. 32, there is shown a diagram of the audio/comm (ISDN) board environment. The description for the rest of this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
Software Architecture for Conferencing System
The software architecture of conferencing system 100 of FIGS. 2 and 5 has three layers of abstraction. A computer supported collaboration (CSC) infrastructure layer comprises the hardware (i.e., video board 204, audio/comm (ISDN) board 206, and LAN board 210) and host/board driver software (i.e., video/host interface 526, DSP interface 528, and network stack 560) to support video, audio, and comm, as well as the encode method for video (running on video board 204) and encode/decode methods for audio (running on audio/comm (ISDN) board 206). The capabilities of the CSC infrastructure are provided to the upper layer as a device driver interface (DDI).
A CSC system software layer provides services for instantiating and controlling the video and audio streams, synchronizing the two streams, and establishing and gracefully ending a call and associated communication channels. This functionality is provided in an application programming interface (API). This API comprises the extended audio and video interfaces and the communications APIs (i.e., conference manager 544, conferencing API (VCI) 506, LAN management interface (LMI) API 556, LMI manager 558, video API 508, video manager 516, video capture driver 522, video decode driver 548, video playback driver 550, comm API 510, comm manager 518, Wave API 514, Wave driver 524, PWave API 552, audio API 512, and audio manager 520).
A CSC applications layer brings CSC to the desktop. The CSC applications may include video annotation to video mail, video answering machine, audio/video/data conferencing (i.e., audio/video conferencing application 502 and data conferencing application 504), and group decision support systems.
Audio/video conferencing application 502 and data conferencing application 504 rely on conference manager 544 and conferencing API 506, which in turn rely upon video API 508, comm API 510, and audio API 512 to interface with video manager 516, comm manager 518, and audio manager 520, respectively. Comm API 510 and comm manager 518 provide a transport-independent interface (TII) that provides communications services to conferencing applications 502 and 504. The communications software of conferencing system 100 may be designed to support different transport mechanisms, such as ISDN, SW56, and LAN (e.g., SPX/IPX, TCP/IP, or NetBIOS). The TII isolates the conferencing applications from the underlying transport layer (i.e., transport-medium-specific DSP interface 528). The TII hides the network/connectivity specific operations. In conferencing system 100, the TII hides the ISDN and LAN layers. The DSP interface 528 is hidden in a datalink module (DLM). The LAN interface is hidden within a media dependent module (MDM). The TII provides services to the conferencing applications for opening communication channels (within the same session) and dynamically managing the bandwidth. The bandwidth is managed through a transmission priority scheme.
In an embodiment in which conferencing system 100 performs software video decoding, AVI capture driver 522 is implemented on top of video/host interface 526 (the video driver). In an alternative embodiment in which conferencing system 100 performs hardware video decoding, an AVI display driver is also implemented on top of video/host interface 526.
The software architecture of conferencing system 100 comprises three major subsystems: video, audio, and communication. The audio and video subsystems are decoupled and treated as "data types" (similar to text or graphics) with conventional operations like open, save, edit, and display. The video and audio services are available to the applications through video-management and audio-management extended interfaces, respectively.
Conferencing system 100 is implemented mostly in the C++ computer language using the Microsoft.RTM. Foundation Classes (MFC) with portions implemented in the C7.0 computer language.
Audio/Video Conferencing Application
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
CMIF.LIB
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
CCm
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Loading and Unloading
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Registering and Unregistering
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Call Support
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, now abandoned.
Channel Pair Support
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Stream Support
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
CMDLL Callback
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
NO VCI Support
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Miscellaneous
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
CImageSize
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
CImageState
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
PSVIDEO.EXE
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Frame, View, and Image
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Class Descriptions
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
CCyApp
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
CCyFrameWnd
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
CCyAppFrame
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
CVideoFrame
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
CVideoController
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Auto-Sizing of Video Windows
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Split and Combined Modes
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Control Channel Management
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Mute Message
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
High-Quality Snapshot Message
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Application Launch
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Application Launch Response
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
CChanPair
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Video View Class Relationships
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Handset Class Relationships
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Dialog Boxes
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Helper Classes
Dialog Helper
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Fast Bitmap Buttons
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Data Conferencing Application
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Conference Manager
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Conference Manager Overview
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Implementation Details
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned. Additional information on the conference manager API is found in APPENDIX A of this specification.
Conference Application Installation
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Conference Application Registration
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
VCI Call Handler Callback
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Channel Pair Establishment
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Critical Sections
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Call Notification and Caller ID
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Audible Call Progress
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
On Demand Application Invocation
Referring now to FIG. 33, there is shown a flow diagram of the on-demand application invocation processing of conferencing system 100 of FIG. 5. The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Managing Multiple Applications
Referring now to FIG. 34, there is shown a flow diagram of an example of the processing implemented within conferencing system 100 of FIG. 5 to manage two conferencing applications in a single conferencing session with a remote conferencing system. The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Conferencing API
Referring now to FIG. 7, there is shown a block diagram of conference manager 544 and conferencing API 506 between conferencing applications 502 and 504, on one side, and comm API 508, LMI API 556, video API 510, and audio API 512, on the other side. The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned. Additional information on the conferencing API is found in APPENDIX B of this specification.
Interfacing with the Comm Subsystem
Conferencing API 506 supports the following comm services with the comm subsystem:
Comm initialization--initialize a session in the comm subsystem on which the call will be made.
Call establishment--place a call to start a conference.
Channel establishment--establish two comm channels for video conferencing control information, two comm channels for audio (incoming/outgoing), four comm channels for video (incoming data and control and outgoing data and control).
Call termination--hang up a call and close all active channels.
Comm Initialization/Uninitialization
Initialization of a session in the comm subsystem on which a call may be made by the user of conferencing system A of FIG. 1 and the user of conferencing system B of FIG. 1 is implemented as follows:
Conferencing APIs A and B call LMI.sub.-- AddLANTransport to initialize their LAN management interface (LMI) subsystems.
Conferencing APIs A and B receive a LMI.sub.-- ADDTRANS.sub.-- RESPONSE callback from the LMI subsystem.
Conferencing APIs A and B call BeginSession to initialize their comm subsystems.
Conferencing APIs A and B receive a SESS.sub.-- BEGIN callback from the comm subsystem.
Conferencing APIs A and B then notify the conferencing applications with a CFM.sub.-- INIT.sub.-- TRANSP.sub.-- NTFY callback.
Uninitialization of a session in the comm subsystem is implemented as follows:
Conferencing APIs A and B call LMI.sub.-- DeleteLANTransport to uninitialize their LAN management interface (LMI) subsystems.
Conferencing APIs A and B receive a LMI.sub.-- DELTRANS.sub.-- RESPONSE callback from the LMI subsystem.
Conferencing APIs A and B call EndSession to uninitialize their comm subsystems.
Conferencing APIs A and B receive a SESS.sub.-- CLOSED callback from the comm subsystem.
Conferencing APIs A and B then notify the conferencing applications with a CFM.sub.-- UNINIT.sub.-- TRANSP.sub.-- NTFY callback.
Call Establishment
Establishment of a call between the user of conferencing system A of FIG. 1 and the user of conferencing system B of FIG. 1 is implemented as follows:
Conferencing API A calls LMI.sub.-- RequestPermission to request permission to make the conference call from the management computer.
Conferencing API A receives a LMI.sub.-- PERM.sub.-- RESPONSE callback from the LMI subsystem. If permission is denied, conferencing API A notifies the conferencing application with a CFM.sub.-- REJECT.sub.-- NTFY callback. If permission is granted, establishment of the call is continued.
Conferencing API A calls LMI.sub.-- CallCommit to indicate to LMI that the call will be made.
Conferencing API A calls MakeConnection to dial conferencing API B's number.
Conferencing API B receives a CONN.sub.-- REQUESTED callback from the comm subsystem.
Conferencing API B calls LMI.sub.-- RequestPermission to request permission to accept the conference call from the management computer.
Conferencing API B receives a LMI.sub.-- PERM.sub.-- RESPONSE callback from the LMI subsystem. If permission is denied, conferencing API B rejects the call with RejectConnection, and notifies the conferencing application with a CFM.sub.-- DENIAL.sub.-- NTFY callback. If permission is granted, establishment of the call is continued.
Conferencing API B sends the call notification to the graphic user interface (GUI) with a CFM.sub.-- CALL.sub.-- NTFY callback; and, if user B accepts the call via the GUI, conferencing API B proceeds with the following steps.
Conferencing API B calls LMI.sub.-- CallCommit to indicate to LMI that the call will be accepted.
Conferencing API B calls AcceptConnection to accept the incoming call from conferencing API A.
Conferencing APIs A and B receive CONN.sub.-- ACCEPTED callback from the comm subsystem.
Conferencing API A calls OpenChannel to open its outgoing conferencing control channel.
Conferencing API B receives the CHAN.sub.-- REQUESTED callback for the incoming control channel and accepts it via AcceptChannel. Then conferencing API B calls OpenChannel to open its outgoing conferencing control channel.
Conferencing API A receives the CHAN.sub.-- ACCEPTED callback for its outgoing control channel and calls RegisterChanHandler to receive channel callbacks from the comm subsystem. Then conferencing API A receives the CHAN.sub.-- REQUESTED callback for the incoming control channel and accepts it via AcceptChannel.
Conferencing API B receives the CHAN.sub.-- ACCEPTED callback for its outgoing control channel and calls RegisterChanHandler to receive channel callbacks from the comm subsystem.
Conferencing API A sends a Login Request on the control channel, which conferencing API B receives.
Conferencing API B sends a Login Response on the control channel, which conferencing API A receives.
Conferencing APIs A and B negotiate conference capabilities between themselves. Capabilities that are negotiated include: negotiation protocol version, audio compression algorithm, video compression algorithm, video frame rate, video capture resolution, video bitrate, and data sharing capabilities.
Conferencing API A sends a Capabilities Request on the control channel, specifying conference requirements, which conferencing API B receives.
Conferencing API B sends a Capabilities Response on the control channel, accepting or modifying conference requirements, which conferencing API A receives.
When conferencing APIs A and B agree upon conference capabilities, the capabilities are saved and will be communicated to the application via the CFM.sub.-- ACCEPT.sub.-- NTFY callback.
Conferencing API A calls OpenChannel to open its outgoing audio channel.
Conferencing API B receives the CHAN.sub.-- REQUESTED callback for the incoming audio channel and accepts it via AcceptChannel.
Conferencing API A receives the CHAN.sub.-- ACCEPTED callback for the outgoing audio channel.
The last three steps are repeated for the video data channel and the video control channel.
Conferencing API B then turns around and repeats the above 4 steps (i.e., opens its outbound channels for audio/video data/video control).
Conferencing API A sends Participant Information on the control channel, which conferencing API B receives.
Conferencing API B sends Participant Information on the control channel, which conferencing API A receives.
Conferencing APIs A and B call LMI.sub.-- ConferenceCommit to indicate to LMI that the conference is in progress.
Conferencing APIs A and B then notify the conferencing applications with a CFM.sub.-- ACCEPT.sub.-- NTFY callback.
Channel Establishment
Video and audio channel establishment is implicitly done as part of call establishment, as described above, and need not be repeated here. For establishing other channels such as data conferencing, the conferencing API passes through the request to the comm manager, and sends the comm manager's callback to the user's channel manager.
Call Termination
Termination of a call between users A and B is implemented as follows (assuming user A hangs up):
Conferencing API A unlinks local/remote video/audio streams from the network.
Conferencing API A calls LMI.sub.-- ConferenceLeave to indicate to LMI that the conference is being closed.
Conferencing API A then calls the comm subsystem's CloseConnection.
The comm subsystem implicitly closes all channels, and sends CHAN.sub.-- CLOSED callbacks to the conferencing API A.
Conferencing API A closes its remote audio/video streams on receipt of the CHAN.sub.-- CLOSED callback for its inbound audio/video channels, respectively.
Conferencing API A then receives the CONN.sub.-- CLOSE.sub.-- RESP callback after the call is cleaned up completely. Conferencing API A notifies its conferencing application with a CFM.sub.-- HANGUP.sub.-- NTFY callback.
In the meantime, conferencing API B would have received the CHAN.sub.-- CLOSED callbacks from the comm subsystem for all the closed channels.
Conferencing API B closes its remote audio/video streams on receipt of the CHAN.sub.-- CLOSED callback for its inbound audio/video channels, respectively.
Conferencing API B unlinks its local audio/video streams from the network on receipt of the CHAN.sub.-- CLOSED callback for its outbound audio/video channels, respectively.
Conferencing API B then receives a CONN.sub.-- CLOSED callback from the comm subsystem.
Conferencing API B calls LMI.sub.-- ConferenceLeave to indicate to LMI that the conference is being closed.
Conferencing API B then notifies its conferencing application with a CFM.sub.-- HANGUP.sub.-- NTFY callback.
Interfacing with the Audio and Video Subsystems
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Capture/Monitor/Transmit Local Streams
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Receive/Play Remote Streams
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Control Local/Remote Streams
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Snap an Image from Local Video Streams
Referring now to FIG. 8, there is shown a representation of the conferencing call finite state machine (FSM) for a conferencing session between a local conferencing system (i.e., caller) and a remote conferencing system (i.e., callee). Referring now to FIG. 9, there is shown a representation of the conferencing stream FSM for each conferencing system participating in a conferencing session. The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned. Differences include changes to the CF.sub.-- Init function and new functions CF.sub.-- Uninit, CF.sub.-- InitTransport, CF.sub.-- UninitTransport, and CF.sub.-- ChangeTransportMaxVideoBitrate, as follows:
______________________________________
CF.sub.-- Init Initializes the LAN management
interface (LMI), audio and video
subsystems, and initializes data
structures required for
conferencing.
CF.sub.-- Uninit Uninitializes the LMI, audio,
and video subsystems.
If a conference
call is in progress, it is
gracefully destroyed.
CF.sub.-- InitTransport
Initializes a LAN or ISDN
transport stack so that
conference calls may be
made or received on a
particular transport type.
The maximum video bitrate
allowed on this transport is
specified.
CF.sub.-- UninitTransport
Uninitializes a transport
stack, so calls may
no longer be made or
received on a particular
transport type.
CF.sub.-- ChangeTransportMaxVideoBitrate
Changes the maximum video
bitrate allowed on a transport.
______________________________________
These functions are defined in further detail later in this specification in APPENDIX B.
In addition, conferencing API 506 supports the following additional messages returned to conferencing applications 502 and 504 from the video, comm, and audio subsystems in response to some of the above-listed functions:
______________________________________
CFM.sub.-- INIT.sub.-- TRANSP.sub.-- NTFY
Indicates that transport stack
initialization has completed
successfully or unsucessfully.
CFM.sub.-- UNINIT.sub.-- TRANSP.sub.-- NTFY
Indicates that transport stack
uninitialization has completed.
CFM.sub.-- UNINIT.sub.-- NTFY
Indicates that the conferencing API
subsystem uninitialization
has completed.
CFM.sub.-- DENIAL.sub.-- NTFY
Indicates that a call request initiated
from the remote site has
been received,
but the local site was denied
permission to accept the call by the
management computer.
CFM.sub.-- ERROR.sub.-- NTFY
Indicates that an error has occurred in
the comm subsystem.
CFM.sub.-- KILL.sub.-- NTFY
Indicates that the management
computer has demanded the con-
ference call be terminated.
______________________________________
Video Subsystem
The video subsystem of conferencing system 100 of FIG. 5 comprises video API 508, video manager 516, video decode driver 548, video playback driver 550, video capture driver 522, and video/host interface 526 running on host processor 202 of FIG. 2 and video microcode 530 running on video board 204.
In an embodiment of the invention of U.S. patent application Ser. No. 08/157,694 (filed Nov. 24, 1993), now U.S. Pat. No. 5,506,954, the video subsystem encoded and decoded video according to a single video compression technique, that, for purposes of this patent application, may be referred to as the ISDN-rate video (IRV) technique. The video processing and video bitstream format described in defined in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954, corresponded to that IRV technique.
The video subsystem of the present invention, however, is capable of encoding and decoding video according to more than one video compression technique. In one embodiment, the video system is capable of encoding and decoding video using both the IRV technique and a multi-rate video (MRV) technique. The following sections of this specification refer primarily to the IRV technique. The MRV technique is described in further detail in later sections of this specification starting with the section entitled "Compressed Multi-Rate Video Bitstream."
Video API
Referring now to FIG. 10, there is shown a representation of the video FSM for the local video stream and the remote video stream of a conferencing system during a conferencing session. The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954. Additional information on the video API is found in APPENDIX C of this specification.
Video Manager
Referring now to FIG. 11, there is shown a block diagram of the software components of video manager (VM) 516 of FIG. 5. The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
Capture/Playback Video Effects
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
Video Stream Restart
Referring now to FIG. 12, there is shown a representation of a sequence of N walking key frames. The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
Audio/Video Synchronization
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Alternative Timestamp Driver
FIG. 35 represents the flow of bits between two remote high-resolution counters used to maintain clock values over a conferencing network. The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Bit Rate Throttling
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Multiple Video Formats
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Normal Display Resolution
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Quarter Display Resolution
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Video Frame Format/Capture Interface
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Playback Implementation
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Self-Calibration
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Measurement
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
File-Based Capture (File Playback)
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Playback Statistics
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
VCost Function
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
VM DLL
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
VCapt EXE
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
VPlay EXE
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
Palette Creation
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Extra RealizePalette Logic
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Netw DLL
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
AVSync DLL
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
Video Capture Driver
Video capture driver 522 of FIG. 5 follows driver specifications set forth in the Microsoft.RTM. Video for Windows.TM. (VfW) Developer Kit documentation. This documentation specifies a series of application program interfaces (APIs) to which video capture driver 522 responds. Microsoft.RTM. Video for Windows.TM. (VfW) is a Microsoft.RTM. extension to the Microsoft.RTM. Windows.TM. operating system. VfW provides a common framework to integrate audio and video into an application program. Video capture driver 522 extends the basic Microsoft.RTM. API definitions by providing nine "custom" APIs that provide direct control of enhancements to the standard VfW specification to enable and control bit rate throttling and local video monitoring. Video capture driver 522 captures images in the "raw" YVU9 format and compresses them using either the IRV or the MRV compression technique. Video capture driver 522 controls bit rate throttling and local video monitoring differently for IRV and MRV compression.
Bit rate throttling controls the bit rate of a transmitted video conference data stream. Bit rate throttling is based on the quality of the captured video image and the image capture frame rate. A high-quality image has more fine detail information than a low-quality image. A user of conferencing system 100 is able to vary the relative importance of image quality and frame capture rate with a custom capture driver API.
The data bandwidth capacity of the video conference communication channel is fixed. The amount of captured video data to be transmitted is variable, depending upon the amount of motion that is present in the video image. The capture driver is able to control the amount of data that is captured by changing the quality of the next captured video frame and by not capturing the next video frame (i.e., "dropping" the frame).
The image quality is determined on a frame-by-frame basis using the following equation: ##EQU1## Quality is the relative image quality of the next captured frame. A lower Quality number represents a lower image quality (less image detail). TargetSize is the desired size of a captured and compressed frame. TargetSize is based on a fixed, desired capture frame rate.
Normally, video capture driver 522 captures new video frames at a fixed, periodic rate which is set by the audio/video conference application program. Video capture driver 522 keeps a running total of the available communication channel bandwidth. When video capture driver 522 is ready to capture the next video frame, it first checks the available channel bandwidth and if there is insufficient bandwidth (due to a large, previously captured frame), then video capture driver 522 delays capturing the next video frame until sufficient bandwidth is available. Finally, the size of the captured video frame is subtracted from the available channel bandwidth total.
A user of conferencing system 100 may control the relationship between reduced image quality and dropped frames by setting image quality characteristics. For IRV compression, the user may set a minimum image quality value which controls the range of permitted image qualities, from a wide range down to a narrow range of only the best image qualities. For MRV compression, the user may set image quality using three parameters: motion estimation, spatial filtering, and temporal filtering. The effects of these parameters on image quality are discussed in U.S. patent application Ser. No. 08/235,955 (filed Apr. 28, 1994), now U.S. Pat. No. 5,493,514.
Bit rate throttling is implemented inside of the video capture driver and is controlled by the following VfW extension APIs:
______________________________________
CUSTOM.sub.-- SET.sub.-- DATA.sub.-- RATE
Sets the data rate of the
communications channel.
CUSTOM.sub.-- SET.sub.-- FPS
Sets the desired capture
frame rate.
CUSTOM.sub.-- SET.sub.-- QUAL.sub.-- PERCENT
Sets the minimum image quality
value (IRV only).
CUSTOM.sub.-- SET.sub.-- MOTION.sub.-- EST
Enables or disables motion
estimation (MRV only).
CUSTOM.sub.-- SET.sub.-- SPATIAL.sub.-- FILT
Enables or disables spatial
filtering (MRV only).
CUSTOM.sub.-- SET.sub.-- TEMPORAL.sub.-- FILT
Sets the level of tem-
poral filtering (MRV only).
______________________________________
The local video monitoring extension to VfW gives the video capture driver 522 the ability to output simultaneously both a compressed and a non-compressed image data stream to the application, while remaining fully compatible with the Microsoft.RTM. VfW interface specification. Without this capability, audio/video conferencing application 502 would have to decompress and display the image stream generated by the capture driver in order to provide local video monitoring, which would place an additional burden on the host processor and may decrease the frame update rate of the displayed image.
According to the VfW interface specification, the compressed image data is placed in an output buffer. When local video monitoring is active, an uncompressed copy of the same image frame is appended to the output buffer immediately following the compressed image data. The capture driver generates control information associated with the output buffer. This control information reflects only the compressed image block of the output buffer and does not indicate the presence of the uncompressed image block, making local video monitoring fully compatible with other VfW applications. A reserved, 32-bit data word in the VfW control information block indicates to a local video monitor aware application that there is a valid uncompressed video image block in the output buffer. The application program may then read and directly display the uncompressed video image block from the output buffer.
For the IRV technique, the uncompressed image data may be in either Device Independent Bitmap (DIB) or YUV9 format. For the MRV technique, the YVU9 format is used for the uncompressed image data. DIB format images are a fixed size, whereas YUV9 format images may be increased in size while retaining image quality. For both IRV and MRV techniques, the YUV9 images are converted into DIB format by the video display driver before they are displayed on the computer monitor.
The capture driver allows the uncompressed video image to be captured either normally or mirrored (reversed left to right). In normal mode, the local video monitoring image appears as it is viewed by a video camera--printing appears correctly in the displayed image. In mirrored mode, the local video monitoring image appears as if it were being viewed in a mirror.
The CUSTOM.sub.-- SET.sub.-- DIB.sub.-- CONTROL extension API controls the local video monitoring capabilities of the video capture driver.
Custom APIs for Video Capture Driver
The CUSTOM.sub.-- SET.sub.-- FPS message sets the frame rate for a video capture. This message is used while in streaming capture mode.
The CUSTOM.sub.-- SET.sub.-- KEY message informs the capture driver to produce one key frame as soon as possible. The capture driver will typically produce one delta frame before the key frame. Once the key frame has been encoded, delta frames will typically follow.
The CUSTOM.sub.-- SET.sub.-- DATA.sub.-- RATE message informs the capture driver to set an output data rate. This data rate value is in KBits per second and typically corresponds to the data rate of the communications channel over which the compressed video data will be transmitted.
The CUSTOM.sub.-- SET.sub.-- QUAL.sub.-- PERCENT message controls the relationship between reducing the image quality and dropping video frames when the IRV compressed video data stream size exceeds the data rate set by the CUSTOM.sub.-- SET.sub.-- DATA.sub.-- RATE message. For example, a CUSTOM.sub.-- SET.sub.-- QUAL.sub.-- PERCENT value of 0 means that the driver should reduce the image quality as much as possible before dropping frames and a value of 100 means that video frames should be dropped before the image quality is lowered. The CUSTOM.sub.-- SET.sub.-- QUAL.sub.-- PERCENT message is used only with IRV compression.
The CUSTOM.sub.-- SET.sub.-- DIB.sub.-- CONTROL message controls the uncompressed DIB or YUV9 format image output. With IRV compression, the uncompressed image may be in DIB format at either (80.times.60) or (160.times.120) pixel resolution or may be in YVU9 format at (160.times.120) resolution. With MRV compression, only the (160.times.120) YVU9 image format is supported. All images are available in either mirrored (reversed left to right) or a normal image. This API controls the following four parameters:
Uncompressed image enable/disable
Mirrored/normal image
The uncompressed image size
Image data format (DIB or YVU9)
The default condition is for the uncompressed image to be disabled. Once set, these control flags remain in effect until changed by another CUSTOM.sub.-- SET.sub.-- DIB.sub.-- CONTROL message. The uncompressed image data is appended to the video data buffer immediately following the compressed image data. The uncompressed DIB or YUV9 data have the bottom scan-line data first and the top scan-line data last in the buffer.
The CUSTOM.sub.-- SET.sub.-- VIDEO message controls the video demodulator CONTRAST, BRIGHTNESS, HUE (TINT), and SATURATION parameters. These video parameters are also set by the capture driver at initialization and via a video control dialog box.
The CUSTOM.sub.-- SET.sub.-- MOTION.sub.-- EST message allows MRV motion estimation to be enabled or disabled to improve image quality. This message is used only with MRV compression.
The CUSTOM.sub.-- SET.sub.-- SPATIAL.sub.-- FILT message allows MRV spatial filtering to be enabled or disabled to improve image quality. This message is used only with MRV compression.
The CUSTOM.sub.-- SET.sub.-- TEMPORAL.sub.-- FILT message allows the MRV temporal filter strength to be altered to improve image quality. This message is used only with MRV compression.
Video Microcode
The video microcode 530 of FIG. 5 running on video board 204 of FIG. 2 performs video compression. The preferred video compression techniques are disclosed in later sections of this specification starting with the section entitled "Compressed Video Bitstream."
Audio Subsystem
Referring now to FIG. 13, there is shown a block diagram of the architecture of the audio subsystem. The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954. In addition, referring again to FIG. 13, if the network connection is over a LAN, then the audio task 538 on System A sends the packetized, time-stamped audio data to the commstub task 1308, which sends it to the audio manager 520 on the host processor 202. The audio manager 520 passes the data to TII 510 for delivery to the remote system. The audio data from System B is delivered by TII 510 to the audio manager 520 on System A (on the host). The audio manager 520 sends the packet to the commstub task 1308 which passes it on to the audio task 538.
Audio API
Referring now to FIG. 14, there is shown a representation of the audio FSM for the local audio stream and the remote audio stream of a conferencing system during a conferencing session. The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954. Additional information on the audio API is found in APPENDIX D of this specification.
Audio Manager
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
Audio Manager Device Driver Interface
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954, except for the following. The expected messages (generated by a Microsoft.RTM. OpenDriver SDK call to installable device drivers) and the drivers response are as follows:
__________________________________________________________________________
DRV.sub.-- LOAD
Reads any configuration parameters associated with the driver.
Allocates any
memory required for execution. This call is only made the first
time the driver is
opened.
DRV.sub.13 ENABLE
Ensures that an audio/comm board is installed and functional. For
audio/comm
board 206 of FIG. 2, this means the DSP interface 532 is
accessible. This call is
only made the first time the driver is opened.
DRV.sub.-- OPEN
This call is made each time OpenDriver is called. The audio
manager can be opened
once for input, once for output (i.e., it supports one full
duplex conversation), and
any number of times for device capablilities query. This call
allocats the per
application data. This includes information such as the callback
and the application
instance data and buffers for transferring audio between the host
and the audio
board for LAN connections. If this is an input or output call, it
starts the DSP audio
task and sets up communication between host and DSP audio task
(e.g. setup mail
boxes, register callbacks). If this is the first open of an input
or output stream, it
starts the commstub task.
__________________________________________________________________________
The installable device driver will respond to the close protocol messages defined by Microsoft.RTM.. The expected messages (generated by the Microsoft.RTM. SDK CloseDriver call to installable device drivers) and the drivers response are as follows:
______________________________________
DRV.sub.-- CLOSE
Frees the per application data allocated in
DRV.sub.-- OPEN message.
DRV.sub.-- DISABLE
Ignored.
DRV.sub.-- FREE
Ignored
______________________________________
This call sequence is symmetric with respect to the call sequence generated by OpenDriver. It has the same characteristics and behavior as the open sequence does. Namely, it receives one to three messages from the CloseDriver call dependent on the driver's state and it generates one callback per CloseDriver call. Three messages are received when the driver's final instance is being closed. Only the DRV.sub.-- CLOSE message is generated for other CloseDriver calls.
DRV.sub.-- CLOSE message closes the audio thread that corresponds to the audio stream indicated by HASTRM. The response to the close message is in response to a message sent back from the board indicating that the driver has closed. Therefore, this call is asynchronous.
AM.sub.-- LINKIN Message
The AM.sub.-- LINKIN message is sent to the driver whenever the audio manager function ALinkIn is called. Param1 is a pointer to the following structure:
______________________________________
typedef struct.sub.-- ALinkStruct {
BOOL ToLink;
CHANID ChanId;
} ALinkStruct, FAR * 1pALinkStruct;
______________________________________
ToLink contains a BOOL value that indicates whether the stream is being linked in or unlinked (TRUE is linked in and FALSE is unlinked). If no error is detected and ToLink is TRUE, the channel and the playback stream should be linked together. The driver calls TII to determine whether the transport associated with the channel is ISDN. If so, the driver calls TII to determine the ID of the channel on the board associated with the TII channel ID. It then sends the Audio Task the ALINKIN.sub.-- TMSG message with the board channel ID as a parameter. This causes the Audio Task to link up with the specified comm channel and begin playing incoming audio. If the transport associated with the channel is not ISDN, the driver prepares to receive data from the specified TII channel and send the data to the commstub task. It then sends the Audio Task the ALINKIN.sub.-- HOST.sub.-- TMSG. This causes the Audio Task to link up with the commstub task to receive the audio data and play it.
Breaking the link between the audio stream handle and the channel ID is done when the ToLink field is set to FALSE. The audio manager sends the ALINKIN.sub.-- TMSG to the task along with the channel ID. The Audio Task responds to this message by unlinking the specified channel ID (i.e., it does not play any more audio).
Errors that the host task will detect are as follows:
The channel ID does not represents a valid read stream.
The audio stream handle is already linked or unlinked (detected on host).
The audio stream handle is not a playback handle.
If those or any interface errors (e.g. message pending) are detected the callback associated with this stream is notified immediately. If no errors are detected, the ALINKIN.sub.-- TMSG or ALINKIN.sub.-- HOST.sub.-- TMSG is issued to the DSP interface and the message pending flag is set for this stream. Upon receiving the callback for this message, the callback associated with this stream is made, and finally the message pending flag is unset.
AM.sub.-- LINKOUT Message
The AM.sub.-- LINKOUT message is sent to the driver whenever the audio manager function ALinkOut is called. Param1 is a pointer to the following structure:
______________________________________
typedef struct.sub.-- ALinkStruct {
BOOL ToLink;
CHANID ChanId;
} ALinkStruct, FAR * 1pALinkStruct;
______________________________________
ToLink contains a BOOL value that indicates whether the stream is being linked out or unlinked (TRUE is linked out and FALSE is unlinked). If no error is detected and ToLink is TRUE, the channel and the audio in stream should be linked together. The driver calls TII to determine whether the transport associated with the channel is ISDN. If so, the driver calls TII to determine the ID of the channel on the board associated with the TII channel ID. It then sends the Audio Task the ALINKOUT.sub.-- TMSG message with the board channel ID as a parameter. This causes the Audio Task to link up with the specified comm channel and send it captured audio. If the transport associated with the channel is not ISDN, the driver prepares to receive data from the commstub task and send it to the specified TII channel. It then sends the Audio Task the ALINKOUT.sub.-- HOST.sub.-- TMSG. This causes the Audio Task to link up with the commstub task to send it captured audio data.
Breaking the link between the audio stream handle and the channel ID is done when ToLink field is set to FALSE. The audio manager will send the ALINKOUT.sub.-- TMSG to the task along with the channel ID. The Audio Task will respond to this message by unlinking the specified channel ID (i.e. it won't send any more audio).
Errors that the host task will detect are as follows:
The channel ID does not represents a valid write stream.
The audio stream handle is already linked or unlinked (detected on host).
The audio stream handle is not a audio in handle.
If those or any interface errors (e.g., message pending) are detected, the callback associated with this stream is notified immediately. If no errors are detected, the ALINKOUT.sub.-- TMSG or ALINKOUT.sub.-- HOST.sub.-- TMSG is issued to the DSP interface and the message pending flag is set for this stream. Upon receiving the callback for this message, the callback associated with this stream is made, and finally the message pending flag is unset.
Audio Manager Interface with the DSP Interface
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
Host Processor to Audio/Comm Board Messages
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954, except for the following:
______________________________________
ALINKIN.sub.-- TMSG:
Connects/disconnects the audio task with
a virtual circuit supported by the
network task. The local and
remote channel
IDs (valid on the board) are
passed to the audio task in the first two
DWORDs of the dwArgs array.
The flag specifying whether to link or
unlink is passed in the third DWORD.
ALINKIN.sub.-- HOST.sub.-- TMSG:
Connects/disconnects the audio task with
the commstub task to receive
audio to the host. The flag specifying
whether to link or unlink is passed to
the audio task in the third DWORD
of the dwArgs array. The first two
DWORDS are ignored.
ALINKOUT.sub.-- TMSG:
Connects the audio task with a virtual
circuit supported by the network
task. The local and remote channel
IDs (valid on the board) are passed to
the audio task in the first two DWORDs
of the dwArgs array. The flag
specifying whether to link or unlink is
passed in the third DWORD.
ALINKOUT.sub.-- HOST.sub.-- TMSG:
Connects the audio task with a virtual
circuit supported by the network
task. The flag specifying whether
to link or unlink is passed to the audio
task in the third DWORD
of the dwArgs array. The
first two DWORDS are ignored.
______________________________________
Audio/Comm Board to Host Processor Messages
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
Wave Audio Implementation
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
Audio Subsystem Audio/Comm (ISDN) Board-Resident Implementation
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954. In addition, the audio task 538 of FIG. 13 connects with the commstub task 1308. This interface allows the audio task to exchange compressed data packets of audio samples with the host 202, which is responsible for delivering them to the remote system when the network is not ISDN (e.g., LAN). As the name implies, this task is a standin for the comm task. The interface is the same as that between the audio task 538 and the comm task 540.
Audio Task Interface with Host Device Driver
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
Audio Task Interface with Audio Hardware
Referring now to FIG. 15, there is shown a block diagram of interface between the audio task 538 and the audio hardware of audio/comm (ISDN) board 206 of FIG. 13. The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
Timestamp Driver
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
(De)Compression Drivers
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
Mixer/Splitter Driver
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
Mixer Internal Operation
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
Echo Suppression Driver
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Spectral Equalization
In one embodiment of the present invention, the microphone 104 and speaker 108 of a conferencing node of FIG. 1 are part of a single earpiece component, such as an Enterprise.TM. headset sold by Plantronics. Because the microphone is located away from the mouth and in physical contact with the user's head near the ear, the audio signals may become distorted. These distortions may be due to reverberation signals that reflect off the user's cheek, sounds from the user's mouth that become out of phase at the microphone, and/or the directionality/loss of the higher frequencies. These distortions may combine with artifacts of the audio coder to degrade the quality of the audio portion of a conferencing session.
Digital filtering is applied to the audio signals to attempt to correct for the distortions that result from using a combined microphone/speaker earpiece. When using the Plantronics Enterprise.TM. microphone, the digital filter is implemented using a cascade of a second-order high-pass Chebyshev Type I Infinite Impulse Response filter and a sixth-order Infinite Impulse Response filter designed using the Steiglitz approximation, which produces a 3 dB bump at 2 kHz to enhance perception.
This digital filtering is implemented as part of the equalizer stackable driver 1514 in the capture side audio processing as shown in FIG. 15. The equalizer driver 1514 can be selectively enabled or disabled. When the user selects a combined earpiece headset, then the equalizer driver 1514 is enabled and each audio frame is digitally filtered before being passed to the next driver on the audio stack (i.e., echo/suppression stackable driver 1512 of FIG. 15). When the user selects another configuration of microphone and speaker (e.g., a speakerphone or a directional boom microphone headset), then the equalizer driver 1514 is disabled and each audio frame is passed on to the echo/suppression driver 1512 without any processing. The equalizer driver 1514 is implemented as a driver under the Spectron Microsystems SPOX.TM. operating system.
Audio Task Interface with Comm Task
Referring again to FIG. 13, the audio task 538 sends and receives audio packets from either the comm task 540 or the commstub task 1308, depending on whether the network connection is over ISDN or LAN. The interface the audio task uses in the same in either case. Throughout this section, references to comm task 540 also apply to commstub task 1308.
The interface between the audio task to the audio hardware is based on SPOX streams. Unfortunately, SPOX streams connect tasks to source and sink device drivers, not to each other. Audio data are contained within SPOX array objects and associated with streams. To avoid unnecessary buffer copies, array objects are passed back and forth between the comm and audio subsystems running on the audio/comm board using SPOX streams and a pipe driver. The actual pipe driver used will be based on a SPOX driver called NULLDEV. Like Spectron's version, this driver simply redirects buffers it receives as an IO.sub.-- SINK to the IO.sub.-- SOURCE stream; no buffer copying is performed. Unlike Spectron's pipe driver, however, NULLDEV does not block the receiving task if no buffers are available from the sending stream and discards buffers received from the IO.sub.-- SOURCE stream if no task has made the IO.sub.-- SINK stream connection to the driver. In addition, NULLDEV will not block or return errors to the sender. If no free buffers are available for exchange with the sender's live buffer, NULLDEV returns a previously queued live buffer. This action simulates a dropped packet condition.
Setup and teardown of these pipes will be managed by a message protocol between the comm task and audio task threads utilizing the existing TMB mailbox architecture built into the Mikado DSP interface. The interface assumes that the comm task or commstub task is running, a network connection has been established, and channel ID's (i.e., virtual circuit ID's) have been allocated to the audio subsystem by the conferencing API. The interface requires the comm task and commstub task each to make available to the audio threads the handle to its local mailbox TMB.sub.-- MYMBOX. This is the mailbox a task uses to receive messages from the host processor. The mailbox handle is copied to a global memory location and retrieved by the threads using the global data package discussed later in this specification. The audio task chooses which mailbox to use, and thus whether to communicate with the comm task or the commstub task, based on which message it receives from the host. ALINKOUT.sub.-- TMSG and ALINKIN.sub.-- TMSG cause it to use the comm task mailbox, and ALINKOUT.sub.-- HOST.sub.-- TMSG and ALINKIN.sub.-- HOST.sub.-- TMSG cause ti to use the commstub task mailbox. In the case of an ISDN connection, the audio task becomes the channel handler for the audio channels. Otherwise, the audio driver on the host becomes the channel handler.
Message Protocol
Referring now to FIG. 16, there is shown a block diagram of the interface between the audio task 538 and the comm task 540 of FIGS. 5 and 13. The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954, which applies to conferencing over an ISDN connection. In addition, for a LAN connection, the processing is analogous as for the ISDN connection, with the following differences:
The commstub task replaces the comm task.
The ALINKOUT.sub.-- HOST.sub.-- TMSG message replaces the ALINKOUT.sub.-- TMSG message.
The ALINKIN.sub.-- HOST.sub.-- TMSG message replaces the ALINKIN.sub.-- TMSG message.
The commstub task sends buffers to and receives buffers from the host.
Global Data Package
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954.
NULLDEV Driver
The SPOX image for the audio/comm board contains a device driver that supports interprocess communication though the stream (SS) package. The number of distinct streams supported by NULLDEV is controlled by a defined constant NBRNULLDEVS in NULLDEV.H. NULLDEV supports three streams. One is used for the audio task capture thread to communicate with the comm task for ISDN connection. Another is used by the playback thread to communicate with the comm task. The third is for the audio capture task to communicate with the commstub task for LAN connection. The assignment of device names to tasks is done by the following three constants in ASTASK.H:
______________________________________
#define AS.sub.-- CAPTURE.sub.-- PIPE "/null"
#define AS.sub.-- PLAYBACK.sub.-- PIPE "/null2"
#define AS.sub.-- HOST.sub.-- CAPTURE.sub.-- PIPE "/null3"
______________________________________
Support for additional streams may be obtained by changing the NBRNULLDEVS constant and recompiling NULLDVR.C. The SPOX config file is also adjusted by adding additional device name strings to this section as follows:
______________________________________
driver NULLDEV.sub.-- driver {
"/null": devid = 0;
"/nu112": devid = 1;
"/null3": devid = 2;
};
______________________________________
The next device in the sequence has devid=3.
SS.sub.-- get() calls to NULLDEV receive an error if NULLDEV's ready queue is empty. It is possible to SS.sub.-- put() to a NULLDEV stream that has not been opened for SS.sub.-- get() on the other end. Data written to the stream in this case is discarded. In other words, input live buffers are simply appended to the free queue. SS.sub.-- put() never returns an error to the caller. If no buffers exist on the free queue for exchange with the incoming live buffer, NULLDEV removes the buffer at the head of the ready queue and returns it as the free buffer.
PWave Subsystem
The PWave subsystem provides high-priority playback of digital audio signals contained in Microsoft.RTM. standard Wave files.
PWave API
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
High Priority Playback Task
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
PWave Protocol
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/340,171, filed Nov. 15, 1994, now abandoned.
Comm Subsystem
The communications (comm) subsystem of conferencing system 100 of FIG. 5 comprises:
Comm API 510, comm manager 518, DSP interface 528, and portions of the network stacks 560 running on host processor 202 of FIG. 2,
Portions of the network stacks 560 running on LAN board 210, and
Comm task 540 running on audio/comm (ISDN) board 206.
The comm subsystem provides connectivity functions to the conferencing application programs 502 and 504. It maintains and manages the session, connection, and the virtual channel states. All the connection control, as well as data communication are done through the communication subsystem.
Referring now to FIG. 17, there is shown a block diagram of the comm subsystem of conferencing system 100 of FIG. 5. The comm subsystem consists of the following layers that reside on host processor 202, the audio/comm (ISDN) board 206, and LAN board 210:
Transport independent interface 510 (TII.DLL),
Datalink module 1702 (DLM.DLL+KPDAPI.DLL, where KPDAPI.DLL is the back-end of the DLM which communicates with the DSP interface 528),
Reliable datalink module 1704 (RDLM.DLL),
Global dynamic loader 1706 (GDL.DLL),
Global dynamic loader executable 1708 (GDLE.EXE),
Control (D channel) 1710,
D channel driver 1712,
Data comm tasks 1714,
B channel drivers 1716,
LAN datalink module 1718 (DLMLAN.DLL),
The appropriate LAN media dependent modules 1720 (MDM.DLLs),
The appropriate comm stacks 560, and
The MDM helper task 1722 (MDMHELPR.DLL). TII 510, DLM 1702, DSP interface 528, RDLM 1704, DLMLAN 1718, the MDMs 1720, portions of the comm stacks 560, MDMHELPR 1722, GDL 1706, and GDLE.EXE 1708 reside entirely on the host processor. Control (D channel) 1710, D channel driver 1712, data comm tasks 1714, and B channel drivers 1716 reside on audio/comm (ISDN) board 206. Portions of the comm stacks 560 reside on the LAN board 210.
The comm interface provides a "transport independent interface" for the conferencing applications. This means that the comm interface hides all the network dependent features of the conferencing system. For ISDN connections, conferencing system 100 uses the ISDN Basic Rate Interface (BRI) which provides 2*64 KBits/sec data (B) channels and one signaling (D) channel (2B+D). Conferencing system 100 also uses conventional LAN connections.
The comm subsystem provides an interface by which the conferencing applications can gain access to the communication hardware. The goal of the interface is to hide the implementation of the connectivity mechanism and provide an easy to use interface. This interface provides a very simple (yet functional) set of connection control features, as well as data communication features. The conferencing applications use virtual channels for data communication. Virtual channels are simplex, which means that two virtual channels are open for full duplex communication between peers. Each conferencing application opens its outgoing channel which is write-only. The incoming (read-only) channels are created by "accepting" an "open channel" request from the peer.
ISDN-Based Conferencing
Referring now to FIG. 18, there is shown a block diagram of the comm subsystem architecture for two conferencing systems 100 participating in a conferencing session over an ISDN connection. The comm subsystem provides an asynchronous interface between the audio/comm (ISDN) board 206 and the conferencing applications 502 and 504.
The comm subsystem provides all the software modules that manage the two ISDN B channels. The comm subsystem provides a multiple virtual channel interface for the B channels. Each virtual channel is associated with transmission priority. The data queued for the higher priority channels are transmitted before the data in the lower priority queues. The virtual channels are unidirectional. The conferencing applications open write-only channels. The conferencing applications acquire read-only channels as a result of accepting a open channel request from the peer. The DLM supports the virtual channel interface.
During an ISDN-based conferencing session, the comm subsystem software handles all the multiplexing and inverse multiplexing of virtual channels over the B channels. The number of available B channels (and the fact that there is more than one physical channel available) is not a concern to the application.
The comm subsystem provides the D channel signaling software to the audio/comm (ISDN) board. The comm subsystem is responsible for providing the ISDN B channel device drivers for the audio/comm (ISDN) board. The comm subsystem provides the ISDN D channel device drivers for the audio/comm (ISDN) board. The comm software is certifiable in North America (U.S.A., Canada) and Europe. The signaling software is compatible with NI1, AT&T Custom, and Northern Telecom DMS-100.
LAN-Based Conferencing
For LAN-based conferencing, the comm subsystem provides an asynchronous interface between the LAN board 210 and the conferencing applications 502 and 504. The comm subsystem provides all the software modules that manage the LAN communication network 110. The comm subystem provides a multiple virtual channel interface for the LAN interconnecton between the conferencing machines. Each virtual channel is associated with a transmission priority. The data queued for the higher priority channels are transmitted before the data in the lower priority queues. The virtual channesl are unidirectional. The conferencing applications open write-only channels. The conferencing applications acquire read-only channels as a result of accepting an open channel request from the peer. The DLMLAN modules supports the virtual channel interface.
During a LAN-based conferencing session, the comm subsystem handles all the multiplexing and inverse multiplexing of virtual channels over the typically singular LAN interconnection. The number of network `sockets` or connection points is not a concern to the application.
When the video conferencing connection is across the LAN, comm stack 506 receives the compress audio generated by the remote site and stores it to host memory. The appropriate LAN MDM 1720 of FIG. 17 and DLMLAN 1718 then reconstructs the compressed audio stream as the sequence of packets supplied by the audio manager on the remote site to that site's LAN comm subsystem. The comm manager 518 then passes the audio packets to the audio manager 520, which sends the packets to the audio task on audio/comm (ISDN) board 206 for playback.
qMUX MULTIPLE CHANNEL STREAMING MODULE
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954. In addition, for LAN-based conferencing, the LAN implementation of the DLM interface (DLMLAN) 1718 provides the same functionality on the LAN that DLM 1702 does for ISDN-based conferencing, i.e., virtual channels and transport independent message sizes. The DLMLAN implementation is supported on another abstraction layer, the media dependent modules (MDMs) 1720. The MDMs have a common MDM API and they implement the required functionality on top of an existing LAN protocol stack (e.g., IPX, TCP/IP) A single MDM helper task (MDMHELPR) 1722 assists the MDMs by generating threads of execution for callbacks and data transmission.
Comm API
The description for this section is the same as the description for the section of the same name in U.S. patent application Ser. No. 08/157,694, now U.S. Pat. No. 5,506,954. In addition, sessions and connections have associated addresses, represented by the TADDR structure. A TADDR consists of a transport type and up to 80 bytes of addressing information. The transport type specifies if this is an ISDN or LAN address. Referring again to FIG. 17, TII 510 determines which DLM will be servicing a given address by passing it to the Global Dynamic Loader (GDL) module 1706. GDL 1706 and its associated helper task GDLE 1708 load the appropriate module into memory and return all of the DLM entry points to TII 510. If this is a LAN address, the DLMLAN 1718 will then consult GDL 1706 in order to load the appropriate MDM 1720. DLMLAN 1718 receives back from GDL 1706 a list of the appropriate MDM entry points. GDL 1706 and GDLE 1708 determine the appropriate DLM and MDM to load by reading the file GDL.INI which is written when the product is installed. This file specifies the MDMs that are appropriate based on the configuration of the user's machine. Further description of the operations of global dynamic loaders and global dynamic loader executables is presented in U.S. patent application Ser. No. 08/133,612, now U.S. Pat. No. 5,410,698. Additional information on the comm API is found in APPENDIX E of this specification.
Automatic Transport Detection
Conferencing system 100 of FIG. 1 is capable of supporting conferencing over different types of transports (e.g., ISDN and LAN). Moreover, conferencing system 100 is capable of supporting LAN-based conferencing under different LAN transport standards (e.g., Novell IPX, Internet User Datagram Protocol (UDP), and/or NetBIOS standards). Further still, conferencing system 100 is capable of supporting LAN-based conferencing with different LAN products for a single LAN transport standard (e.g., LAN WorkPlace (LWPUDP) by Novell and FTPUDP by FTP Software, Inc., both of which conform to the LAN UDP standard).
In order for a particular conferencing system 100 to be able to exercise the full range of its conferencing options, it knows which of the supported transports are installed. Conferencing system 100 is able to determine automatically which supported transports are installed. This automatic transport detection may be implemented at install time (i.e., when conferencing system 100 is installed in a PC node) and/or at run time (i.e., when conferencing system 100 is ready to begin conferencing).
Although different LAN products that conform to the same transport standard will generate data with the same packet format, they may have different APIs for generating and interpreting those packets. Thus, automatic transport detection determines which transport products are installed as well as which transport types and standards are supported. Each different supported transport will typically have a corresponding media dependent module (MDM). A goal of automatic transport detection is to identify (and store in the GDL.INI file) the specific MDMs to be used to communicate with the specific network transport stacks that are supported and installed in conferencing system 100.
Install-Time Processing
Conferencing systems 100 may be configured to support conferencing over different sets of transports. For example, a particular conferencing system 100 may support conferencing over ISDN, Novell IPX, and UDP, but not NetBIOS. The supported transports are presented to the conferencing application 502 of conferencing system 100 as a list of numbers corresponding to the supported transports. Possible supported |