Synchronization of diverse media

Method and an apparatus for reproducing bitstream having non-sequential system clock data seamlessly therebetween

5923869

Abstract

A system stream contiguous reproduction apparatus to which are input one or more system streams interleaving at least moving picture data and audio data, and system stream connection information includes a system clock STC generator for producing the system clock that is used as the system stream reproduction reference clock. The system stream contiguous reproduction apparatus further includes one or more signal processing decoders that operate referenced to the system clock STC, decoder buffers for temporarily storing the system stream data transferred to the corresponding signal processing decoders, and STC selectors for selecting a system clock STC referenced by the signal processing decoders when decoding the first system stream, and another system clock STC referenced by the signal processing decoders when decoding a second system stream reproduced contiguously to the first system stream.


Claims

What is claimed is:

1. A data stream reproduction apparatus for reproducing data streams, including a plurality of data units, after storing the data streams in a buffer, the data units being respectively associated with first time codes for indicating transfer timings at which the respective data units are input to the buffer, at least one data unit being associated with a second time code for indicating a presentation timing at which the at least one data unit is reproduced, the apparatus comprising:

a decoder, which includes the buffer, for storing the data streams input thereto and then decoding the stored data streams with reference to a reference clock, so as to reproduce the stored data streams based on the second time code in such a manner that the at least one data unit with the second time code is reproduced at the presentation timing;

a data stream supplying means for supplying the data streams to the buffer with reference to the reference clock so as to input the data units at the transfer timings based on the first time codes, respectively, and

a controller for supplying, the reference clock to the decoder and the data stream supplying means,

wherein the controller comprises:

a system clock generator for generating a first clock and a second clock different from the first clock; and

a system clock selector for selectively outputting the first and second clocks in such a manner that, during a first period, one of the first and second clocks is supplied as the reference clock both to the data stream supplying means and the decoder and, during a second period, one of the first and second clocks is supplied to the data stream supplying means while the other of the first and second clocks is supplied to the decoder.

2. A data stream reproduction apparatus according to claim 1, wherein the data streams include a first data stream and a second data stream following the first data stream, the first and second data streams to be reproduced contiguously, and

wherein the first period terminates after a last data unit of the first data stream is supplied to the buffer of the decoder, while the second period follows the first period and terminates after the last data unit of the first data system is reproduced.

3. An optical disk for use with a data stream reproduction apparatus, the data stream reproduction apparatus is for reproducing data streams, including a plurality of data units, after storing the data streams in a buffer, the data units being respectively associated with first time codes for indicating transfer timings at which the respective data units are input to the buffer, at least one data unit being associated with a second time code for indicating a presentation timing at which the at least one data unit is reproduced,

the apparatus including:

a decoder, which includes the buffer, for storing the data streams input thereto and then decoding the stored data streams with reference to a reference clock, so as to reproduce the stored data streams based on the second time code in such a manner that the at least one data unit with the second time code is reproduced at the presentation timing;

a data stream supplying means for supplying the data streams to the buffer with reference to the reference clock so as to input the data units at the transfer timings based on the first time codes, respectively, and

a controller for supplying the reference clock to the decoder and the data stream supplying means,

wherein the controller comprises:

a system lock generator for generating a first clock and a second clock different from the first clock; and

a system clock selector for selectively outputting the first and second clocks in such a manner that, during a first period, one of the first and second clocks is supplied as the reference clock both to the data stream supplying means and the decoder and, during a second period, one of the first and second clocks is supplied to the data stream supplying means while the other of the first and second clocks is supplied to the decoder;

the optical disk comprising:

a data region for storing the data streams including first and second data streams; and

a management information region for storing connection information which is utilized during data stream reproduction and indicates that the first data stream is followed by the second data stream, wherein

the connection information includes system clock selection information for use during data stream reproduction so that the first data stream is supplied to the buffer and then decoded with reference to one of the first and second clocks while the second data stream is supplied to the buffer and then decoded with reference to the other of the first and second clocks.

4. An optical disk according to claim 3, wherein the first and second data streams are to be reproduced contiguously, and

wherein the first period terminates after a last data unit of the first data stream is supplied to the buffer of the decoder, while the second period follows the first period and terminates after the last data unit of the first data stream is reproduced.

5. A data stream reproduction method for reproducing data streams, including a plurality of data units, after storing the data streams in a buffer, the data units being respectively associated with first time codes for indicating transfer timings at which the respective data units are input to the buffer, at least one data unit being associated with a second time code for indicating a presentation timing at which the at least one data unit is reproduced, the method comprising:

storing the data streams in a decoder, which includes the buffer, and then decoding the stored data streams with reference to a reference clock, so as to reproduce the stored data streams based on the second time code in such a manner that the at least one data unit with the second time code is reproduced at the presentation timing;

supplying the data streams to the buffer via a data stream supplying means with reference to the reference clock so as to input the data units at the transfer timings based on the first time codes, respectively, and

supplying the reference clock to the decoder and the data stream supplying means,

wherein supplying the reference clock includes:

generating a first clock and a second clock different from the first clock; and

selectively outputting the first and second clocks in such a manner that, during a first period, one of the first and second clocks is supplied as the reference clock both to the data stream supplying means and the decoder and, during a second period, one of the first and second clocks is supplied to the data stream supplying means while the other of the first and second clocks is supplied to the decoder.

6. A data stream reproduction method according to claim 5, wherein the data streams include a first data stream and a second data stream following the first data stream, the first and second data streams to be reproduced contiguously, and

wherein the first period terminates after a last data unit of the first data stream is supplied to the buffer of the decoder, while the second period follows the first period and terminates after the last data unit of the first data system is reproduced.


Description

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to a method and apparatus for seamlessly reproducing a bitstream having non-sequential system clock data therein and, more specifically, to a bits ream for use in an authoring system for variously processing a data bitstream comprising the video data, audio data, and sub-picture data constituting each of plural program titles containing related video data, audio data, and sub-picture data content to generate a bitstream from which a new title containing the content desired by the user can be reproduced, and efficiently recording and reproducing the generated bitstream using a particular recording medium.

2. Description of the Prior Art

Authoring systems used to produce program titles comprising related video data, audio data, and sub-picture data by digitally processing, for example, multimedia data comprising video, audio, and sub-picture data recorded to laser disk or video CD formats are currently available. Systems using Video-Cds in particular are able to record video data to a CD format disk, which was originally designed with an approximately 600 MB recording capacity for storing digital audio data only, by using such high efficiency video compression techniques as MPEG. As a result of the increased effective recording capacity achieved using data compression techniques, karaoke titles and other conventional laser disk applications are gradually being transferred to the video CD format.

Users today expect both sophisticated title content and high reproduction quality. To meet these expectations, each title must be composed from bitstreams with an increasingly deep hierarchical structure. The data size of multimedia titles written with bitstreams having such deep hierarchical structures, however, is ten or more times greater than the data size of less complex titles. The need to edit small image (title) details also makes it necessary to process and control the bitstream using low order hierarchical data units.

It is therefore necessary to develop and prove a bitstream structure and an advanced digital processing method including both recording and reproduction capabilities whereby a large volume, multiple level hierarchical digital bitstream can be efficiently controlled at each level of the hierarchy. Also needed are an apparatus for executing this digital processing method, and a recording media to which the bitstream digitally processed by the apparatus can be efficiently recorded for storage and from which the recorded information can be quickly reproduced.

Means for increasing the storage capacity of conventional optical disks have been widely researched to address the recording medium aspect of this problem. One way to increase the storage capacity of the optical disk is to reduce the spot diameter D of the optical (laser) beam. If the wavelength of the laser beam is 1 and the aperture of the objective lens is NA, then the spot diameter D is proportional to 1/NA, and the storage capacity can be efficiently improved by decreasing 1 and increasing NA.

As described, for example, in U.S. Pat. No. 5,235,581, however, coma caused by a relative tilt between the disk surface and the optical axis of the laser beam (hereafter "tilt") increases when a large aperture (high NA) lens is used. To prevent tilt-induced coma, the transparent substrate must be made very thin. The problem is that the mechanical strength of the disk is low when the transparent substrate is very thin.

MPEG1, the conventional method of recording and reproducing video, audio, and graphic signal data, has also been replaced by the more robust MPEG2 method, which can transfer large data volumes at a higher rate. It should be noted that the compression method and data format of the MPEG2 standard differ somewhat from those of MPEG1. The specific content of and differences between MPEG1 and MPEG2 are described in detail in the ISO-11172 and ISO-13818 MPEG standards, and further description thereof is omitted below.

Note, however, that while the structure of the encoded video stream is defined in the MPEG2 specification, the hierarchical structure of the system stream and the method of processing lower hierarchical levels are not defined.

As described above, it is therefore not possible in a conventional authoring system to process a large data stream containing sufficient information to satisfy many different user requirements. Moreover, even if such a processing method were available, the processed data recorded thereto cannot be repeatedly used to reduce data redundancy because there is no large capacity recording medium currently available that can efficiently record and reproduce high volume bitstreams such as described above.

More specifically, particular significant hardware and software requirements must be satisfied in order to process a bitstream using a data unit smaller than the title. These specific hardware requirements include significantly increasing the storage capacity of the recording medium and increasing the speed of digital processing; software requirements include inventing an advanced digital processing method including a sophisticated data structure.

Therefore, the object of the present invention is to provide an effective authoring system for controlling a multimedia data bitstream with advanced hardware and software requirements using a data unit smaller than the title to better address advanced user requirements.

To share data between plural titles and thereby efficiently utilize optical disk capacity, multi-scene control whereby scene data common to plural titles and the desired scenes on the same time-base from within multi-scene periods containing plural scenes unique to particular reproduction paths can be freely selected and reproduced is desirable.

However, when plural scenes unique to a reproduction path within the multi-scene period are arranged on the same time-base, the scene data must be contiguous. Unselected multi-scene data is therefore unavoidably inserted between the selected common scene data and the selected multi-scene data. The problem this creates when reproducing multi-scene data is that reproduction is interrupted by this unselected scene data.

In other words, except when a video object VOB, which is normally a single-stream title editing unit, is divided into discrete streams, seamless reproduction cannot be achieved by simply connecting and reproducing individual VOBs. This is because while the reproduction of video, audio, and sub-picture streams forming each VOB must be synchronized, the means for achieving this synchronization is enclosed in each VOB. As a result, the synchronization means will not function normally at VOB connections if the VOBs are simply connected together.

The object of the present invention is therefore to provide a reproduction apparatus enabling seamless reproduction whereby scene data can be reproduced without intermittence even from these multi-scene periods.

The object of the present invention is therefore to provide an optical disk medium from which data can be seamlessly reproduced without audio or video intermitting even in such multi-scene periods, and a reproducing apparatus implementing the recording and reproducing method.

The present application is based upon Japanese Patent Application No. 7-276710 and 8-041583, which were filed on Sep. 29, 1995 and Feb. 28, 1996, respectively, the entire contents of which are expressly incorporated by reference herein.

SUMMARY OF THE INVENTION

The present invention has been developed with a view to substantially solving the above described disadvantages and has for its essential object to provide an improved method and apparatus for reproducing a bitstream having non-sequential system clock data seamlessly therebetween.

In order to achieve the aforementioned objective, a system stream contiguous reproduction apparatus to which are input one or more system streams interleaving at least moving picture data and audio data, and system stream connection information comprises a system clock STC generator for producing the system clock that is used as the system stream reproduction reference clock, one or more signal processing decoders that operate referenced to the system clock STC, decoder buffers for temporarily storing the system stream data transferred to the corresponding signal processing decoders, and STC selectors for selecting a system clock STC referenced by the signal processing decoders when decoding the first system stream, and another system clock STC referenced by the signal processing decoders when decoding a second system stream reproduced contiguously to the first system stream.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other objects and features of the present invention will become clear from the following description taken in conjunction with the preferred embodiments thereof with reference to the accompanying drawings throughout which like parts are designated by like reference numerals, and in which:

FIG. 1 is a graph schematically showing a structure of a multimedia bit stream according to the present invention,

FIG. 2 is a block diagram showing an authoring encoder according to the present invention,

FIG. 3 is a block diagram showing an authoring decoder according to the present invention,

FIG. 4 is a side view of an optical disk storing the multi media bit stream of FIG. 1,

FIG. 5 is an enlarged view showing a portion confined by a circle of FIG. 4,

FIG. 6 is an enlarged view showing a portion confined by a circle of FIG. 5,

FIG. 7 is a side view showing a variation of the optical disk of FIG. 4,

FIG. 8 is a side view showing another variation of the optical disk of FIG. 4,

FIG. 9 is a plan view showing one example of track path formed on the recording surface of the optical disk of FIG. 4,

FIG. 10 is a plan view showing another example of track path formed on the recording surface of the optical disk of FIG. 4,

FIG. 11 is a diagonal view schematically showing one example of a track path pattern formed on the optical disk of FIG. 7,

FIG. 12 is a plan view showing another example of track path formed on the recording surface of the optical disk of FIG. 7,

FIG. 13 is a diagonal view schematically showing one example of a track path pattern formed on the optical disk of FIG. 8,

FIG. 14 is a plan view showing another example of track path formed on the recording surface of the optical disk of FIG. 8,

FIG. 15 is a graph in assistance of explaining a concept of parental control according to the present invention,

FIG. 16 is a graph schematically showing the structure of a multimedia bit stream for use in Digital Video Disk system according to the present invention,

FIG. 17 is a graph schematically showing the encoded video stream according to the present invention,

FIG. 18 is a graph schematically showing an internal structure of a video zone of FIG. 16.

FIG. 19 is a graph schematically showing the stream management information according to the present invention,

FIG. 20 is a graph schematically showing the structure the navigation pack NV of FIG. 17,

FIG. 21 is a graph is assistance of explaining a concept of parental lock playback control according to the present invention,

FIG. 22 is a graph schematically showing the data structure used in a digital video disk system according to the present invention,

FIG. 23 is a graph in assistance of explaining a concept of Multi-angle scene control according to the present invention,

FIG. 24 is a graph in assistance of explaining a concept of multi scene data connection,

FIG. 25 is a block diagram showing a DVD encoder according to the present invention,

FIG. 26 is a block diagram showing a DVD decoder according to the present invention,

FIG. 27 is a graph schematically showing an encoding information table generated by the encoding system controller of FIG. 25,

FIG. 28 is a graph schematically showing an encoding information table,

FIG. 29 is a graph schematically showing an encoding parameters used by the video encoder of FIG. 25,

FIG. 30 is a graph schematically showing an example of the contents of the program chain information according to the present invention,

FIG. 31 is a graph schematically showing another example of the contents of the program chain information according to the present invention,

FIG. 32 is a block diagram showing a synchronizer of FIG. 26 according to the present invention,

FIG. 33 is a graph in assistance of explaining a concept of multi-angle scene control according to the present in invention,

FIG. 34 is a flow chart, formed by FIGS. 34A and 34B, showing an operation of the DVD encoder of FIG. 25,

FIG. 35 is a flow chart showing details of the encode parameter production-sub-routine of FIG. 34,

FIG. 36 is a flow chart showing the details of the VOB data setting routine of FIG. 35,

FIG. 37 is a flow chart showing the encode parameters generating operation for a seamless switching,

FIG. 38 is a flow chart showing the encode parameters generating operation for a system stream,

FIG. 39 is a block diagram showing the STC generator of FIG. 32,

FIG. 40 is a graph in assistance of explaining the relationship the relationship between the SCR, APTS, VDTS, and VPTS values,

FIG. 41 is a block diagram showing a modification of the synchronizer of FIG. 32,

FIG. 42 is a block diagram showing a synchronization controller of FIG. 41,

FIG. 43 is a flow chart showing an operation of the syncronization controller of FIG. 42,

FIG. 44 is a graph in assistance of explaining the relationship between the system clock reference SCR, the audio playback starting time information APTS, the decoder reference clock STC, and the video playback starting time VPTS,

FIG. 45 is a graph in assistance of explaining the relationship between the recording positions and values of SCR, APTS, and VPTS when VOB #1 and VOB #2 are seamlessly reproduced,

FIG. 46 is a graph in assitance of explaining the relationship between the SCR, APTS, and VPTS values and recording positions in each VOB,

FIG. 47 is a graph in assistance of explaining the relationship between the SCR, APTS, and VPTS values and recording positions in the VOB,

FIG. 48 is a graph showing a time line from input of the VOB in FIG. 47 to the system decoder to output of the last audio and video reproduction data,

FIG. 49 is a flow chart showing the operation of the DVD encoder of FIG. 26,

FIG. 50 is a flow chart showing details of the multi-angle non-seamless switching control routine of FIG. 49,

FIG. 51 is a flow chart showing details of the multi-angle seamless switching control routine of FIG. 49,

FIG. 52 is a flow chart showing details of the parental lock sub-routine of FIG. 49,

FIG. 53 is a flow chart showing details of the single scene subroutine of FIG. 49,

FIGS. 54 and 55 are graphs showing decoding information table produced by the decoding system controller of FIG. 26,

FIG. 56 is a flow chart showing the operation of the DVD decoder DCD of FIG. 26,

FIG. 57 is a low chart showing details of reproduction extracted PGC routing of FIG. 56,

FIG. 58 is a flow chart showing details of decoding data process of FIG. 57, performed by the stream buffer,

FIG. 59 is a flow chart showing details of the decoder synchronization process of FIG. 58,

FIG. 60 is a flow chart showing an operation of the STC selection controller of FIG. 39 during a non-seamless reproduction operation,

FIG. 61 is a flow chart showing the operation of the STC selection controller of FIG. 39 during a seamless reproduction operation,

FIG. 62 is a flow chart showing the data transferring operation of FIG. 57,

FIG. 63 is a flow chart showing details of the non multi-angle decoding process of FIG. 62,

FIG. 64 is a flow chart showing details of the non-multi-angled interleave process of FIG. 63,

FIG. 65 is a flow chart showing details of the non-multi-angled contiguous block process of FIG. 63,

FIG. 66 is a flow chart showing a modification of FIG. 63,

FIG. 67 is a flow chart showing details of the seamless multi-angle decoding process of FIG. 62,

FIG. 68 is a flow chart showing details of non-seamless multi-angle decoding process of FIG. 62,

FIG. 69 is a block diagram showing details of the stream buffer of FIG. 26,

FIG. 70 is a flow chart showing the encode parameters generating operation for a system stream containing a single scene,

FIG. 71 is a graph schematically showing an actual arrangement of data blocks recorded to a data recording track on a recording medium according to the present invention;

FIG. 72 is a graph schematically showing contiguous block regions and interleaved block regions array,

FIG. 73 is a graph schematically showing a content of a VTS title VOBS according to the present invention, and

FIG. 74 is a graph schematically showing an internal data structure of the interleaved block regions according to the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Data Structure of the Authoring System

The logic structure of the multimedia data bitstream processed using the recording apparatus, recording medium, reproduction apparatus, and authoring system according to the present invention is described first below with reference to FIG. 1.

In this structure, one title refers to the combination of video and audio data expressing program content recognized by a user for education, entertainment, or other purpose. Referenced to a motion picture (movie), one title may correspond to the content of an entire movie, or to just one scene within said movie.

A video title set (VTS) comprises the bitstream data containing the information for a specific number of titles. More specifically, each VTS comprises the video, audio, and other reproduction data representing the content of each title in the seq, and control data for controlling the content data.

The video zone VZ is the video data unit processed by the authoring system, and comprises a specific number of video title sets. More specifically, each video zone is a linear sequence of K+1 video title sets numbered VTS #0-VTS #K where K is an integer value of zero or greater. One video title set, preferably the first video title set VTS #0, is used as the video manager describing the content information of the titles contained in each video title set.

The multimedia bitstream MBS is the largest control unit of the multimedia data bitstream handled by the authoring system of the present invention, and comprises plural video zones VZ.

Authoring Encoder EC

A preferred embodiment of the authoring encoder EC according to the present invention for generating a new multimedia bitstream MES by re-encoding the original multimedia bitstream MES according to the scenario desired by the user is shown in FIG. 2. Note that the original multimedia bitstream MES comprises a video stream St1 containing the video information, a sub-picture stream St3 containing caption text and other auxiliary video information, and the audio stream St5 containing the audio information.

The video and audio streams are the bitstreams containing the video and audio information obtained from the source within a particular period of time. The sub-picture stream is a bitstream containing momentary video information relevant to a particular scene. The sub-picture data encoded to a single scene may be captured to video memory and displayed continuously from the video memory for plural scenes as may be necessary.

When this multimedia source data St1, St3, and St5 is obtained from a live broadcast, the video and audio signals are supplied in real-time from a video camera or other imaging source; when the multimedia source data is reproduced from a video tape or other recording medium, the audio and video signals are not real-time signals.

While the multimedia source stream is shown in FIG. 2 as comprising these three source signals, this is for convenience only, and it should be noted that the multimedia source stream may contain more than three types of source signals, and may contain source data for different titles. Multimedia source data with audio, video, and sub-picture data for plural titles are referred to below as multi-title streams.

As shown in FIG. 2, the authoring encoder EC comprises a scenario editor 100, encoding system controller 200, video encoder 300, video stream buffer 400, sub-picture encoder 500, sub-picture stream buffer 600, audio encoder 700, audio stream buffer 800, system encoder 900, video zone formatter 1300, recorder 1200, and recording medium M.

The video zone formatter 1300 comprises video object (VOB) buffer 1000, formatter 1100, and volume and file structure formatter 1400.

The bitstream encoded by the authoring encoder EC of the present embodiment is recorded, by way of example only, to an optical disk.

The scenario editor 100 of the authoring encoder EC outputs the scenario data, i.e., the user-defined editing of instructions. The scenario data controls editing the corresponding parts of the multimedia bitstream MBS according to the user's manipulation of the video, sub-picture, and audio components of the original multimedia title. This scenario editor 100 preferably comprises a display, speaker(s), keyboard, CPU, and source stream buffer. The scenario editor 100 is connected to an external multimedia bitstream source from which the multimedia source data St1, St3, and St5 are supplied.

The user is thus able to reproduce the video and audio components of the multimedia source data using the display and speaker to confirm the content of the generated title. The user is then able to edit the title content according to the desired scenario using the keyboard, mouse, and other command input devices while confirming the content of the title on the display and speakers. The result of this multimedia data manipulation is the scenario data St7.

The scenario data St7 is basically a set of instructions describing what source data is selected from all or a subset of the source data containing plural titles within a defined time period, and how the selected source data is reassembled to reproduce the scenario (sequence) intended by the user. Based on the instructions received through the keyboard or other control device, the CPU codes the position, length, and the relative time-based positions of the edited parts of the-respective multimedia source data streams St1, St3, and St5 to generate the scenario data St7.

The source stream buffer has a specific capacity, and is used to delay the multimedia source data streams St1, St3, and St5 a known time Td and then output streams St1, St3, and St5.

This delay is required for synchronization with the editor encoding process. More specifically, when data encoding and user generation of scenario data St7 are executed simultaneously, i.e., when encoding immediately follows editing, time Td is required to determine the content of the multimedia source data editing process based on the scenario data St7 as will be described further below. As a result, the multimedia source data must be delayed by time Td to synchronize the editing process during the actual encoding operation. Because this delay time Td is limited to the time required to synchronize the operation of the various system components in the case of sequential editing as described above, the source stream buffer is normally achieved by means of a high speed storage medium such as semiconductor memory.

During batch editing in which all multimedia source data is encoded at once ("batch encoded") after scenario data St7 is generated for the complete title, delay time Td must be long enough to process the complete title or longer. In this case, the source stream buffer may be a low speed, high capacity storage medium such as video tape, magnetic disk, or optical disk.

The structure (type) of media used for the source stream buffer may therefore be determined according to the delay time Td required and the allowable manufacturing cost.

The encoding system controller 200 is connected to the scenario editor 100 and receives the scenario data St7 therefrom. Based con the time-base position and length information of the edit segment contained in the scenario data St7, the encoding system controller 200 generates the encoding parameter signals St9, St11, and St13 for encoding the edit segment of the multimedia source data. The encoding signals St9, St11, and St13 supply the parameters used for video, sub-picture, and audio encoding, including the encoding start and end timing. Note that multimedia source data St1, St3, and St5 are output after delay time Td by the source stream buffer, and are therefore synchronized to encoding parameter signals St9, St11, and St13.

More specifically, encoding parameter signal St9 is the video encoding signal specifying the encoding timing of video stream St1 to extract the encoding segment from the video stream St1 and generate the video encoding unit. Encoding parameter signal St11 is likewise the sub-picture stream encoding signal used to generate the sub-picture encoding unit by specifying the encoding timing for sub-picture stream St3. Encoding parameter signal St13 is the audio encoding signal used to generate the audio encoding unit by specifying the encoding timing for audio stream St5.

Based on the time-base relationship between the encoding segments of streams St1, St3, and St5 in the multimedia source data contained in scenario data St7, the encoding system controller 200 generates the timing signals St21, St23, and St25 arranging the encoded multimedia-encoded stream in the specified time-base relationship.

The encoding system controller 200 also generates the reproduction time information IT defining the reproduction time of the title editing unit (video object, VOB), and the stream encoding data St33 defining the system encode parameters for multiplexing the encoded multimedia stream containing video, audio, and sub-picture data. Note that the reproduction time information IT and stream encoding data St33 are generated for the video object VOB of each title in one video zone VZ.

The encoding system controller 200 also generates the title sequence control signal St39, which declares the formatting parameters for formatting the title editing units VOB of each of the streams in a particular time-base relationship as a multimedia bitstream. More specifically, the title sequence control signal St39 is used to control the connections between the title editing units (VOB) of each title in the multimedia bitstream MBS, or to control the sequence of the interleaved title editing unit (VOBs) interleaving the title editing units VOB of plural reproduction paths.

The video encoder 300 is connected to the source stream buffer of the scenario editor 100 and to the encoding system controller 200, and receives therefrom the video stream St1 and video encoding parameter signal St9, respectively. Encoding parameters supplied by the video encoding signal St9 include the encoding start and end timing, bit rate, the encoding conditions for the encoding start and end, and the material type. Possible material types include NTSC or PAL video signal, and telecine converted material. Based on the video encoding parameter signal St9, the video encoder 300 encodes a specific part of the video stream St1 to generate the encoded video stream St15.

The sub-picture encoder 500 is similarly connected to the source stream buffer of the scenario editor 100 and to the encoding system controller 200, and receives therefrom the sub-picture stream St3 and sub-picture encoding parameter signal St11, respectively. Based on the sub-picture encoding parameter signal St11, the sub-picture encoder 500 encodes a specific part of the sub-picture stream St3 to generate the encoded sub-picture stream St17.

The audio encoder 700 is also connected to the source stream buffer of the scenario editor 100 and to the encoding system controller 200, and receives therefrom the audio stream St5 and audio encoding parameter signal St13, which supplies the encoding start and end timing. Based on the audio encoding parameter signal St13, the audio encoder 700 encodes a specific part of the audio stream St5 to generate the encoded audio stream St19.

The video stream-buffer 400 is connected to the video encoder 300 and to the encoding system controller 200. The video stream buffer 400 stores the encoded video stream St15 input from the video encoder 300, and outputs the stored encoded video stream St15 as the time-delayed encoded video stream St27 based on the timing signal St21 supplied from the encoding system controller 200.

The sub-picture stream buffer 600 is similarly connected to the sub-picture encoder 500 and to the encoding system controller 200. The sub-picture stream buffer 600 stores the encoded sub-picture stream St17 output from the sub-picture encoder 500, and then outputs the stored encoded sub-picture stream St17 as time-delayed encoded sub-picture stream St29 based on the timing signal St23 supplied from the encoding system controller 200.

The audio stream buffer 800 is similarly connected to the audio encoder 700 and to the encoding system controller 200. The audio stream buffer 800 stores the encoded audio stream St19 input from the audio encoder 700, and then outputs the encoded audio stream St19 as the time-delayed encoded audio stream St31 based on the timing signal St25 supplied from the encoding system controller 200.

The system encoder 900 is connected to the video stream buffer 400, sub-picture stream buffer 600, audio stream buffer 800, and the encoding system controller 200, and is respectively supplied thereby with the time-delayed encoded video stream St27, time-delayed encoded sub-picture stream St29, time-delayed encoded audio stream St31, and the stream encoding data St33. Note that the system encoder 900 is a multiplexer that multiplexes the time-delayed streams St27, St29, and St31 based on the stream encoding data St33 (timing signal) to generate title editing unit (VOB) St35. The stream encoding data St33 contains the system encoding parameters, including the encoding start and end timing.

The video zone formatter 1300 is connected to the system encoder 900 and the encoding system controller 200 from which the title editing unit (VOB) St35 and title sequence control signal St39 (timing signal) are respectively supplied. The title sequence control signal St39 contains the formatting start and end timing, and the formatting parameters used to generate (format) a multimedia bitstream MBS. The video zone formatter 1300 rearranges the title editing units (VOB) St35 in one video zone VZ in the scenario sequence defined by the user based on the title sequence control signal St39 to generate the edited multimedia stream data St43.

The multimedia bitstream MBS St43 edited according to the user-defined scenario is then sent to the recorder 1200. The recorder 1200 processes the edited multimedia stream data St43 to the data stream St45 format of the recording medium M, and thus records the formatted data stream St45 to the recording medium M. Note that the multimedia bitstream MBS recorded to the recording medium M contains the volume file structure VFS, which includes the physical address of the data on the recording medium generated by the video zone formatter 1300.

Note that the encoded multimedia bitstream MBS St35 may be output directly to the decoder to immediately reproduce the edited title content. It will be obvious that the output multimedia bitstream MBS will not in this case contain the volume file structure VFS.

Authoring Decoder

A preferred embodiment of the authoring decoder DC used to decode the multimedia bitstream MBS edited by the authoring encoder EC of the present invention, and thereby reproduce the content of each title unit according to the user-defined scenario, is described next below with reference to FIG. 3. Note that in the preferred embodiment described below the multimedia bitstream St45 encoded by the authoring encoder EC is recorded to the recording medium M.

As shown in FIG. 3, the authoring decoder DC comprises a multimedia bitstream producer 2000, scenario selector 2100, decodinc- system controller 2300, stream buffer 2400, system decoder 2500, video buffer 2600, sub-picture buffer 2700, audio buffer 2800, synchronizer 2900, video decoder 3800, sub-picture decoder 3100, audio decoder 3200, synthesizer 3500, video data output terminal 3600, and audio data output terminal 3700.

The bitstream producer 2000 comprises a recording media drive unit 2004 for driving the recording medium M; a reading head 2006 for reading the information recorded to the recording medium M and producing the binary read signal St57; a signal processor 2008 for variously processing the read signal St57 to generate the reproduced bitstream St61; and a reproduction controller 2002.

The reproduction controller 2002 is connected to the decoding system controller 2300 from which the multimedia bitstream reproduction control signal St53 is supplied, and in turn generates the reproduction control signals St55 and St59 respectively controlling the recording media drive unit (motor) 2004 and signal processor 2008.

So that the user-defined video, sub-picture, and audio portions of the multimedia title edited by the authoring encoder EC are reproduced, the authoring decoder DC comprises a scenario selector 2100 for selecting and reproducing the corresponding scenes (titles). The scenario selector 2100 then outputs the selected titles as scenario data to the authoring decoder DC.

The scenario selector 2100 preferably comprises a keyboard, CPU, and monitor. Using the keyboard, the user then inputs the desired scenario based on the content of the scenario input by the authoring encoder EC. Based on the keyboard input, the CPU generates the scenario selection data St51 specifying the selected scenario. The scenario selector 2100 is connected by an infrared communications device, for example, to the decoding system controller 2300, to which it inputs the scenario selection data St51.

Based on the scenario selection data St51, the decoding system controller 2300 then generates the bitstream reproduction control signal St53 controlling the operation of the bitstream producer 2000.

The stream buffer 2400 has a specific buffer capacity used to temporarily store the reproduced bitstream St61 input from the bitstream producer 2000, extract the address information and initial synchronization data SCR (system clock reference) for each stream, and generate bitstream control data St63. The stream buffer 2400 is also connected to the decoding system controller 2300, to which it supplies the generated bitstream control data St63.

The synchronizer 2900 is connected to the decoding system controller 2300 from which it receives the system clock reference SCR contained in the synchronization control data St81 to set the internal system clock STC and supply the reset system clock St79 to the decoding system controller 2300.

Based on this system clock St79, the decoding system controller 2300 also generates the stream read signal St65 at a specific interval and outputs the read signal St65 to the stream buffer 2400.

Based on the supplied read signal St65, the stream buffer 2400 outputs the reproduced bitstream St61 at a specific interval to the system decoder 2500 as bitstream St67.

Based on the scenario selection data St51, the decoding system controller 2300 generates the decoding signal St69 defining the stream Ids for the video, sub-picture, and audio bitstreams corresponding to the selected scenario, and outputs to the system decoder 2500.

Based on the instructions contained in the decoding signal St69, the-system decoder 2500 respectively outputs the video, sub-picture, and audio bitstreams input from the stream buffer 2400 to the video buffer 2600, sub-picture buffer 2700, and audio buffer 2800 as the encoded video stream St71, encoded sub-picture stream St73, and encoded audio stream St75.

The system decoder 2500 detects the presentation time stamp PTS and decoding time stamp DTS of the smallest control unit in each bitstream St67 to generate the time information signal St77. This time information signal St77 is supplied to the synchronizer 2900 through the decoding system controller 2300 as the synchronization control data St81.

Based on this synchronization control data St81, the synchronizer 2900 determines the decoding start timing whereby each of the bitstreams will be arranged in the correct sequence after decoding, and then generates and inputs the video stream decoding start signal St89 to the video decoder 3800 based on this decoding timing. The synchronizer 2900 also generates and supplies the sub-picture decoding start signal St91 and audio stream decoding start signal St93 to the sub-picture decoder 3100 and audio decoder 3200, respectively.

The video decoder 3800 generates the video output request signal St84 based on the video stream decoding start signal St89, and outputs to the video buffer 2600. In response to the video output request signal St84, the video buffer 2600 outputs the video stream St83 to the video decoder 3800. The video decoder 3800 thus detects the presentation time information contained in the video stream St83, and disables the video output request signal St84 when the length of the received video stream St83 is equivalent to the specified presentation time. A video stream equal in length to the specified presentation time is thus decoded by the video decoder 3800, which outputs the reproduced video signal St104 to the synthesizer 3500.

The sub-picture decoder 3100 similarly generates the sub-picture output request signal St86 based on the sub-picture decoding start signal St91, and outputs to the sub-picture buffer 2700. In response to the sub-picture output request signal St86, the sub-picture buffer 2700 outputs the sub-picture stream St85 to the sub-picture decoder 3100. Based or the presentation time information contained in the sub-picture stream St85, the sub-picture decoder 3100 decodes a length of the sub-picture stream St85 corresponding to the specified presentation time to reproduce and supply to the synthesizer 3500 the sub-picture signal St99.

The synthesizer 3500 superimposes the video signal St104 and sub-picture signal St99 to generate and output the multi-picture video signal St105 to the video data output terminal 3600.

The audio decoder 3200 generates and supplies to the audio buffer 2800 the audio output request signal St88 based on the audio stream decoding start signal St93. The audio buffer 2800 thus outputs the audio stream St87 to the audio decoder 3200. The audio decoder 3200 decodes a length of the audio stream St87 corresponding to the specified presentation time based on the presentation time information contained in the audio stream St87, and outputs the decoded audio stream St101 to the audio data output terminal 3700.

It is thus possible to reproduce a user-defined multimedia bitstream MB, in real-time according to a user-defined scenario. More specifically, each time the user selects a different scenario, the authoring decoder DC is able to reproduce the title content desired by the user in the desired sequence by reproducing the multimedia bitstream MBS corresponding to the selected scenario.

It is therefore possible by means of the authoring system of the present invention to generate a multimedia bitstream according to plural user-defined scenarios by real-time or batch encoding multimedia source data in a manner whereby the substreams of the smallest editing units (scenes), which can be divided into plural substreams, expressing the basic title content are arranged in a specific time-base relationship.

The multimedia bitstream thus encoded can then be reproduced according to the one scenario selected from among plural possible scenarios. It is also possible to change scenarios while playback is in progress, i.e., to select a different scenario and dynamically generate a new multimedia bitstream according to the most recently selected scenario. It is also possible to dynamically select and reproduce any of plural scenes while reproducing the title content according to a desired scenario.

It is therefore possible by means of the authoring system of the present invention to encode and not only reproduce but to repeatedly reproduce a multimedia bitstream MBS in real-time.

A detail of the authoring system is disclosed in Japanese Patent Application filed Sep. 27, 1996, and entitled and assigned to the same assignee as the present application.

Digital Video Disk (DVD)

An example of a digital video disk (DVD) with only one recording surface (a single-sided DVD) is shown in FIG. 4.

The DVD recording medium RC1 in the preferred embodiment of the invention comprises a data recording surface RS1 to and from which data is written and read by emitting laser beam LS, and a protective layer PL1 covering the data recording surface RS1. A backing layer BL1 is also provided on the back of data recording surface RS1. The side of the disk on which protective layer PL1 is provided is therefore referred to below as side SA (commonly "side A"), and the opposite side (on which the backing layer BL1 is provided) is referred to as side SB ("side B"). Note that digital video disk recording media having a single data recording surface RS1 on only one side such as this DVD recording medium RC1 is commonly called a single-sided single layer disk.

A detailed illustration of area C1 in FIG. 4 is shown in FIG. 5. Note that the data recording surface RS1 is formed by applying a metallic thin film or other reflective coating as a data layer 4109 on a first transparent layer 4108 having a particular thickness T1. This first transparent layer 4108 also functions as the protective layer PL1. A second transparent substrate 4111 of a thickness T2 functions as the backing layer BL1, and is bonded to the first transparent layer 4108 by means of an adhesive layer 4110 disposed therebetween.

A printing layer 4112 for printing a disk label may also be disposed on the second transparent substrate 4111 as necessary. Thea printing layer 4112 does not usually cover the entire surface area of the second transparent substrate 1111 (backing layer BL1), but only the area needed to print the text and graphics of the disk label. The area of second transparent substrate 4111 to which the printing layer 4112 is not formed may be left exposed. Light reflected from the data layer 4109 (metallic thin film) forming the data recording surface RS1 can therefore be directly observed where the label is not printed when the digital video disk is viewed from side SB. As a result, the background looks like a silver-white over which the printed text and graphics float when the metallic thin film is an aluminum thin film, for example.

Note that it is only necessary to provide the printing layer 4112 where needed for printing, and it is not necessary to provide the printing layer 4112 over the entire surface of the backing layer BL1.

A detailed illustration of area C2 in FIG. 5 is shown in FIG. 6. Pits and lands are molded to the common contact surface between the first transparent layer 4108 and the data layer 4109 on side SA from which data is read by emitting a laser beam LS, and data is recorded by varying the lengths of the pits and lands (i.e., the length of the intervals between the pits). More specifically, the pit and land configuration formed on the first transparent layer 4108 is transferred to the data layer 4109. The lengths of the pits and lands is shorter, and the pitch of the data tracks formed by the pit sequences is narrower, than with a conventional Compact Disc (CD). The surface recording density is therefore greatly improved.

Side SA of the first transparent layer 4108 on which data pits are not formed is a flat surface. The second transparent substrate 4111 is for reinforcement, and is a transparent panel made from the same material as the first transparent layer 4108 with both sides flat. Thicknesses T1 and T2 are preferably equal and commonly approximately 0.6 mm, but the invention shall not be so limited.

As with a CD, information is read by irradiating the surface with a laser beam LS and detecting the change in the reflectivity of the light spot. Because the objective lens apertures NA can be large and the wavelength 1 of the light beam small in a digital video disk system, the diameter of the light spot Ls used can be reduced to approximately 1/1.6 the light spot needed to read a CD. Note that this means the resolution of the laser beam LS in the DVD system is approximately 1.6 times the resolution of a conventional CD system.

The optical system used to read data from the digital video disk uses a short 650 nm wavelength red semiconductor laser and an-objective lens with a 0.6 mm aperture NA. By thus also reducing the thickness T of the transparent panels to 0.6 mm, more than 5 GB of data can be stored to one side of a 120 mm diameter optical disk.

It is therefore possible to store motion picture (video) images having an extremely large per unit data size to a digital video disk system disk without losing image quality because the storage capacity of a single-sided, single-layer recording medium RC1 with one data recording surface RS1 as thus described is nearly ten times the storage capacity of a conventional CD. As a result, while the video presentation time of a conventional CD system is approximately 74 minutes if image quality is sacrificed, high quality video imacges with a video presentation time exceeding two hours can be recorded to a DVD.

The digital video disk is therefore well-suited as a recording medium for video images.

A digital video disk recording medium with plural recording surfaces RS as described above is shown in FIGS. 7 and 8. The DVD recording medium RC2 shown in FIG. 7 comprises two recording surfaces, i.e., first recording surface RS1 and semi-transparent second recording surface RS2, on the same side, i.e. side SA, of the disk. Data can be simultaneously recorded or reproduced from these two recording surfaces by useing different laser beams LS1 and LS2 for the first recording surface RS1 and the second recording surface RS2. It is also possible to read/write both recording surfaces RS1 and RS2 using only one of the laser beams LS1 or LS2. Note that recording media thus comprised are called "single-side, dual-layer disks."

It should also be noted that while two recording surfaces RS1 and RS2 are provided in this example, it is also possible to produce digital video disk recording media having more than two recording surfaces RS. Disks thus comprised are known as "single-sided, multi-layer disks."

Though comprising two recording surfaces similarly to the recording media shown in FIG. 7, the DVD recording medium RC3 shown in FIG. 8 has the recording surfaces on opposite sides of the disk, i. e., has the first data recording surface RS1 on side SA and the second data recording surface RS2 on side SB. It will also be obvious that while only two recording surfaces are shown on one digital video disk in this example, more than two recording surfaces may also be formed on a double-sided digital video disk. As with the recording medium shown in FIG. 7, it is also possible to provide two separate laser beams LS1 and LS2 for recording surfaces RS1 and RS2, or to read/write both recording surfaces RS1 and RS2 using a single laser beam. Note that this type of digital video disk is called a "double-sided, dual-layer disk." It will also be obvious that a double-sided digital video disk can be comprised with two or more recording surfaces per side. This type of disk is called a "double-sided, multi-layer disk."

A plan view from the laser beam LS irradiation side of the recording surface RS of the DVD recording medium RC is shown in FIG. 9 and FIG. 10. Note that a continuous spiral data recording track TR is provided from the inside circumference to the outside circumference of the DVD. The data recording track TR is divided into plural sectors each having the same known storage capacity. Note that for simplicity only the data recording track TR is shown in FIG. 9 with more than three sectors per revolution.

As shown in FIG. 9, the data recording track TR is normally formed clockwise inside to outside (see arrow DrA) from the inside end point IA at the inside circumference of disk RCA to the outside end point OA at the outside circumference of the disk with the disk RCA rotating counterclockwise RdA. This type of disk RCA is called a clockwise disk, and the recording track formed thereon is called a clockwise track TRA.

Depending upon the application, the recording track TRB may be formed clockwise from outside to inside circumference (see arrow DrB in FIG. 10) from the outside end point OB at the outside circumference of disk RCB to the inside end point IB at the inside circumference of the disk with the disk RCB rotating clockwise RdB. Because the recording track appears to wind counterclockwise when viewed from the inside circumference to the outside circumference on disks with the recording track formed in the direction of arrow DrB, these disks are referred to as counterclockwise disk RCB with counterclockwise track TRB to distinguish them from disk RCA in FIG. 9. Note that track directions DrA and DrB are the track paths along which the laser beam travels when scanning the tracks for recording and playback. Direction of disk rotation RdA in which disk RCA turns is thus opposite the direction of track path DrA, and direction of disk rotation RdB in which disk RCB turns is thus opposite the direction of track path DrB.

An exploded view of the single-sided, dual-layer disk RC2 shown in FIG. 7 is shown as disk RC2o in FIG. 11. Note that the recording tracks formed on the two recording surfaces run in opposite directions. Specifically, a clockwise recording track TRA as shown in FIG. 9 is formed in clockwise direction DrA on the (lower) first data recording surface RS1, and a counterclockwise recording track TRB formed in counterclockwise direction DrB as shown in FIG. 10 is providead on the (upper) second data recording surface RS2. As a result, the outside end points OA and OB of the first and second (top and bottom) tracks are at the same radial position relative to the center axis of the disk RC2o. Note that track paths DrA and DrB of tracks TR are also the data read/write directions to disk RC. The first and second (top and bottom) recording tracks thus wind opposite each other with this disk RC, i.e., the track paths DrA and DrB of the top and bottom recording layers are opposite track paths.

Opposite track path type, single-sided, dual-layer disks RC2o rotate in direction RdA corresponding to the first recording surface RS1 with the laser beam LS travelling along track path DrA to trace the recording track on the first recording surface RS1. When the laser beam LS reaches the outside end point OA, the laser beam LS can be refocused to end point OB on the second recording surface RS2 to continue tracing the recording track from the first to the second recording surface uninterrupted. The physical distance between the recording tracks TRA and TRB on the first and second recording surfaces RS1 and RS2 can thus be instantaneously eliminated by simply adjusting the focus of the laser beam LS.

It is therefore possible with an opposite track path type, single-sided, dual-layer disk RC2o to easily process the recording tracks disposed to physically discrete top and bottom recording surfaces as a single continuous recording track. It is therefore also possible in an authoring system as described above with reference to FIG. 1 to continuously record the multimedia bitstream MBS that is the largest multimedia data management unit to two discrete recording surfaces RS1 and RS2 on a single recording medium RC2o.

It should be noted that the tracks on recording surfaces RS1 and RS2 can be wound in the directions opposite those described above, i.e., the counterclockwise track TRB may be provided on the first recording surface RS1 and the clockwise track TRA on the second recording surface RS2. In this cease the direction of disk rotation is also changed to a clockwise rotation RdB, thereby enabling the two recording surfaces to be used as comprising a single continuous recording track as described above. For simplification, a further example of this type of disk is therefore neither shown nor described below.

It is therefore possible by thus constructing the digital video disk to record the multimedia bitstream MBS for a feature-length title to a single opposite track path type, single-sided, du-al-layer disk RC2o. Note that this type of digital video disk medium is called a single-sided dual-layer disk with opposite track paths.

Another example of the single-sided, dual-layer DVD recording medium RC2 shown in FIG. 7 is shown as disk RC2p in FIG. 12. The recording tracks formed on both first and second recording surfaces RS1 and RS2 are clockwise tracks TRA as shown in FIG. 9. In this case, the single-sided, dual-layer disk RC2p rotates counterclockwise in the direction of arrow RdA, and the direction of laser beam LS travel is the same as the direction of the track spiral, i.e., the track paths of the top and bottom recording surfaces are mutually parallel (parallel track paths). The outside end points OA of both top and bottom tracks are again preferably positioned at the same radial position relative to the center axis of the disk RC2p as described above. As also described above with disk RC2o shown in FIG. 11, the access point can be instantaneously shifted from outside end point OA of track TRA on the first recording surface RS1 to the outside end point OA of track TRA on the second recording surface RS2 by appropriately adjusting the focus of the laser beam LS at outside end point OA.

However, for the laser beam LS to continuously access the clockwise recording track TRA on the second recording surface RS2, the recording medium RC2p must be driven in the opposite direction (clockwise, opposite direction RdA). Depending-on the radial position of the laser beam LS, however, it is inefficient to change the rotational direction of the recording medium. As shown by the diagonal arrow in FIG. 12, the laser beam LS is therefore moved from The outside end point OA of the track on the first recording surface RS1 to the inside end point IA of the track on the second recording surface RS2 to use these physically discrete recording tracks as one logically continuous recording track.

Rather than using the recording tracks on top and bottom recording surfaces as one continuous recording track, it is also possible to use the recording tracks to record the multimedia bitstreams MBS for different titles.

This type of digital video disk recording medium is called a "single-sided, dual-layer disk with parallel track paths."

Note that if the direction of the tracks formed on the recording surfaces RS1 and RS2 is opposite that described above, i.e., counterclockwise recording tracks TRB are formed, disk operation remains the same as that described above except for the direction of disk rotation, which is clockwise as shown by arrow RdB.

Whether using clockwise or counterclockwise recording tracks, the single-sided, dual-layer disk RC2p with parallel track paths thus described is well-suited to storing on a single disk encyclopedia and similar multimedia bitstreams comprising multiple titles that are frequently and randomly accessed.

An exploded view-of the dual-sided single-layer DVD recording medium RC3 comprising one recording surface layer RS1 and RS2 on (each side as shown in FIG. 8 is shown as DVD recording medium RC3s in FIG. 13. Clockwise recording track TRA is provided on the one recording surface RS1, and a counterclockwise recording track TRB is provided on the other recording surface RS2. As in the preceding recording media, the outside end points OA and OB of the recording tracks on each recording surface are preferably positioned at the same radial position relative to the center axis of the DVD recording medium RC3s.

Note that while the recording tracks on these recording surfaces R31 and RS2 rotate in opposite directions, the track paths are symmetrical. This type of recording medium is therefore known as a double-sided dual layer disk with symmetrical track paths. This double-sided dual layer disk with symmetrical track paths RC3s rotates in direction RdA when reading/writing the first recording surface RS1. As a result, the track path on the second recording surface RS2 on the opposite side is opposite the direction DrB in which the track winds, i.e., direction DrA. Accessing both recording surfaces RS1 and RS2 using a single laser beam LS is therefore not realistic irrespective of whether access is continuous or non-continuous. In addition, a multimedia bitstream MBS is separately recorded to the recording surfaces on the first and second sides of the disk.

A different example of the double-sided single layer disk RC3 shown in FIG. 8 is shown in FIG. 14 as disk RC3a. Note that this disk comprises clockwise recording tracks TRA as shown in FIG. 9 on both recording surfaces RS1 and RS2. As with the preceding recording media, the outside end points OA and OA of the recording tracks on each recording surface are preferably positioned at the same radial position relative to the center axis of the DVD recording medium RC3a. Unlike the double-sided dual layer disk with symmetrical track paths RC3s described above, the tracks on these recording surfaces RS1 and RS2 are asymmetrical. This type of disk is therefore known as a double-sided dual layer disk with asymmetrical track paths.

This double-sided dual layer disk with asymmetrical track paths RC3a rotates in direction RdA when reading/writing the first recording surface RS1. As a result, the track path on the second recording surface RS2 on the opposite side is opposite the direction DrA in which the track winds, i.e., direction DrB.

This means that if a laser beam LS is driven continuously from the inside circumference to the outside circumference on the first recording surface RS1, and then from the outside circumference to the inside circumference on the second recording surface RS2, both sides of the recording medium RC3a can be read/written without turning the disk over and without providing different laser beams for the two sides.

The track paths for recording surfaces RS1 and RS2 are also the same with this double-sided dual layer disk with asymmetrical track paths RC3a. As a result, it is also possible to read/write both sides of the disk without providing separated laser beams for each side if the recording medium RC3a is turned over between sides, and the read/write apparatus can therefore be constructed economically.

It should be noted that this recording medium remains functionally identical even if counterclockwise recording track TRB is provided in place of clockwise recording track TRA on both recording surfaces RS1 and RS2.

As described above, the true value of a DVD system whereby the storage capacity of the recording medium can be easily increased by using a multiple layer recording surface is realized in multimedia applications whereby plural video data units, plural audio data units, and plural graphics data units recorded to a single disk are reproduced through interactive operation by the user.

It is therefore possible to achieve one long-standing desire of software (programming) providers, specifically, to provide programming content such as a commercial movie on a single recording medium in plural versions for different language and demographic groups while retaining the image quality of the original.

Parental Control

Content providers of movie and video titles have conventionally had to produce, supply, and manage the inventory of individual titles in multiple languages, typically the language of each distribution market, and multi-rated title packages conforming to the parental control (censorship) regulations of individual countries in Europe and North America. The time and resources required for this are significant. While high image quality is obviously important, the programming content must also be consistently reproducible.

The digital video disk recording medium is close to solving these problems.

Multiple Angles

One interactive operation widely sought in multimedia applications today is for the user to be able to change the position from which a scene is viewed during reproduction of that scene. This capability is achieved by means of the multiple angle function.

This multiple angle function makes possible applications whereby, for example, a user can watch a baseball game from different angles (or virtual positions in the stadium), and can freely switch between the views while viewing is in progress. In this example of a baseball game, the available angles may include a position behind the backstop centered on the catcher, batter, and pitcher; one from behind the backstop centered on a fielder, the pitcher, and the catcher; and one from center field showing the view to the pitcher and catcher.

To meet these requirements, the digital video disk system uses MPEG, the same basic standard format used with Video-Cds to record the video, audio, graphics, and other signal data. Because of the differences in storage capacity, transfer rates, and signal processing performance within the reproduction apparatus, DVD uses MPEG2, the compression method and data format of which differ slightly from the MPEG1 format used with Video-Cds.

It should be noted that the content of and differences between the MPEG1 and MPEG2 standards have no direct relationship to the intent of the present invention, and further description is therefore omitted below (for more information, see MPEG specifications ISO-11172 and ISO-13818).

The data structure of the DVD system according to the present invention is described in detail below with reference to FIGS. 16, 17, 18, 19, 20, and 21.

Multi-scene Control

A fully functional and practical parental lock playback function and multi-angle scene playback function must enable the user to modify the system output in minor, subtle ways while still presenting substantially the same video and audio output. If these functions are achieved by preparing and recording separate titles satisfying each of the many possible parental lock and multi-angle scene playback requests, titles that are substantially identical and differ in only minor ways must be recorded to the recording medium. This results in identical data being repeatedly recorded to the larger part of the recording medium, and significantly reduces the utilization efficiency of the available storage capacity. More particularly, it is virtually impossible to record discrete titles satisfying every possible request even using the massive capacity of the digital video disk medium. While it may be concluded that this problem can be easily solved by increasing the capacity of the recording medium, this is an obviously undesirable solution when the effective use of available system resources is considered.

Using multi-scene control, the concept of which is described in another section below, in a DVD system, it is possible to dynamically construct titles for numerous variations of the same basic content using the smallest possible amount of data, and thereby effectively utilize the available system resources (recording medium). More specifically, titles that can be played back with numerous variations are constructed from basic (common) scene periods containing data common to each title, and multi-scene periods comprising groups of different scenes corresponding to the various requests. During reproduction, the user is able to freely and at any time select particular scenes from the multi-scene periods to dynamically construct a title conforming to the desired content, e.g., a title omitting certain scenes using the parental lock control function.

Note that multi-scene control enabling a parental lock playback control function and multi-angle scene playback is described in another section below with reference to FIG. 21.

Data Structure of the DVD System

The data structure used in the authoring system of a digital video disc system according to the present invention is shown in FIG. 22. To record a multimedia bitstream MBS, this digital video disk system divides the recording medium into three major recording areas, the lead-in area LI, the volume space VS, and the lead-out area LO.

The lead-in area LI is provided at the inside circumference area of the optical disk. In the disks described with reference to FIGS. 9 and 10, the lead-in area LI is positioned at the inside end points IA and IB of each track. Data for stabilizing the operation of the reproducing apparatus when reading starts is written to the lead-in area LI.

The lead-out area LO is correspondingly located at the outside circumference of the optical disk, i.e., at outside end points OA and OB of each track in the disks described with reference to FIGS. 9 and 10. Data identifying the end of the volume space VS is recorded in this lead-out area LO.

The volume space VS is located between the lead-in area LI and lead-out area LO, and is recorded as a one-dimensional array of n+1 (where n is an integer greater than or equal to zero) 2048-byte logic sectors LS. The logic sectors LS are sequentially number #0, #1, #2, . . . #n. The volume space VS is also divided into a volume and file structure management area VFS and a file data structure area FDS.

The volume and file structure management area VFS comprises m+1 logic sectors LS#0 to LS#m (where m is an integer greater than or equal to zero and less than n. The file data structure FDS comprises n-m logic sectors LS #m+1 to LS #n.

Note that this file data structure area FDS corresponds to the multimedia bitstream MBS shown in FIG. 1 and described above.

The volume file structure VFS is the file system for managing the data stored to the volume space VS as files, and is divided into logic sectors LS#0-LS#m where m is the number of sectors required to store all data needed to manage the entire disk, and is a natural number less than n. Information for the files stored to the file data structure area FDS is written to the volume file structure VFS according to a known specification such as ISO-9660 or IS0-13346.

The file data structure area FDS comprises n-m logic sectors LS#m-LS#n, each comprising a video manager VMG sized to an integer multiple of the logic sector (2048.times.I, where I is a known integer), and k video title sets VTS #1-VTS#k (where k is a natural number less than 100).

The video manager VMG stores the title management information for the entire disk, and information for building a volume menu used to set and change reproduction control of the entire volume.

Any video title set VTS #k is also called a "video file" representing a title comprising video, audio, and/or still image data.

The internal structure of each video title set VTS shown in FIG. 22 is shown in FIG. 16. Each video title set VTS comprises VTS information VTSI describing the management information for the entire disk, and the VTS title video objects VOB (VTSTT.sub.-- VOBS), i.e., the system stream of the multimedia bitstream. The VTS information VTSI is described first below, followed by the VTS title VOBS.

The VTS information primarily includes the VTSI management table VTSI.sub.-- MAT and VTSPGC information table VTS.sub.-- PGCIT.

The VTSI management table VTSI.sub.-- MAT stores such information as the internal structure of the video title set VTS, the number of selectable audio streams contained in the video title set VTS, the number of sub-pictures, and the video title set VTS location (storage address).

The VTSPGC information table VTS.sub.-- PGCIT records i (where i is a natural number) program chain (PGC) data blocks VTS.sub.-- PGCI #1-VTS.sub.-- PGCI #i for controlling the playback sequence. Each of the table entries VTS.sub.-- PGCI #i is a data entry expressing the program chain, and comprises j (where j is a natural number) cell playback information blocks C.sub.-- PBI #1-C.sub.-- PBI #j. Each cell playback information block C.sub.-- PBI #j contains the playback sequence of the cell and playback control information.

The program chain PGC is a conceptual structure describing the story of the title content, and therefore defines the structure of each title by describing the cell playback sequence. Note that these cells are described in detail below.

If, for example, the video title set information relates to the menus, the video title set information VTSI is stored to a buffer in the playback device when playback starts. If the user then presses a MENU button on a remote control device, for example, during playback, the playback device references the buffer to fetch the menu information and display the top menu #1. If the menus are hierarchical, the main menu stored as program chain information VTS.sub.-- PGCI #1 may be displayed, for example, by pressing the MENU button, VTS.sub.-- PGCI #2-#9 may correspond to submenus accessed using the numeric keypad on the remote control, and VTS.sub.-- PGCI #10 and higher may correspond to additional submenus further down the hierarchy. Alternatively, VTS.sub.-- PCCI #1 may be the top menu displayed by pressing the MENU button, while VTS.sub.-- PGCI #2 and higher may be voice guidance reproduced by pressing the corresponding numeric key.

The menus themselves are expressed by the plural program chains defined in this table. As a result, the menus may be freely constructed in various ways, and shall not be limited to hierarchical or non-hierarchical menus or menus containing voice guidance.

In the case of a movie, for example, the video title set information VTSI is stored to a buffer in the playback device when playback starts, the playback device references the cell playback sequence described by the program chain PGC, and reproduces the system stream.

The "cells" referenced here may be all or part of the system stream, and are used as access points during playback. Cells can therefore be used, for example, as the "chapters" into which a title may be divided.

Note that each of the PGC information entries C.sub.-- PBI #j contain both cell playback processing information and a cell information table. The cell playback processing information comprises the processing information needed to reproduce the cell, such as the presentation time and number of repetitions. More specifically, this information includes the cell block mode CBM, cell block type CBT, seamless playback flag SPF, interleaved allocation flag IAF, STC resetting flag STCDF, cell presentation time C.sub.-- PBTM, seamless angle change flag SACF, first cell VOBU start address C.sub.-- FVOBU.sub.-- SA, and the last cell VOBU start address C.sub.-- LVOBU.sub.-- SA.

Note that seamless playback refers to the reproduction in a digital video disk system of multimedia data including video, audio, and sub-picture data without intermittent breaks in the data or information. Seamless playback is described in detail in another section below with reference to FIG. 23 and FIG. 24.

The cell block mode CBM indicates whether plural cells constitute one functional block. The cell playback information of each cell in a functional block is arranged consecutively in the PGC information. The cell block mode CBM of the first cell playback information in this sequence contains the value of the first cell in the block, and the cell block mode CBM of the last cell playback information in this sequence contains the value of the last cell in the block. The cell block mode CBM of each cell arrayed between these first and last cells contains a value indicating that the cell is a cell between these first and last cells in that block.

The cell block type CBT identifies the type of the block indicated by the cell block mode CBM. For example, when a multiple angle function is enabled, the cell information corresponding to each of the reproducible angles is programmed as one of the functional blocks mentioned above, and the type of these functional blocks is defined by a value identifying "angle" in the cell block type CBT for each cell in that block.

The seamless playback flag SPF simply indicates whether the corresponding cell is to be linked and played back seamlessly with the cell or cell block reproduced immediately therebefore. To seamlessly reproduce a given cell with the preceding cell or cell block, the seamless playback flag SPF is set to 1 in the cell playback information for that cell; otherwise SPF is set to 0.

The interleaved allocation flag IAF stores a value identifying whether the cell exists in a contiguous or interleaved block. If the cell is part of an interleaved block, the flag IAF is set to 1; otherwise it is set to 0.

The STC resetting flag STCDF identifies whether the system time clock STC used for synchronization must be reset when the cell is played back; when resetting the system time clock STC is necessary, the STC resetting flag STCDF is set to 1.

The seamless angle change flag SACF stores a value indicating whether a cell in a multi-angle period should be connected seamlessly at an angle change. If the angle change is seamless, the seamless angle change flag SACF is set to 1; otherwise it is set to 0.

The cell presentation time C.sub.-- PBTM expresses the cell presentation time with video frame precision.

The first cell VOBU start address C.sub.-- FVOBU.sub.-- SA is the VOBU start address of the first cell in a block, and is also expressed as the distance from the logic sector of the first cell in the VTS title VOBS (VTSTT.sub.-- VOBS) as measured by the number of sectors.

The last cell VOBU start address C.sub.-- LVOBU.sub.-- SA is the VOBU start address of the last cell in the block. The value of this address is expressed as the distance from the logic sector of the first cell in the VTS title VOBS (VTSTT.sub.-- VOBS) as measured by the number of sectors.

The VTS title VOBS (VTSTT.sub.-- VOBS), i.e., the multimedia system stream data, is described next. The system stream data VTSTT.sub.-- VOBS comprises i (where i is a natural number) system streams SS, each of which is referred to as a "video object" (VOB). Each video object VOB #1-VOB #i comprises at least one video data block interleaved with up to a maximum eight audio data blocks and up to a maximum 32 sub-picture data blocks.

Each video object VOB comprises q (where q is a natural number) cells C#1-C#q. Each cell C comprises r (where r is a natural number) video object units VOBU #1-VOBU #r.

Each video object unit VOBU comprises plural groups.sub.-- of.sub.-- pictures GOP, and the audio and sub-pictures corresponding to the playback of said plural groups.sub.-- of.sub.-- pictures GOP. Note that the group.sub.-- of.sub.-- pictures GOP corresponds to the video encoding refresh cycle. Each video object unit VOBU also starts with an NV pack, i.e., the control data for that VOBU.

The structure of the navigation packs NV is described with reference to FIG. 19.

Before describing the navigation pack NV, the internal structure of the video zone VZ (see FIG. 22), i.e., the system stream St35 encoded by the authoring encoder EC described with reference to FIG. 25, is described with reference to FIG. 17. Note that the encoded video stream St15 shown in FIG. 17 is the compressed one-dimensional video data stream encoded by the video encoder 300. The encoded audio stream St19 is likewise the compressed one-dimensional audio data stream multiplexing the right and left stereo audio channels encoded by the audio encoder 700. Note that the audio signal shall not be Limited to a stereo signal, and may also be a multichannel surround-sound signal.

The system stream (title editing unit VOB) St35 is a one dimensional array of packs with a byte size corresponding to the logic sectors LS #n having a 2048-byte capacity as described using FIG. 22. A stream control pack is placed at the beginning of the title editing unit (VOB) St35, i.e., at the beginning of the video object unit VOBU. This stream control pack is called the "navigation pack NV", and records the data arrangement in the system stream and other control information.

The encoded video stream St15 and the encoded audio stream St19 are packetized in byte units corresponding to the system stream packs. These packets are shown in FIG. 17 as packets V1, V2, V3, V4 . . . and A1, A2, A3 . . . . As shown in FIG. 17, these packets are interleaved in the appropriate sequence as system stream St35, thus forming a packet stream, with consideration given to the decoder buffer size and the time required by the decoder to expand the video and audio data packets. In the example shown in FIG. 17, the packet stream is interleaved in the sequence V1, V2, A1, V3, V4, A2 . . . .

Note that the sequence shown in FIG. 17 interleaves one video data unit with one audio data unit. Significantly increased recording/playback capacity, high speed recording/playback, and performance improvements in the signal processing LSI enable the DVD system to record plural audio data and plural sub-picture data (graphics data) to one video data unit in a single interleaved MPEG system stream, and thereby enable the user to select the specific audio data and sub-picture data to be reproduced during playback. The structure of the system stream used in this type of DVD system is shown in FIG. 18 and described below.

As in FIG. 17, the packetized encoded video stream St15 is shown in FIG. 18 as V1, V2, V3, V4, . . . . In this example, however, there is not just one encoded audio stream St19, but three encoded audio streams St19A, St19B, and St19C input as the source data. There are also two encoded sub-picture streams St17A and St17B input as the source data sub-picture streams. These six compressed data streams, St15, St19A, 3t19B, St19C, St17A and St17B, are interleaved to a single system stream St35.

The video data is encoded according to the MPEG specification with the group.sub.-- of.sub.-- pictures GOP being the unit of compression. In general, each group.sub.-- of.sub.-- pictures GOP contains 15 frames in the case of an NTSC signal, but the specific number of frames compressed to one GOP is variable. The stream management pack, which describes the management data containing, for example, the relationship between interleaved data, is also interleaved at the GOP unit interval. Because the group.sub.-- of.sub.-- pictures GOP unit is based on the video data, changing the number of video frames per GOP unit changes the interval of the stream management packs. This interval is expressed in terms of the presentation time on the digital video disk within a range from 0.4 sec. to 1.0 sec. referenced to the GOP unit. If the presentation time of contiguous plural GOP units is less than 1 sec., the management data packs for the video data of the plural GOP units is interleaved to a single stream.

These management data packs are referred to as navigation packs NV in the digital video disk system. The data from one navigation pack NV to the packet immediately preceding the next navigation pack NV forms one video object unit VOBU. In general, one contiguous playback unit that can be defined as one scene is called a video object VOB, and each video object VOB contains plural video object units VOBU. Data sets of plural video objects VOB form a VOB set (VOBS). Note that these data units were first used in the digital video disk.

When plural of these data streams are interleaved, the navigation packs NV defining the relationship between the interleaved packs must also be interleaved at a defined unit known as the pack number unit. Each group.sub.-- of.sub.-- pictures GOP is normally a unit containing approximately 0.5 sec. of video data, which is equivalent to the presentation time required for 12-15 frames, and one navigation pack NV is generally interleaved with the number of data packets required for this presentation time.

The stream management information contained in the interleaved video, audio, and sub-picture data packets constituting the system stream is described below with reference to FIG. 19 As shown in FIG. 19, the data contained in the system stream is recorded in a format packed or packetized according to the MPEG2 standard. The packet structure is essentially the same for video, audio, and sub-picture data. One pack in the digital video disk system has a 2048 byte capacity as described above, and contains a pack header PKH and one packet PES; each packet PES contains a packet header PTH and data block.

The pack header PKH records the time at which that pack is to be sent from stream buffer 2400 to system decoder 2500 (see FIG. 26), i.e., the system clock reference SCR defining the reference time for synchronized audio-visual data playback. The MPEG standard assumes that the system clock reference SCR is the reference clock for the entire decoder operation. With such disk media as the digital video disk, however, time management specific to individual disk players can be used, and a reference clock for the decoder system is therefore separately provided.

The packet header PTH similarly contains a presentation time stamp PTS and a decoding time stamp DTS, both of which are placed in the packet before the access unit (the decoding unit). The presentation time stamp PTS defines the time at which the video data or audio data contained in the packet should be output as the playback output after being decoded, and the decoding time stamp DTS defines the time at which the video stream should be decoded. Note that the presentation time stamp PTS effectively defines the display start timing of the access unit, and the decoding time stamp DTS effectively defines the decoding start timing of the access unit. If the PTS and DTS are the same time, the DTS is omitted.

The packet header PTH also contains an 8-bit field called the stream ID identifying the packet type, i.e., whether the packet is a video packet containing a video data stream, a private packet, or an MPEG audio packet.

Private packets under the MPEG2 standard are data packets of which the content can be freely defined. Private packet 1 in this embodiment of the invention is used to carry audio data other than the MPEG audio data, and sub-picture data; private packet 2 carries the PCI packet and DSI packet.

Private packets 1 and 2 each comprise a packet header, private data area, and data area. The private data area contains an 8-bit sub-stream ID indicating whether the recorded data is audic) data or sub-picture data. The audio data defined by private packet 2 may be defined as any of eight types #0-#7 of linear PCM or AC-3 encoded data. Sub-picture data may be defined as one of up to 32 types #0-#31.

The data area is the field to which data compressed according to the MPEG2 specification is written if the stored data is video data; linear PCM, AC-3, or MPEG encoded data is written if audio data is stored; or graphics data compressed by runlength coding is written if sub-picture data is stored.

MPEG2-compressed video data may be compressed by constant bit rate (CBR) or variable bit rate (VBR) coding. With constant bit rate coding, the video stream is input continuously to the video buffer at a constant rate. This contrasts with variable bit rate coding in which the video stream is input intermittently to the video buffer, thereby making it possible to suppress the generation of unnecessary code. Both constant bit rate and variable bit rate coding can be used in the digital video disk system.

Because MEIEG video data is compressed with variable length coding, the data quantity in each group.sub.-- of.sub.-- pictures GOP is not constant. The video and audio decoding times also differ, and the time-base relationship between the video and audio data read from an optical disk, and the time-base relationship between the video and audio data output from the decoder, do not match. The method of time-base synchronizing the video and audio data is therefore described in detail below with reference to FIG. 26, but is described briefly below based on constant bit rate coding.

The navigation pack NV structure is shown in FIG. 20. Each navigation pack NV starts with a pack header PKH, and contains a PCI packet and DSI packet.

As described above, the pack header PKH records the time at which that pack is to be sent from stream buffer 2400 to system, decoder 2500 (see FIG. 26 ), i.e., the system clock reference SCR defining the reference time for synchronized audio-visual data playback.

Each PCI packet contains PCI General Information (PCI.sub.-- GI) and Angle Information for Non-seamless playback (NMSL.sub.-- AGLI).

The PCI General Information (PCI.sub.-- GI) declares the display time of the first video frame (the Start PTM of VOBU (VOBU.sub.-- S.sub.-- PTM)), and the display time of the last video frame (End PTM of VOBU (VOBU.sub.-- E.sub.-- PTM)), in the corresponding video object unit VCBU with system clock precision (90 Khz).

The Angle Information for Non-seamless playback (NMSL.sub.-- AGLI) states the read start address of the corresponding video object unit VOBU when the angle is changed expressed as the number of sectors from the beginning of the video object VOB. Because there are nine or fewer angles in this example, there are nine angle address declaration cells: Destination Address of Angle Cell #1 for Non-seamless playback (NMSL.sub.-- AGL.sub.-- C1.sub.-- DSTA) to Destination Address of Angle Cell #9 for Non-seamless playback (NMSL.sub.-- AGL.sub.-- C9.sub.-- DSTA).

Each DSI packet contains DSI General Information (DSI.sub.-- GI), Seamless Pliayback Information (SML.sub.-- PBI), and Angle Information for Seamless playback (SML.sub.-- AGLI).

The DSI General Information (DSI.sub.-- GI) declares the address of the last pack in the video object unit VOBU, i. e., the End Address for VOB (VOBU.sub.-- EA), expressed as the number of sectors from the beginning of the video object unit VOBU.

While seamless playback is described in detail later, it should be noted that the continuously read data units must be interleaved (multiplexed) at the system stream level as an interleaved unit ILVU in order to seamlessly reproduce split or combined titles. Plural system streams interleaved with the interleaved unit ILVU as the smallest unit are defined as an interleaved block.

The Seamless Playback Information (SML.sub.-- PBI) is declared to seamlessly reproduce the stream interleaved with the interleaved unit ILVU as the smallest data unit, and contains an Interleaved Unit Flag (ILVU flag) identifying whether the corresponding video object unit VOBU is an interleaved block. The ILVU flag indicates whether the video object unit VOBU is in an interleaved block, and is set to 1 when it is. Otherwise the ILVU flag is set to 0.

When a video object unit VOBU is in an interleaved block, a Unit END flag is declared to indicate whether the video object unit VOBU is the last VOBU in the interleaved unit ILVU. Because the interleaved unit ILVU is the data unit for continuous reading, the Unit END flag is set to 1 if the VOEU currently being read is the last VOBU in the interleaved unit ILVU. Otherwise the Unit END flag is set to 0.

An Interleaved Unit End Address (ILVU.sub.-- EA) identifying the address of the last pack in the ILVU to which the VOBU belongs, and the starting address of the next interleaved unit ILVU, Next Interleaved Unit Start Address (NT.sub.-- ILVU.sub.-- SA), are also declared when a video object unit VOBU is in an interleaved block. Both the Interleaved Unit End Address (ILVJ.sub.-- EA) and Next Interleaved Unit Start Address (NT.sub.-- ILVU.sub.-- SA) are expressed as the number of sectors from the navigation pack NV of that VOBU.

When two system streams are seamlessly connected but the audio components of the two system streams are not contiguous, particularly immediately before and after the seam, it is necessary to pause the audio output to synchronize the audio and video components of the system stream following the seam. Note that non-contiguous audio may result from different audio signals being recorded with the corresponding video blocks. With an NTSC signal, for example, the video frame cycle is approximately 33.33 msec while the AC-3 audio frame cycle is 32 msec.

To enable this resynchronization, audio reproduction stopping times 1 and 2, i.e., Audio Stop PTM 1 in VOB (VOB.sub.-- A.sub.-- STP.sub.-- PTM1), and Audio Stop PTM2 in VOB (VOB.sub.-- A.sub.-- STP.sub.-- PTM2), indicating the time at which the audio is to be paused; and audio reproduction stopping periods 1 and 2, i.e., Audio Gap Length 1 in VOB (VOB.sub.-- A.sub.-- GAP.sub.-- LEN1) and Audio Gap Length 2 in VOB (VOB.sub.-- A.sub.-- GAP.sub.-- LEN2), indicating for how long the audio is to be paused, are also declared in the DSI packet. Note that these times are specified at the system clock precision (90 Khz).

The Angle Information for Seamless playback (SML.sub.-- AGLI) declares the read start address when the angle is changed. Note that: this field is valid when seamless, multi-angle control is enabled. This address is also expressed as the number of sectors from the navigation pack NV of that VOBU. Because there are nine or fewer angles, there are nine angle address declaration cells: Destination Address of Angle Cell #1 for Seamless playback (SML.sub.-- AGL.sub.-- C1.sub.-- DSTA) to Destination Address of Angle Cell #9 for Seamless playback (SML.sub.-- AGL.sub.-- C9.sub.-- DSTA).

Note also that each title is edited in video object (VOB) units. Interleaved video objects (interleaved title editing units) are referenced as "VOBS"; and the encoded range of the source data is the encoding unit.

DVD Encoder

A preferred embodiment of a digital video disk system authoring encoder ECD in which the multimedia bitstream authoring system according to the present invention is applied to a digital video disk system is described below and shown in FIG. 25. It will be obvious that the authoring encoder ECD applied to the digital video disk system, referred to below as a DVD encoder, is substantially identical to the authoring encoder EC shown in FIG. 2. The basic difference between these encoders is the replacement in the DVD encoder ECD of the video zone formatter 1300 of the authoring encoder EC above with a VOB buffer 1000 and formatter 1100. It will also be obvious that the bitstream encoded by this DVD encoder ECD is recorded to a digital video disk medium M. The operation of this DVD encoder ECD is therefore described below in comparison with the authoring encoder EC described above.

As in the above authoring encoder EC, the encoding system controller 200 generates control signals St9, St11, St13, St21, St23, St25, St33, and St39 based on the scenario data St7 describing the user-defined editing instructions input from the scenario editor 100, and controls the video encoder 300, sub-picture encoder 500, and audio encoder 700 in the DVD encoder ECD. Note that the user-defined editing instructions in the DVD encoder ECD are a superset of the editing instructions of the authoring encoder EC described above.

Specifically, the user-defined editing instructions (scenario data St7) in the DVD encoder ECD similarly describe what source data is selected from all or a subset of the source data containing plural titles within a defined time period, and how the selected source data is reassembled to reproduce the scenario (sequence) intended by the user. The scenario data St7 of the DVD encoder ECD, however, further contains such information as: the number of streams contained in the editing units, which are obtained by splitting a multi-title source stream into blocks at a constant time interval; the number of audio and sub-picture data cells contained in each stream, and the sub-picture display time and period; whether the title is a multi-rated title enabling parental lock control; whether the user content is selected from plural streams including, for example, multiple viewing angles; and the method of connecting scenes when the angle is switched among the multiple viewing angles.

The scenario data St7 of the DVD encoder ECD also contains control information on a video object VOB unit basis. This information is required to encode the media source stream, and specifically includes such information as whether there are multiple angles or parental control features. When multiple angle viewing is enabled, the scenario data St7 also contains the encoding bit rate of each stream considering data interleaving and the disk capacity, the start and end times of each control, and whether a seamless connection should be made between the preceding and following streams.

The encoding system controller 200 extracts this information from the scenario data St7, and generates the encoding information table and encoding parameters required for encoding control. The encoding information table and encoding parameters are described with reference to FIGS. 27, 28, and 29 below.

The stream encoding data St33 contains the system stream encoding parameters and system encoding start and end timing values required by the DVD system to generate the VOBS. These system stream encoding parameters include the conditions for cornecting one video object VOB with those before and after, the number of audio streams, the audio encoding information and audio Ids, the number of sub-pictures and the sub-picture Ids, the video playback starting time information VPTS, and the audio playback starting time information APTS.

The title seaquence control signal St39 supplies the multimedia bitstream MBS formatting start and end timing information and formatting parameters declaring the reproduction control information and interleave information.

Based on the video encoding parameter and encoding start/end timing signal St9, the video encoder 300 encodes a specific part of the video stream St1 to generate an elementary stream conforming to the MPEG2 Video standard defined in ISO-13818. This elementary stream is output to the video stream buffer 400 as encoded video stream St1S.

Note that while the video encoder 300 generates an elementary stream conforming to the MPEG2 Video standard defined in ISO-13818, specific encoding parameters are input via the video encoding parameter signal St9, including the encodirng start and end timing, bit rate, the encoding conditions for the encoding start and end, the material type, including whether the material is an NTSC or PAL video signal or telecine converted material, and whether the encoding irode is set for either open GOP or closed GOP encoding.

The MPEG2 coding method is basically an interframe coding method using the correlation between frames for maximum signal compression, i.e., the frame being coded (the target frame) is coded by referencing frames before and/or after the target frame. However, intra-coded frames, i. e., frames that are coded based solely on the content of-the target frame, are also inserted to avoid error propagation and enable accessibility from mid-stream (random access). The coding unit containing at least one intra-coded frame ("intra-frame") is called a group.sub.-- of.sub.-- pictures GOP.

A group.sub.-- of.sub.-- pictures GOP in which coding is closed completely within that GOP is known as a "closed GOP." A group.sub.-- of.sub.-- pictures GOP containing a frame coded with reference to a frame in a preceding or following (ISO-13818 DOES NOT LIMIT P- and B-picture CODING to referencing PAST frames) group.sub.-- of.sub.-- pictures GOP is an "open GOP." It is therefore possible to playback a closed GOP using only that GOP. Reproducing an open GOP, however, also requires the presence of the referenced GOP, generally the GOP preceding the open GOP.

The GOP is often used as the access unit. For example, the GOP may be used as the playback start point for reproducing a title from the middle, as a transition point in a movie, or for fast-forward play and other special reproduction modes. High speed reproduction can be achieved in such cases by reproducing only the intra-frame coded frames in a GOP or by reproducing only frames in GOP units.

Based on the sub-picture stream encoding parameter signal St11, the sub-picture encoder 500 encodes a specific part of the sub-picture stream St3 to generate a variable length coded bitstream of bitmapped data. This variable length coded bitstream data is output as the encoded sub-picture stream St17 to the sub-picture stream buffer 600.

Based on the audio encoding parameter signal St13, the audio encoder 700 encodes a specific part of the audio stream St5 to generate the encoded audio data. This encoded audio data may be data based on the MPEG1 audio standard defined in ISO-11172 and the MPEG2 audio standard defined in ISO-13818, AC-3 audio data, or PCM (LPCM) data. Note that the methods and means of encoding audio data according to these standards are known and commonly available.

The video stream buffer 400 is connected to the video encoder 300 and to the encoding system controller 200. The video stream buffer 400 stores the encoded video stream St15 input from the video encoder 300, and outputs the stored encoded video stream St15 as the time-delayed encoded video stream St27 based on the timing signal St21 supplied from the encoding system controller 200.

The sub-picture stream buffer 600 is similarly connected to the sub-picture encoder 500 and to the encoding system controller 200. The sub-picture stream buffer 600 stores the encoded sub-picture stream St17 input from the sub-picture encoder 500, and then outputs the stored encoded sub-picture stream St17 as time-delayed encoded sub-picture stream St29 based on the timing signal St23 supplied from the encoding system controller 200.

The audio stream buffer 800 is similarly connected to the audio encoder 700 and to the encoding system controller 200. The audio stream buffer 800 stores the encoded audio stream St19 input from the audio encoder 700, and then outputs the encoded audio stream St19 as the time-delayed encoded audio stream St31 based on the timing signal St25 supplied from the encoding system controller 200.

The system encoder 900 is connected to the video stream buffer 400, sub-picture stream buffer 600, audio stream buffer 800, and the encoding system controller 200, and is respectively supplied thereby with the time-delayed encoded video stream St27, time-delayed encoded sub-picture stream St29, time-delayed encoded audio stream St31, and the system stream encoding parameter data St33. Note that the system encoder 900 is a multiplexer that multiplexes the time-delayed streams St27, St29, and St31 based on the stream encoding data St33 (timing signal) to generate title editing units (VOBs) St35.

The VOB buffer 1000 temporarily stores the video objects VOBs produced by the system encoder 900. The formatter 1100 reads the delayed video objects VOB from the VOB buffer 1000 based on the title sequence control signal St39 to generate one video zone VZ, and adds the volume file structure VFS to generate the edited multimedia stream data St43.

The multimedia bitstream MBS St43 edited according to the user-defined scenario is then sent to the recorder 1200. The recorder 1200 processes the edited multimedia stream date St43 to the data stream St45 format of the recording medium M, and thus records the formatted data stream St45 to the recording medium M.

DVD Decoder

A preferred embodiment of a digital video disk system authoring decoder DCD in which the multimedia bitstream authoring system of the present invention is applied to a digital video disk system is described below and shown in FIG. 26. The authoring decoder DCD applied to the digital video disk system, referred to below as a DVD decoder DCD, decodes the multimedia bitstream MBS edited using the DVD encoder ECD of the present invention, and recreates the content of each title according to the user-defined scenario. It will also be obvious that the multimedia bitstream St45 encoded by this DVD encoder ECD is recorded to a digital video disk medium M.

The basic configuration of the DVD decoder DCD according to this embodiment is the same as that of the authoring decoder DC shown in FIG. 3. The differences are that a different video decoder 3801 (shown as 3800 in FIG. 23) is used in place of the video decoder 3800, and a reordering buffer 3300 and selector 3400 are disposed between the video decoder 3801 and synthesizer 3500.

Note that the selector 3400 is connected to the synchronizer 2900, and is controlled by a switching signal St103.

The operation of this DVD decoder DCD is therefore described bellow in comparison with the authoring decoder DC described above.

As shown in FIG. 26, the DVD decoder DCD comprises a multimedia bitstream producer 2000, scenario selector 2100, decoding system controller 2300, stream buffer 2400, system deacoder 2500, video buffer 2600, sub-picture buffer 2700, audio buffer 2800, synchronizer 2900, video decoder 3801, reordering buffer 3300, sub-picture decoder 3100, audio decoder 3200, selector 3400, synthesizer 3500, video data output terminal 3600, and audio data output terminal 3700.

The bitstream producer 2000 comprises a recording media drive unit 2004 for driving the recording medium M; a reading head 2006 for reading the information recorded to the recording medium M and producing the binary read signal St57; a signal processor 2008 for variously processing the read signal St57 to generate the reproduced bitstream St61; and a reproduction controller 2002.

The reproduction controller 2002 is connected to the decoding system controller 2300 from which the multimedia bitstream reproduction control signal St53 is supplied, and in turn generates the reproduction control signals St55 and St5S) respectively controlling the recording media drive unit (motor) 2004 and signal processor 2008.

So that the user-defined video, sub-picture, and audio portions of the multimedia title edited by the authoring encoder EC are reproduced, the authoring decoder DC comprises a scenario selector 2100 for selecting and reproducing the corresponding scenes (titles). The scenario selector 2100 then outputs the selected titles as scenario data to the EVD decoder DCD.

The scenario selector 2100 preferably comprises a keyboard, CPU, and monitor. Using the keyboard, the user then inputs the desired scenario based on the content of the scenario input by the DVD encoder ECD. Based on the keyboard input, the CEU generates the scenario selection data St51 specifying the selected scenario. The scenario selector 2100 is connected to the decoding system controller 2300 by an infrared communications device, for example, and inputs the generated scenario selection data St51 to the decoding system controller 2300.

The stream buffer 2400 has a specific buffer capacity used to temporarily store the reproduced bitstream St61 input from the bitstream producer 2000, extract the volume file structure VFS, the initial synchronization data SCR (system clock reference) in each pack, and the VOBU control information (DSI) in the navigation pack NV, to generate the bitstream control data St63. The stream buffer 2400 is also connected to the decoding system controller 2300, to which it supplies the generated bitstream control data St63.

Based on the scenario selection data St51 supplied by the scenario selector 2100, the decoding system controller 2300 then generates the bitstream reproduction control signal St53 controlling the operation of the bitstream producer 2000. The decoding system controller 2300 also extracts the user-defined playback instruction data from the bitstream reproduction control signal St53, and generates the decoding information table required for decoding control. This decoding information table is described further below with reference to FIGS. 26 and 32. The decoding system controller 2300 also extracts the title information recorded to the optical disk M from the file data structure area FDS of the bitstream control data St63 to generate the title information signal St200. Note that the extracted title information includes the video manager VMG, VTS information VTSI, the PGC information entries C.sub.-- PBI #j, and the cell presentation time C.sub.-- PBTM.

Note that the bitstream control data St63 is generated in pack units as shown in FIG. 19, and is supplied from the stream buffer 2400 to the decoding system controller 2300, to which the stream buffer 2400 is connected.

The synchronizer 2900 is connected to the decoding system controller 2300 from which it receives the system clock references SCR contained in the synchronization control data St81 to set the internal system clock STC and supply the reset system clock St79 to the decoding system controller 2300.

Based on this system clock St79, the decoding system controller 2300 also generates the stream read signal St65 at a specific interval and outputs the read signal St65 to the stream buffer 2400. Note that the read unit in this case is the pack.

The method of generating the stream read signal St65 is described next.

The decoding system controller 2300 compares the system clock reference. SCR contained in the stream control data extracted from the stream buffer 2400 with the system clock St79 supplied from the synchronizer 2900, and generates the read request signal St65 when the system clock St79 is greater than the system clock reference SCR of the bitstream control data St63. Pack transfers are controlled by executing this control process on a pack unit.

Based on the scenario selection data St51, the decoding system controller 2300 generates the decoding signal St69 defining the stream Ids for the video, sub-picture, and audio bitstreams corresponding to the selected scenario, and outputs to the system decoder 2500.

When a title contains plural audio tracks, e.g. audio tracks in Japanese, English, French, and/or other languages, and plural sub-picture tracks for subtitles in Japanese, English, French, and/or other languages, for example, a discrete ID is assigned to each of the language tracks. As described above with reference to FIG. 19, a stream ID is assigned to the video data and MPEG audio data, and a substream ID is assigned to the sub-picture data, AC-3 audio-data, linear PCM data, and navigation pack NV information. While the user need never be aware of these ID numbers, the user can select the language of the audio and/or subtitles using the scenario selector 2100. If English language audio is selected, for example, the ID corresponding to the English audio track is sent to the decoding system controller 2300 as scenario selection data St51. The decoding system controller 2300 then adds this ID to the decoding signal St69 output to the system decoder 2500.

Based on the instructions contained in the decoding signal St69, the system decoder 2500 respectively outputs the video, sub-picture, and audio bitstreams input from the stream buffer 2400 to the video buffer 2600, sub-picture buffer 2700, and audio buffer 2800 as the encoded video stream St71, encoded sub-picture stream st73, and encoded audio stream St75. Thus, when the stream ID input from the scenario selector 2100 and the pack ID input from the stream buffer 2400 match, the system decoder 2500 outputs the corresponding packs to the respective buffers (i.e., the video buffer 2600, sub-picture buffer 2700, and audio buffer 2800).

The system decoder 2500 detects the presentation time stamp PTS and decoding time stamp DTS of-the smallest control unit in each bitstream St67 to generate the time information signal St77. This time information signal St77 is supplied to the synchronizer 2900 through the decoding system controller 2300 as the synchronization control data St81.

Based on this synchronization control data St81, the synchronizer 2900 determines the decoding start timing whereby each of the bitstreams will be arranged in the correct sequence after decoding, and then generates and inputs the video stream decoding start signal St89 to the video decoder 3801 based on this decoding timing. The synchronizer 2900 also generates and supplies the sub-picture decoding start signal St91 and audio stream decoding start signal St93 to the sub-picture decoder 3100 and audio decoder 3200, respectively.

The video decoder 3801 generates the video output request signal St84 based on the video stream decoding start signal St89, and outputs to the video buffer 2600. In response to the video output request signal St84, the video buffer 2600 outputs the video stream St83 to the video decoder 3801. The video decoder 3801 thus detects the presentation time information contained in the video stream St83, and disables the video output request signal St84 when the length of the received video stream St83 is equivalent to the specified presentation time. A video stream equal in length to the specified presentation time is thus decoded by the video decoder 3801, which outputs the reproduced video signal St95 to the reordering buffer 3300 and selector 3400.

Because the encoded video stream is coded using the interframe correlations between pictures, the coded order and display order do not necessarily match on a frame unit basis. The video cannot, therefore, be displayed in the decoded order. The decoded frames are therefore temporarily stored to the reordering buffer 3300. The synchronizer 2900 therefore controls the switching signal St103 so that the reproduced video signal St95 output from the video decoder 3800 and the reordering buffer output St97 are appropriately selected and output in the display order to the synthesizer 3500.

The sub-picture decoder 3100 similarly generates the sub-picture output request signal St86 based on the sub-picture decoding start signal St91, and outputs to the sub-picture buffer 2700. In response to the sub-picture output request signal St86, the sub-picture buffer 2700 outputs the sub-picture stream St85 to the sub-picture decoder 3100. Based on the presentation time information contained in the sub-picture stream St85, the sub-picture decoder 3100 decodes a length of the sub-picture stream St85 corresponding to the specified presentation time to reproduce and supply to the synthesizer 3500 the sub-picture signal St99.

The synthesizer 3500 superimposes the selector 3400 output with the sub-picture signal St99 to generate and output the video signal St105 to the video data output terminal 3600.

The audio decoder 3200 generates and supplies to the audio buffer 2800 the audio output request signal St88 based on the audio stream decoding start signal St93. The audio buffer 2800 thus outputs the audio stream St87 to the audio decoder 3200. The audio decoder 3200 decodes a length of the audio stream St87 corresponding to the specified presentation time based on the presentation time information contained in the audio stream St87, and outputs the decoded audio stream St101 to the audio data output terminal 3700.

It is thus possible to reproduce a user-defined multimedia bitstream MBS in real-time according to a user-defined scenario. More specifically, each time the user selects a different scenario, the DVD decoder DCD is able to reproduce the title content desired by the user in the desired sequence by reproducing the multimedia bitstream MBS corresponding to the selected scenario.

It should be noted that the decoding system controller 2300 may supply the title information signal St200 to the scenario selector 2100 by means of the infrared communications device mentioned above or another means. Interactive scenario selection controlled by the user can also be made possible by the scenario selector 2100 extracting the title information recorded to the optical disk M from the file data structure area FDS of the bitstream control data St63 contained in the title information signal St200, and displaying this title information on a display for user selection.

Note, further, that the stream buffer 2400, video buffer 2600, sub-picture buffer 2700, audio buffer 2800, and reordering buffer 3300 are expressed above and in the figures as separate entities because they are functionally different. It will be obvious, however, that a single buffer memory can be controlled to provide the same discrete functionality by time-share controlled use of a buffer memory with an operating speed plural times faster than the read and write rates of these separate buffers.

Multi-scene Control

The concept of multiple angle scene control according to the present invention is described below with reference to FIG. 21. As described above, titles that can be played back with numerous variations are constructed from basic scene periods containing data common to each title, and multi-scene periods comprising groups of different scenes corresponding to the various scenario requests. In FIG. 21, scenes 1, 5, and 8 are the common scenes of the basic scene periods. The multi-angle scenes (angles 1, 2, and 3) between scenes 1 and 5, and the parental locked scenes (scenes 6 and 7) between scenes 5 and 8, are the multi-scene periods.

Scenes taken from different angles, i.e., angles 1, 2, and 3 in this example, can be dynamically selected and reproduced during playback in the multi-angle scene period. In the parental locked scene period, however, only one of the available scenes, scenes 6 and 7, having different content can be selected, and must be selected statically before playback begins.

Which of these scenes from the multi-scene periods is to be selected and reproduced is defined by the user operating the scenario selector 2100 and thereby generating the scenario selection data St51. In scenario 1 in FIG. 21 the user (an freely select any of the multi-angle scenes, and scene 6 has been preselected for output in the parental locked scene period. Similarly in scenario 2, the user can freely select any of the multi-angle scenes, and scene 7 has been preselected for output in the parental locked scene period.

With reference to FIGS. 30 and 31, furthermore, the contents of the program chain information VTS.sub.-- PGCI is described. In FIG. 30, the case that a scenario requested by the user is shown with respect to a VTSI data construction. The scenario 1 and scenario 2 shown in FIG. 21 are described as program chain information VTS.sub.-- PGC#1 and VTS.sub.-- PGC#2. VTS.sub.-- PGC#1 describing the scenario 1 consists of cell playback information C.sub.-- PBI#1 corresponding to scene 1, C.sub.-- PBI#2, C.sub.-- PBI#3, and C.sub.-- PBI#4 within a multi angle cell block, C.sub.-- PBI#5 corresponding to scene 5, C-PBI#6 corresponding to scene 6, and C.sub.-- PBI#7 corresponding to scene 8.

VTS.sub.-- PGCI#2 describing the scenario 2 consists of cell playback information C.sub.-- PBI#1 corresponding to scene 1, C-PBI#2, C.sub.-- PBI#3, and. C.sub.-- PBI#4 within a multi-angle cell block corresponding to a multi-angle scene, C.sub.-- PBI#5 corresponding to scene 5, C.sub.-- PBI#6 corresponding to scene 7, and C.sub.-- PBI#7 corresponding to scene 8. According to the digital video system data structure, a scene which is a control unit of a scenario is described as a cell which is a unit thereunder, thus a scenario requested by a user can be obtained.

In FIG. 31, the case that a scenario requested by the user shown in FIG. 21 is shown with respect to a VOB data construction VTSTT.sub.-- VOBS. As specifically shown in FIG. 31, the two scenarios 1 and 2 use the same VOB data in common. With respect to a single scene commonly owned by each scenario, VOB#1 corresponding to scene 1, VOB#5 corresponding to scene 5, and VOB#8 corresponding to scene 8 are arranged in non-interleaved block which is the contiguous block.

With respect to the multi-angle data commonly owned by scenarios 1 and 2, one angle scene data is constructed by a single VOB. Specifically speaking, angle 1 is constructed by VDB#2, and angle 2 is constructed by VOB#3, angle 3 is constructed by VOB#4. Thus constructed multi-angle data is formed as the interleaved block for the sake of switching between each angle and seamless reproduction of each angle data. Scenes 6 and 7 peculiar to scenarios 1 and 2, respectively, are formed as the interleaved block for the sake of seamless reproduction between common scenes before and behind thereof as well as seamless reproduction between each scene.

As described in the above, the user's requesting scenario shown in FIG. 21 can be realized by utilizing the video title playback control information shown in FIG. 30 and the title playback VOB data structure shown in FIG. 31.

Seamless Playback

The seamless playback capability briefly mentioned above with regard to the digital video disk system data structure is describ