Apparatus and methods for including codes in audio signals and decoding5764763
Abstract
Apparatus and methods for including a code having at least one code frequency component in an audio signal are provided. The abilities of various frequency components in the audio signal to mask the code frequency component to human hearing are evaluated and based on these evaluations an amplitude is assigned to the code frequency component. Methods and apparatus for detecting a code in an encoded audio signal are also provided. A code frequency component in the encoded audio signal is detected based on an expected code amplitude or on a noise amplitude within a range of audio frequencies including the frequency of the code component.
Claims
What is claimed is:
1. An apparatus for including a code with an audio signal having a plurality of audio signal frequency components, the code comprising a plurality of code frequency component sets, each of the code frequency component sets representing a respectively different code symbol and including a plurality of code frequency components, comprising:
means for producing the code frequency component sets, the code frequency components of the code frequency component sets forming component clusters spaced from one another within the frequency domain, each of the component clusters having a respective predetermined frequency range and consisting of one frequency component from each of the code frequency component sets falling within its respective predetermined frequency range, component clusters which are adjacent within the frequency domain being separated by respective frequency amounts, and wherein the predetermined frequency range of each respective component cluster is smaller than the frequency amounts separating the respective component cluster from its adjacent component clusters;
first masking evaluation means for evaluating a masking ability of a first set of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing to produce a first masking evaluation;
second masking evaluation means for evaluating a masking ability of a second set of the plurality of audio signal frequency components different from the first set thereof to mask the at least one code frequency component to human hearing to produce a second masking evaluation;
amplitude assigning means for assigning an amplitude to the at least one code frequency component based on a selected one of the first and second masking evaluations; and
code inclusion means for including the code frequency component sets with the audio signal.
2. An apparatus for including a code having at least one code frequency component with an audio signal having a plurality of audio signal frequency components, comprising:
first masking evaluation means for evaluating a masking ability of a first set of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing to produce a first masking evaluation, the first masking evaluation means being operative to detect signal power of audio signal frequency components of the first set within a specified frequency range, to determine first and second masking factors on the conditions that the signal power is at each of first and second frequencies, respectively, within the specified frequency range, the second frequency being different than the first frequency, to select that one of the first and second masking factors which represents a smaller amplitude of at least one code frequency component, and to determine the masking ability of the first set of the plurality of audio signal frequency components based on the selected masking factor;
second masking evaluation means for evaluating a masking ability of a second set of the plurality of audio signal frequency components different from the first set thereof to mask the at least one code frequency component to human hearing to produce a second masking evaluation;
amplitude assigning means for assigning an amplitude to the at least one code frequency component based on a selected one of the first and second masking evaluations; and
code inclusion means for including the at least one code frequency component with the audio signal.
3. A method for including a code with an audio signal having a plurality of audio signal frequency components, the code comprising a plurality of code frequency component sets, each of the code frequency component sets representing a respectively different code symbol and including a plurality of code frequency components, comprising the steps of:
producing the code frequency component sets, the code frequency components of the code frequency component sets forming component clusters spaced from one another within the frequency domain, each of the component clusters having a respective predetermined frequency range and consisting of one frequency component from each of the code frequency component sets falling within its respective predetermined frequency range, component clusters which are adjacent within the frequency domain being separated by respective frequency amounts, and wherein the predetermined frequency range of each respective component cluster is smaller than the frequency amounts separating the respective component cluster from its adjacent component clusters;
evaluating a masking ability of a first set of the plurality of audio signal frequency components to mask at least one code frequency component to human hearing to produce a first masking evaluation;
evaluating a masking ability of a second set of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing to produce a second masking evaluation;
assigning an amplitude to the at least one code frequency component based on a selected one of the first and second masking evaluations; and
including the code frequency component sets with the audio signal.
4. A method for including a code having at least one code frequency component with an audio signal having a plurality of audio signal frequency components, comprising the steps of:
evaluating a masking ability of a first set of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing to produce a first masking evaluation, by detecting signal power of audio signal frequency components of the first set within a specified frequency range, determining first and second masking factors on the conditions that the signal power is at each of first and second frequencies, respectively, within the specified frequency range, the second frequency being different than the first frequency, selecting that one of the first and second masking factors which represents a smaller amplitude of the at least one code frequency component, and determining the masking ability of the first set of the plurality of audio signal frequency components based on the selected masking factors;
evaluating a masking ability of a second set of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing to produce a second masking evaluation;
assigning an amplitude to the at least one code frequency component based on a selected one of the first and second masking evaluations; and
including the at least one code frequency component with the audio signal.
5. An apparatus for including a code with an audio signal having a plurality of audio signal frequency components, the code comprising a plurality of code frequency component sets, each of the code frequency component sets representing a respectively different code symbol and including a plurality of code frequency components, comprising:
a digital processor having an input for receiving the audio signal, the digital processor being programmed to produce the code frequency components such that said components form component clusters spaced from one another within the frequency domain, each of the component clusters having a respective predetermined frequency range and consisting of one frequency component from each of the code frequency component sets falling within its respective predetermined frequency range, component clusters which are adjacent within the frequency domain being separated by respective frequency amounts, and wherein the predetermined frequency range of each respective component cluster is smaller than the frequency amounts separating the respective component cluster from its adjacent component clusters, the digital processor being further programmed to evaluate respective masking abilities of first and second sets of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing to produce respective first and second masking evaluations, the second set of the plurality of audio signal frequency components differing from the first set thereof, the digital processor being further programmed to assign an amplitude to the at least one code frequency component based on a selected one of the first and second masking evaluations; and
means for including the code frequency component sets with the audio signal.
6. An apparatus for including a code having at least one code frequency component with an audio signal having a plurality of audio signal frequency components, comprising:
a digital processor having an input for receiving the audio signal, the digital processor being programmed to evaluate respective masking abilities of first and second sets of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing to produce respective first and second masking evaluations, the second set of the plurality of audio signal frequency components differing from the first set thereof, the digital processor being operative to evaluate the masking ability of the first set by detecting signal power of audio signal frequency components of the first set within a specified frequency range, detecting first and second masking factors on the conditions that the signal power is at each of first and second frequencies, respectively, within the specified frequency range, the second frequency being different than the first frequency, selecting that one of the first and second masking factors which represents a smaller amplitude of the at least one code frequency component, and determining the masking ability of the first set of the plurality of audio signal frequency components based on the selected masking factor, the digital processor being further programmed to assign an amplitude to the at least one code frequency component based on a selected one of the first and second masking evaluations; and
means for including the at least one code frequency component with the audio signal.
7. An apparatus for including a code having a plurality of code frequency components with an audio signal having a plurality of audio signal frequency components, the plurality of code frequency components comprising a plurality of code frequency component sets, each of the code frequency component sets representing a respectively different code symbol and including a plurality of respectively different code frequency components, the plurality of code frequency components including a first code frequency component having a first frequency and a second code frequency component having a second frequency different from the first frequency, comprising:
means for producing the code frequency components, the code frequency components of the code frequency component sets forming component clusters spaced from one another within the frequency domain, each of the component clusters having a respective predetermined frequency range and consisting of one frequency component from each of the code frequency component sets falling within its respective predetermined frequency range, component clusters which are adjacent within the frequency domain being separated by respective frequency amounts, and wherein the predetermined frequency range of each respective component cluster is smaller than the frequency amounts separating the respective component cluster from its adjacent component clusters;
first masking evaluation means for evaluating a masking ability of at least one of the plurality of audio signal frequency components to mask a code frequency component having the first frequency to human hearing to produce a first respective masking evaluation;
second masking evaluation means for evaluating a masking ability of at least one of the plurality of audio signal frequency components to mask a code frequency component having the second frequency to human hearing to produce a second respective masking evaluation;
amplitude assigning means for assigning a respective amplitude to the first code frequency component based on the first respective masking evaluation and for assigning a respective amplitude to the second code frequency component based on the second respective masking evaluation; and
code inclusion means for including the plurality of code frequency components with the audio signal.
8. An apparatus for including a code having a plurality of code frequency components with an audio signal having a plurality of audio signal frequency components, the plurality of code frequency components including a first code frequency component having a first frequency and a second code frequency component having a second frequency different from the first frequency, comprising:
first masking evaluation means for evaluating a masking ability of at least one of the plurality of audio signal frequency components to mask a code frequency component having the first frequency to human hearing to produce a first respective masking evaluation, the first masking evaluation means being operative to detect signal power of the at least one of the plurality of audio signal frequency components within a specified frequency range, to determine first and second masking factors on the conditions that the signal power is at each of first and second frequencies, respectively, within the specified frequency range, the second frequency being different than the first frequency, to select that one of the first and second masking factors which represents a smaller amplitude of the at least one code frequency component, and to determine the masking ability of the at least one of the plurality of audio signal frequency components based on the selected masking factor;
second masking evaluation means for evaluating a masking ability of at least one of the plurality of audio signal frequency components to mask a code frequency component having the second frequency to human hearing to produce a second respective masking evaluation;
amplitude assigning means for assigning a respective amplitude to the first code frequency component based on the first respective masking evaluation and for assigning a respective amplitude to the second code frequency component based on the second respective masking evaluation; and
code inclusion means for including the plurality of code frequency components with the audio signal.
9. A method for including a code having a plurality of code frequency components with an audio signal having a plurality of audio signal frequency components, the plurality of code frequency components comprising a plurality of code frequency component sets, each of the code frequency component sets representing a respectively different code symbol and including a plurality of respectively different code frequency components, the plurality of code frequency components including a first code frequency component having a first frequency and a second code frequency component having a second frequency different from the first frequency, comprising the steps of:
producing the code frequency components, the code frequency components of the code frequency component sets forming component clusters spaced from one another within the frequency domain, each of the component clusters having a respective predetermined frequency range and consisting of one frequency component from each of the code frequency component sets falling within its respective predetermined frequency range, component clusters which are adjacent within the frequency domain being separated by respective frequency amounts, and wherein the predetermined frequency range of each respective component cluster is smaller than the frequency amounts separating the respective component cluster from its adjacent component clusters;
evaluating a masking ability of at least one of the plurality of audio signal frequency components to mask a code frequency component having the first frequency to human hearing to produce a first respective marking evaluation;
evaluating a masking ability of at least one of the plurality of audio signal frequency components to mask a code frequency component having the second frequency to human hearing to produce a second respective marking evaluation;
assigning a respective amplitude to the first code frequency component based on the first respective masking evaluation and a respective amplitude to the second code frequency component based on the second respective marking evaluation; and
including the plurality of code frequency components with the audio signal.
10. A method for including a code having a plurality of code frequency components with an audio signal having a plurality of audio signal frequency components, the plurality of code frequency components including a first code frequency component having a first frequency and a second code frequency component having a second frequency different from the first frequency, comprising the steps of:
evaluating a masking ability of at least one of the plurality of audio signal frequency components to mask a code frequency component having the first frequency to human hearing to produce a first respective masking evaluation, by detecting signal power of audio signal frequency components within a specified frequency range, determining first and second masking factors on the conditions that the signal power is at each of first and second frequencies, respectively, within the specified frequency range, the second frequency being different than the first frequency, selecting that one of the first and second masking factors which represents a smaller amplitude of the at least one code frequency component, and determining the masking ability of the at least one of the plurality of audio signal frequency components to mask a code frequency component having the first frequency based on the selected masking factor;
evaluating a masking ability of at least one of the plurality of audio signal frequency components to mask a code frequency component having the second frequency to human hearing to produce a second respective masking evaluation;
assigning a respective amplitude to the first code frequency component based on the first respective masking evaluation and a respective amplitude to the second code frequency component based on the second respective masking evaluation; and
including the plurality of the code frequency components with the audio signal.
11. An apparatus for including a code having a plurality of code frequency components with an audio signal having a plurality of audio signal frequency components, the plurality of code frequency components including a first code frequency component having a first frequency and a second code frequency component having a second frequency different from the first frequency, comprising:
a digital processor having an input for receiving the audio signal, the digital processor being programmed to evaluate a masking ability of at least one of the plurality of audio signal frequency components to mask a code frequency component having the first frequency to human hearing to produce a first respective masking evaluation and to evaluate a masking ability of at least one of the plurality of audio signal frequency components to mask a code frequency component having the second frequency to human hearing to produce a second respective masking evaluation;
the digital processor being further programmed to produce the code as a plurality of code frequency component sets, each of the code frequency component sets representing a respectively different code symbol and including a plurality of respectively different code frequency components, the code frequency components of the code frequency component sets forming component clusters spaced from one another within the frequency domain, each of the component clusters having a respective predetermined frequency range and consisting of one frequency component from each of the code frequency component sets falling within its respective predetermined frequency range, component clusters which are adjacent within the frequency domain being separated by respective frequency amounts, and wherein the predetermined frequency range of each respective component cluster is smaller than the frequency amounts separating the respective component cluster from its adjacent component clusters;
the digital processor being further programmed to assign a corresponding amplitude to the first code frequency component based on the first respective masking evaluation and to assign a corresponding amplitude to the second code frequency component based on the second respective masking evaluation; and
means for including the plurality of code frequency components with the audio signal.
12. An apparatus for including a code having a plurality of code frequency components with an audio signal having a plurality of audio signal frequency components, the plurality of code frequency components including a first code frequency component having a first frequency and a second code frequency component having a second code frequency different from the first frequency, comprising:
a digital processor having an input for receiving the audio signal, the digital processor being programmed to produce a first respective masking evaluation by evaluating a masking ability of at least one of the plurality of audio signal frequency components to mask a code frequency component having the first frequency to human hearing, wherein the digital processor is programmed to evaluate the masking ability of the at least one of the plurality of audio signal frequency components by detecting signal power of audio signal frequency components within a specified frequency range, determining first and second masking factors with respect to the code frequency component having the first frequency on the conditions that the signal power is at each of first and second frequencies, respectively, within the specified frequency range, the second frequency being different than the first frequency, selecting that one of the first and second masking factors which represents a smaller amplitude of the at least one code frequency component, and determining the masking ability of the at least one of the plurality of audio signal frequency components based on the selected masking factors;
the digital processor being further programmed to evaluate a masking ability of at least one of the plurality of audio signal frequency components to mask a code frequency component having the second frequency to human hearing to produce a second respective masking evaluation;
the digital processor being further programmed to assign a corresponding amplitude to the first code frequency component based on the first respective masking evaluation and to assign a corresponding amplitude to the second code frequency component based on the second respective masking evaluation; and
means for including the plurality of code frequency components with the audio signal.
13. An apparatus for including a code with an audio signal having a plurality of audio signal frequency components, wherein the code comprises a plurality of code frequency component sets, each of the code frequency component sets representing a respectively different code symbol and including a plurality of respectively different code frequency components comprising:
means for producing the code frequency component sets, the code frequency components of the code frequency component sets forming component clusters spaced from one another within the frequency domain, each of the component clusters having a respective predetermined frequency range and consisting of one frequency component from each of the code frequency component sets falling within its respective predetermined frequency range, component clusters which are adjacent within the frequency domain being separated by respective frequency amounts, and wherein the predetermined frequency range of each respective component cluster is smaller than the frequency amounts separating the respective component cluster from its adjacent component clusters;
tonal signal producing means for producing a first tonal signal representing a first substantially single one of the plurality of audio signal frequency components;
masking evaluation means for evaluating a masking ability of the first substantially single one of the plurality of audio signal frequency components to mask at least one code frequency component to human hearing based on the first tonal signal to produce a first masking evaluation;
amplitude assigning means for assigning an amplitude to the at least one code frequency component based on the first masking evaluation; and
code inclusion means for including the code frequency component sets with the audio signal.
14. An apparatus for including a code having at least one code frequency component with an audio signal having a plurality of audio signal frequency components, comprising:
tonal signal producing means for producing a first tonal signal representing signal power of a first substantially single one of the plurality of the audio signal frequency components within a specified frequency range;
masking evaluation means for evaluating a masking ability of the first substantially single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the first tonal signal to produce a first masking evaluation, the masking evaluation means being operative to determine first and second masking factors on the conditions that the signal power represented by the first tonal signal is at each of first and second frequencies, respectively, within the specified frequency range, the second frequency being different than the first frequency, to select that one of the first and second masking factors which represents a smaller amplitude of the at least one code frequency component, and to determine the masking ability of the first substantially single one of the plurality of the audio signal frequency components based on the selected masking factor;
amplitude assigning means for assigning an amplitude to the at least one code frequency component based on the first masking evaluation; and
code inclusion means for including the at least one code frequency component with the audio signal.
15. An apparatus for including a code having at least one code frequency component with an audio signal having a plurality of audio signal frequency components, comprising:
tonal signal producing means for producing a first tonal signal representing a first substantially single one of the plurality of audio signal frequency components;
masking evaluation means for evaluating a masking ability of the first substantially single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the first tonal signal to produce a first masking evaluation;
amplitude assigning means for assigning an amplitude to the at least one code frequency component based on the first masking evaluation; and
code inclusion means for including the at least one code frequency component with the audio signal, wherein said masking evaluation means is operative to produce said first masking evaluation only when said at least one code frequency component is within a critical band of said first substantially single one of the plurality of audio signal frequency components.
16. An apparatus for including a code with an audio signal having a plurality of audio signal frequency components, wherein said code includes a plurality of code frequency components, comprising:
tonal signal producing means for producing a first tonal signal representing a first substantially single one of the plurality of audio signal frequency components;
masking evaluation means for evaluating a masking ability of the first substantially single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the first tonal signal to produce a first masking evaluation;
amplitude assigning means for assigning an amplitude to the at least one code frequency component based on the first masking evaluation and based on a number of the code frequency components within a critical band of the at least one code frequency component; and
code inclusion means for including the at least one code frequency component with the audio signal.
17. An apparatus for including a code having at least one code frequency component with an audio signal having a plurality of audio signal frequency components, comprising:
tonal signal producing means for producing a first tonal signal representing a first substantially single one of the plurality of audio signal frequency components and a second tonal signal representing a second substantially single one of the plurality of audio signal frequency components;
masking evaluation means for evaluating a masking ability of the first substantially single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the first tonal signal to produce a first masking evaluation; said masking evaluation means being operative to evaluate an ability of said second substantially single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the second tonal signal to produce a second masking evaluation;
amplitude assigning means for assigning an amplitude to the at least one code frequency component based on the first and second masking evaluations; and
code inclusion means for including the at least one code frequency component with the audio signal.
18. The apparatus of claim 17, wherein said amplitude assigning means is operative to assign the amplitude to the at least one code frequency component based on a distribution of power between said first and second tonal signals.
19. A method for including a code with an audio signal having a plurality of audio signal frequency components, wherein the code comprises a plurality of code frequency component sets, each of the code frequency component sets representing a respectively different code symbol and including a plurality of respectively different code frequency components, comprising the steps of:
producing the code frequency component sets, the code frequency components of the code frequency component sets forming component clusters spaced from one another within the frequency domain, each of the component clusters having a respective predetermined frequency range and consisting of one frequency component from each of the code frequency component sets falling within its respective predetermined frequency range, component clusters which are adjacent within the frequency domain being separated by respective frequency amounts, and wherein the predetermined frequency range of each respective component cluster is smaller than the frequency amounts separating the respective component cluster from its adjacent component clusters;
producing a first tonal signal representing a first substantially single one of the plurality of audio signal frequency components;
evaluating a masking ability of the first substantially single one of the plurality of audio signal frequency components to mask at least one code frequency component to human hearing based on the first tonal signal to produce a first masking evaluation;
assigning an amplitude to the at least one code frequency component based on the first masking evaluation; and
including the at least one code frequency component with the audio signal.
20. A method for including a code having at least one code frequency component with an audio signal having a plurality of audio signal frequency components, comprising the steps of:
producing a first tonal signal representing signal power of a first substantially single one of the plurality of audio signal frequency components within a specified frequency range;
evaluating a masking ability of the first substantially single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the first tonal signal to produce a first masking evaluation, by determining first and second masking factors on the conditions that the signal power represented by the first tonal signal is at each of first and second frequencies, respectively, within the specified frequency range, the second frequency being different than the first frequency, selecting that one of the first and second masking factors which represents a smaller amplitude of the at least one code frequency component, and determining the masking ability of the first substantially single one of the plurality of audio signal frequency components based on the selected masking factor;
assigning an amplitude to the at least one code frequency component based on the first masking evaluation; and
including the at least one code frequency component with the audio signal.
21. A method for including a code having at least one code frequency component with an audio signal having a plurality of audio signal frequency components, comprising the steps of:
producing a first tonal signal representing a first substantially single one of the plurality of audio signal frequency components;
evaluating a masking ability of the first substantially single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the first tonal signal to produce a first masking evaluation;
assigning an amplitude to the at least one code frequency component based on the first masking evaluation; and
including the at least one code frequency component with the audio signal, wherein the step of evaluating a masking ability occurs only when said at least one code frequency component is within a critical band of said first substantially single one of the plurality of audio signal frequency components.
22. A method for including a code with an audio signal having a plurality of audio signal frequency components, wherein said code includes a plurality of code frequency components, comprising the steps of:
producing a first tonal signal representing a first substantially single one of the plurality of audio signal frequency components;
evaluating a masking ability of the first substantially single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the first tonal signal to produce a first masking evaluation;
assigning an amplitude to the at least one code frequency component based on the first masking evaluation and based on a number of the code frequency components within a critical band of the at least one code frequency component; and
including the at least one code frequency component with the audio signal.
23. A method for including a code having at least one code frequency component with an audio signal having a plurality of audio signal frequency components, comprising the steps of:
producing a first tonal signal representing a first substantially single one of the plurality of audio signal frequency components and a second tonal signal representing a second substantially single one of the plurality of audio signal frequency components;
evaluating a masking ability of the first substantially single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the first tonal signal to produce a first masking evaluation;
evaluating a masking ability of said second substantially single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the second tonal signal to produce a second masking evaluation;
assigning an amplitude to the at least one code frequency component based on the first and second masking evaluations; and
including the at least one frequency component with the audio signal.
24. The method of claim 23, wherein the step of assigning assigns the amplitude to the at least one code frequency component based on a distribution of power between said first and second tonal signals.
25. An apparatus for including a code with an audio signal having a plurality of audio signal frequency components, comprising:
a digital processor having an input for receiving the audio signal, the digital processor being programmed to produce the code as a plurality of code frequency component sets, each of the code frequency component set representing a respectively different code symbol and including a plurality of respectively different code frequency components, the code frequency components of the code frequency components, the code frequency component cluster spaced from one another within the frequency domain, each of the component cluster having a respective predetermined frequency range and consisting of one frequency component from each of the code frequency component sets falling within its respective predetermined frequency range, component clusters which are adjacent within the frequency domain being separated by respective frequency amounts, and wherein the predetermined frequency range of each respective component cluster is smaller than the frequency amounts separating the respective component cluster from its adjacent component cluster;
the digital processor being further programmed to produce a first tonal signal representing a first substantially single one of the plurality of audio signal frequency components and to evaluate a masking ability of the first substantially single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the first tonal signal to produce a first masking evaluation;
the digital processor being further programmed to assign an amplitude to the at least one code frequency component based on the first masking evaluation;
the apparatus further comprising code inclusion means for including the at least one code frequency component with the audio signal.
26. An apparatus for including a code having at least one code frequency component with an audio signal having a plurality of audio signal frequency components, comprising:
a digital processor having an input for receiving the audio signal, the digital processor being programmed to produce a first tonal signal representing signal power of a first substantially single one of the plurality of the audio signal frequency components within a specified frequency range, the digital processor being further programmed to evaluate a masking ability of the first substantially single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the first tonal signal to produce a first masking evaluation, by determining first and second masking factors on the conditions that the signal power represented by the first tonal signal is at each of first and second frequencies, respectively, within the specified frequency range, the second frequency being different than the first frequency, selecting that one of the first and second masking factors which represents a smaller amplitude of at least one code frequency component, and determining the masking ability of the first substantially single one of the plurality of audio signal frequency components based on the selected masking factor;
the digital processor being further programmed to assign an amplitude to the at least one code frequency component based on the first masking evaluation;
the apparatus further comprising code inclusion means for including the at least one code frequency component with the audio signal.
27. An apparatus for including a code having at least one code frequency component with an audio signal having a plurality of audio signal frequency components, comprising:
a digital processor having an input for receiving the audio signal, the digital processor being programmed to produce a first tonal signal representing a first substantially single one of the plurality of audio signal frequency components and to evaluate a masking ability of the first substantially single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the first tonal signal to produce a first masking evaluation wherein the digital processor is programmed to produce said first masking evaluation only when said at least one code frequency component is within a critical band of said first substantially single one of the plurality of audio signal frequency components, the digital processor being further programmed to assign an amplitude to the at least one code frequency component based on the first masking evaluation; and
code inclusion means for including the at least one code frequency component with the audio signal.
28. An apparatus for including a code with an audio signal having a plurality of audio signal frequency components, wherein said code includes a plurality of code frequency components, comprising:
a digital processor having an input for receiving the audio signal, the digital processor being programmed to produce a first tonal signal representing a first substantially single one of the plurality of audio signal frequency components and to evaluate a masking ability of the first substantially single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the first tonal signal to produce a first masking evaluation, said digital processor being further programmed to assign an amplitude to the at least one code frequency component based on the first masking evaluation and based on a number of the code frequency components within a critical band of the at least one code frequency component; and
code inclusion means for including the at least one code frequency component with the audio signal.
29. An apparatus for including a code having at least one code frequency component with an audio signal having a plurality of audio signal frequency components, comprising:
a digital processor having an input for receiving the audio signal, the digital processor being programmed to produce a first tonal signal representing a first substantially single one of the plurality of audio signal frequency components and to produce a second tonal signal representing a second substantially single one of the plurality of audio signal frequency components; the digital processor being further programmed to evaluate a masking ability of the first substantially single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the first tonal signal to produce a first masking evaluation and to evaluate an ability of said second substantially single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the second tonal signal to produce a second masking evaluation; the digital processor being further programmed to assign an amplitude to the at least one code frequency component based on the first and second masking evaluations; and
code inclusion means for including the at least one code frequency component with the audio signal.
30. The apparatus of claim 29, wherein said digital computer is programmed to assign the amplitude to the at least one code frequency component based on a distribution of power between said first and second tonal signals.
31. An apparatus for encoding an audio signal, comprising:
means for generating a code comprising a plurality of code frequency component sets, each of the code frequency component sets representing a respectively different code symbol and including a plurality of respectively different code frequency components, the code frequency components of the code frequency component sets forming component clusters spaced from one another within the frequency domain, each of the component clusters having a respective predetermined frequency range and consisting of one frequency component from each of the code frequency component sets falling within its respective predetermined frequency range, component clusters which are adjacent within the frequency domain being separated by respective frequency amounts, the predetermined frequency range of each respective component cluster being smaller than the frequency amounts separating the respective component cluster from its adjacent component clusters; and
code inclusion means for combining the code with the audio signal.
32. A method for encoding an audio signal, comprising:
generating a code comprising a plurality of code frequency component sets, each of the code frequency component sets representing a respectively different code symbol and including a plurality of respectively different code frequency components, the code frequency components of the code frequency component sets forming component clusters spaced from one another within the frequency domain, each of the component clusters having a respective predetermined frequency range and consisting of one frequency component from each of the code frequency component sets falling within its respective predetermined frequency range, component clusters which are adjacent within the frequency domain being separated by respective frequency amounts, the predetermined frequency range of each respective component cluster being smaller than the frequency amounts separating the respective component cluster from its adjacent component clusters; and
combining the code with the audio signal.
33. An apparatus for encoding an audio signal, comprising:
a digital processor having an input for receiving the audio signal, the digital processor being programmed to produce a code comprising a plurality of code frequency component sets, each of the code frequency component sets representing a respectively different code symbol and including a plurality of respectively different code frequency components, the code frequency components of the code frequency component sets forming component clusters spaced from one another within the frequency domain, each of the component clusters having a respective predetermined frequency range and consisting of one frequency component from each of the code frequency component sets falling within its respective predetermined frequency range, component clusters which are adjacent within the frequency domain being separated by respective frequency amounts, the predetermined frequency range of each respective component cluster being smaller than the frequency amounts separating the respective component cluster from its adjacent component clusters; and
means for combining the code with the audio signal.
34. A method for including a code having a plurality of code frequency components with an audio signal, comprising the steps of:
producing a first code frequency component;
producing a second code frequency component separately from the first code frequency component;
evaluating a first ability of the audio signal to mask the first code frequency component to produce a first masking evaluation;
evaluating a second ability of the audio signal to mask the second code frequency component to produce a second masking evaluation;
assigning a first amplitude to the first code frequency component based on the first masking evaluation;
assigning a second amplitude to the second code frequency component based on the second masking evaluation; and
including the first and second code frequency components with the audio signal.
35. The method of claim 34, wherein each of the first and second code frequency components is initially generated so that its amplitude is selected for masking by the audio signal.
36. The method of claim 34, wherein the respective amplitudes are assigned to the first and second code frequency components after the first and second frequency components are generated.
37. The method of claim 34, wherein the first and second frequency components are produced in response to data representing one symbol.
38. An apparatus for including a code having a plurality of code frequency components with an audio signal, comprising:
means for producing a first code frequency component;
means for producing a second code frequency component separately from the first code frequency component;
means for evaluating a first ability of the audio signal to mask the first code frequency component to produce a first masking evaluation;
means for evaluating a second ability of the audio signal to mask the second code frequency component to produce a second masking evaluation;
means for assigning a first amplitude to the first code frequency component based on the first masking evaluation;
means for assigning a second amplitude to the second code frequency component based on the second masking evaluation; and
means for including the first and second code frequency components with the audio signal.
39. The apparatus of claim 38, wherein the means for producing the first and second frequency components are operative to produce the first and second frequency components in response to data representing one symbol.
40. An apparatus for including a code having a plurality of code frequency components with an audio signal, comprising:
a digital processor having an input for receiving the audio signal, the digital processor being programmed to produce a first code frequency component, to produce a second code frequency component separately from the first code frequency component, to evaluate a first ability of the audio signal to mask the first code frequency component to produce a first masking evaluation, to evaluate a second ability of the audio signal to mask the second code frequency component to produce a second masking evaluation, to assign a first amplitude to the first code frequency component based on the first masking evaluation, and to assign a second amplitude to the second code frequency component based on the second masking evaluation; and
means for including the first and second code frequency components with the audio signal.
41. The apparatus of claim 40, wherein the digital processor is programmed to produce the first and second code frequency components in response to data representing one symbol.
Description
BACKGROUND OF THE INVENTION
The present invention relates to apparatus and methods for including codes in audio signals and decoding such codes.
For many years, techniques have been proposed for mixing codes with audio signals so that (1) the codes can be reliably reproduced from the audio signals, while (2) the codes are inaudible when the audio signals are reproduced as sound. The accomplishment of both objectives is essential for practical application. For example, broadcasters and producers of broadcast programs, as well as those who record music for public distribution will not tolerate the inclusion of audible codes in their programs and recordings.
Techniques for encoding audio signals have been proposed at various times going back at least to U.S. Pat. No. 3,004,104 to Hembrooke issued Oct. 10, 1961. Hembrooke showed an encoding method in which audio signal energy within a narrow frequency band was selectively removed to encode the signal. A problem with this technique arises when noise or signal distortion reintroduces energy into the narrow frequency band so that the code is obscured.
In another method, U.S. Pat. No. 3,845,391 to Crosby proposed to eliminate a narrow frequency band from the audio signal and insert a code therein. This technique evidently encountered the same problems as Hembrooke, as recounted in U.S. Pat. No. 4,703,476 to Howard which, as indicated thereon, was commonly assigned with the Crosby patent. However, the Howard patent sought only to improve Crosby's method without departing from its fundamental approach.
It has also been proposed to encode binary signals by spreading the binary codes into frequencies extending throughout the audio band. A problem with this proposed method is that, in the absence of audio signal components to mask the code frequencies, they can become audible. This method, therefore, relies on the asserted noiselike character of the codes to suggest that their presence will be ignored by listeners. However, in many cases this assumption may not be valid, for example, in the case of classical music including portions with relatively little audio signal content or during pauses in speech.
A further technique has been suggested in which dual tone multifrequency (DTMF) codes are inserted in an audio signal. The DTMF codes are purportedly detected based on their frequencies and durations. However, audio signal components can be mistaken for one or both tones of each DTMF code, so that either the presence of a code can be missed by the detector or signal components can be mistaken for a DTMF code. It is noted in addition that each DTMF code includes a tone common to another DTMF code. Accordingly, a signal component corresponding to a tone of a different DTMF code can combine with the tone of a DTMF code which is simultaneously present in the signal to result in a false detection.
OBJECTS AND SUMMARY OF THE INVENTION
Accordingly, it is an object of the present invention to provide coding and decoding apparatus and methods which overcome the disadvantages of the foregoing proposed techniques.
It is a further object of the present invention to provide coding apparatus and methods for including codes with audio signals so that, as sound, the codes are inaudible to the human ear but can be detected reliably by decoding apparatus.
A further object of the present invention is to provide decoding apparatus and methods for reliably recovering codes present in audio signals.
In accordance with a first aspect of the present invention, apparatus and methods for including a code having at least one code frequency component with an audio signal having a plurality of audio signal frequency components, comprise the means for and the steps of: evaluating an ability of a first set of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing to produce a first masking evaluation; evaluating an ability of a second set of the plurality of audio signal frequency components differing from the first set thereof to mask the at least one code frequency component to human hearing to produce a second masking evaluation; assigning an amplitude to the at least one code frequency component based on a selected one of the first and second masking evaluations; and including the at least one code frequency component with the audio signal.
In accordance with another aspect of the present invention, an apparatus for including a code having at least one code frequency component with an audio signal having a plurality of audio signal frequency components, comprises: a digital computer having an input for receiving the audio signal, the digital computer being programmed to evaluate respective abilities of first and second sets of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing to produce respective first and second masking evaluations, the second set of the plurality of audio signal frequency components differing from the first set thereof, the digital computer being further programmed to assign an amplitude to the at least one code frequency component based on a selected one of the first and second masking evaluations; and means for including the at least one code frequency component with the audio signal.
In accordance with a further aspect of the present invention, apparatus and methods for including a code having a plurality of code frequency components with an audio signal having a plurality of audio signal frequency components, the plurality of code frequency components including a first code frequency component having a first frequency and a second code frequency component having a second frequency different from the first frequency, comprise the means for and the steps of, respectively: evaluating an ability of at least one of the plurality of audio signal frequency components to mask a code frequency component having the first frequency to human hearing to produce a first respective masking evaluation; evaluating an ability of at least one of the plurality of audio signal frequency components to mask a code frequency component having the second frequency to human hearing to produce a second respective masking evaluation; assigning a respective amplitude to the first code frequency component based on the first respective masking evaluation and assigning a respective amplitude to the second code frequency component based on the second respective masking evaluation; and including the plurality of code frequency components with the audio signal.
In accordance with yet another aspect of the present invention, an apparatus for including a code having a plurality of code frequency components with an audio signal having a plurality of audio signal frequency components, the plurality of code frequency components including a first code frequency component having a first frequency and a second code frequency component having a second code frequency different from the first frequency, comprises: a digital computer having an input for receiving the audio signal, the digital computer being programmed to evaluate an ability of at least one of the plurality of audio signal frequency components to mask a code frequency component having the first frequency to human hearing to produce a first respective masking evaluation and to evaluate an ability of at least one of the plurality of audio signal frequency components to mask a code frequency component having the second frequency to human hearing to produce a second respective masking evaluation; the digital computer being further programmed to assign a corresponding amplitude to the first code frequency component based on the first respective masking evaluation and to assign a corresponding amplitude to the second code frequency component based on the second respective masking evaluation; and means for including the plurality of code frequency components with the audio signal.
In accordance with a still further aspect of the present invention, apparatus and methods for including a code having at least one code frequency component with an audio signal including a plurality of audio signal frequency components, comprise the means for and the steps of, respectively: evaluating an ability of at least one of the plurality of audio signal frequency components within a first audio signal interval on a time scale of the audio signal when reproduced as sound during a corresponding first time interval to mask the at least one code frequency component to human hearing when reproduced as sound during a second time interval corresponding to a second audio signal interval offset from the first audio signal interval to produce a first masking evaluation; assigning an amplitude to the at least one code frequency component based on the first masking evaluation; and including the at least one code frequency component in a portion of the audio signal within the second audio signal interval.
In accordance with yet still another aspect of the present invention, an apparatus for including a code having at least one code frequency component with an audio signal including a plurality of audio signal frequency components, comprises: a digital computer having an input for receiving the audio signal, the digital computer being programmed to evaluate an ability of at least one of the plurality of audio signal frequency components within a first audio signal interval on a time scale of the audio signal when reproduced as sound during a corresponding first time interval to mask the at least one code frequency component to human hearing when reproduced as sound during a second time interval corresponding to a second audio signal interval offset from the first audio signal interval, to produce a first masking evaluation; the digital computer being further programmed to assign an amplitude to the at least one code frequency component based on the first masking evaluation; and means for including the at least one code frequency component in a portion of the audio signal within the second audio signal interval.
In accordance with a still further aspect of the present invention, apparatus and methods for including a code having at least one code frequency component with an audio signal having a plurality of audio signal frequency components, comprise the means for and the steps of, respectively: producing a first tonal signal representing substantially a first single one of the plurality of audio signal frequency components; evaluating an ability of the first single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the first tonal signal to produce a first masking evaluation; assigning an amplitude to the at least one code frequency component based on the first masking evaluation; and including the at least one code frequency component with the audio signal.
In accordance with another aspect of the present invention, an apparatus for including a code having at least one code frequency component with an audio signal having a plurality of audio signal frequency components, comprises: a digital computer having an input for receiving the audio signal, the digital computer being programmed to produce a first tonal signal representing substantially a first single one of the plurality of audio signal frequency components and to evaluate an ability of the first single one of the plurality of audio signal frequency components to mask the at least one code frequency component to human hearing based on the first tonal signal to produce a first masking evaluation; the digital computer being further programmed to assign an amplitude to the at least one code frequency component based on the first masking evaluation; and means for including the at least one code frequency component with the audio signal.
In accordance with yet still another aspect of the present invention, apparatus and methods for detecting a code in an encoded audio signal, the encoded audio signal including a plurality of audio frequency signal components and at least one code frequency component having an amplitude and an audio frequency selected for masking the code frequency component to human hearing by at least one of the plurality of audio frequency signal components, comprise the means for and the steps of, respectively: establishing an expected code amplitude of the at least one code frequency component based on the encoded audio signal; and detecting the code frequency component in the encoded audio signal based on the expected code amplitude thereof.
In accordance with a yet still further aspect of the present invention, a programmed digital computer is provided for detecting a code in an encoded audio signal, the encoded audio signal including a plurality of audio frequency signal components and at least one code frequency component having an amplitude and an audio frequency selected for masking the code frequency component to human hearing by at least one of the plurality of audio frequency signal components, the digital computer comprising: an input for receiving the encoded audio signal; a processor programmed to establish an expected code amplitude of the at least one code frequency component based on the encoded audio signal, to detect the code frequency component in the encoded audio signal based on the expected code amplitude and to produce a detected code output signal based on the detected code frequency component; and an output coupled with the processor for providing the detected code output signal.
In accordance with another aspect of the present invention, apparatus and methods are provided for detecting a code in an encoded audio signal, the encoded audio signal having a plurality of frequency components including a plurality of audio frequency signal components and at least one code frequency component having a predetermined audio frequency and a predetermined amplitude for distinguishing the at least one code frequency component from the plurality of audio frequency signal components, comprise the means for and the steps of, respectively: determining an amplitude of a frequency component of the encoded audio signal within a first range of audio frequencies including the predetermined audio frequency of the at least one code frequency component; establishing a noise amplitude for the first range of audio frequencies; and detecting the presence of the at least one code frequency component in the first range of audio frequencies based on the established noise amplitude thereof and the determined amplitude of the frequency component therein.
In accordance with a further aspect of the present invention, a digital computer is provided for detecting a code in an encoded audio signal, the encoded audio signal having a plurality of frequency components including a plurality of audio frequency signal components and at least one code frequency component having a predetermined audio frequency and a predetermined amplitude for distinguishing the at least one code frequency component from the plurality of audio frequency signal components, comprising: an input for receiving the encoded audio signal; a processor coupled with the input to receive the encoded audio signal and programmed to determine an amplitude of a frequency component of the encoded audio signal within a first range of audio frequencies including the predetermined audio frequency of the at least one code frequency component; the processor being further programmed to establish a noise amplitude for the first range of audio frequencies and to detect the presence of the at least one code frequency component in the first range of audio frequencies based on the established noise amplitude thereof and the determined amplitude of the frequency component therein; the processor being operative to produce a code output signal based on the detected presence of the at least one code frequency component; and an output terminal coupled with the processor to provide the code signal thereat.
In accordance with yet a further aspect of the present invention, apparatus and methods are provided for encoding an audio signal, comprise the means for and the steps of, respectively: generating a code comprising a plurality of code frequency component sets, each of the code frequency component sets representing a respectively different code symbol and including a plurality of respectively different code frequency components, the code frequency components of the code frequency component sets forming component clusters spaced from one another within the frequency domain, each of the component clusters having a respective predetermined frequency range and consisting of one frequency component from each of the code frequency component sets falling within its respective predetermined frequency range, component clusters which are adjacent within the frequency domain being separated by respective frequency amounts, the predetermined frequency range of each respective component cluster being smaller than the frequency amounts separating the respective component cluster from its adjacent component clusters; and combining the code with the audio signal.
In accordance with yet still another aspect of the present invention, a digital computer is provided for encoding an audio signal, comprising: an input for receiving the audio signal, a processor programmed to produce a code comprising a plurality of code frequency component sets, each of the code frequency component sets representing a respectively different code symbol and including a plurality of respectively different code frequency components, the code frequency components of the code frequency component sets forming component clusters spaced from one another within the frequency domain, each of the component clusters having a respective predetermined frequency range and consisting of one frequency component from each of the code frequency component sets falling within its respective predetermined frequency range, component clusters which are adjacent within the frequency domain being separated by respective frequency amounts, the predetermined frequency range of each respective component cluster being smaller than the frequency amounts separating the respective component cluster from its adjacent component clusters; and means for combining the code with the audio signal.
The above, and other objects, features and advantages of the invention, will be apparent in the following detailed description of certain advantageous embodiments thereof which is to be read in connection with the accompanying drawings forming a part hereof, and wherein corresponding elements are identified by the same reference numerals in the several views of the drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a functional block diagram of an encoder in accordance with an aspect of the present invention;
FIG. 2 is a functional block diagram of a digital encoder in accordance with an embodiment of the present invention;
FIG. 3 is a block diagram of an encoding system for use in encoding audio signals supplied in analog form;
FIG. 4 provides spectral diagrams for use in illustrating frequency compositions of various data symbols as encoded by the embodiment of FIG. 3;
FIGS. 5 and 6 are functional block diagrams for use in illustrating the operation of the embodiment of FIG. 3;
FIGS. 7A through 7C are flow charts for illustrating a software routine employed in the embodiment of FIG. 3;
FIGS. 7D and 7E are flow charts for illustrating an alternative software routine employed in the embodiment of FIG. 3;
FIG. 7F is a graph showing a linear approximation of a single tone masking relationship;
FIG. 8 is a block diagram of an encoder employing analog circuitry;
FIG. 9 is a block diagram of a weighting factor determination circuit of the embodiment of FIG. 8;
FIG. 10 is a functional block diagram of a decoder in accordance with certain features of the present invention;
FIG. 11 is a block diagram of a decoder in accordance with an embodiment of the present invention employing digital signal processing;
FIGS. 12A and 12B are flow charts for use in describing the operation of the decoder of FIG. 11;
FIG. 13 is a functional block diagram of a decoder in accordance with certain embodiments of the present invention;
FIG. 14 is a block diagram of an embodiment of an analog decoder in accordance with the present invention;
FIG. 15 is a block diagram of a component detector of the embodiment of FIG. 14; and
FIGS. 16 and 17 are block diagrams of apparatus in accordance with an embodiment of the present invention incorporated in a system for producing estimates of audiences for widely disseminated information.
DETAILED DESCRIPTION OF CERTAIN ADVANTAGEOUS EMBODIMENTS
Encoding
The present invention implements techniques for including codes in audio signals in order to optimize the probability of accurately recovering the information in the codes from the signals, while ensuring that the codes are inaudible to the human ear when the encoded audio is reproduced as sound even if the frequencies of the codes fall within the audible frequency range.
With reference first to FIG. 1, a functional block diagram of an encoder in accordance with an aspect of the present invention is illustrated therein. An audio signal to be encoded is received at an input terminal 30. The audio signal may represent, for example, a program to be broadcast by radio, the audio portion of a television broadcast, or a musical composition or other kind of audio signal to be recorded in some fashion. Moreover, the audio signal may be a private communication, such as a telephone transmission, or a personal recording of some sort. However, these are examples of the applicability of the present invention and there is no intention to limit its scope by providing such examples.
As indicated by the functional block 34 in FIG. 1, the ability of one or more components of the received audio signal to mask sounds having frequencies corresponding with those of the code frequency component or components to be added to the audio signal is evaluated. Multiple evaluations may be carried out for a single code frequency, a separate evaluation for each of a plurality of code frequencies may be carried out, multiple evaluations for each of a plurality of code frequencies may be effected, one or more common evaluations for multiple code frequencies may be carried out or a combination of one or more of the foregoing may be implemented. Each evaluation is carried out based on the frequency of the one or more code components to be masked and the frequency or frequencies of the audio signal component or components whose masking abilities are being evaluated. In addition, if the code component and the masking audio component or components do not fall within substantially simultaneous signal intervals, such that they would be reproduced as sound at significantly different time intervals, the effects of differences in signal intervals between the code component or components being masked and the masking program component or components are also to be taken into consideration.
Advantageously, in certain embodiments multiple evaluations are carried out for each code component by separately considering the abilities of different portions of the audio signal to mask each code component. In one embodiment, the ability of each of a plurality of substantially single tone audio signal components to mask a code component is evaluated based on the frequency of the audio signal component, its "amplitude" (as defined herein) and timing relevant to the code component, such masking being referred to herein as "tonal masking".
The term "amplitude" is used herein to refer to any signal value or values which may be employed to evaluate masking ability, to select the size of a code component, to detect its presence in a reproduced signal, or as otherwise used, including values such as signal energy, power, voltage, current, intensity and pressure, whether measured on an absolute or relative basis, and whether measured on an instantaneous or accumulated basis. As appropriate, amplitude may be measured as a windowed average, an arithmetic average, by integration, as a root-mean-square value, as an accumulation of absolute or relative discrete values, or otherwise.
In other embodiments, in addition to tonal masking evaluations or in the alternative, the ability of audio signal components within a relatively narrow band of frequencies sufficiently near a given code component to mask the component is evaluated (referred to herein as "narrow band" masking). In still other embodiments, the ability of multiple code components within a relatively broad band of frequencies to mask the component is evaluated. As necessary or appropriate, the abilities of program audio components in signal intervals preceding or following a given component or components to mask the same on a non-simultaneous basis are evaluated. This manner of evaluation is particularly useful where audio signal components in a given signal interval have insufficiently large amplitudes to permit the inclusion of code components of sufficiently large amplitudes in the same signal interval so that they are distinguishable from noise.
Preferably, a combination of two or more tonal masking abilities, narrow band masking abilities and broadband masking abilities (and, as necessary or appropriate, non-simultaneous masking abilities), are evaluated for multiple code components. Where code components are sufficiently close in frequency, separate evaluations need not be carried out for each.
In certain other advantageous embodiments, a sliding tonal analysis is carried out instead of separate tonal, narrow band and broadband analyses, avoiding the need to classify the program audio as tonal, narrow band or broadband.
Preferably, where a combination of masking abilities are evaluated, each evaluation provides a maximum allowable amplitude for one or more code components, so that by comparing all of the evaluations that have been carried out and which relate to a given component, a maximum amplitude may be selected therefor which will ensure that each component will nevertheless be masked by the audio signal when it is reproduced as sound so that all of the components become inaudible to human hearing. By maximizing the amplitude of each component, the probability of detecting its presence based on its amplitude, is likewise maximized. Of course, it is not essential that the maximum possible amplitude be employed, as it is only necessary when decoding to be able to distinguish a sufficiently large number of code components from audio signal components and other noise.
The results of the evaluations are output as indicated at 36 in FIG. 1 and made available to a code generator 40. Code generation may be carried out in any of a variety of different ways. One particularly advantageous technique assigns a unique set of code frequency components to each of a plurality of data states or symbols, so that, during a given signal interval, a corresponding data state is represented by the presence of its respective set of code frequency components. In this manner, interference with code detection by audio signal components is reduced since, in an advantageously high percentage of signal intervals, a sufficiently large number of code components will be detectable despite program audio signal interference with the detection of other components. Moreover, the process of implementing the masking evaluations is simplified where the frequencies of the code components are known before they are generated.
Other forms of encoding may also be implemented. For example, frequency shift keying (FSK), frequency modulation (FM), frequency hopping, spread spectrum encoding, as well as combinations of the foregoing can be employed. Still other encoding techniques which may be used in practicing the present invention will be apparent from its disclosure herein.
The data to be encoded is received at an input 42 of the code generator 40 which responds by producing its unique group of code frequency components and assigning an amplitude to each based upon the evaluations received from the output 36. The code frequency components as thus produced are supplied to a first input of a summing circuit 46 which receives the audio signal to be encoded at a second input. The circuit 46 adds the code frequency components to the audio signal and outputs an encoded audio signal at an output terminal 50. The circuit 46 may be either an analog or digital summing circuit, depending on the form of the signals supplied thereto. The summing function may also be implemented by software and, if so, a digital processor used to carry out the masking evaluation and to produce the code can also be used to sum the code with the audio signal. In one embodiment, the code is supplied as time domain data in digital form which is then summed with time domain audio data. In another, the audio signal is converted to the frequency domain in digital form and added to the code which likewise is represented as digital frequency domain data. In most applications, the summed frequency domain data is then converted to time domain data.
From the following, it will be seen that masking evaluation as well as code producing functions may be carried out either by digital or analog processing, or by combinations of digital and analog processing. In addition, while the audio signal may be received in analog form at the input terminal 30 and added to the code components in analog form by the circuit 46 as shown in FIG. 1, in the alternative, the audio signal may be converted to digital form when it is received, added to the code components in digital form and output in either digital or analog form. For example, when the signal is to be recorded on a compact disk or on a digital audio tape, it may be output in digital form, whereas if it is to be broadcast by conventional radio or television broadcasting techniques, it may be output in analog form. Various other combinations of analog and digital processing may also be implemented.
In certain embodiments, the code components of only one code symbol at a time are included in the audio signal. However, in other embodiments, the components of multiple code symbols are included simultaneously in the audio signal. For example, in certain embodiments the components of one symbol occupy one frequency band and those of another occupy a second frequency band simultaneously. In the alternative, the components of one symbol can reside in the same band as another or in an overlapping band, so long as their components are distinguishable, for example, by assigning to respectively different frequencies or frequency intervals.
An embodiment of a digital encoder is illustrated in FIG. 2. In this embodiment, an audio signal in analog form is received at an input terminal 60 and converted to digital form by an A/D converter 62. The digitized audio signal is supplied for masking evaluation, as indicated functionally by the block 64 pursuant to which the digitized audio signal is separated into frequency components, for example, by Fast Fourier Transform (FFT), wavelet transform, or other time-to-frequency domain transformation, or else by digital filtering. Thereafter, the masking abilities of audio signal frequency components within frequency bins of interest are evaluated for their tonal masking ability, narrow band masking ability and broadband masking ability (and, if necessary or appropriate, for non-simultaneous masking ability). Alternatively, the masking abilities of audio signal frequency components within frequency bins of interest are evaluated with a sliding tonal analysis.
Data to be encoded is received at an input terminal 68 and, for each data state corresponding to a given signal interval, its respective group of code components is produced, as indicated by the signal generation functional block 72, and subjected to level adjustment, as indicated by the block 76 which is also supplied with the relevant masking evaluations. Signal generation may be implemented, for example, by means of a look-up table storing each of the code components as time domain data or by interpolation of stored data. The code components can either be permanently stored or generated upon initialization of the system of FIG. 2 and then stored in memory, such as in RAM, to be output as appropriate in response to the data received at terminal 68. The values of the components may also be computed at the time they are generated.
Level adjustment is carried out for each of the code components based upon the relevant masking evaluations as discussed above, and the code components whose amplitude has been adjusted to ensure inaudibility are added to the digitized audio signal as indicated by the summation symbol 80. Depending on the amount of time necessary to carry out the foregoing processes, it may be desirable to delay the digitized audio signal, as indicated at 82 by temporary storage in memory. If the audio signal is not delayed, after an FFT and masking evaluation have been carried out for a first interval of the audio signal, the amplitude adjusted code components are added to a second interval of the audio signal following the first interval. If the audio signal is delayed, however, the amplitude adjusted code components can instead be added to the first interval and a simultaneous masking evaluation may thus be used. Moreover, if the portion of the audio signal during the first interval provides a greater masking capability for a code component added during the second interval than the portion of the audio signal during the second interval would provide to the code component during the same interval, an amplitude may be assigned to the code component based on the non-simultaneous masking abilities of the portion of audio signal within the first interval. In this fashion both simultaneous and non-simultaneous masking capabilities may be evaluated and an optimal amplitude can be assigned to each code component based on the more advantageous evaluation.
In certain applications, such as in broadcasting, or analog recording (as on a conventional tape cassette), the encoded audio signal in digital form is converted to analog form by a digital-to-analog converter (DAC) 84. However, when the signal is to be transmitted or recorded in digital form, the DAC 84 may be omitted.
The various functions illustrated in FIG. 2 may be implemented, for example, by a digital signal processor or by a personal computer, workstation, mainframe, or other digital computer.
FIG. 3 is a block diagram of an encoding system for use in encoding audio signals supplied in analog form, such as in a conventional broadcast studio. In the system of FIG. 3, a host processor 90 which may be, for example, a personal computer, supervises the selection and generation of information to be encoded for inclusion in an analog audio signal received at an input terminal 94. The host processor 90 is coupled with a keyboard 96 and with a monitor 100, such as a CRT monitor, so that a user may select a desired message to be encoded while choosing from a menu of available messages displayed by the monitor 100. A typical message to be encoded in a broadcast audio signal could include station or channel identification information, program or segment information and/or a time code.
Once the desired message has been input to the host processor 90, the host proceeds to output data representing the symbols of the message to a digital signal processor (DSP) 104 which proceeds to encode each symbol received from the host processor 90 in the form of a unique set of code signal components as described hereinbelow. In one embodiment, the host processor generates a four state data stream, that is, a data stream in which each data unit can assume one of four distinct data states each representing a unique symbol including two synchronizing symbols termed "E" and "S" herein and two message information symbols "1" and "0" each of which represents a respective binary state. It will be appreciated that any number of distinct data states may be employed. For example, instead of two message information symbols, three data states may be represented by three unique symbols which permits a correspondingly larger amount of information to be conveyed by a data stream of a given size.
For example, when the program material represents speech, it is advantageous to transmit a symbol for a relatively longer period of time than in the case of program audio having a substantially more continuous energy content, in order to allow for the natural pauses or gaps present in speech. Accordingly, to ensure that information throughput is sufficiently high in this case, the number of possible message information symbols is advantageously increased. For symbols representing up to five bits, symbol transmission lengths of two, three and four seconds provide increasingly greater probabilities of correct decoding. In some such embodiments, an initial symbol ("E") is decoded when (i) the energy in the FFT bins for this symbol is greatest, (ii) the average energy minus the standard deviation of the energy for this symbol is greater than the average energy plus the average standard deviation of the energy for all other symbols, and (iii) the shape of the energy versus time curve for this symbol has a generally bell shape, peaking at the intersymbol temporal boundary.
In the embodiment of FIG. 3, as the DSP 104 has received the symbols of a given message to be encoded, it responds by generating a unique set of code frequency components for each symbol which it supplies at an output 106. With reference also to FIG. 4, spectral diagrams are provided for each of the four data symbols S, E, 0 and 1 of the exemplary data set described above. As shown in FIG. 4, in this embodiment the symbol S is represented by a unique group of ten code frequency components f.sub.1 through f.sub.10 arranged at equal frequency intervals in a range extending from a frequency value slightly greater than 2 kHz to a frequency value slightly less than 3 kHz. The symbol E is represented by a second unique group of ten code frequency components f.sub.11 through f.sub.20 arranged in the frequency spectrum at equal intervals from a first frequency value slightly greater than 2 kHz up to a frequency value slightly less than 3 kHz, wherein each of the code components f.sub.11 through f.sub.20 has a unique frequency value different from all others in the same group as well as from all of the frequencies f.sub.1 through f.sub.10. The symbol 0 is represented by a further unique group of ten code frequency components f.sub.21 through f.sub.30 also arranged at equal frequency intervals from a value slightly greater than 2 kHz up to a value slightly less than 3 kHz and each of which has a unique frequency value different from all others in the same group as well as from all of the frequencies f.sub.1 through f.sub.20. Finally, the symbol 1 is represented by a further unique group of ten code frequency components f.sub.31 through f.sub.40 also arranged at equal frequency intervals from a value slightly greater than 2 kHz to a value slightly less than 3 kHz, such that each of the components f.sub.31 through f.sub.40 has a unique frequency value different from any of the other frequency components f.sub.1 through f.sub.40. By using multiple code frequency components for each data state so that the code components of each state are substantially separated from one another in frequency, the presence of noise (such as non-code audio signal components or other noise) in a common detection band with any one code component of a given data state is less likely to interfere with detection of the remaining components of that data state.
In other embodiments, it is advantageous to represent the symbols by multiple frequency components, for example ten code tones or frequency components, which are not uniformly spaced in frequency, and which do not have the same offset from symbol to symbol. Avoiding an integral relationship between code frequencies for a symbol by clustering the tones reduces the effects of interfrequency beating and room nulls, that is, locations where echoes from room walls interfere with correct decoding. The following sets of code tone frequency components for the four symbols (0, 1, S and E) is provided for alleviating the effects of room nulls, where f.sub.1 through f.sub.10 represent respective code frequency components of each of the four symbols (expressed in Hertz):
______________________________________
"0" "1" "S" "E"
______________________________________
f1 1046.9 1054.7 1062.5
1070.3
f2 1195.3 1203.1 1179.7
1187.5
f3 1351.6 1343.8 1335.9
1328.1
f4 1492.2 1484.4 1507.8
1500.0
f5 1656.3 1664.1 1671.9
1679.7
f6 1859.4 1867.2 1843.8
1851.6
f7 2078.1 2070.3 2062.5
2054.7
f8 2296.9 2289.1 2304.7
2312.5
f9 2546.9 2554.7 2562.5
2570.3
f10 2859.4 2867.2 2843.8
2851.6
______________________________________
Generally speaking, in the examples provided above, the spectral content of the code varies relatively little when the DSP 104 switches its output from any of the data states S, E, 0 and 1 to any other thereof. In accordance with one aspect of the present invention in certain advantageous embodiments, each code frequency component of each symbol is paired with a frequency component of each of the other data states so that the difference therebetween is less than the critical bandwidth therefor. For any pair of pure tones, the critical bandwidth is a frequency range within which the frequency separation between the two tones may be varied without substantially increasing loudness. Since the frequency separation between adjacent tones in the case of each of data states S, E, 0 and 1 is the same, and since each tone of each of the data states S, E, 0 and 1 is paired with a respective tone of each of the others thereof so that the difference in frequency therebetween is less than the critical bandwidth for that pair, there will be substantially no change in loudness upon transition from any of the data states S, E, 0 and 1 to any of the others thereof when they are reproduced as sound. Moreover, by minimizing the difference in frequency between the code components of each pair, the relative probabilities of detecting each data state when it is received is not substantially affected by the frequency characteristics of the transmission path. A further benefit of pairing components of different data states so that they are relatively close in frequency is that a masking evaluation carried out for a code component of a first data state will be substantially accurate for a corresponding component of a next data state when switching of states take place.
Alternatively, in the non-uniform code tone spacing scheme to minimize the effects of room nulls, it will be seen that the frequencies selected for each of the code frequency components f.sub.1 through f.sub.10 are clustered around a frequency, for example, the frequency components for f1, f2 and f3 are located in the vicinity of 1055 Hz, 1180 Hz and 1340 Hz, respectively. Specifically, in this exemplary embodiment, the tones are spaced apart by two times the FFT resolution, for example, for a resolution of 4 Hz, the tones are shown as spaced apart by 8 Hz, and are chosen to be in the middle of the frequency range of an FFT bin. Also, the order of the various frequencies which are assigned to the code frequency components f.sub.1 through f.sub.10 for representing the various symbols 0, 1, S and E is varied in each cluster. For example, the frequencies selected for the components f1, f2 and f3 correspond to the symbols (0, 1, S, E), (S, E, 0, 1) and (E, S, 1, 0), respectively, from lowest to highest frequency, that is, (1046.9, 1054.7, 1062.5, 1070.3), (1179.7, 1187.5, 1195.3, 1203.1), (1328.1, 1335.9, 1343.8, 1351.6). A benefit of this scheme is that even if there is a room null which interferes with correct reception of a code component, in general the same tone is eliminated from each of the symbols, so it is easier to decode a symbol from the remaining components. In contrast, if a room null eliminates a component from one symbol but not from another symbol, it is more difficult to correctly decode the symbol.
It will be appreciated that, in the alternative, either more or less than four separate data states or symbols may be employed for encoding. Moreover, each data state or symbol may be represented by more or less than ten code tones, and while it is preferable that the same number of tones be used to represent each of the data states, it is not essential in all applications that the number of code tones used to represent each data state be the same. Preferably, each of the code tones differs in frequency from all of the other code tones to maximize the probability of distinguishing each of the data states upon decoding. However, it is not essential in all applications that none of the code tone frequencies are shared by two or more data states.
FIG. 5 is a functional block diagram to which reference is made in explaining the encoding operation carried out by the embodiment of FIG. 3. As noted above, the DSP 104 receives data from the host processor 90 designating the sequence of data states to be output by the DSP 104 as respective groups of code frequency components. Advantageously, the DSP 104 generates a look-up table of time domain representations for each of the code frequency components f.sub.1 through f.sub.40 which it then stores in a RAM thereof, represented by the memory 110 of FIG. 5. In response to the data received from the host processor 90, the DSP 104 generates a respective address which it applies to an address input of the memory 110, as indicated at 112 in FIG. 5, to cause the memory 110 to output time domain data for each of the ten frequency components corresponding to the data state to be output at that time.
With reference also to FIG. 6, which is a functional block diagram for illustrating certain operations carried out by the DSP 104, the memory 110 stores a sequence of time-domain values for each of the frequency components of each of the symbols S, E, 0 and 1. In this particular embodiment, since the code frequency components range from approximately 2 kHz up to approximately 3 kHz, a sufficiently large number of time domain samples are stored in the memory 110 for each of the frequency components f.sub.1 through f.sub.40 so that they may be output at a rate higher than the Nyquist frequency of the highest frequency code component. The time domain code components are output at an appropriately high rate from the memory 110 which stores time-domain components for each of the code frequency components representing a predetermined duration so that (n) time-domain components are stored for each of the code frequency components f.sub.1 through f.sub.40 for (n) time intervals t.sub.1 through t.sub.n, as shown in FIG. 6. For example, if the symbol S is to be encoded during a given signal interval, during the first interval t.sub.1, the memory 110 outputs the time-domain components f.sub.1 through f.sub.10 corresponding to that interval, as stored in the memory 110. During the next interval, the time-domain components f.sub.1 through f.sub.10 for the interval t.sub.2 are output by the memory 110. This process continues sequentially for the intervals t.sub.3 through t.sub.n and back to t.sub.1 until the duration of the encoded symbol S has expired.
In certain embodiments, instead of outputting all ten code components, e.g., f1 through f10, during a time interval, only those of the code components lying within the critical bandwidth of the tones of the audio signal are output. This is a generally conservative approach to ensuring inaudibility of the code components.
With reference again to FIG. 5, the DSP 104 also serves to adjust the amplitudes of the time-domain components output by the memory 110 so that, when the code frequency components are reproduced as sound, they will be masked by components of the audio signal in which they have been included such that they are inaudible to human hearing. Consequently, the DSP 104 is also supplied with the audio signal received at the input terminal 94 after appropriate filtering and analog-to-digital conversion. More specifically, the encoder of FIG. 3 includes an analog band pass filter 120 which serves to substantially remove audio signal frequency components outside of a band of interest for evaluating the masking ability of the received audio signal which in the present embodiment extends from approximately 1.5 kHz to approximately 3.2 kHz. The filter 120 also serves to remove high frequency components of the audio signal which may cause aliasing when the signal is subsequently digitized by an analog-to-digital convertor (A/D) 124 operating at a sufficiently high sampling rate.
As indicated in FIG. 3, the digitized audio signal is supplied by the A/D 124 to DSP 104 where, as indicated at 130 in FIG. 5, the program audio signal undergoes frequency range separation. In this particular embodiment, frequency range separation is carried out as a Fast Fourier Transform (FFT) which is performed periodically with or without temporal overlap to produce successive frequency bins each having a predetermined frequency width. Other techniques are available for segregating the frequency components of the audio signals, such as a wavelet transform, discrete Walsh Hadamard transform, discrete Hadamard transform, discrete cosine transform, as well as various digital filtering techniques.
Once the DSP 104 has separated the frequency components of the digitized audio signal into the successive frequency bins, as mentioned above, it then proceeds to evaluate the ability of various frequency components present in the audio signal to mask the various code components output by the memory 110 and to produce respective amplitude adjustment factors which serve to adjust the amplitudes of the various code frequency components such that they will be masked by the program audio when reproduced as sound so that they will be inaudible to human hearing. These processes are represented by the block 134 in FIG. 5.
For audio signal components that are substantially simultaneous with the code frequency components they are to mask (but which precede the code frequency components by a short period of time), the masking ability of the program audio components is evaluated on a tonal basis, as well as on a narrow band masking basis and on a broadband masking basis, as described below. For each code frequency component which is output at a given time by the memory 110, a tonal masking ability is evaluated for each of a plurality of audio signal frequency components based on the energy level in each of the respective bins in which these components fall as well as on the frequency relationship of each bin to the respective code frequency component. The evaluation in each case (tonal, narrow band and broadband) may take the form of an amplitude adjustment factor or other measure enabling a code component amplitude to be assigned so that the code component is masked by the audio signal. Alternatively, the evaluation may be a sliding tonal analysis.
In the case of narrow band masking, in this embodiment for each respective code frequency component the energy content of frequency components below a predetermined level within a predetermined frequency band including the respective code frequency component is evaluated to derive a separate masking ability evaluation. In certain implementations narrow band masking capability is measured based on the energy content of those audio signal frequency components below the average bin energy level within the predetermined frequency band. In this implementation, the energy levels of the components below the energy levels of the components below the average bin energy (as a component threshold) are summed to produce a narrow band energy level in response to which a corresponding narrow band masking evaluation for the respective code component is identified. A different narrow band energy level may instead be produced by selecting a component threshold other than the average energy level. Moreover, in still other embodiments, the average energy level of all audio signal components within the predetermined frequency band instead is used as the narrow band energy level for assigning a narrow band masking evaluation to the respective code component. In still further embodiments, the total energy content of audio signal components within the predetermined frequency band instead is used, while in other embodiments a minimum component level within the predetermined frequency band is used for this purpose.
Finally, in certain implementations the broadband energy content of the audio signal is determined to evaluate the ability of the audio signal to mask the respective code frequency component on a broadband masking basis. In this embodiment, the broadband masking evaluation is based on the minimum narrow band energy level found in the course of the narrow band masking evaluations described above. That is, if four separate predetermined frequency bands have been investigated in the course of evaluating narrow band masking as described above, and broadband noise is taken to include the minimum narrow band energy level among all four predetermined frequency bands (however determined), then this minimum narrow band energy level is multiplied by a factor equal to the ratio of the range of frequencies spanned by all four narrow bands to the bandwidth of the predetermined frequency band having the minimum narrow band energy level. The resulting product indicates a permissible overall code power level. If the overall permissible code power level is designated P, and the code includes ten code components, each is then assigned an amplitude adjustment factor to yield a component power level which is 10 dB less than P. In the alternative, broadband noise is calculated for a predetermined, relatively wide band encompassing the code components by selecting one of the techniques discussed above for assessing the narrow band energy level but instead using the audio signal components throughout the predetermined, relatively wide band. Once the broadband noise has been determined in the selected manner, a corresponding broadband masking evaluation is assigned to each respective code component.
The amplitude adjust factor for each code frequency component is then selected based upon that one of the tonal, narrow band and broadband masking evaluations yielding the highest permissible level for the respective component. This maximizes the probability that each respective code frequency component will be distinguishable from non-audio signal noise while at the same time ensuring that the respective code frequency component will be masked so that it is inaudible to human hearing.
The amplitude adjust factors are selected for each of tonal, narrow band and broadband masking based on the following factors and circumstances. In the case of tonal masking, the factors are assigned on the basis of the frequencies of the audio signal components whose masking abilities are being evaluated and the frequency or frequencies of the code components to be masked. Moreover, a given audio signal over any selected interval provides the ability to mask a given code component within the same interval (i.e., simultaneous masking) at a maximum level greater than that at which the same audio signal over the selected interval is able to mask the same code component occurring before or after the selected interval (i.e., non-simultaneous masking). The conditions under which the encoded audio signal will be heard by an audience or other listening group, as appropriate, preferably are also taken into consideration. For example, if television audio is to be encoded, the distorting effects of a typical listening environment are preferably taken into consideration, since in such environments certain frequencies are attenuated more than others. Receiving and reproduction equipment (such as graphic equalizers) can cause similar effects. Environmental and equipment related effects can be compensated by selecting sufficiently low amplitude adjust factors to ensure masking under anticipated conditions.
In certain embodiments only one of tonal, narrow band or broadband masking capabilities are evaluated. In other embodiments two of such different types of masking capabilities are evaluated, and in still others all three are employed.
In certain embodiments, a sliding tonal analysis is employed to evaluate the masking capability of the audio signal. A sliding tonal analysis generally satisfies the masking rules for narrow band noise, broadband noise and single tones without requiring audio signal classification. In the sliding tonal analysis, the audio signal is regarded as a set of discrete tones, each being centered in a respective FFT frequency bin. Generally, the sliding tonal analysis first computes the power of the audio signal in each FFT bin. Then, for each code tone, the masking effects of the discrete tones of the audio signal in each FFT bin separated in frequency from such code tone by no more than the critical bandwidth of the audio tone are evaluated based on the audio signal power in each such bin using the masking relationships for single tone masking. The masking effects of all of the relevant discrete tones of the audio signal are summed for each code tone, then adjusted for the number of tones within the critical bandwidth of the audio signal tones and the complexity of the audio signal. As explained below, in certain embodiments, the complexity of the program material is empirically based on the ratio of the power in the relevant tones of the audio signal and the root sum of squares power in such audio signal tones. The complexity serves to account for the fact that narrow band noise and broadband noise each provide much better masking effects than are obtained from a simple summation of the tones used to model narrow band and broadband noise.
In certain embodiments which employ a sliding tonal analysis, a predetermined number of samples of the audio signal first undergo a large FFT, which provides high resolution but requires longer processing time. Then, successive portions of the predetermined number of samples undergo a relatively smaller FFT, which is faster but provides less resolution. The amplitude factors found from the large FFT are merged with those found from the smaller FFTs, which generally corresponds to time weighting the higher "frequency accuracy" large FFT by the higher "time accuracy" of the smaller FFT.
In the embodiment of FIG. 5, once an appropriate amplitude adjust factor has been selected for each of the code frequency components output by the memory 110, the DSP 104 adjusts the amplitude of each code frequency component accordingly, as indicated by the functional block "amplitude adjust" 114. In other embodiments, each code frequency component is initially generated so that its amplitude conforms to its respective adjust factor. With reference also to FIG. 6, the amplitude adjust operation of the DSP 104 in this embodiment multiplies the ten selected ones of the time domain code frequency components values f.sub.1 through f.sub.40 for the current time interval t.sub.1 through t.sub.n by a respective amplitude adjust factor G.sub.A1 through G.sub.A10 and then the DSP 104 proceeds to add the amplitude adjusted time domain components to produce a composite code signal which it supplies at its output 106. With reference to FIGS. 3 and 5, the composite code signal is converted to analog form by a digital-to-analog converter (DAC) 140 and supplied thereby to a first input of a summing circuit 142. The summing circuit 142 receives the audio signal from the input terminal 94 at a second input and adds the composite analog code signal to the analog audio signal to supply an encoded audio signal at an output 146 thereof.
In radio broadcasting applications, the encoded audio signal modulates a carrier wave and is broadcast over the air. In NTSC television broadcasting applications, the encoded audio signal frequency modulates a subcarrier and is mixed with a composite video signal so that the combined signal is used to modulate a broadcast carrier for over-the-air broadcast. The radio and television signals, of course, may also be transmitted by cable (for example, conventional or fiber optic cable), satellite or otherwise. In other applications, the encoded audio can be recorded either for distribution in recorded form or for subsequent broadcast or other wide dissemination. Encoded audio may also be employed in point-to-point transmissions. Various other applications, and transmission and recording techniques will be apparent.
FIGS. 7A through 7C provide flow charts for illustrating a software routine carried out by the DSP 104 for implementing the evaluation of tonal, narrow band and broadband masking functions thereof described above. FIG. 7A illustrates a main loop of the software program of the DSP 104. The program is initiated by a command from the host processor 90 (step 150), whereupon the DSP 104 initializes its hardware registers (step 152) and then proceeds in step 154 to compute unweighted time domain code component data as illustrated in FIG. 6 which it then stores in memory to be read out as needed to generate the time domain code components, as mentioned hereinabove. In the alternative, this step may be omitted if the code components are stored permanently in a ROM or other nonvolatile storage. It is also possible to calculate the code component data when required, although this adds to the processing load. Another alternative is to produce unweighted code components in analog form and then adjust the amplitudes of the analog components by means of weighting factors produced by a digital processor.
Once the time domain data has been computed and stored, in step 156 the DSP 104 communicates a request to the host processor 90 for a next message to be encoded. The message is a string of characters, integers, or other set of data symbols uniquely identifying the code component groups to be output by the DSP 104 in an order which is predetermined by the message. In other embodiments, the host, knowing the output data rate of the DSP, determines on its own when to supply a next message to the DSP by setting an appropriate timer and supplying the message upon a time-out condition. In a further alternative embodiment, a decoder is coupled with the output of the DSP 104 to receive the output code components in order to decode the same and feed back the message to the host processor as output by the DSP so that the host can determine when to supply a further message to the DSP 104. In still other embodiments, the functions of the host processor 90 and the DSP 104 are carried out by a single processor.
Once the next message has been received from the host processor, pursuant to step 156, the DSP proceeds to generate the code components for each symbol of the message in order and to supply the combined, weighted code frequency components at its output 106. This process is represented by a loop identified by the tag 160 in FIG. 7A.
Upon entering the loop symbolized by the tag 160, the DSP 104 enables timer interrupts 1 and 2 and then enters a "compute weighting factors" subroutine 162 which will be described in connection with the flow charts of FIGS. 7B and 7C. With reference first to FIG. 7B, upon entering the compute weighting factors subroutine 162 the DSP first determines whether a sufficient number of audio signal samples have been stored to permit a high-resolution FFT to be carried out in order to analyze the spectral content of the audio signal during a most recent predetermined audio signal interval, as indicated by step 163. Upon start up, a sufficient number of audio signal samples must first be accumulated to carry out the FFT. However, if an overlapping FFT is employed, during subsequent passes through the loop correspondingly fewer data samples need be stored before the next FFT is carried out.
As will be seen from FIG. 7B, the DSP remains in a tight loop at the step 163 awaiting the necessary sample accumulation. Upon each timer interrupt 1, the A/D 124 provides a new digitized sample of the program audio signal which is accumulated in a data buffer of the DSP 104, as indicated by the subroutine 164 in FIG. 7A.
Returning to FIG. 7B, once a sufficiently large number of sample data have been accumulated by the DSP, processing continues in a step 168 wherein the above-mentioned high resolution FFT is carried out on the audio signal data samples of the most recent audio signal interval. Thereafter, as indicated by a tag 170, a respective weighting factor or amplitude adjust factor is computed for each of the ten code frequency components in the symbol currently being encoded. In a step 172, that one of the frequency bins produced by the high resolution FFT (step 168) which provides the ability to mask the highest level of the respective code component on a single tone basis (the "dominant tonal") is determined in the manner discussed above.
With reference also to FIG. 7C, in a step 176, the weighting factor for the dominant tonal is determined and retained for comparison with relative masking abilities provided by narrow band and broadband masking and, if found to be the most effective masker, is used as the weighting factor for setting the amplitude of the current code frequency component. In a subsequent step 180, an evaluation of narrow band and broadband masking capabilities is carried out in the manner described above. Thereafter, in a step 182, it is determined whether narrow band masking provides the best ability to mask the respective code component and if so, in a step 184, the weighting factor is updated based on narrow band masking. In a subsequent step 186, it is determined whether broadband masking provides the best ability to mask the respective code frequency component and, if so, in a step 190, the weighting factor for the respective code frequency component is adjusted based on broadband masking. Then, in step 192 it is determined whether weighting factors have been selected for each of the code frequency components to be output presently to represent the current symbol and, if not, the loop is re-initiated to select a weighting factor for the next code frequency component. If, however, the weighting factors for all components have been selected, then the subroutine is terminated as indicated in step 194.
Upon the occurrence of timer interrupt 2, processing continues to a subroutine 200 wherein the functions illustrated in FIG. 6 above are carried out. That is, in the subroutine 200 the weighting factors calculated during the subroutine 162 are used to multiply the respective time domain values of the current symbol to be output and then the weighted time domain code component values are added and output as a weighted, composite code signal to the DAC 140. Each code symbol is output for a predetermined period of time upon the expiration of which processing returns to the step 156 from the step 202.
FIGS. 7D and 7E show flowcharts illustrating an implementation of the sliding tonal analysis technique for evaluating the masking effects of an audio signal. At step 702, variables are initialized such as the size in samples of a large FFT and a smaller FFT, the number of smaller FFTs per large FFT and the number of code tones per symbol, for example, 2048, 256, 8 and 10, respectively.
At steps 704-708, a number of samples corresponding to a large FFT is analyzed. At step 704, audio signal samples are obtained. At step 706, the power of the program material in each FFT bin is obtained. At step 708, the permissible code tone power in each corresponding FFT bin, accounting for the effects of all of the relevant audio signal tones on that bin, is obtained, for each of the tones. The flowchart of FIG. 7E shows step 708 in more detail.
At steps 710-712, a number of samples corresponding to a smaller FFT is analyzed, in similar fashion to steps 706-708 for a large FFT. At step 714, the permissible code powers found from the large FFT in step 708 and the smaller FFT in step 712 are merged for the portion of the samples which have undergone a smaller FFT. At step 716, the code tones are mixed with the audio signal to form encoded audio, and at step 718, the encoded audio is output to DAC 140. At step 720, it is decided whether to repeat steps 710-718, that is, whether there are portions of audio signal samples which have undergone a large FFT but not a smaller FFT. Then, at step 722, if there are any more audio samples, a next number of samples corresponding to a large FFT is analyzed.
FIG. 7E provides detail for steps 708 and 712, computing the permissible code power in each FFT bin. Generally, this procedure models the audio signal as comprising a set of tones (see examples below), computes the masking effect of each audio signal tone on each code tone, sums the masking effects and adjusts for the density of code tones and complexity of the audio signal.
At step 752, the band of interest is determined. For example, let the bandwidth used for encoding be 800 Hz-3200 Hz, and the sampling frequency be 44100 samples/sec. The starting bin begins at 800 Hz, and the ending bin ends at 3200 Hz.
At step 754, the masking effect of each relevant audio signal tone on each code in this bin is determined using the masking curve for a single tone, and compensating for the non-zero audio signal FFT bin width by determining (1) a first masking value based on the assumption that all of the audio signal power is at the upper end of the bin, and (2) a second masking value based on the assumption that all of the audio signal power is at the lower end of the bin, and then choosing that one of the first and second masking values which is smaller.
FIG. 7F shows an approximation of a single tone masking curve for an audio signal tone at a frequency of fPGM which is about 2200 Hz in this example, following Zwislocki, J. J., "Masking: Experimental and Theoretical Aspects of Simultaneous, Forward, Backward and Central Masking", 1978, in Zwicker et al., ed., Psychoacoustics: Facts and Models, pages 283-316, Springer-Verlag, New York. The width of the critical band (CB) is defined by Zwislocki as:
critical band=0.002*f.sub.PGM.sup.1.5 +100
With the following definitions, and letting "masker" be the audio signal tone,
______________________________________
BRKPOINT = 0.3 / +/- 0.3 critical bands/
PEAKFAC = 0.025119
/ -16 db from masker/
BEATFAC = 0.002512
/ -26 db from masker/
mNEG = -2.40 / -24 db per critical band/
mPOS = -0.70 / -7 db per critical band/
cf = code frequency
mf = masker frequency
cband = critical band around f.sub.PGM
______________________________________
then the masking factor, mfactor, can be computed as follows:
brkpt=cband*BRKPOINT
if on negative slope of curve of FIG. 7F,
mfactor=PEAKFAC*10**(mNEG*mf-brkpt-cf)/cband)
if on flat part of curve of FIG. 7F,
mfactor=BEATFAC
if on positive slope of curve of FIG. 7F,
mfactor=PEAKFAC*10**(mPOS*cf-brkpt-mf)/cband)
Specifically, a first mfactor is computed based on the assumption that all of the audio signal power is at the lower end of its bin, then a second mfactor is computed assuming that all of the audio signal power is at the upper end of its bin, and the smaller of the first and second mfactors is chosen as the masking value provided by that audio signal tone for the selected code tone. At step 754, this processing is performed for each relevant audio signal tone for each code tone.
At step 756, each code tone is adjusted by each of the masking factors corresponding to the audio signal tones. In this embodiment, the masking factor is multiplied by the audio signal power in the relevant bin.
At step 758, the result of multiplying the masking factors by the audio signal power is summed for each bin, to provide an allowable power for each code tone.
At step 760, the allowable code tone powers are adjusted for the number of code tones within a critical bandwidth on either side of the code tone being evaluated, and for the complexity of the audio signal. The number of code tones within the critical band, CTSUM, is counted. The adjustment factor, ADJFAC, is given by:
ADJFAC=GLOBAL*(PSUM/PRSS).sup.1.5 /CTSUM
where GLOBAL is a derating factor accounting for encoder inaccuracy due to time delays in FFT performance, (PSUM/PRSS).sup.1.5 is an empirical complexity correction factor, and 1/CTSUM represents simply dividing the audio signal power over all the code tones it is to mask. PSUM is the sum of the masking tone power levels assigned to the masking of the code tone whose ADJFAC is being determined. The root sum of squares power (PRSS) is given by ##EQU1## For example, assuming a total masking tone power in a band equally spread among one, two and three tones, then
______________________________________
no.
tones tone power PSUM PRSS
______________________________________
1 10 1 * 10 = 10
10
2 5, 5 2 * 5 = 10 SQRT (2*5.sup.2) = 7.07
3 3.3, 3.3, 3.3
3 * 3.3 = 10
SQRT (3*3.3.sup.2) = 5.77
______________________________________
Thus, PRSS measures masking power peakiness (increasin |