Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Understanding Dynamic Range Compression and Expansion in Audio Processing - Prof. Jennifer, Study notes of Database Programming

The concepts of dynamic range compression and expansion in audio processing. It discusses the different stages where dynamics processing can be applied, the effects of normalization and compression, and the importance of attack and release times. The text also touches upon the difference between possible dynamic range, actual dynamic range, and perceived loudness of a piece. For students and professionals in audio engineering, music production, or related fields, this document provides valuable insights into the art and science of managing dynamic range in audio recordings.

Typology: Study notes

Pre 2010

Uploaded on 08/17/2009

koofers-user-9ut-1
koofers-user-9ut-1 🇺🇸

10 documents

1 / 8

Toggle sidebar

Related documents


Partial preview of the text

Download Understanding Dynamic Range Compression and Expansion in Audio Processing - Prof. Jennifer and more Study notes Database Programming in PDF only on Docsity! Excerpt from Chapter 5 of The Science of Digital Media by Jennifer Burg 5.2 Dynamics Processing Dynamics processing is the process of adjusting the dynamic range of an audio selection, either to reduce or to increase the difference between the loudest and softest passages. An increase in amplitude is called gain or boost. A decrease in amplitude is called attenuation or, informally, a cut. Dynamics processing can be done at different stages while audio is being prepared, and by a variety of methods. The maximum amplitude can be limited with a hardware device during initial recording, gain can be adjusted manually in real-time with analog dials, and hardware compressors and expanders can be applied after recording. In music production, vocals and instruments can be recorded at different times, each on its own track, and each track can be adjusted dynamically in real-time or after recording. When the tracks are mixed down to a single track, the dynamics of the mix can be adjusted again. Finally, in the mastering process, dynamic range can be adjusted for the purpose of including multiple tracks on a CD and giving the tracks a consistent sound. In summary, audio can be manipulated through hardware or software; the hardware can be analog or digital; the audio can be processed in segments or holistically; and processing can happen in real-time or after recording. The information in this section is based on digital dynamics processing tools: hard limiting, normalization, compression, and expansion. These tools alter the amplitude of an audio signal and therefore change its dynamics—the difference between the softest and the loudest part of the signal. Limiting sets a maximum amplitude. Normalization finds the maximum amplitude sample in the signal, boosts it to the maximum possible amplitude (or an amplitude chosen by the user), and boosts all other amplitudes proportionately. Dynamic compression decreases the dynamic range of a selection. (This type of compression has nothing to do with file size.) Dynamic expansion increases it. The purpose of adjusting dynamic range is to improve the texture or balance of sound. The texture of music arises in part from its differing amplitude levels. Instruments and voices have their characteristic amplitude or dynamic range. The difference between peak level amplitude and average amplitude of the human voice, for example, is about 10 dB. In a musical composition, instruments and voices can vary in amplitude over time— a flute is played softly in the background, vocals emerge at medium amplitude, and a drum is suddenly struck at high amplitude. Classical music typically has a wide dynamic range. Sections of low amplitude are contrasted with impressive high amplitude sections  Aside: You should be careful to distinguish among the following: possible dynamic range as a function of bit depth in digital audio; actual dynamic range of a particular piece of audio; and perceived loudness of a piece. The possible dynamic range for a piece of digital audio is determined by the bit depth in which that piece is encoded. In Chapters 1 and 4, we derived a formula that tells us that the possible dynamic range is equal to approximately 6 * n dB where n is the number of bits per sample. For CD quality audio, which uses 16-bit samples, this would be 96 dB. However, a given piece of music doesn't necessarily use that full possible dynamic range. The dynamic range of a piece is the difference between its highest amplitude and lowest amplitude sample. The overall perceived loudness of a piece, which is a subjective measurement, is related to the average RMS of the piece. The higher the average RMS, the louder a piece seems to the human ear. (RMS— root-mean-square—is explained in Chapter 4.) full of instruments and percussion. You probably are familiar with Beethoven’s Fifth Symphony. Think of the contrast between the first eight notes and what follows: BUM BUM BUM BAH! BUM BUM BUM BAH! Then softer… In contrast, “elevator music” or “Muzak” is intentionally produced with a small dynamic range. Its purpose is to lie in the background, pleasantly but almost imperceptibly. Musicians and music editors have words to describe the character of different pieces that arise from their variance in dynamic range. A piece can sound “punchy,” “wimpy,” “smooth,” “bouncy,” “hot,” or “crunchy,” for example. Audio engineers train their ears to hear subtle nuances in sound and to use their dynamics processing tools to create the effects they want. Deciding when and how much to compress or expand dynamic range is as much art as science. Compressing the dynamic range is desirable for some types of sound and listening environments and not for others. It’s generally a good thing to compress the dynamic range of music intended for radio. You can understand why if you think about the way radio sounds in a car, which is where radio music is often heard. With the background noise of your tires humming on the highway, you don’t want music that has big differences between the loudest and softest parts. Otherwise, the soft parts will be drowned out by the background noise. For this reason, radio music is dynamically compressed, and then the amplitude is raised overall. The result is that the sound has a higher average RMS, and overall it is perceived to be louder. There’s a price to be paid for dynamic compression. Some sounds—like percussion instruments or the beginning notes of vocal music—have a fast attack time. The attack time of a sound is the time it takes for the sound to change amplitude. With a fast attack time, the sound reaches high amplitude in a sudden burst, and then it may drop off quickly. Fast-attack percussion sounds like drums or cymbals are called transients. Increasing the perceived loudness of a piece by compressing the dynamic range and then increasing the overall amplitude can leave little headroom—room for transients to stand out with higher amplitude. The entire piece of music may sound louder, but it can lose much of its texture and musicality. Transients give brightness or punchiness to sound, and suppressing them too much can make music sound dull and flat. Allowing the transients to be sufficiently loud without compromising the overall perceived loudness and dynamic range of a piece is one of the challenges of dynamics processing. While dynamic compression is more common than expansion, expansion has its uses also. Expansion allows more of the potential dynamic range—the range made possible by the bit depth of the audio file—to be used. This can brighten a music selection. Using downward expansion, it’s possible to lower the amplitude of signals below the point where they can be heard. The point below which a digital audio signal is no longer audible is called the noise floor. Say that your audio processing software represents amplitude in dBFS—decibels full scale—where the maximum amplitude of a sample is 0 and the minimum possible amplitude—a function of bit depth—is somewhere between 0 and −∞. For 16-bit audio, the minimum possible amplitude is approximately −96 dBFS. Ideally, this is the noise floor, but in most recording situations there is a certain amount of low amplitude background noise that masks low amplitude sounds. shows the traditional view. Compression above the threshold is typically represented as a ratio a:b. If you indicate that you want a compression ratio of a:b, then you’re saying that, above the threshold, for each a decibels that the signal increases in amplitude, you want it to increase only by b decibels. For example, if you specify a dynamic range compression ratio of 2:1 above the threshold, then if the amplitude raises by 1 dB from one sample to the next, it will actually go up (after compression) by only 0.5 dB. Notice that, beginning at an input of −40 dB and continuing to the end, the slope of the line is 2/1/ ab . Figure 5.5 Graph of transfer function for downward compression (from Audition) Figure 5.6 Downward compression, traditional view (from Audition) Often, a gain makeup is applied after downward compression. You can see in Figure 5.6 that there is a place to set Output Gain. The Output Gain is set to 0 in the figure. If you set the output gain to a value g dB greater than 0, this means that after the audio selection is compressed, the amplitudes of all samples are increased by g dB. Gain makeup can also be done by means of normalization, as described above. The result is to increase the perceived loudness of the entire piece. However, if the dynamic range has been decreased, the perceived difference between the loud and soft parts is reduced. Figure 5.7 Upward compression by 2:1 below −30 dB (from Audition) Upward compression is accomplished by indicating that you want compression of sample values that are below a certain threshold, or decibel limit. For example, Figure 5.7 shows how you indicate that you want samples that are below −30 dB to be compressed by a ratio of 2:1. If you look at the graph, you can see that this means that sample values will get larger. For example, a sample value of −80 dB becomes −54 dB after compression. This may seem counterintuitive at first, since you may think of compressing something as making it smaller. But remember that it is the dynamic range, not the sample values themselves, that you are compressing. If you want to compress the dynamic range by changing values that are below a certain threshold, then you have to make them larger, moving them toward the higher amplitude values at the top. This is what is meant by upward compression. With some tools, it’s possible to achieve both downward and upward compression with one operation. Figure 5.8 shows the graph for downward compression above −20 dB, no compression between −20 and −60 dB, and upward compression below −60 dB. To this, an output gain of 4 dB is added. An audio file before and after such dynamics processing is shown in Figure 5.9. The dynamic range has been reduced by both downward and upward compression. Sometimes, normalization is used after dynamic range compression. If we downward and upward compress the same audio file and follow this with normalization, we get the audio file pictured in Figure 5.10. Figure 5.8 Downward and upward compression (from Audition) a. Uncompressed audio b. Compressed audio Figure 5.9 Audio file before and after dynamics processing Figure 5.10 Downward and upward compression followed by normalization It is also possible to compress the dynamic range at both ends, by making high amplitudes lower and low amplitudes higher. Following is an example of expanding the dynamic range by “squashing” at both the low and high amplitudes. The compression is performed on an audio file that has three single-frequency tones at 440 Hz. The amplitude of the first is −5 dB, the second is −3 dB, and the third is −12 dB. Values above −4 dB are made smaller (downward compression). Values below −10 dB are made larger (upward compression). The settings are given in Figure 5.12. The audio file
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved