After writing this blogpost and viewing the videos, it became clear that the definition of the screen recordings is too low to see the exact frequency numbers. The grid of the spectrum view has vertical lines markers separating every 500Hz. This should give the viewer an idea of the frequency peaks. I will edit later to add the information that cannot be seen immediately.
The amount of information available relative to vocal acoustics is exhaustive and exhausting. To understand the really profound stuff one should have at least a strong physics background. However some basic mathematics is enough to understand the stuff that makes a difference to how we singers think.
The first thing to understand about a tone is that it is much more than what it is named. When we sing or play the tone A4 (so called because it is the 4th A from the bottom of the standard 88 key piano; 440 Hz (Hz=Hertz or oscillations per second. It is the unit of acoustic measurement), the pitch orchestras tune by, soprano middle range, tenor high range), we are singing that tone (called the Fundamental [F0] or the first harmonic [H1]) plus all its overtones also called natural harmonics (H2 for the second harmonic, H3 for the third, etc…). The harmonics are multiples of the fundamental tone. If A4 (440 Hz) is the fundamental (first harmonic, H1) then H2 is 880 Hz, twice the frequency rate and H3 is 1320 Hz, 3 times the frequency rate of the fundamental (H1).
At the end of the video, I freeze the spectrum view and point the cursor to each harmonic: The fundamental (F0 also called H1 for first harmonic) is about 1000 Hz, H2 is about 2000 Hz, H3 is 3000 Hz and H4 is 4000 Hz. It is clear that the frozen display shows the harmonics decrease in intensity as they get higher in frequency. This is standard for a tube like the Irish flute but not necessarily so for a piano.
The lower half of the acoustic piano has keys that activate a felt-covered hammer to strike three strings of equal length and thickness. Even though highly calibrated, the strings are not always struck with equal intensity. Furthermore, the shape of the pianos sounding board (its resonator) favors certain frequencies to others. It is a complex instrument in that way. Different pianos playing the same note may produce slightly different acoustic patterns. The rate of decay of the piano’s sound also contributes to how the acoustic pattern appears depending on when the display is frozen. An early freezing of the display (i.e. right after the note is struck) gives us a pattern closer to the acoustic nature of the piano. The higher notes of a piano however are very thin and are produced with two strings instead of three. The pattern there resembles the Irish flute a bit. That is because the sound board (resonator) of a piano favors the lower notes. Like a violin (most instruments in fact) the piano has a fixed resonator that acts on different notes with different intensity.
All pitched instruments (instruments that produce a regular oscillation of sound) regardless of make up, including pitched percussion, will produce the same harmonics. What gives each instrument its unique color (timbre) can be observed by the relative strength of the harmonics. Each instrument has a different variation of strength of the harmonics of a given note. In fact, different notes on the same instrument may produce different relative strengths in the harmonics. In a straight tube, like the Irish flute in the video above, the harmonics tend to decrease in strength the higher they are. However depending on the shape of the instrument’s resonator certain acoustic regions will be stronger than others and so certain higher harmonics will be stronger, reflecting the specific nature of the instrument’s make up. An instrument, based on its acoustic make-up is expected to produce very predictable resonance.
The human vocal tract is flexible and changeable and so it can change its acoustics from one moment to another. Those changes will have a direct impact on the note being produced at any given moment. That is the virtue that makes the human voice very unique. It is also what makes it difficult to make consistent.
On a given fundamental frequency (F0) the harmonics are predictable. However there are two fundamental variables: 1) how the source tone is produced (there are many components to the laryngeal tone) and 2) how the vocal tract is shaped.
For the time being let us assume the source tone is optimally produced, we are left with the possible variations in the vocal tract. We have to consider what the components are that can vary and how they affect the sound.
The optimal volume of the vocal tract is dependent upon five variables: A) laryngeal depth B) opening of the jaw C) variations on tongue position D) the shape of the lips and E) the position of the soft palate (closing or opening of the velar port: nasal or not nasal).
There are many theories about the soft-palate and how high it needs to be. It basically responds to a desire not to be nasal. Its function is related to laryngeal functions including the depth of the larynx, which itself depends on phonation, as well as tongue migration. The pieces are inter-connected. A weakness in the source tone can affect the ability of the velum to close the passage to the nasal cavity. A nasal tone has proven to have an adverse effect on the resonance of the vocal tract, producing a weaker and less balanced tone (balance of low and high harmonics what is often called chiaroscuro or balance between bright and dark).
The concept of a “low larynx” is commonly accepted as beneficial to resonance. A high larynx contributes to many vocal faults AND is caused by vocal faults. The question is rather how to achieve a low larynx without losing other fundamental functions, such as a flexible tongue and raised palate. All functions must be able to be achieved satisfyingly without disturbing other functions.
The optimum spatial nature of the vocal tract could appear to depend on taste. Some teachers insist that the jaw has to be relatively closed and that releasing the jaw even barely will cause a loss of high harmonics. This is obviously false. Other teachers insist on the jaw being opened three fingers tall. Others insist that the jaw must be pushed back. The jaw should open to what I call its natural maximum.
The singers in the photo were told to push the jaw back as in the pictures on the left. This was beginning to cause both discomfort as is obvious by their looks . When advised to allow the jaw to release according to its natural contour, the result was the photos on the right. The alignment of the lower jaw seems appropriate to each structure whereas one can see that the lower jaw is crooked toward the right in the left photos, when they attempted to push their jaw toward the back. An inappropriate opening or forced closure of the jaw during singing does not make for a high quality tone. There are those who have great source tones and can get away with inappropriate resonance adjustments. These types of singers make a conversation about efficiency difficult to sustain. I posted these pictures because I encounter many singers with TMD (Temporo-Mandibular Joint Disorder) who acquired it after they began singing lessons. Many teachers afraid of a protruding jaw suggest that the jaw should be pushed back.
A jaw released to its natural maximum (different for each physiognomy) regardless of vowel and through the articulation of most consonants, contributes to a resonance atmosphere of regularity and constancy. A fully open vocal tract creates the conditions for optimal resonance of lower harmonics, which leaves the tongue as the principal element to partition the vocal tract, creating conditions for a balance between lower and higher harmonics. When the jaw is released to its natural maximum and the larynx is released low, the tongue must migrate further to create the [i] to [E] spectrum of vowels. In speech we do a combination of subtler tongue migration and closing of the jaw to achieve an [i], so singers assume this is natural. Yet they usually open the jaw when they produce the same vowel on higher fundamental frequencies (pitches). An [i] vowel is better balanced when the jaw is released and the larynx is low creating conditions for optimum resonance of the [i]’s very low first formant (F1–more on this later). The coordination of released jaw, low larynx and high tongue position is not easy to achieve. Those that seek immediate gratification and quick results usually go the easy route and close the jaw and/or allow the larynx to rise for [i], the [a] vowel that comes after such an [i] would usually be weak because it would require adjustments that do not occur very quickly.
The lips are refiners and rounding them should only be used for vowels that require rounding such as [o] and [u] and mixed vowels [ø] or [y] (for example). There are some specific situations where a slight rounding makes for a better resonance adjustment but overuse of lip rounding often replace a low larynx to produce a warmer sound. The rounding of the lips does not produce the same results as a larynx relaxed to its lower position. Lip rounding has a way of dampening high harmonics rendering the tone warmer but at the expense of high resonance that is needed for the voice to be heard over loud accompaniments. A low larynx enriches low partials given the voice warmth without eliminating high partials, as long as the tongue is able to migrate naturally and not muffle the resonance by pushing down on the epiglottis.
This brings us to the tongue. It is the most agile, multi-faceted and complex muscle we deal with as singers. If it is not handled with specific expectations and intent it tends to do what it wants to compensate for weaknesses elsewhere. When the rest of the vocal tract is optimized (i.e. low larynx, closed velum released jaw and relaxed lips ready to be shaped “as needed” and not rounded when not needed) the tongue becomes the most important agent of resonance change. The tongue repartitions the vocal tract to create the fundamental vowel spectrum from [i] through [e], [E], [ae] to [a]. The lips then round to continue from [a] through [O], [o], [U] to [u]. Combining lips and tongue create mixed vowels such as [y], [Y], [ø] [oe]. Through all these changes the jaw remains at its natural maximum, the larynx floats low and the velar port remains shut. Here is a very clear, concise and thorough discussion of the tongue’s intrinsic and extrinsic muscles and how they interact
The vocal tract, like any space has resonant frequency bands called formants. Depending on the shape of the vocal tract–what we recognize as vowels– these formant areas move around. Looking at a spectrograph, vowel formants may be identified based upon where the strongest harmonics are. For our purposes, the voice displays 5 formant areas. The lowest two have the strongest impact on vowel recognition. The upper three combine to produce strong higher harmonics that make the voice seem more present. Formants bandwidths vary with frequency. The lowest vowel formant (the first formant of the vowel [i])around 250 Hz has a bandwidth of around 50 Hz whereas the highest formant value around 4000 Hz has a bandwidth of 200 Hz.
The exact formant frequencies for a given vowel are similar for all singers, however they do vary subtly between voice types and probably to a certain degree for each individual since we do not have the same size and shape of vocal tract. A simple way to find formant frequencies is by producing a a gentle vocal fry (also called pulse tone). A vocal fry requires little air pressure, a fact that reduces the strength of the harmonics so much that only the formants are seen:
In this video, I freeze the spectrum view (bottom of the screen) to allow the viewer to see the formant peaks for each of the cardinal vowels ([a,e,i,o,u]). The peaks also give the exact frequency numbers. In the video that follows, I sing all 5 vowels on the pitch f3=267Hz (so named because it is the third f from the bottom of the standard keyboard. You will see that the peaks in the spectrum are pretty close to what was experienced in the fry-tones for the respective vowels.
What we should take from this is the following:
Female singers sing approximately an octave above their male counterparts (that is alto to bass or soprano to tenor). If a soprano sings G5, a fourth below her high C, the fundamental frequency is 800 Hz. This mean the harmonics would be as follows: H2=1600 Hz, H3=2400 Hz, H4=3200 Hz etc… The SF for a soprano is thought to be between 2900 Hz and 3200Hz depending on the specific singer. Even if the bandwidth of the SF were around 200 Hz, its frequency would have to be at least 3000 Hz in order to catch the H4 (fourth harmonic). Because the harmonics are so far apart, the singer’s formant does not always have an effect on the soprano or mezzo voice. However, there is no reason other than pharyngeal size that would prevent a woman from having the SF in the middle range quite consistently. However, there are problems in modern training. The discovery of the acoustic passagio (where the first formant loses dominance to the second) in the female lower voice has caused teachers to think of the middle voice as a separate register from a source tone perspective. Today’s female singers often do not develop the source tone enough in the middle range to achieve strong enough harmonics that would carry the influence of the SF resonance.