Sound quality standards and sound quality evaluation methods

Sound quality standards and sound quality evaluation methods

● Sound quality standard The so-called sound quality refers to the fidelity of the audio signal after transmission and processing. At present, the industry-accepted sound quality standards are divided into four levels, that is, digital laser turntable CD-DA quality, whose signal bandwidth is 10Hz ~ 20kHz; FM broadcast FM quality, its signal bandwidth is 20Hz ~ 15kHz; AM broadcast AM quality, its signal The bandwidth is 50Hz ~ 7kHz; the voice quality of the phone, the signal bandwidth is 200Hz ~ 3400Hz. It can be seen that the sound quality of the digital laser turntable is the highest, and the voice quality of the phone is the lowest. In addition to the frequency range, people often use other methods and indicators to further describe the sound quality standards for different purposes.

For analog audio, the more frequency components of the reproduced sound, the less distortion and interference, the higher the fidelity of the sound, and the better the sound quality. For example, in communication science, the level of sound quality is measured in addition to the frequency range of audio signals, as well as indicators such as distortion and signal-to-noise ratio. For digital audio, the more the frequency component of the reproduced sound, the smaller the bit error rate and the better the sound quality. Usually measured by digital rate (or storage capacity), the higher the sampling frequency, the greater the number of quantization bits, the more the number of channels, the greater the storage capacity, of course, the higher the fidelity, the better the sound quality.

The types of sound are different, and the sound quality requirements are also different. For example, the fidelity of voice sound quality is mainly reflected in clear, undistorted, and reproduced flat sound images; the fidelity of musical sounds requires high, and the creation of spatial sound images is mainly reflected in the use of multi-channel analog stereo surround sound, or virtual two-channel 3D surround sound and other methods to reproduce all sound images of the original sound source.

The uses of audio signals are different, and the quality standards for compression are also different. For example, the audio signal of telephone quality adopts ITU-TG · 711 standard, 8kHz sampling, 8bit quantization, and the code rate is 64Kbps. AM broadcast adopts ITU-TG · 722 standard, 16kHz sampling, 14bit quantization, code rate 224Kbps. High-fidelity stereo audio compression standards are jointly formulated by ISO and ITU-T. The CD11172-3MPEG audio standards are sampled at 48kHz, 44.1kHz, and 32kHz, and the digital rate of each channel is 32Kbps ~ 448Kbps, suitable for CD-DA discs.

If the sound quality is too high, the equipment is complicated; otherwise, the application cannot be met. Generally, the principle is "enough and not wasteful".

● Sound quality evaluation method

There are two methods for evaluating the quality of reproduced sound: subjective evaluation and objective evaluation. E.g:

1. Voice quality

The methods for assessing the quality of speech coding are subjective and objective. At present, subjective assessment is commonly used, which is measured by subjective scoring (MOS), which is divided into the following five levels: 5 (excellent), without perceiving distortion; 4 (good), just perceiving distortion, but not annoying; 3 (middle) , Perceived distortion, slightly annoying; 2 (poor), annoying, but not objectionable; 1 (inferior), extremely annoying, objectionable. Generally speaking, if the frequency of the reproduced voice reaches above 7kHz, the MOS can score 5 points. This evaluation standard is widely used in multimedia technology and communication, such as videophone, video conference, voice e-mail, voice mail, etc.

2. Music sound quality

The quality of musical sound depends on many factors, such as the characteristics of the sound source (sound pressure, frequency, spectrum, etc.), the signal characteristics of the audio equipment (such as distortion, frequency response, dynamic range, signal-to-noise ratio, transient characteristics, stereo sound Separation degree, etc.), sound field characteristics (such as direct sound, early reflection, reverberation, correlation between the two ears, reference vibration, sound absorption rate, etc.), auditory characteristics (such as loudness curve, audible range, various senses of hearing) )Wait. Therefore, it is difficult to evaluate the sound quality of audio equipment.

Usually use the following two methods: one is to use the instrument to test the technical indicators; the second is to listen to various sound effects subjectively. Due to the complex nature of musical sound quality, the subjective evaluation of personal color is relatively strong, and the existing audio testing technology can only reflect its fidelity from certain sides. Therefore, to date, there is no internationally recognized evaluation standard that can truly reflect the fidelity of musical sound quality. However, it has also been reported that the International Telecommunication Union (ITU-T) has recently approved a new measurement method called electronic ear that objectively evaluates the sound quality, which can be used for objective listening evaluation of the sound quality of any audio equipment and can also be used to detect telephone calls. Defects of communication voice coding system.

Now, the evaluation methods of musical sound quality are summarized as follows:

(1) Sound effect of subjective listening judgment

In general, various attributes of sound quality are subjectively evaluated according to the changes and combinations of loudness, pitch, and pleasure, such as loudness, pitch, and pleasure, such as low-frequency loudness is full sound, high-frequency loudness is bright sound, and low-frequency weakness is smooth sound. The high frequency is weak and the sound is clear. In the following, several typical listening sensations are introduced in combination with sound source, sound field and signal characteristics.

â‘  Three-dimensional

The sense of hearing is mainly composed of the sense of space (surrounding), positioning (direction), layering (thickness), etc. The sound with these senses is called stereo. The various sound fields in nature are full of three-dimensional sense, which is the most important feature of simulating the sound image of the sound source. The De Boer effect proves that the physiological characteristics of the human ear are: the human ear is on the symmetry axis of the two sound sources. When the sound pressure difference â–³ p = 0dB and the time difference â–³ t = 0ms, the sound images of the two sound sources feel the same, divided There are no two sound sources; when â–³ p> 15dB or â–³ t> 3ms, the human ear feels that there are two sound sources, and the sound image moves toward the sound source with high sound pressure or pre-guide, every 5dB of sound The pressure difference is equivalent to the time difference of lms. The Haas effect further proves that when â–³ t = 5ms ~ 35ms, the human ear feels that there are two sound sources; when the time difference between the last reflection sound, the lagging direct sound, or the two sound sources â–³ t> 50ms, even one The loudness of reflected sounds (also called recent or early reflections) or lagging sounds is many times greater than the loudness of direct sounds or leading sounds, and the orientation of the sound source is still determined by the direct sounds or leading sounds.

According to the physiological characteristics of the human ear, as long as the sound intensity, delay, reverberation, spatial effect, etc. are properly controlled and processed, the artificial manufacturing of the two ears has a certain time difference Δt, phase difference Δθ, sound pressure If the sound wave state of the difference △ P is the same as the state of the sound wave generated by the original sound source at both ears, the human can truly and completely feel the three-dimensional sense of the reproduced sound. Compared with mono sound, stereo usually has the characteristics of sound image dispersion, proper volume distribution of each part, high clarity, and low background noise.

â‘¡ Positioning

If the sound source is recorded in different directions of left, right, up, down and back and then sent, the received and reproduced sound should be able to reproduce the direction of the sound source in the original sound field. This is the sense of localization. According to the physiological characteristics of the human ear, the maximum time difference of the direct sound that reaches the ears first from the same sound source is 0.44ms ~ 0.5ms, and there is also a certain sound pressure difference and phase difference. Physiological psychology proves that the bass of 20Hz ~ 200Hz is mainly located by the phase difference of human ears, the midrange of 300Hz ~ 4kHz is mainly located by the difference of sound pressure, and the higher treble is mainly located by the time difference. It can be seen that the sense of localization is mainly determined by the direct sound that first reaches the two ears, and the primary reflection sound lagging to the ears and the reverberation sound that is reflected multiple times in all directions mainly simulate the spatial surroundings of the sound image.

â‘¢Space sense

Although the reflected sound of one reflection and multiple reflections lag the direct sound, it has little effect on the direction of the sound, but the reflected sound always reaches the two ears from all directions, which has an important impact on the hearing judgment of the size of the surrounding space, so that the human ear is surrounded by surround Feeling, this is the sense of space. The sense of space is more important than the sense of positioning.

â‘£ Hierarchical sound with high, medium and low frequency response, balanced high-pitched harmonics, clear and slender without harshness, bright mid-range highlight, plump and full without stiffness, thick bass without nasal sound.

⑤Thickness

The bass is calm and powerful, thick and not muddy, high treble is not lacking, the volume is moderate, there is a certain brightness, the reverb is suitable, and the distortion is small.

In addition, there are many listening sensations for evaluating sound quality, such as sense of strength, brightness, presence, softness, tightness, width and so on.

(2) Objective test technical indicators

â‘  Distortion

Harmonic distortion mainly causes the sound to be hard and explosive; and steady-state or transient intermodulation distortion mainly causes the sound to be rough, sharp and turbid. Both of them degrade the sound quality. If the distortion exceeds 3%, the sound quality deteriorates significantly. The speaker system of the sound system has the highest distortion, and the minimum distortion generally exceeds 1%.

The phase distortion mainly causes the low frequency sound below 1kHz to be blurred, and also affects the intermediate frequency sound level and sound image localization.
Shaking and distortion are mainly caused by unstable motor speed, unstable pressure of the capstan-pinch roller, magnetic head tapping on the tape, etc., which causes tape vibration and tape volume change, which in turn causes the signal frequency to be modulated and the sound tone to appear turbid and tremble. Shaking is usually expressed by the root mean square value of the change in tone. Usually, the shake rate of the recorder is <0.1%, the Hi-Fi recorder is less than 0.005%, the ordinary video recorder is less than 0.3%, and the video disc machine is less than 0.001%.

â‘¡ Frequency response and transient response

Frequency response refers to the situation where the gain or sensitivity of audio equipment changes with the signal frequency, and is expressed by the width of the passband and the in-band unevenness (such as the frequency response of a high-quality power amplifier 1Hz ~ 200kHz ± ldB). The wider the bandwidth, the better the high and low frequency response: the smaller the unevenness, the better the frequency equalization performance. Generally, the low frequency of 30Hz ~ 150Hz makes the sound have a certain thickness foundation, and the low frequency of 150Hz ~ 500Hz makes the sound have a certain strength. When the sound pressure of the low frequency of 300Hz ~ 500Hz is excessively strengthened, the sound is muddy, and when it is excessively attenuated, the sound is weak; the high frequency of 500Hz ~ 5kHz Make the sound have a certain brightness, when the sound is excessively enhanced, the sound is stiff; when the sound is excessively attenuated, the sound is scattered and floating; the high frequency band of 5kHz ~ 10kHz makes the sound have a certain level and color; when the sound is excessively strengthened, the sound is sharp; when the sound is excessively attenuated, the sound is dull , Bored. According to this law, the frequency response of the sound system can be adjusted quantitatively according to various senses of hearing.

Transient response refers to the ability of the sound system to follow abrupt signals. In essence, it reflects the magnitude of the high-order harmonic distortion of the pulse signal, which seriously affects the transparency and layering of sound quality. The transient response is usually expressed by the conversion rate V / μs. The higher the index, the smaller the harmonic distortion. For example, the conversion rate of general amplifiers is> 10V / μs.

â‘¢Signal to noise ratio

The signal-to-noise ratio, which represents the decibel difference between the signal and the noise level, is expressed in S / N or SNR (dB). The level of the noise frequency and the strength of the signal have different effects on the human ear. Generally, the human ear is most sensitive to noise of 4-8 kHz, and weak signals are more affected by noise than strong signals. Different audio equipment has different signal-to-noise ratio requirements. For example, Hi-Fi audio requires SNR> 70dB, and CD player requires SNR> 90dB.

â‘£ Channel separation and balance

Channel separation refers to the degree of stereo isolation between different channels, and is expressed by the difference between the signal level of one channel and the signal level of another channel. The larger the difference, the better. Generally requires Hi-Fi sound separation> 50dB. Channel balance refers to the consistency of the gain and frequency response characteristics of the two channels. Otherwise, it will cause the channel sound image to shift.

Wall Switch And Socket

Wall Switch And Socket,Wireless Wall Switch,Decora Light Switch,Push Button Light Switch

ZHEJIANG HUAYAN ELECTRIC CO.,LTD , https://www.huayanelectric.com