WARNING: This server is unstable and will be retired in the next days.
If you want to keep this forum available, please request immediately a migration
on the Nabble Support forum.
Forums that don't receive any migration request will be deleted forever.
The online documentation isn’t completely clear on how it works:
Voice Activity Detector. Attempts to trim silence and quiet background sounds from the ends of (fairly high resolution i.e. 16-bit, 44−48kHz) recordings of speech.
Options: Default values are shown in parenthesis. −t num (7)
The measurement level used to trigger activity detection. This might need to be changed depending on the noise level, signal level and other charactistics of the input audio.
What is a “measurement level”? What does the default number 7 represent?
Hello Damon, per the manual, this is a measurement of cepstral power. Basically, this is the volume (in some arbitrary units) of a vowel sound needed to trigger detection of voice, so a lower number will trigger more readily than a higher number. The -p option may help in preserving initial consonant sounds. Remember also, that vad performs best with normalised audio (norm effect).