Real-Time Audio-Visual Analysis for Multiperson Videoconferencing
Figure 10
Low delay association and fusion. The voice activity is associated with direction of arrival and detected face at the moment of voice activity confirmation and not at the moment when the voice activity is over.