Real-time Sociofeedback: A VoIP Application and a Correlation Study presented by Debsubhra CHAKRABORTY PhD student Institute for Media Innovation/ Interdisciplinary Graduate School Supervisor: Asst. Prof. Justin Dauwels (EEE) Co-Supervisor: Prof. Daniel Thalmann (IMI/SCE) 21st May, 2015 Outline • Introduction • VoIP Application • Corpora • Correlation Study • Conclusion 2 Introduction • Computers from tools to facilitating human-human interaction to social robotics • Social Signal Processing – endow computers with social intelligence Vinciarelli, A., Pantic, M., & Bourlard, H. (2009). Social signal processing: Survey of an emerging domain. Image and Vision Computing, 27(12), 1743-1759. Albrecht, K. (2006). Social Intelligence: The new science of success. John Wiley & Sons. 3 Introduction • Social signals – key to SSP • Social signals contain non-verbal cues • Existing methods – offline analysis, one or two social signals, no additional video data used, rule-based, no feedback 4 Vinciarelli, A., Pantic, M., Heylen, D., Pelachaud, C., Poggi, I., D'Errico, F., & Schröder, M. (2012). Bridging the gap between social animal and unsocial machine: A survey of social signal processing.Affective Computing, IEEE Transactions on, 3(1), 6987. (figure source) Poggi, I. (2007). Mind, hands, face and body: a goal and belief view of multimodal communication. Weidler. Our Approach • Applications: Skype interview or online course Politeness Friendliness Frustration Respect Empathy Confusion Hostility Agreement Dominance Interest 5 VoIP Application: A Demo 6 VoIP Application • Records each speaker on separate .mp4 files SuperTintin • Segments of 1 minute for Skype Matlab • File event handler analyzes each new file saved • Separates audio from video • Uses trained machine learning models on audio to get sociometrics • Use Skype API to conveniently display the sociometrics Skype 7 Machine Learning • Different machine learning algorithms to obtain best classification result Support Vector Machine Support Vector Ordinal Regression Adaptive Boosting Bagging Artificial Neural Network k-Nearest Neighbors Random Subspace Ensembles Naïve Bayes Least Squares • Model training is performed on two corpora: AC and AVC 8 Corpora Audio Corpus (AC) • Audio on separate microphones • No video • 150 recordings • 2.5-3 minutes long • 22 participants (17 M, 5 F) • Both speaker participants Audio-Visual Corpus (AVC) • Audio on separate microphones • Video on separate Kinects • 100 recordings • 1 minute long • 21 participants (16 M, 5 F) • One speaker participant, other control 9 Feature Extraction • Speech detection using HMM • Prosodic cues computed over 30 ms • Visual cues from RGB + depth data from Kinect Conversational Prosodic Visual Speaking Duration Frequencies Postures Speaking Turns MFCC Head Movement Interruption Amplitude Hand Gestures Interjection Head Pose 10 Basu, S. (2003, April). A linked-HMM model for robust voicing and speech detection. In Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP'03). 2003 IEEE International Conference on(Vol. 1, pp. I-816). IEEE. Annotation Protocol • Each recording annotated by multiple judges • Annotation on likert scale ranging from 1 (low) to 3 (high) Corpora Maximum SD Minimum SD Mean SD Median SD AC 0.55 0.24 0.45 0.47 AVC 0.51 0.24 0.41 0.43 Tahir, Y., Chakraborty, D., Maszczyk, T., Dauwels, S., Dauwels, J., Thalmann, N., Thalmann, D. (2015). Real-Time Sociometrics from Audio-Visual Features for Two-Person Dialogs. – Accepted in DSP 2015. 11 Classification Results • Leave-one-person-out cross validation technique Sociometric AC AVC Agreement 84 % 81 % Dominance 86 % 90 % Interest 85 % 92 % Politeness 81 % 76 % Friendliness 51 % 63 % Frustration 50 % 67 % Empathy 59 % 67 % Respect 59 % 62 % Confusion 81 % 89 % Hostility 77 % 72 % [6] Tahir, Y., Chakraborty, D., Maszczyk, T., Dauwels, S., Dauwels, J., Thalmann, N., Thalmann, D. (2015). Real-Time Sociometrics from Audio-Visual Features for Two-Person Dialogs. – Accepted in DSP 2015. 12 Correlation Study • Understand interrelationships between indicators (sociometrics) and features • Express them in a visually engaging manner • Identify redundancies • Boost existing classification 13 Correlation: Between Indicators AC Oleg Komarov, “Schemaball,” [Software], June 2013, Available from http://www.mathworks.com/matlabcentral/fileexchange/42279-schemaball. AVC 14 Correlation: Indicators & Features AC AVC 15 Conclusion • Feedback of social states in real-time is important in certain situations • Such feedback can be provided through VoIP • Such feedback is based on machine learning of annotated corpora • Correlation study of corpora can help us understand important relationships 16 Acknowledgements • Asst. Prof. Justin Dauwels for his thoughtful insights • Yasir Tahir and Tomasz Maszczyk for their generous help • IMI and IGS at NTU for supporting this research 17 18
© Copyright 2025