2nd Semester 2023/24: Multiscale Dynamics of Multimodal Communication

James Trujillo

Conversation consists of a complex dynamic of communicative behaviors, unfolding over multiple time-scales. Individual speakers produce gestures, facial expressions, and linguistic information in the form of speech or signs. But these multimodal utterances are embedded in the back-and-forth of turn-taking. During turn-taking, speakers may align their behaviors with one another, such as using similar words or gestures. How can we understand, and even model, the complex dynamic of conversation that involves multiple communicative modalities, and at different time-scales?

In this course, you will learn how to apply NLP, information theoretic measures, and computer vision to model the dynamics of conversation.


The course will start with several lecture/discussion sessions to introduce key conceptual work, as well as introduction to (and potentially hands-on practice with) methodologies that can be used to address the key questions described above (Week 1). We will collaboratively develop concrete research questions, splitting into several groups depending on the number of students (Week 2). The rest of Weeks 2 and 3 will be dedicated to working on the research project, with regular opportunities for feedback and discussion. At the end of Week 4, groups will present their project and results.


 Some programming experience (Python or R).  NLP or computer vision knowledge/skill is an advantage, but not required. Methodological tutorials will be available, and a group’s specific approach can be based on their collective skills.


At the end of the month, students submit individual reports of their project, styled as a short research paper. Assessment is based on content and clarity of the report.


Giulianelli, M., Wallbridge, S., & Fernández, R. (2023). Information Value: Measuring Utterance Predictability as Distance from Plausible Alternatives. arXiv preprint arXiv:2310.13676.

Fusaroli, R., Rączaszek-Leonardi, J., & Tylén, K. (2014). Dialog as interpersonal synergy. New Ideas in Psychology32, 147-157.

Louwerse, M. M., Dale, R., Bard, E. G., & Jeuniaux, P. (2012). Behavior matching in multimodal communication is synchronized. Cognitive science36(8), 1404-1426.

Wiltshire, T. J., Hudson, D., Belitsky, M., Lijdsman, P., Wever, S., & Atzmueller, M. (2021, September). Examining team interaction using dynamic complexity and network visualizations. In 2021 IEEE 2nd International Conference on Human-Machine Systems (ICHMS) (pp. 1-6). IEEE.