Multimodality

Not all media types are perceived the same. For instance, visual objects are processed differently than textual objects because visuals provide spatial and contextual meaning. That means, that a user can interpret images intuitively (Raptis et al., 2021). These insights are meaningful when it comes to engaging users with digital artifacts, such as 3D models.

Multimodality is the integration of different modes of communication into texts or communication events and it comes in various modes, such as language, images, audio and gestures (Van Leeuwen, 2015). It embodies the use of multiple media forms, including 3D models, textual descriptions or interactive narratives, mostly to present historical content. It is important to combine several media forms to enhance user experience and engagement (Schreibman & Schoueri, 2024). Considering multimodality when conceptualising and building a 3D scholarly edition helps to create meaningful content to enhance the understanding of the viewer. Multimodality deserves particular attention these days, as it enables multivocality, representation and inclusion. Regarding multivocality, multimodality supports different interaction styles and users can engage with content through speech, gestures or gaze, which allows multiple forms of interpretation (Raptis et al., 2021). In terms of representation, multimodality enhances the accessibility for people with disabilities by integrating multiple interaction methods, for instance a speech assistant for visually impaired users. It also supports experiences with cultural heritage by offering alternative ways of engaging with artifacts (Raptis et al., 2021). Moreover, inclusion can be guaranteed, as personal experiences with the artifact are based on user capabilities and preferences, therefore this increases the engagement across different demographics with different technological literacy levels (Raptis et al., 2021). In short, multimodal systems improve comprehension by enabling the user to absorb information efficiently and evoking natural interactions (Raptis et al., 2021).

For our 3D scholarly edition, we decided to incorporate multimodality into our process by building a 3D scene and incorporating multiple forms of media in Voyager tours. Integration, representation and user engagement were important to us and guiding concepts for creating our tours. To integrate multimodal narratives, we opted for 3D images, other 3D models, audios and images. We believed the articles should be enriched with images to break through the mass of texts. Some external 3D models make our scene more realistic and less boring. The tours themselves integrate and combine many different media types. The user can click through the tour steps, requiring interactivity. Audios and images prompt even more engagement with our model and the narrative around it. This prevents boredom, that appears for instance when a user only sees masses of texts with no interaction possibilities. Since the attention span of our user is very low nowadays, we want to prompt our viewer with many interaction possibilities, which enhances the user engagement and the absorption of information, as we increase the interest in our model and topic.