Sound has been intensively used to facilitate learning in education since the expansion of computer technology in the 1970s. Yet the primary focus of sound has been paid to using audio to replace text or to supplement graphics in multimedia instructions. Nonverbal sound, as one of the critical information sources in daily life, is largely overlooked in learning environments. This experimental study investigated how integrating nonverbal sound in embodied interaction affected learning in a digital environment. A language learning website was designed to host the interactive learning activity where nonverbal sound was used to guide learners’ actions during the character writing activity. A total of 140 undergraduate students participated in the experiment. The finding suggested that using nonverbal sound to facilitate embodied interactions led to a better interactive experience and higher intrinsic motivation.
Introduction
Interactive computer-based activities such as simulations and educational video games have properties that can effectively facilitate learning (Lindgren et al., 2016; Rutten et al., 2012; Smetana & Bell, 2012). Learners operate computer interface controls such as mouse and keyboard to respond to learning content, thus constructing their knowledge (Anastopoulou et al., 2011; Lindgren et al., 2016; Mihalca & Miclea, 2007). The design rationale of such interactive activities draws from the embodied cognition perspective that the coordination of sensory, motor, and cognitive functions creates the human experience and the understanding of the surroundings (Barsalou, 1999; Glenberg, 2010; Johnson, 2013; Skulmowski & Rey, 2018). In other words, the sensorimotor system plays a crucial role in information encoding and retention in knowledge construction (Goldin-Meadow et al., 2009; Mizelle & Wheaton, 2010). Moreover, involving the physical movement and sensory system in the learning process can promote concrete experiences, which are driving forces to develop an understanding of abstract concepts (Abrahamson & Lindgren, 2014; Black et al., 2012; Carver et al., 2007).
In computer-based learning environments, we heavily rely on visual and auditory senses (Ale et al., 2022; Dinh et al., 1999). Instructional messages are delivered in bimodal presentations so that the brain processes information in separate channels that help improve information processing capacity (Mayer, 2005; Mayer et al., 2001; Moreno & Mayer, 2007; Paas & Sweller, 2014). However, auditory presentation refers predominantly to narration in most cognitive theory of multimedia learning (CTML) research studies (Bishop et al., 2008; Bishop & Sonnenschein, 2012; Li & Finch, 2021). Occasionally, sound effects and environmental sound are presented together with narrations, yet most research still focuses on narration (Ale et al., 2022; Li & Finch, 2021). Among a handful of studies emphasizing nonverbal sound, the results are mixed for the effects of nonverbal sound on learning. For example, environmental sound can create a “cinematic style” (Bishop & Sonnenschein, 2012) atmosphere that engages learners. On the contrary, random background music can be distractive and negatively affects cognitive task performance (Chou, 2010; Fumham, 1997; Grice & Hughes, 2009; Perham & Vizard, 2011; Perham & Currie,2014). The inconsistent effect of nonverbal sound reported in existing studies, to some extent, has validated concerns raised by Mayer (2005) and other cognitive researchers (Moreno, 2006; Moreno & Mayer, 2007) regarding the extra cognitive load caused by nonverbal sound could interfere with the learning process. Still, it is worth noting that current studies examined the use of nonverbal sound in static environments where learner-interface interactions were limited.
A well-known example of using nonverbal sound to facilitate interactions is video games. Game players highly depend on nonverbal auditory signals to navigate the fast-paced gaming process. For example, nonverbal sound enables players to monitor background processes, locate critical information, and collect environmental information (Jørgensen, 2009). All these details can be processed in the auditory channel without deviating the players from their primary visual tasks. Research indicates that players’ performance drops accordingly when nonverbal sound is removed from the game (Jørgensen, 2009).
Given that learning environments are increasingly immersive and interactive, it is worthwhile to investigate how to integrate nonverbal sound into the learning process to promote active learning. In the existing studies, design by intuition is the most commonly used approach by many designers when integrating sound into the learning contents (Pfeiffer, 2008). Such an intuition-based design method can lead to inconsistent learning outcomes across different research studies because decisions on sound use are based on personal assumptions (Ferati et al., 2012; Mann, 1979; Pfeiffer, 2008; Vickers, 2002). In this study, we perceive sound as an essential element that promotes the embodied cognitive process, and nonverbal sound can play a vital role in the embodied interaction process. The design of nonverbal sound follows the interaction design principles in the field of user experience (UX). Learners perform actions on an interface that generates nonverbal sound, and sonic feedback in turn affects learners’ subsequent actions. We attempt to find out: Will the inclusion of nonverbal sounds in the interaction process lead to a better learning experience in a computer-based writing task? Can sound stimulate motivation by immersing learners in embodied interaction, leading to better task performance?
Literature Review and Theoretical Framework
Three types of nonverbal sounds have been used for learning purposes: environmental sounds, sound effects, and electronically generated tones (Li & Finch, 2021). Environmental sounds were those associated with surroundings that helped to understand where the action took place (Bajaj et al., 2015; Mann, 1979). For example, in Bishop et al.’s (2008) study, environmental sounds were presented together with the visual content in a computer-based learning module, such as an ominous orchestral theme played as background music when the module title of “Thinking Zone” scrolled across the screen; crickets chirping, a constant dry wind blowing, and a wolf howling in the distance were played along with an image of a desert on the screen.
Sound effects were those associated with surroundings that helped build an intuitive linkage between objects and events in a virtual world, typically using sound similar to what people recognized in the real world (Brewster, 1997; Buxton, 1989; Calvert & Scott, 1989). For instance, in Bajaj et al.’s (2015) study, sound effects were used in an instructional video to highlight particular learning concepts, such as using sound effects of water flowing through a tap when explaining how to mix water into a chemical compound. In Bishop et al.’s (2008) study, sound effects of clinking dishes and muffled conversations were embedded with an image of a restaurant booth that augmented the learning experience in an online learning environment.
Environmental sounds and sound effects were primarily used in instructional videos and audiobooks, occasionally in computer-based courses. These nonverbal sounds carried solid affective significance that could influence and shape learners’ behaviors in the learning process. Research studies revealed that the integration of nonverbal sound better sustained learners’ attention and improved learner engagement through a sense of audio immersion (Bajaj et al., 2015; Bishop et al., 2008; Calvert & Scott, 1989; Mann, 1979). In addition, researchers observed the positive effect of these two types of nonverbal sounds on memory retrieval, attributing to the strong linkage between sound and narrative content (Bajaj et al., 2015; Boltz et al., 1991; Calvert & Scott, 1989; Hativa & Reingold, 1987; Mann, 1979).
The third type of nonverbal sound was abstract synthetic tones. These short synthetic tones were parameterized to deliver messages in an auditory format (Calvert & Scott, 1989; Pfeiffer, 2008; Vickers, 2002). Unlike environmental sounds and sound effects, synthetic tones were not directly associated with any specific concept, event, or object. Hence, it depended on how the researchers established connections between selected tones and learning concepts or events. In previous studies, synthetic tones were usually embedded before or after the critical information identified by the instructors as pre-cue or post-cue. For example, Hativa and Reingold (1987) highlighted important scenes using a set of synthetic video-game-like sounds in a computer-based lesson. Calvert and Scott (1989) used a one-second whistle-like sound to mark critical events in a video program. One of the major problems in using synthetic tones was that these tones could become irrelevant and even distracting if learners failed to understand the connection between the tones and the content (Bajaj et al., 2015). Therefore, a systematic application of selected synthetic tones was critical to helping learners recognize the pattern of sound use, thereby enabling these tones to be effective memory anchors (Bajaj et al., 2015; Li & Finch, 2021).
Although the numbers of research studies are limited, we can still track the value of nonverbal sound in these studies, such as immersing learners, sustaining learners’ attention, and organizing information as cues. As mentioned earlier, we perceive learning through the lens of embodied cognition that emphasizes the role of embodied interactions in knowledge construction. We were interested in how the virtue of nonverbal sound can facilitate learning in an interactive learning environment. We selected language writing as the activity because it allowed a high degree of sensorimotor engagement based on Johnson-Glenberg’s taxonomy of embodied learning (Johnson-Glenberg et al., 2014). We decided to use synthetic tones since our activity was a simple language writing task without narrative content. Moreover, synthetic tones could also lower the possibility of activating inappropriate prior knowledge compared to environmental sounds and sound effects. We adapted interaction design principles in the UX field to guide the activity design and a systematic application of synthetic tones in the interactions (Aleksy et al., 2016; Borchers, 2000). As shown in Figure 1, a simple interaction cycle consists of a user action and interface feedback (a targeted response to the action). Nonverbal sound can be integrated as feedback in the interaction to guide learner actions in the writing task.
Figure 1
Adding Nonverbal Sound in the Interaction
Given the potentials of nonverbal sound discussed earlier, an assumption is made that nonverbal sound may be an effective tool to facilitate learning in an interactive environment by encouraging learners to stay engaged with the learning activity, enhancing the learning experience, thereby leading to better learning achievements.
Method
Participants and Procedure
The participants came from a large public university in Texas, USA. A total of 140 undergraduate students participated in this study. They ranged from sophomores to seniors and were enrolled in the College of Education. As the writing task was to explore how to write selected Chinese characters, participants were supposed to have no prior knowledge of any spoken and written Chinese. The study used the split testing method to compare two versions of the language writing activity: the nonverbal sound version and the visual equivalent version. Following the random assignment to the different testing groups, all participants received an email containing a link to access their assigned version of the Learning Chinese program and instructions on how to use it. Participants were asked to use their electronic devices to explore the program at their preferred location and time. Additionally, they were not limited by the number of attempts and were allowed to exit the program at any point. Once participants accessed the program, the software started recording the time spent on each page. Before leaving the program, the participants were asked to complete a brief 5-minute survey.
Learning Chinese Program Development
The Learning Chinese program was a customized website that allowed learners to construct the understanding of strokes and correct order of eight selected Chinese characters by interacting with the digital writing canvas. A drag-and-drop knowledge check task was included at the end of the writing activity to assess how well they could recall the stroke orders of those characters. The tracking codes were embedded into each page to capture the time spent on each activity.
Eight Chinese characters were hieroglyphs, and the character's shape represents its meaning. These characters were displayed in the writing program in the following order: 门 (Door), 山 (Mountain), 子 (Child), 女 (Woman), 马 (Horse), 夫 (Man), 水 (Water), 日 (Sun). Every character contains several strokes in a particular order. For example, 门 (Door) has three strokes starting from the dot from the top downwards. Then a vertical stroke is drawn from the top downwards. The last is a compound stroke called horizontal turning with a hook. This stroke starts with the horizontal stroke from left to right, turns downward, and ends with a hook (Figure 2).
Figure 2
An Example of 门 (Door) on Writing Canvas
An interaction analysis was performed to identify areas where learner-interface interaction occurred (Figure 3). Three types of nonverbal sounds were integrated into the interaction process to assist learners in exploring the basic rules of writing Chinese characters. A short and low-pitched “Ding” sound (S1) was to confirm the action of locating the first stroke of the character. A short but heavy “Dur” sound (S2) was to inform the learner of incorrect movement. A prolonged high-pitched “Ding” sound (S3) was to reinforce the learner’s action on the successful completion of one character.
Figure 3
The Interactive Process of Writing Activity - An Example of 门 (Door)
We used the Web Audio programming application interface (Web Audio API) to generate synthesized tones. The Web Audio API was a high-level JavaScript application that eliminated the problems caused by large audio file sizes and avoided time spent waiting for files to load. For example, the sonic feedback indicating a correct movement (S1) was coded as playSound (‘sine’, 1440.13, 0, 0.8, true). The Web Audio API technique improved the synchronization level of visual, auditory, and physical movement in the interaction process.
For the visual equivalent version, we followed the Spatial Contiguity Principle and placed text feedback at the bottom of the canvas to guide interaction (Figure 4). ‘Correct! Continue!’ was to confirm the correct selection of the first stroke, ‘Incorrect! Start Over!’ was to inform an incorrect drawing, and ‘You Got It! Next!’ was to notify completion of writing one character. The stroke color turned grey, indicating an out-of-track drawing behavior.
Figure 4
An Example of Visual Equivalent
Instruments
The instruments were the User Experience Questionnaire (UEQ) (Schrepp et al., 2017), the Situational Motivation Scale (SIMS) (Guay et al., 2000), and web data. The UEQ measures user-interface interaction experience in online environments that captures users’ feelings and impressions when interacting with the website. The questionnaire contains six factors with 26 items using a 7-point rating scale. Attractiveness examined a purely emotional reaction to the acceptance or rejection of the website. Efficiency, Perspicuity, and Dependability measured whether the website enables the learner to complete the task efficiently and effectively. The Stimulation and Novelty factors assessed the non-task-oriented aspects of the interaction experience, such as the originality of the website design.
The SIMS measures both intrinsic and extrinsic motivation using a 7-point rating scale. The subscales of intrinsic motivation and identified regulation from the SIMS were used in this study as we attempted to explore if nonverbal sounds could stimulate intrinsic motivation. These test items examined how satisfied learners were while performing the writing activity.
Time spent on a product is considered an objective indicator of users' engagement with the product (Liu et al., 2010). Web data collected the amount of time spent on each webpage to understand how well the website captured learners' attention, thus keeping them engaged in the writing activity.
Additionally, time spent on the drag-and-drop task was collected to understand learning achievements on how well participants could recall the correct stroke orders.
Data Analysis
A multivariate analysis of variance (MANOVA) was performed to examine whether nonverbal sounds led to a better interactive experience and increased intrinsic motivation and if any interaction effect occurred by learning preference. MANOVA is a way to examine whether one or more independent variables affect two or more dependent variables (O’Brien & Kaiser, 1985). Wilk’s lambda was an indicator to detect the statistical significance between the means of identified groups on a combination of dependent variables. A further analysis was conducted to determine how the specific dependent variables contributed to the significant overall effect. Moreover, a p-value was obtained to detect whether a statistical interaction effect existed between website versions and self-perceived learning preference.
Results
Cronbach’s Alpha coefficient was calculated for each subscale of the UEQ and SIMS, using the Scale – Reliability function in a software named Statistical Analysis in Social Science (SPSS). The alphas of all six subscales were higher than .75 for the UEQ. The alphas of intrinsic motivation and identified regulation were .92 and .86 for the SIMS. The results indicated a high level of inter-item reliability. Box’s Test of Equality of Covariance Matrices (.418) was not significant, p (.51) > (.05), indicating Wilk’s Lambda was appropriately used to determine the significance level. Bartlett’s Test of Sphericity3 was significant, p (.000) < (.05), indicating the condition of Sphericity was not violated, and the MANOVA would not risk increasing the likelihood of a Type I error.
The multivariate test results indicated a statistical significance between the groups (nonverbal sounds and visual equivalent) on a combination of three dependent variables (Wilk’s ʌ = .841, F (3, 121) = 7.619, η2 = .159, alpha = .05. 1-β = .986). The η2 = .159 indicated that approximately 15% of the multivariate variance of the dependent variables was associated with program versions.
The univariate tests showed statistical significance for interactive experience (F = 7.994, η2 = .61, p =.005, 1-β = .801) and motivation ((F = 14.619, η2 = .106, p = .000, p =.005, 1-β = .967)) between groups. The η2 results indicated that 6.1% of the variance of the interactive experience was associated with the variable of groups, and 10.6% of the variance of motivation was associated with the variable of groups. However, the univariate test did not find a statistical significance for time spent on the learning program (F = .701, η2 = .017, p =.708, 1-β = .348) as well as the drag-and-drop task (F= .699, η2 = .950, p =.709, 1-β = .348).
Discussion
The split testing results indicate that the inclusion of nonverbal sound in the interaction process has a positive effect on participants’ learning experience and intrinsic motivation. The UEQ measures participants’ learning experience built through interactions with the program interface. In the experiment, learners explored the correct stroke orders of Chinese characters on the digital canvas with either auditory or visual feedback. Compared to the visual-only version, the inclusion of nonverbal sound in the interaction process activates a multisensory learning experience. This design simulates our experiences in the real world in which visual and auditory senses are coordinated to guide our behaviors in performing many daily tasks. In addition, the synchronization of visual, auditory, and writing movement keeps the learners better focused on the learning activity. When drawing on the digital canvas, learners maintain their attention on the writing activity with constant guidance delivered by nonverbal auditory signals. By contrast, learners in the visual group have to constantly switch their attention between the writing task and text feedback. Although the text is presented close to the canvas in a succinct way, a frequent switch of attention could increase the visual perceptual load that interferes with information encoding and retention (Calvert & Scott, 1989). According to the UEQ, learners’ perceptions and feelings towards the interactive experience with the program primarily come from their impressions of its design and use (Schrepp et al., 2017). Therefore, the test results of a positive effect on the learning experience may have been attributed to the above-described reasons.
The SIMS examines the internal pleasure and satisfaction experienced when learners work through the writing activity. The test results indicate learners in the nonverbal sound group tend to get more motivated than those in the visual group. One possible reason is that the multisensory learning experience improves learners’ feeling of presence in an online environment. Presence is one of the determinants that instigates intrinsic motivation (Makransky & Petersen, 2021). As mentioned earlier, having nonverbal sound to facilitate interactions activates a multisensory experience. Learners may have been more immersed in a multisensory learning experience. The degree of immersion is associated with the level of presence in an interactive environment (Makransky & Petersen, 2021). Another possible reason can be the synchronization of visual, auditory, and hand movement increase a sense of control over the writing activity. Learning activities achieved with high perceived control over actions can trigger enjoyment in learners, thus activating intrinsic motivation (Makransky & Petersen, 2021; Moore & Fletcher, 2012; Pekrun, 2006). When working on the writing activity, learners in the nonverbal sound group can simultaneously focus on the drawing movements on the canvas and process nonverbal sound feedback. Yet learners in the visual group must constantly switch their attention from drawing actions to visual feedback that interrupts drawing actions, potentially decreasing perceived control over the writing activity.
Unlike the positive effects on learning experience and motivation, the observed difference in the amount of time spent on the writing activity and the drag-and-drop task is not statistically significant between the two testing groups. The test results from the above survey suggest that nonverbal sound could be an effective motivator to engage learners with the writing activity. However, web data collected from the program indicate that participants in the nonverbal sound group did not spend more time exploring the character writing activity than those in the visual group. In addition, participants in the nonverbal sound group did not complete the drag-and-drop task faster than those in the visual group. A possible explanation could be that the time required to discover the correct stroke orders of all eight characters may have been short because they are entry-level characters for first-grade students in Chinese-speaking regions.
Conclusion
We have depended on our visual systems in learning for so long that we almost forget how powerful the auditory system can be. As one of the primary information sources, nonverbal sound is essential to guide us to function appropriately and effectively in daily life. However, it is so underrated as an information representation format in multimedia design for learning. Our study has initiated the first step to incorporate nonverbal sound in the multimedia design to facilitate active learning in interactive environments. The author team has actively contributed through different phases starting from the design of the Learning Chinese program. We have one member specializing in UX design, two from educational technology, and one from research measurement and evaluation. A team with various backgrounds has enabled the authors to view this study and interpret the results from diverse perspectives.
This study did have some limitations. The first limitation is the activity design of the Learning Chinese program. Learning content should include a set of sophisticated Chinese characters. The drag-and-drop activity was not the best strategy to measure a recall of the correct stroke orders. Initially, the design team identified 15 Chinese characters with different difficulty levels, and the final assessment was to allow the participants to write on a blank canvas. The version used in the testing lacked the complexity of content and simplified the assessment method, which can limit the ability to determine true group differences in test results. Second, using computers as the delivery tool was not optimal for the language writing task. The decision made to use computers was based on cost and convenience. Since the focus of this study was for the participants to explore the stroke orders of Chinese characters, the ideal delivery tool would have been touch screen devices that allowed the participants to draw the Chinese characters with their fingers instead of a computer mouse. Third, environmental distractions may affect learners in the visual group more than those in the nonverbal sounds group. Learners participated in the study at their preferred locations. Auditory signals are better suited to attract attention than visual signals in a busy environment. If learners were distracted by their surroundings, nonverbal sounds could better redirect their attention to the writing activity. In this case, nonverbal sound played a role in stimulating attention rather than facilitating embodied interaction.
We attempt to bring nonverbal sound to the design community's attention. We see its great potential in facilitating learning even in our preliminary investigation. There are several gaps in our understanding of using nonverbal sound in interactive environments. Future research studies might investigate the following aspects to expand our knowledge of nonverbal sound based on the findings in the current study:
- More empirical studies are needed to discover the effective design. Future research might consider having multiple testing versions, such as text only, audio only, non-text visual only, text + audio, non-text + visual, and text + non-text + audio, to identify whether the effect of nonverbal sound observed in this study still applies.
- An investigation of different types of nonverbal sounds in interactive environments. The current study has focused on using synthetic tones to respond to learners' actions in the interaction process. More studies are needed to explore different types of nonverbal sounds, such as environmental sounds and sound effects, and how they affect learning in interactive environments.
- More empirical studies are needed to investigate how the inclusion of nonverbal sound affects knowledge or skills in different subject areas. In this study, a well-designed post-test should have been conducted to examine how integrating nonverbal sound in the interaction process affected knowledge construction and retention of basic rules of Chinese stroke orders. Due to the simplified assessment activity, little did we know about the effects of nonverbal sound on supporting language learning. More empirical studies exploring the effects on different learning outcomes would help us develop a deep understanding of nonverbal sound for learning.
References
Abrahamson, D., & Lindgren, R. (2014). Embodiment and embodied design. In R. K. Sawyer (Ed.). The Cambridge handbook of the learning sciences (2nd ed., pp. 358-376). Cambridge University Press. https://doi.org/10.1017/CBO9781139519526
Aleksy, M., Bronmark, J., & Mahate, S. (2016, March). Microinteractions in Mobile and Wearable Computing. In Advanced Information Networking and Applications (AINA), 2016 IEEE 30th International Conference on (pp. 495-500). IEEE.
Ale, M., Sturdee, M., & Rubegni, E. (2022). A systematic survey on embodied cognition: 11 years of research in child-computer interaction. International Journal of Child-Computer Interaction, 33, 100478. https://doi.org/10.1016/j.ijcci.2022.100478
Anastopoulou, S., Sharples, M., & Baber, C. (2011). An evaluation of multimodal interactions with technology while learning science concepts. British Journal of Educational Technology, 42 (2), 266-290. https://doi.org/10.1111/j.1467-8535.2009.01017.x
Bajaj, J., Harlalka, A., Kumar, A., Punekar, R. M., Sorathia, K., Deshmukh, O., & Yadav, K. (2015). Audio cues: Can sound be worth a hundred words? In P. Zaphiris, & A. Ioannou (Eds.), Learning and collaboration technologies (pp. 14-23). Springer.
Barsalou, L. W. (1999). Perceptual symbol systems. Behavioral and Brain Sciences, 22 (4), 577-660. https://doi.org/10.1017/s0140525x99002149
Bishop, M. J., Amankwatia, T. B., & Cates, W. M. (2008). Sound’s use in instructional software to enhance learning: A theory-to-practice content analysis. Educational Technology Research and Development, 56 (4), 467-486. https://doi.org/10.1007/s11423-006-9032-3
Bishop, M. J., & Sonnenschein, D. (2012). Designing with sound to enhance learning: Four recommendations from the film industry. Journal of Applied Instructional Design, 2 (1), 5-15.
Black, J. B., Segal, A., Vitale, J., & Fadjo, C. L. (2012). Embodied cognition and learning environment design. In S. Land, & D. Jonassen (Eds.). Theoretical foundations of learning environments (pp. 198-223). Routledge. https://doi.org/10.4324/9780203813799
Boltz, M., Schulkind, M., & Kantra, S. (1991). Effects of background music on the remembering of filmed events. Memory & Cognition, 19 (6), 593-606.
Borchers, J. O. (2000, August). A pattern approach to interaction design. In Proceedings of the 3rd conference on Designing interactive systems: processes, practices, methods, and techniques (pp. 369-378). ACM.
Brewster, S. A. (1997). Using non-speech sound to overcome information overload. Displays. 17 (3-4), 179-189. https://doi.org/10.1016/S0141-9382(96)01034-7
Buxton, W. (1989). Introduction to this special issue on non-speech audio. Human-Computer Interaction, 4, 1-9.
Calvert, S. L., & Scott, M. C. (1989). Sound effects for children's temporal integration of fast‐paced television content. Journal of Broadcasting & Electronic Media, 33 (3), 233-246.
Carver, R., King, R., Hannum, W., & Fowler, B. (2007). Toward a model of experiential e-learning. MERLOT Journal of Online Learning and Teaching, 3 (3), 247-256. https://jolt.merlot.org/vol3no3/hannum.htm
Chou, P. T. M. (2010). Attention drainage effect: How background music effects concentration in Taiwanese college students. Journal of the Scholarship of Teaching and Learning, 10 (1), 36-46.
Dinh, H. Q., Walker, N., Hodges, L. F., Song, C., & Kobayashi, A. (1999, March). Evaluating the importance of multi-sensory input on memory and the sense of presence in virtual environments. In Proceedings IEEE Virtual Reality (Cat. No. 99CB36316) (pp. 222-228). IEEE.
Ferati, M., Pfaff, M. S., Mannheimer, S., & Bolchini, D. (2012). Audemes at work: Investigating features of non-speech sounds to maximize content recognition. International Journal of Human-Computer Studies, 70 (12), 936-966. https://doi.org/10.1016/j.ijhcs.2012.09.003
Fumham, A., & Bradley, A. (1997). Music while you work: The differential distraction of background music on the cognitive test performance of introverts and extraverts. Applied Cognitive Psychology, 11 (5), 445-455. https://doi.org/10.1002/(sici)1099-0720(199710)11:5%3C445::aid-acp472%3E3.0.co;2-r
Glenberg, A. M. (2010). Embodiment as a unifying perspective for psychology. Wiley Interdisciplinary Reviews: Cognitive science, 1 (4), 586-596. https://doi.org/10.1002/wcs.55
Goldin-Meadow, S., Cook, S. W., & Mitchell, Z. A. (2009). Gesturing gives children new ideas about math. Psychological Science, 20 (3), 267-272. https://doi.org/10.1111/j.1467-9280.2009.02297.x
Grice, S., & Hughes, J. (2009). Can music and animation improve the flow and attainment in online learning? Journal of Educational Multimedia and Hypermedia, 18 (4), 385-403.
Guay, F., Vallerand, R. J., & Blanchard, C. (2000). On the assessment of situational intrinsic and extrinsic motivation: The Situational Motivation Scale (SIMS). Motivation and Emotion, 24 (3), 175-213. https://doi.org/10.1023/A:1005614228250
Hativa, N., & Reingold, A. (1987). Effects of audiovisual stimuli on learning through microcomputer-based class presentation. Instructional Science, 16 (3), 287-306. http://dx.doi.org/10.1007/BF00120254
Johnson-Glenberg, M. C., Birchfield, D. A., Tolentino, L., & Koziupa, T. (2014). Collaborative embodied learning in mixed reality motion-capture environments: Two science studies. Journal of Educational Psychology, 106 (1), 86. https://doi.org/10.1037/a0034008
Johnson, M. (2013). The body in the mind: The bodily basis of meaning, imagination, and reason. University of Chicago Press.
Jørgensen, K. (2009). A comprehensive study of sound in computer games: How audio affects player action. Edwin Mellen Press.
Lindgren, R., Tscholl, M., Wang, S., & Johnson, E. (2016). Enhancing learning and engagement through embodied interaction within a mixed reality simulation. Computers & Education, 95, 174-187. https://doi.org/10.1016/j.compedu.2016.01.001
Liu, C., White, R. W., & Dumais, S. (2010, July). Understanding web browsing behaviors through Weibull analysis of dwell time. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval (pp. 379-386). ACM.
Li, Y., & Finch, S. (2021). Using sound to enhance interactions in an online learning environment. In K. K. Seo, & S. Gibbons (Eds.). Learning technologies and user interaction (pp. 118-133). Routledge. https://doi.org/10.4324/9781003089704-9
Makransky, G., & Petersen, G. B. (2021). The cognitive affective model of immersive learning (CAMIL): A theoretical research-based model of learning in immersive virtual reality. Educational Psychology Review, 33 (3), 937-958. https://doi.org/10.1007/s10648-020-09586-2
Mann, R. E. (1979). The effect of music and sound effects on the listening comprehension of fourth grade students [Unpublished doctoral dissertation]. College of Education, University of North Texas.
Mayer, R. E. (2005). Cognitive theory of multimedia learning. The Cambridge handbook of multimedia learning (Vol.41, pp. 31-48). Cambridge University Press. https://doi.org/10.1017/CBO9780511816819.004
Mayer, R. E., Heiser, J., & Lonn, S. (2001). Cognitive constraints on multimedia learning: When presenting more material results in less understanding. Journal of educational psychology, 93 (1), 187. https://doi.org/10.1037/0022-0663.93.1.187
Mihalca, L., & Miclea, M. (2007). Current trends in educational technology research. Cognition, Brain, Behavior, 11 (1), 115-129. https://silo.tips/download/current-trends-in-educational-technology-research-loredana-mihalca-mircea-miclea
Mizelle, J. C., & Wheaton, L. A. (2010). The neuroscience of storing and molding tool action concepts: How “plastic” is grounded cognition? Frontiers in Psychology, 1, 195. https://doi.org/10.3389/fpsyg.2010.00195
Moore, J. W., & Fletcher, P. C. (2012). Sense of agency in health and disease: a review of cue integration approaches. Consciousness and Cognition, 21 (1), 59-68. https://doi.org/10.1016/j.concog.2011.08.010
Moreno, R. (2006). Does the modality principle hold for different media? A test of the method‐affects‐learning hypothesis. Journal of Computer Assisted Learning, 22 (3), 149-158. https://doi.org/10.1111/j.1365-2729.2006.00170.x
Moreno, R., & Mayer, R. (2007). Interactive multimodal learning environments. Educational Psychology Review, 19 (3), 309-326. https://psycnet.apa.org/doi/10.1007/s10648-007-9047-2
O'Brien, R. G., & Kaiser, M. K. (1985). MANOVA method for analyzing repeated measures designs: an extensive primer. Psychological Bulletin, 97 (2), 316. https://psycnet.apa.org/doi/10.1037/0033-2909.97.2.316
Paas, F., & Sweller, J. (2014). Implications of cognitive load theory for multimedia learning. In R. E. Mayer (Ed.). The Cambridge handbook of multimedia learning (2nd ed., pp. 27– 42). Cambridge University Press. https://doi.org/10.1017/CBO9781139547369.004
Pekrun, R. (2006). The control-value theory of achievement emotions: Assumptions, corollaries, and implications for educational research and practice. Educational Psychology Review, 18 (4), 315-341. https://doi.org/10.1007/s10648-006-9029-9
Perham, N., & Currie, H. (2014). Does listening to preferred music improve reading comprehension performance? Applied Cognitive Psychology, 28 (2), 279-284. https://doi.org/10.1002/acp.2994
Perham, N., & Vizard, J. (2011). Can preference for background music mediate the irrelevant sound effect? Applied Cognitive Psychology, 25 (4), 625-631. https://psycnet.apa.org/doi/10.1002/acp.1731
Pfeiffer, D. G. (2008). Listen and learn: An investigation of sonification as an instructional variable to improve understanding of complex environments. Computers in Human Behavior, 24 (2), 475-485. https://doi.org/10.1016/j.chb.2007.02.006
Rutten, N., Van Joolingen, W. R., & Van Der Veen, J. T. (2012). The learning effects of computer simulations in science education. Computers & Education, 58 (1), 136-153. https://doi.org/10.1016/j.compedu.2011.07.017
Schrepp, M., Thomaschewski, J., & Hinderks, A. (2017). Construction of a benchmark for the user experience questionnaire (UEQ). International Journal of Interactive Multimedia and Artificial Intelligence, 4 (4), 40-44. http://doi.org/10.9781/ijimai.2017.445
Skulmowski, A., & Rey, G. D. (2018). Embodied learning: Introducing a taxonomy based on bodily engagement and task integration. Cognitive Research: Principles and Implications, 3 (1), 1-10. https://doi.org/10.1186/s41235-018-0092-9
Smetana, L. K., & Bell, R. L. (2012). Computer simulations to support science instruction and learning: A critical review of the literature. International Journal of Science Education, 34 (9), 1337-1370. https://doi.org/10.1080/09500693.2011.605182
Vickers, P., & Alty, J. L. (2002). Using music to communicate computing information. Interacting with Computers, 14 (5), 435-456. https://doi.org/10.1016/s0953-5438(02)00003-6