Ein Sound Studies-Blog

Exploring the Narrative Aspect of Video Game Soundscapes

by Melisa Sen

Once upon a time there was a curious reader who stumbled upon a blogpost about how sound conveys narrative in video games. Whenever a story is being told, there are many dimensions that go into its narration. Whether it is a fairy tale, a movie or a video game, the sonic dimension is one of the most important ones. Of course, the most obvious aspect that comes to mind is narrative storytelling. While it is possible in recorded media to have a narrator’s voice tell the story, a more unique form of narration is found when different layers of sound are involved. Most modern video games have a story world with its own narrative time and space that is shaped by the game’s sound dimension. Additionally, a game has rules and those rules have narrative meaning, which suggests that when a sound is attributed to said rules, the sound itself acquires narrative meaning.

Certainly, the goal to combine visual and aural information, while also adding the element of interactivity, is best possible within the video game format. This in turn creates different layers of sound, which – according to Sebastian Domsch – are able to further advance the narrative. First, there is the diegetic layer. This layer consists of all the sounds that are contained within the fictional world one might hear like heavy rainfall and footsteps. It is also important to make the distinction between what is a non-interactive sound native to the game world (heavy rain) and an interactive sound that is created by the player (footsteps). The main purpose of the diegetic sound is to convey the message: “something is happening or has happened in the game world”.

Related to that is the extradiegetic layer, which consists of sounds that are intended to be heard only by the player but serve as a background sound, for example a musical theme. Determining the non-interactivity is a bit trickier in this case as it varies from game to game. Some games may have a continuous soundtrack on loop, others may change it depending on the player’s actions. However, with the latter the player is usually not supposed to pick up on it. By creating multiple short themes that can be looped and transitioned into one another, a video game can create a sound system so complex that it is able to follow the activity of random player choices with ease. Music can be a very powerful expression of narrative storytelling. A highly energetic piece accompanying an intense battle does not only make for a better gaming experience; it also marks the significance that beating the enemy has on the overall story.

Unique to video games is the ludic sound layer. It contains interactive sounds that signify the player’s actions, for example entering a menu. These sounds are exclusive to the player and are important to indicate the impact they have on several game situations. Making the player’s actions audible gives them meaning. For the most part, ludic sounds are interactive. In some cases, however, they are meant to alert the player to important game information, for example the sound that plays when a character’s health gets dangerously low. The purpose of the ludic sound is to convey the message: “you have done something”.

Truly, sound in video games has come a long way. Due to technical limitations it was not always possible to convey diegetic sounds in a quasi-realistic way, which caused some of them to be perceived as ludic when they were not meant to be. Nowadays, this has shifted. There is an increase in ludic sounds being presented as diegetic that has caused a gray-area to appear within the layers of sound. Such sounds can be described as ‘ludic-diegetic’, with the purpose to convey the message: “you have made something happen in the game world”.

To make all these concepts easier to grasp there is a good example of a short video game clip, in which each of the above-mentioned sound layers can be distinguished. It is advised to first watch the video and listen for distinct sounds before replaying it while reading the analysis below. Said analysis attempts to identify every new sound that is introduced with its respective time stamp. The game in question is The Legend of Zelda: Breath of the Wild (Nintendo 2017), being played on the Nintendo Switch. The playable character’s name is Link and the enemy he is fighting is called a Guardian.

00:00 Ambient sound of heavy rainfall (diegetic/non-interactive)

00:00 Link’s footsteps transitioning from a wet wooden bridge to a wet and grassy terrain (diegetic/interactive)

00:02 Link is sheathing and unsheathing a weapon (diegetic/interactive)

00:04 Distinct sound of wind blowing (diegetic/non-interactive)

00:06 Link making noises while moving (diegetic/interactive)

00:07 A bird’s shrill screeching sound (diegetic/non-interactive)

00:08 Link’s footsteps being quieter as he is crouching (diegetic/interactive)

00:12 Link’s gear is rumbling while sprinting (diegetic/interactive)

00:21 Link’s footsteps while walking down a hill and then through puddles (the splashes are more distinct) (diegetic/interactive)

00:27 Player targeting the Guardian (as indicated by an orange arrow pointing at the enemy) (ludic/interactive)

00:31 The Guardian theme starts playing when Link is noticed (extradiegetic/interactive)

00:33 Sound of the Guardian preparing its attack (indicated by the red laser) (diegetic/non-interactive)

00:39 Sound of the Guardian attacking Link (diegetic/non-interactive)

00:39 Sound of an explosion as Link is hit (diegetic/non-interactive)

00:40 Sound to indicate that Link’s health bar (in the top left corner) is very low (ludic/non-interactive)

00:41 Player opens inventory (ludic/interactive)

00:45 Player is selecting which food item to eat (ludic/interactive)

00:49 Player selects item (ludic/interactive)

00:49 Player confirms their selection (ludic/interactive)

00:50 Link is heard making eating sounds (diegetic/interactive)

00:58 Link perfectly parries the Guardian’s attack (indicated by a slow-motion effect) (ludic-diegetic/interactive)

01:12 Sound of the Fairy (an item that is able to resurrect Link) being activated (ludic/non-interactive)

01:19 Link does not perform a perfect parry and the explosion is redirected to the ground (diegetic-interactive)

01:19 Sound to indicate that the shield broke (ludic/non-interactive)

01:22 Player enters the small gear specific menu (ludic/interactive)

01:23 Player confirms selection (ludic/interactive)

01:24 Sound to indicate that the shield is badly damaged (ludic/non-interactive)

01:24 Link draws his bow (diegetic/interactive)

01:26 Link fires his bow but misses (diegetic/interactive)

01:39 Link makes a noise while he jumps (diegetic/interactive)

01:43 Link hits the Guardian in the eye (critical spot) with an arrow (ludic-diegetic/interactive)

01:49 Link hits the Guardian with his weapon (diegetic/interactive)

01:51 Link hits the Guardian with an arrow (diegetic/interactive)

02:11 Link defeats the Guardian (ludic-diegetic/interactive)

02:12 The Guardian theme concludes with a prominent ending (extradiegetic/interactive; ludic-diegetic)

02:22 Link is picking up items (ludic/interactive)

The analysis shows a pattern, which basically substantiates Domsch’s typology. The diegetic and non-interactive sounds are always ambient noises that paint the soundscape of the story world. Link’s sounds are always diegetic and interactive, since the evolution of video game sound has made it possible to create realistic sounds and the player is actively controlling Link. Regarding the extradiegetic layer of the music, Breath of the Wild is also based on an interactive system, because the music started when the fight began, continued when the player was in the menu and ended when the player defeated the Guardian. All of the player sounds are ludic and interactive, seeing as how they are only heard by them and are designed to give aural feedback to their actions. Whenever sound was specifically mentioned, it was non-interactive as the player was not involved.

In the special case of ludic-diegetic, the sound was always interactive and only appeared whenever the player controlled Link and performed special interactions with the Guardian. When parrying the attack, the game went into slow-motion for a few seconds, which attributes a narrative meaning to the sound that occurs. Whenever the Guardian was hit in its eye, which is a critical spot that stuns it for a few seconds, a distinctive sound was played. Both of these actions are the game’s way of telling the player that this is how one is supposed to defeat this special enemy. Upon completion of this task, the destruction of the Guardian is accompanied by not only a huge explosion, but also the climatic ending of the soundtrack.

In conclusion, video games make it possible to create incredibly complex sound structures that are able to explore sound’s storytelling abilities. As technology continues to advance, game audio is also improving. The short clip from The Legend of Zelda alone demonstrates its many layers from diegetic, extradiegetic, ludic to ludic-diegetic. While further differentiating between interactive and non-interactive, the clip also illustrates the narrative function of each of these layers. The game in question has a carefully crafted audio system, which goes to great lengths to elevate storytelling. It is incredibly inspiring and exciting to experience when playing, but also when watching. Hopefully, there will be more games in the future that pay so much attention to detail when it comes to sound.

Sebastian Domsch, Hearing Storyworlds: How Video Games Use Sound to Convey Narrative, in: Audionarratology. Interfaces of Sound and Narrative, ed. by J. Mildorf and T. Kinzel, Berlin and Boston 2016, pp. 185-198.

Beitrag veröffentlicht