Interactive Music

Rob Bridgett for Computer Arts Magazine 2002

Introduction
What is Interactivity and why should it bother us linear musicians?
The music we compose has more often than not been solely based on the traditional narrative structure of beginning middle and end, whether this is a three-minute pop song or a more large-scale orchestral work - the notion that the performance is going to be heard in exactly the order in which it was composed is one that we all feel comfortable with. Yet, increasingly the way we structure our music is changing, most of us have written tracks that are loop based for use on the internet and have provided several musical 'spot effects' that will work within that piece of music for when a button is pushed etc. Also the tools we use in terms of sequencers and editors are breaking down the notion of linearity in music composition in that we can take any section of our composition and move it easily around within the composition, via the copy, cut, paste method. Indeed the way that we use sequencers to copy and paste chunks of music and effects means that most of us composers who already use these tools and techniques will probably find no difficulty in preparing these pieces of music for use in an interactive game or application. It is conceiving of how these sections will function in the application and the preparation of the end material that is the only general barrier to production.
So why bother with these ideas of interactivity? The notion of music being self-generative and self-modifying is in fact an ancient one, the Aeolian, harp an instrument similar to a wind chime which creates random notes when blown in the wind, or a more common example the ringing of church bells, which have a specified number of different bells (or notes) which are played in a random and different order every time they are heard. So these techniques and approaches have been around for a long time, and we have probably not really been aware that they are familiar to us. Unfortunately a lot of the current writing and thinking about interactive music tends to be rather overly academic which does put up a rather intimidating barrier to those of us who just want to produce great sounding music, whether it's for games, web or any other interactive application.


Applications / Media and their influence over musical structures
Film and TV have influenced the structure of music for the last century, theatre and opera before them. Interactive applications and entertainment are now set to re-write the rulebook.
Certainly within the last 100 years, film has been the single most important influence on narrative music, having been responsible for commissioning some of the most interesting, provoking and memorable music of the last century. Although the influence of film music can be felt heavily within the new wave of games consoles through increased use of 'orchestral' soundtracks, the actual structure of the music and the purposes for which the pieces are written is distinctly different. The reason for this re-structuring of music is due to the modular and interactive nature of the new media of game and web.
The first and most commonly used type of interactive music is still loosely a narrative approach, in that it has any number of beginnings, middles and ends but these can be played in a different order each time dependant on game-play or user interaction.
This type of interactive score incorporates several narrative modules for several purposes. These different modules can be thought of in simply defined terms as either narrative, continual, evolving or resolving.

A narrative piece of music is a piece in which plays straight from beginning to end without the need to be interrupted by user input. A good example of this would be a cut-scene or FMV movie in a game, as these sections work in a predictable and linear fashion, the old rules of film composition can be applied: the exact timings and lengths of events are known.

The second type of module, the continual, is basically a piece of music that needs to keep playing until a user input interrupts it. A good example would be a simple music loop on a menu awaiting input, or a particular section of a game in which the length of time for that section is an unknown parameter dependant upon many other factors.


The third state, which is an evolving module, is more complex. In narrative terms it is an unknown quantity, the length of time this music is to play for could be anything from 10 seconds to an hour, the same as for the simple continual loop, however there may be many other factors which influence this piece of music. A good example would be a combat situation within a game. Entering the combat would trigger the continual music, but then variables of how well or badly the player is doing in combat need to give audio feedback to the player and modify the music. If the fight is joined by two more enemies, the music will need to become more intense, and if the player runs to the side of the arena away from the fight to recover for a few moments, the music needs to reflect this less intense period of activity. This evolving sound can be achieved through manipulation of any number of musical variables, from pitch, timbre, orchestration physical modelling or effects filtering. And finally when the combat is over we need to smoothly get out of the combat music and back into a non-combat state, which brings us onto the final module, resolving.

The resolving module is a transitional piece, usually only played once which can be played at any time during the looping continual or evolving states and which signifies an end to the particular piece of music. If the player was successful in combat the combat music could fade out while the triumphant ending to the piece is played over the top, if the player was unsuccessful then a more tragic piece could be played. What you hear and when you hear it is therefore totally dependant upon the way the user interacts with the system.

The notions that structure in composition can be built in several modules is something that can be seen not only in games or interactive applications, but has been taken up within live TV and game shows. In the British TV show 'Millionaire' for example the background music is structured using a continual module and then resolving piece of music. A tense 'loop' of music is played while the contestant ponders the answers and then when the answer is revealed another 'finale' piece of music is played over this. Now, because no-one knows the exact timing of when the contestant is going to give the answer the looping tension-building music must continually play without foreseeable end, the only real ending to this music happens when the other piece of music which is played when the answer is revealed, again there are two possible types of ending for this resolution, the correct answer and the wrong answer and musical equivalents for these have been composed. So here we have an example of music being composed and structured in two distinct modules, each with a distinct purpose.


Modular Interactive Music Making
As we are now aware of the modular nature of interactive music, we can begin to think of ways in which these modules can be designed in order that they work together. There have until now been very few applications which allow the composer to work directly in the interactive format, but recently Microsoft's Direct Music Producer (DMP) has emerged, a free application which allows interactive content to be developed for Games or the Web, or any interactive medium.


DMP is designed specifically for building interactive content and to this end it is structured around building modular variations, intro and outro modules, and variations on themes all of which are responsive to user input or defined or scripted. Indeed this application contains some very clever ways in which an interactive score can be built up and create smooth transitions between the various modules.
One of the more interesting features are the ability to program variants on things like 'one shot' sound effects, so for every footstep or sound event, you can have up to 32 variations of the same event, these can be played in order from first, in order from random, in random order, with no repeats and by shuffling. This feature can also be applied to the music that can be written within DMP.
One of the main ways of creating music within DMP is via creation of Downloadable Sounds (DLS instruments) which bear a striking similarity to software samplers in that you can take any sound or groups of sounds and map them onto a keyboard. This enables you to create any number of instruments, a sampled piano for example, and to control this via MIDI score within the main composition window.
This brings us onto one of the main flaws of DMP in that it only really supports general MIDI information, unless you are using a DLS instrument which are unfortunately very memory intensive within interactive applications. Unfortunately the current expectations of the gaming and web audience is to hear fully orchestrated and professionally recorded audio on a par with film scores, and this is where DMP becomes tricky to handle if you are intending to manipulate pre-recorded modules of music. Even though the ability to stream in conventional Wave data does exist it is still incredibly difficult and not very intuitive to fade between sections of music preview the results in real-time. What you in fact have to do to get looping wave data to work properly is to assign the loops as DLS instruments occupying one key on the keyboard, and by the time you start working with lots of different modules things become extremely unmanageable and memory intensive.
As an extension to the problems with DMP's manipulation of wave data, the ability to manipulate timbre is also seriously lacking from the application.
Overall the application is very difficult to use and feels like it is in very early beta stage, as there are so many features that the program lacks, and it does seem to have been developed for a purely General MIDI driven set up. However Microsoft must be praised on the development of this tool as it is more than a leap in the right direction for musicians wanting to create music for interactive applications.


Generative Music
A second approach, and one less reliant upon the familiar modular narratives of film and linear music is that of generative music. This bears more in common with the Aeolian harp and church bells mentioned earlier, in that some basic parameters are set and then the music is allowed to evolve randomly, never the same from one performance to the next.
This approach is often considered an overly mathematical way of composing music, however the control over all the parameters is still totally within the grasp of the composer. These generative sections of music can either be built over time in a linear fashion, or can be employed using a continual loop. The generative score can also be thought of as an extension to the works and ideas put forward by John Cage on aleatory music, and more recently Brian Eno who has worked extensively with what is currently the best known of the interactive music applications, SSEYO's KOAN.

SSEYO KOAN
SSEYO have recently won an Interactive BAFTA technology award for their KOAN plug-in, and the application that allows you to create content for this platform is equally as intriguing and exciting.
Established as far back as 1986, this application has been geared towards presenting the user with a simple and effective way of creating ever changing, self-generating music. The number of parameters that can be set by the musician are vast, and include all the basic musical values we are all familiar with such as scale, chord progressions and harmonic rules, and even refreshingly, timbre.

The results which can be heard by visiting SSEYO's home page ( http://www.sseyo.com/ ) are very refreshing in that they are in no way like general midi, this is due mainly to KOAN's use of built in FM synthesis, all of whose parameters may form part of the interactive nature of the music.
There are several main features within KOAN that range from predefined to algorithmically defined.

1) Rhythmic voice. This value offers a great deal of interactive scope with some of the parameters being 'phrase pitch range', 'phrase gap range', 'phrase length' , etc

2) Ambient voice. This involves similar constraints as the rhythmic voice, yet allows longer more ambient notes to be generated, as well as being capable of freely ignoring the rhythm track.

3) Follow voice. This follows, as the name suggests, activity in a previously defined voice. But you could set further parameters such as following the lead-line at an interval of a minor third, or to do this four notes behind to create a canon-like structure.

4) Repeat bar. Simply effects how 'repeat' functions.

5) Fixed pattern. This feature will basically play a pre-defined midi sequence, however it can also be told to behave in a generatively adaptive manner or follow adaptive occurrences in other voices.

6) Listen voice. Again this simply takes another pre-defined voice and generatively creates an accompaniment to that line based on whatever rules you wish to define for it.

One of the downsides of the power of the KOAN application is that after some time the lists of numbers and parameters can get extremely confusing. However a quick and easy way to get started is to make the KOAN application generate a random sound set-up, which you can then tinker about with to get the hang of what parameters change what sounds.


Aesthetic Problems
Interactive music is normally viewed from a technical rather than an aesthetic perspective. This means that composers have needed to work in ways constrained by the software that they are using, rather than being able to purely realise musical ideas without being limited by the sounds used by the hardware. Although many of the methods of generating interactive music are still reliant on the purely modular approach, what is required is an outlook based more on a halfway point between this modular approach and the generative approach.

Getting beyond general MIDI
The problem with many of the interactive music creation applications currently available is that they rely heavily on general MIDI or native computer MIDI sound banks from which they draw their sounds. This is of course bad for the person who is using the end application or game, as the end results can sound not a million miles away from the works of John Shuttleworth.
Ways to circumvent this do exist, such as utilisation of the DLS libraries (or downloadable sounds) within Direct Music Producer, which allow the mapping of any particular WAV sound across the virtual MIDI keyboard in order that the synthesiser can then use these sounds. This method is flawed at present due to the vast amount of memory that such DLS sound files take up. By using greatly compressed file formats such as MP3 or Ogg Vorbis, these timbral flaws can begin to be eliminated.

Where next? The Future of Interactive Music.
This is a brief look at apps and advancements on the horizon for interactive composers, where they may be heading and what would make life easier for linear systems to adapt the notions of interactivity into their make-up.
In terms of future developments for interactive music production and implementation, a number of advancements are still sorely in demand. The built-in dedicated audio processing hardware, which is native to the XBOX console, is a step in the right direction for real time effects filtering and manipulation. Other consoles need to make these real-time advancements possible; DirectSound on the PC has been making these advances in terms of CDROM and PC applications. However it is a severely neglected element of interactive web software such as Macromedia Flash and Shockwave in that there is no real time DSP effects such as simple reverb and delay. The introduction of these effects would immediately increase the scope with which interactive and real time sound and music can be filtered and altered, whilst dropping memory size and download time. The limitations of sound-object action-script for Flash are also too clearly shown up for the musician who is not a gifted programmer.

Another major advancement in terms of real-time modified timbre is that of physical modelling whereby the timbre of a synthesised sound is radically altered based on LFO or other real-time alteration of the sounds, the Native Instruments Absynth module gives one some idea of how this kind of sound manipulation works when tied to a particular parameter, and the creative scope for an application that allows this to be simply and effectively executed is quite vast.

There are various superb examples of how amazingly realistic the timbre of physical modelling can sound to be found on TaoSynth's homepage http://web.ukonline.co.uk/taosynth/
However, as this is only available on Linux, and demands a high understanding of code in order to implement, it is a long time until these features can be wrapped in a user interface that non-coder musicians can actually experiment with. The one thing this does point to is a not-too-distant future where physical modelling will form an ally with, and possibly replace altogether, using real samples, as physically modelled sounds can be controlled completely by the computer. Physical modelling also has great things promised for vocal synthesis - whereby all kinds of weird and never-before heard creatures and will be conjured forth, also perhaps opening a way to the holy-grail of voice-overs whereby the voice of a long dead actor, such as Orson Welles, could be given new lines of dialogue. This technology also has a sinister side when one considers the potential uses for vocal physical modelling, in that anyone would be able to mimic anyone else's voice within a computer… think of the scene in Terminator 2 when the T2000 mimics perfectly the voice of the boy's mother… it's quite frightening, but this is where current research is heading.

Other less sinister areas for future enhancement of interactive applications lie in the type of music data they can deal with. This again refers to the drawbacks offered by pure general-MIDI support for applications like DMP. Although the actual MIDI clock features are essential in creating timings for interactive modules to function within, by enabling support for music loops via programs such as Sonic Foundry's Acid, and support for these in various file formats, the timbral nightmares of general MIDI can begin to disappear and make way for use of real recordings.
Again on the subject of MIDI, the eventual evolution of DLS libraries will form something similar to what we currently know as Software Samplers such as Nemysys' GigaStudio. These will enable the MIDI functionality to become much more realistic, one can imagine the power of the GigaPiano made into a fully accessible DLS instrument on a games console, integrating elements of pre-scored modular themes with elements of random generative music, also combined with MP3 or lossless sound file compression to make the DLS samples so much more memory efficient. Of course there is ample scope for all of the applications I have described to become functional in one application, comprising modular arrangement of MIDI powered software samplers and looped sample information, sections of Koan's generative software synth, offering physical modelling and utilising real-time effects and third party plug-ins, so that eventually the interactive score will be a case of setting up the rules by which an application generates sound from every conceivable parameter.

Taking all these concepts to their logical conclusions one can foresee a time when the computer or application is creating every sound in real-time and the composers job is to merely create and define the parameters within which the music and sounds will function - and even this task may well eventually be best left up to the computer. However one feels that this will somehow be lacking in the emotional 'human' feel that we get from music, even in the harshest electronic compositions there is a human element, and until computers can effectively mimic these emotional cues to such a degree that we cannot tell the difference, music will still remain a strictly human endeavour.

Approaching Interactivity: Creating your own simple interactivity and generative scores.

Interactivity can be based on very complex rules, but it need not be any more complicated than creating a track in ACID or some other loop based sequencer and exporting each individual piece of the score in the modular forms described earlier. However, getting a musical composition that doesn't sound like it is continuously looping can be achieved quite easily through generative processes.
.
Generative audio need not be reliant on a complex set of rules, often very simple approaches generate mathematically and musically beautiful pieces. These techniques may also be applied as a starting point to traditional linear composition (aleatory composition).
One such approach would be to simply create five WAVs of varying lengths consisting of separate notes. These notes when played together may make up a chord, but when all these sounds are played simultaneously and looped, due to their differing lengths will play together at different time intervals. An example which is made from several different keyboard notes can be created quickly and simply using Macromedia Flash

This is simply done by making each note a separate Flash movie with differing loop timings from the other notes, there are five notes in all and they continually generate an ever changing chord - this can be achieved with any sound (even loops that fade in and out can be simply and effectively implemented within Flash) and any number of sounds all with fairly low memory usage, and as they are separate elements they don't have to all load up before anything is heard.

…possibilities
As can be seen the interactive nature of music and sound effects can be as simple or complex as we decide. The main problem is indeed one of conception of how the elements all function in relation to one another, and not one of production. The music we produce already on linear systems can be simply and effectively adapted for use in interactive systems. We will perhaps not see interactivity in every musical and performance situation, but we have already begun to undermine the artists carefully considered album running orders via use of random acces CD playlists and MP3 playlists…perhaps it will eventually permeate every musical situation we are currently comfortable with.