|Note: If you see this text you use a browser which does not support usual Web-standards. Therefore the design of Media Art Net will not display correctly. Contents are nevertheless provided. For greatest possible comfort and full functionality you should use one of the recommended browsers.|
Music is what it is to us through media. Their influence can be major or minor, obvious or subtle. However, with the same astuteness with which Niklas Luhmann makes the meaning of mass media absolute for cognition of the world,  for all that is acoustic one can say that mass media make it what it is— and with increasing intensity: With the expansion of electronic media into ever more central positions in everyday and cultural life, music is becoming more heavily influenced by media phenomena, it makes more frequent use of media technologies for its conception and production, and it increasingly reflects the functioning and the effects of media.
Even the earliest media that were important for music were not exclusively acoustic media: any of the musical instruments; singing and instrumental techniques that accompanied dances or other ritual or social activities; written and graphic notation; the printing and publishing industries; conventions of musical performance in various epochs—all of these media and dispositives articulate themselves and are handed downnot only through sound, but to a large extent also by way of narrative/text-based representations; they are conveyed via the senses of touch, smell, and taste, but in particular through their visual manifestation. As soon as music is regarded not only as an acoustic stimulus structure, but within the context of its genesis and effect, in many ways it at the same time becomes intermedial.
This essay examines music for its ‹natural,› visible components, for the visual representations with which it conveys and hands down, as well as for the complex interrelations it enters into with images in everyday media products and in the experimental field of the arts.
Non-acoustic media are essential for music. Written notation, for instance, produced an enhancement of the current as well as the historical musical memory, without which neither the conception nor the performance of complex polyphony would have been possible. The publishing industry created a new basis for life for nineteenth-century composers, who with their new civic freedom had given up their dependence on church functions and princely allowances, thus causing those music genres that could be performed in the home—primarily piano and chamber music played by small ensembles—to become particularly widespread.
So-called ‹eye music,› often exemplified by making reference to works by Johann Sebastian Bach, illustrates how extensively the visual communication of musical structure represents an independent aesthetic level when certain symbolic meanings only become obvious by reading the score. The composer Dieter Schnebel took this thought to the extreme in his book «Mo-No: Music to.»  In the mind of the reader, stimuli to musically appreciate the world of noises and visual text/graphic compositions, which are meant to be taken in by reading alone—mono—and silently, only allow an imagined music to develop out of text and image. Nam June Paik, La Monte Young and George Brecht also composed music for the imagination, which they laid down as so-called «events» in brief textual instructions.
The examination of the specific situations in which music is produced reinforces the diagnosis: Making music also feeds on a variety of visual stimuli—from the flow of reading the score to observing the instrumentalist's motor activity to the gesticulatory coordination of acoustic occurrences amongst several musicians. This continues on the recipient side: The players' gestures and facial expressions convey information to the listeners about context (for example, intended emotion), melody, harmony, and rhythm. The expressive quality of the music—but also the virtuosity demanded of the musician—become more transparent through the visual comprehension of the instrumental performance and supply material that is included in the aesthetic experience and evaluation of music.
Music is clearly an intermedial, above all audiovisual phenomenon to a much larger extent than one might spontaneously believe—more extensively than can be said for fine art. A remark made by Marcel Duchamp provides further explanation: «On peut regarder voir, on ne peut pas entendre entendere,»  which in English means «One can watch (someone) seeing; one cannot hear (someone) hearing.» This insight can be extended: One can in fact hear neither seeing or hearing. When the audience becomes silent just before a concert, one may actually hear how the people begin zu listen; however, this silence cannot be distinguished from the silence of visual concentration in a museum. Conversely, one can visually follow someone listening or watching: In the same way one can read the sequence of the elements being looked at and a person's emotional reaction to them from their eyes, one can also see how people listening to something turn their gaze inwards, because this is where the sounds occur, and one can gather the various reactions to what is being heard from their eyes.
During the reception process, the rapt silence in the spaces of a museum promotes the focus on visual contemplation. For his ironic project «Audio Recordings of Great Works of Art,» <> until 1999 the sound artist Ed Osborn <BIO NEU> collected sound recordings of the background noise in the immediateproximity of central works of European art (including, of course, «Mona Lisa» in the Louvre) and discovered a complete indifference between sound environment and the content of the works of art.
It is somewhat different for the artistic production process. The sound of the brush, the graver, or of the hammer and chisel as well as the acoustic environment certainly affect the practicing artist. But in painting, it was Classic Modernism that first found its way to an explicitly structural parallel to music beyond the representational, oftentimes symbolic depiction of musicians, instruments, and listeners. Cirluonis and Kandinsky,  for example, translated musical experience into abstract color patterns which reproduce sound impressions and temporal proportions.
These artists of the early twentieth century made reference to a fundamental connection between the arts: that as in music, similar temporal processes occur in paintings, sculptures, and photographs. From this point of view, the production and reception of fine art are both activities that are acted out in time and that exhibit parallels to following a narration, a spatialrepresentational dramaturgy, or even a musical score.
In the further development, with the continued opening up of artistic design methods to time-based processes (performance, installation), the sound being produced almost automatically established itself as an integral part of these works of art, as is the case, for instance, for Jean Tinguely's sounding kinetic sculptures, or for works such as «The Box with the Sound of its Own Making» (1961) by Robert Morris. Conversely, composers take up the openness of the visual reception process as a model for a new concept of musical interpretation by producing graphic scores that leave the sequence of sound elements open. For his «Fontana Mix,» < John Cage used graphic models and formulated rules, on the basis of whose two overlying line structures precise details can be inferred for pitch, duration of the sound, etc. The score was conceived by the composer as an open system of rules, which is clearly defined with a large degree of variability.
In popular music, at that moment at which musicleaves out the score, the necessity of renewed interpretation, and the public performance context through its transmission, storage, and synthesis,  several waves of the visualization of musical context information occur: At the latest in the 1950s, the image of stars such as Elvis Presley took on a fundamental role in the music marketing process. At about the same time, record cover art began to produce its own aesthetics, one of the pioneers of which was the «Blue Note» jazz label. In the 1960s and 1970s, the rock concert became a brilliant visual stage spectacle accompanied by light and smoke; the invisibility of the star during mass events is counteracted by gigantic video walls. In the 1980s, the videoclip developed into an unavoidable visual counterpart to every pop music song. 
Musical experimentalism, the central figure of whose development was John Cage , is based on an expanded understanding of the material, in which, for example, noises are used on a level equal in value to the sounds of the instruments. However, it is primarily structures alien to music—technical, social, and non-musical aesthetic contexts—which are decisive for the conception of the works in which the graphic score described above occupies a central position. Besides social, biological, and game-like rule processes, it was primarily the technical functioning as well as the contents of media apparatuses that were used as ‹structure generators,› for example in Cage's «Imaginary Landscape No. 4,» in which the tuning and the volume of the twelve radios were adjusted according to a time schedule.
At the beginning of the 1950s, concretion of the works was still confined to professional musicians. During this period, John Cage, for instance, backed away from any interaction with listeners, because in his opinion, during the realization of open compositions only trained musicians could resist the temptation of sinking into musical cliches. Cage's objection to interacting with listeners illustrates a central problem: How can the competencies of music experts (composers, interpreters) be brought together with those of the listeners? How can the meaning of theirinteraction be conveyed to listeners who are not musically trained in such a way that they act in a ‹musically meaningful› way?
The solution the experimentalists had for this was to expose the intermedial structure in their interaction: the use of technical or other generally familiar systems that could also be understood and operated by the musically untrained. With this approach, the audience can also take on the role of the interpreter. «Maxfeed» (1967) by Max Neuhaus resembles a transistor radio; however, it produces the screeching and hissing sounds itself. The audience consists of only one person, who determines the sound sequences using the dial settings he is already familiar with from his own radio.
As a structure generator, intermediality in the first place aids the composer in finding new structures. At the same time, intermediality is used as a point of contact for the recipient so that his interaction can follow a consistent line. The logical consequence of this development is the step from the concert performance to the installation situation, which the audience masters independently and individually.
How important ‹visual support› is when decoding a musical figure (what kind of facial expression does the musician have? what are his physical gestures when producing a particular sound? and so on) can be demonstrated by example of the Electronic Music of the Cologne School of the 1950s, whose lack of human interpreters on stage and consequently the complete absence of visual elements was perceived as a flagrant communication problem. Electronic tape pieces such as «Studie I» by Karlheinz Stockhausen were frequently described as lifeless, rigid, and unmusical.  The return of interpreters to the stage in so-called live electronic music since the 1960s countered this problem either by transforming the sounds produced by the instrumentalists during the performance or by producing and modifying the purely electronic sounds live. Karlheinz Stockhausen's «Kontakte» made the musical effects of electro-acoustic technology visible. Now one could again see the human being making the music.
However, this approach did not always solve the problem to everyone's general satisfaction. Today's audiences often lack a sense of what performers who operate electronic equipment by means of buttons and dials are actually pushing and turning. Unlike the classic playing of instruments, hardly anything is conveyed through physical gestures, because what is hidden behind the individual buttons is different at every concert, while the playing of a violin can be universally interpreted on the basis of the violinist's gestures. It was not without a certain amount of irony that Nic Collins built his «Trombone Propelled Electronics,» with which he controls an electronic piece of equipment with the familiar gestures of a trombonist without, however, having a marked talent for playing the trombone itself. This problem has become even more acute in the case of the computer performers of recent years, who are often nearly rigid and frequently sit at their laptops without a single discernible sign of emotion. Listeners obviously expect more from music than just ordered sound.
One solution for this communication problem is for performances to take place in a new kind of spatial setting or in a lounge situation, in which one can regularly withdraw one's visual attention from the stage and allow it to wander through the room and turn it towards what is occurring in it. Other solutions are visualizations of musical contexts in installation settings or visuals shown parallel to the music and whose content or aesthetics can also be independent of the music. A further solution is the interactive inclusion of the audience. All three solutions are based on intermediality.
However, the inclusion of visual media in musical practice does not only feed itself on the lack of visual stimuli in electronic music, but rather it is an aesthetic tendency that is bound to the dissolution symptoms of the concert format and the evaluation criteria associated with it. The reception situation in concert halls—the elevated stage located at the front of the hall—which developed in the nineteenth century and which was coupled with strict behavioral norms such as sitting still or clearly defined applause rituals, is to a large extent subordinate to visual criteria. The listener with closed eyes may be highly respected by a musicology fixated on abstract form and structure,however, s/he is a minority in the audience.  The majority of concert visitors intensely follow the conductor and the musicians with their eyes. Frontal orientation in concert halls has been declining since the mid-twentieth century. So-called concert installations frequently integrate multimedia elements in which not only the instrumentalists, but also the members of the audience are distributed throughout the space, and the musicians are seated in the center.
The emphasis of the body and of the physical production of sound, which is often accentuated through the use of media, seems almost antithetical to the bodiless performances of electronic music and laptop concerts. In the twentieth century, the fine-motor virtuosity of instrumental musicians, a central criterion in the nineteenth century, has experienced a shift towards emphasizing overall physicality in the production of music. In the 1980s, Helmut Lachenmann made the physical instrumental production of music the content of his works (for example, «Pression,» 1969/70). The pieces composed by Nam June Paik for the performer Charlotte Moorman (e.g., «TV-Bra for Living Sculpture,» 1969) as well as, for example, the performance «Fractal Flesh» by Stelarc , in which he incorporates himself into an electrical system, are also media stage productions of instrumental sound generation that emphasize the body.
Because the ideal of ‹absolute music› continues to have an effect up to the present day,  what has always been a matter of course in popular music had to be expressly reestablished in art music: The music of Jimi Hendrix or The Who, heavy metal or hip hop would not be the same without the ‹body work› of the musicians.
However, more recent compositions which adhere to the concert and the genre of instrumental music are also experiencing an intermedialization. The playing of instruments concurrent with a live performance, for instance, is being enhanced or visually mirrored by video projections. For «MACH SIEBEN,» Michael Beil made a videotape recording of the pianist Ernst Surberg playing the piece. In the performance, the video projection and the liveinterpretation embrace to become the temporal mirror of a symmetrically planned composition. Vadim Karassikov enlarges instrumental gestures, which create sound bordering on inaudibility, on projection surfaces that have been hung above the musicians. In «Eyear» (2003) by Pascal Battus and Kamel Maad, the creation of differentiated drum sounds, which are produced by rubbing stones and sand on the drum skins, are filmed from close up in real time and projected onto a screen. The drum becomes the stage, whose sound occurrences experience an aesthetic visual doubling in the video.
Visualizations accentuate qualities of the musical production, thus taking up elements already existing in the composition and in the playing of the instrument and putting them into concrete form using media-technical means. The basic media qualities of instrumental music production are extant during this process. The examples cited above can be interpreted in such a way that the composers regard the visuality of the creation of music as an integral part of music and only make us aware of it with the aid of the video image.
While the video image is still a peculiarity in art-musical concert practice, we can hardly imagine clubs without video monitors. In the 1990s, a new audiovisual format emerged in clubs when DJs and VJs began performing together. For the most part, musicians and video artists have only loose contact during their performances, i.e. the connection between image and sound is rarely based on a technical link between the respective instruments or on concepts that have been drawn up in detail, rather it is primarily based on rough outlines or on improvisation, for example in performances by 242.pilots. Connections between music and image are primarily atmospheric, and it is rare that either level deals with specific content or narrative structures. Their different proximity to reality often results in an imbalance: While real images are frequently used in thevisuals, e.g. urban scenes in «Deathsentences» (2003) by the group Negativland, the music is hardly concrete or narrative throughout. Sometimes it is the use of only individual, concrete noises which can be assigned to real objects that mediates between narrative images and electronic sound processes.
When linking image and sound, time and again the issue is raised of according to which process or code structural features should be translated from one sensory level to the other: What does it mean for the music when the image is blue, pale, or moving? And what does it mean for the image when the timbre of the music is dull, or the musical structure is harmonically complicated or melodically dominated by large interval jumps?
The problem results from the fact that on the one hand, when processing image and sound cognitively they have an inseparable reciprocal effect: For example, it is sometimes difficult to say whether it was the images or more likely the music in a film that prompted us to have a certain opinion about the overall experience. On the other hand, however, the individual sensory stimuli or the gestalt on which cognitive processing is based are incommensurable, i.e. incomparable in every way: There are as few colors and bodies in the sounding music as there is a major-minor relation or a melodic line in the visual design.
The most extensive integrative role in the audiovisual reception process is played by the temporal structure. Rhythm is a feature that can be perceived both in music as well as in the image. Whether as image allocation, movement of the figures, or as the editing rhythm of the film; whether as musical pulse or as melodic-rhythmic figure: Temporal structure can always be experienced physiologically or physically. Synchronicity of the auditory and visual levels is therefore an important means for creating an integrated experience.  This is why DJ/VJ works start at precisely this point. The extent to which rhythm is designed to correspond with image and sound is an individual artistic decision. Image and soundcan be brought together at junctions, or they can also be synchronized or counterpointed throughout.
Analogies between image and sound, however, are also based on parallel perceptive experiences in everyday life: For physical reasons, only large bodies are capable of radiating long-wave frequencies, which is why we associate great volume and power with low tones. In contrast, due to the acute auditory sensitivity of the human ear to high tones, they at best serve as warning signals. This leads to their being used in film e.g. to signal horror, which has the same effect across all cultures. 
References between visual and auditory structures can be created in two ways. One possibility is the use of structural or atmospheric analogies or relative synaesthesia: the dark image to which low tones are played; the glaring, mangled surfaces of color accompanied by screeching, high-pitched sounds; or the subdued staccato sequence of notes occurring with the movement of small graphic elements, etc. The other possibility is the narrative assignment of noises to visible objects, and vice versa.
The Pythagoreans sought correspondences between colors/light and sound. They connected their concept of the harmony of the spheres, which arises out of the harmonious relations between planetary movements, with the colors of light. At the same time, the tertium comparationis, the mediating value between color and pitch, was purely speculative. Leonhard Euler's theory on the oscillation of light, published in 1746, as well as Goethe's theory of colors restimulated the search for scientifically-sound, uniform concepts of light and sound and thus for the specific assignment of individual colors to particular pitches. Various mystical and esoteric conceptions of the world played a role in this quest for a common source and for the fusion of visual and acoustic aesthetics.
Even in the twentieth century, attempts were made to integrate colors and light into complex systems either of a cosmic nature or related to the language of music. Based on his own experiences as a synaesthetician,  in 1911 the Russian composer Alexander Skrjabin (1872–1915) who sympathized with the symbolists and theosophicalthought—introduced a color classification system that he regarded as the «reunion of the arts separated over the course of time.» «By overcoming all differences, the synthetic Gesamtkunstwerk is meant to immensely expand consciousness and lead to a state of ecstasy. It shall therefore be structured in such a way that all of the five senses fuse together to become a union. Arts dependent on ‹will› (music, dance, language), however, are superior to the play of color and scents.»  The background for the audiovisual references is an integral system of analogy rooted in the subconscious that influences the form and structure of music and its enmeshment with symbolic contents and colors/scents.
Skrjabin wrote a light staff, ‹Luce,› at the top of his symphonic poem «Promethée. Le Poème du feu,» op. 60, from 1911. However, he did not construct a one-to-one translation of the music into colored light. The part for color piano develops its own formal structure, which is only brought together with the music at several junctions. This light version of Prometheus was not performed during Skrjabin's lifetime; however, it was important for the discourse surrounding «Der Blaue Reiter» and Kandinsky. Skrjabin's «Mysterium,» in which both colored lights and scents were meant to contribute to a synthesis of the arts and a holistic sensory experience, remained unfinished. Neither the scent organ nor the hemispherical building intended for the performance of «Mysterium» were ever built.
Instead of keys, Olivier Messiaen created a system of socalled modes on which to base his music. Similar to the way in which certain functions were attached to the ecclesiastical keys within a liturgy, each of Messiaen's modes are based on an individual scale and are assigned certain colors.  In addition, Messiaen established references to ornithology and elaborated on this extensive system in his seven-volume «Traité de rhythme, de couleur, et d’ornithologie.»  However, his works are pure sound; the union of color and tone, of form and time constitute only an ideal basis. Perception of the «magic of the impossible, thus the identity of time and color»  is left to the recipient's power of imagination.
The Russian Ivan Wyschnegradsky followed the systemic approaches developed by Skrjabin and Messiaen. He constructed a microtonal scale divided up into seventy-two equally tempered intervals (instead of the twelve semitones of our normal scale) and liquidated octave periodicity. Wyschnegradsky developed a system of ‹total analogy›by dissecting colored circles into concentric rings and cells according to tone-systemic rules, thus arriving at 5,184 color-tone cells. His vision was one of a cosmic, spherical «Temple of Light,» which like Skrjabin's «Mysterium» was never realized.
At the 1970 World's Fair in Osaka, a hemispherical Auditorium was built— according to designs drawn up by Karlheinz Stockhausen—for the first time. Sounds could move spatially through the auditorium over fifty speakers. The speaker assignment and the shining of thirty-five sources of white light were controlled intuitively using special ‹spherical sensors.›  In the decades to come, Stockhausen consolidated his synthesizing approach to art into a personal mythology and executed it in his comprehensive work series «LICHT.»
The Fluxus artist La Monte Young can only be partially regarded in the tradition of these composers. His system of sound and harmony is based on mathematical principles and concentrates on the phenomenon of time, as oscillations are temporal phenomena. His first concept for «Dream House,» a «living organism with a life and tradition of its own» which he developed in collaboration with the light artist Marian Zazeela, dates back to 1962. «Dream House» transforms a space into a finely balanced sound-light environment. The sounds La Monte Young produces consist exclusively of pure sinus oscillations, whose frequencies he tunes to the spatial proportions in such a way that stationary waves develop. Using a set of different speakers and mathematically related wave lengths, a sound space is created. The recipient strides through the space tracking down junctions and at the same time ‹interrupting› the balance of the stationary waves. The wave length of the magenta-colored light is connected to the tone frequencies. As is the case for Skrjabin and Wyschnegradsky, symmetrical arrangements play an essential role in the complex mathematical structuring of the sound space. In these examples, the transformation logic, i.e. the rules for assigning color to tone, are derived from music (compositional structure, music theory) or from mathematical relations of musical acoustics.
Work by the composer and artist Christina Kubisch demonstrates how in the area of sound art, visual and auditory design are closely interrelated. Sound installations may always have something to do withspace and thus with seeing, with orientation within the space; however, they are not invariably arranged visually as well. With Kubisch, besides the sound level there is almost always a light component as well, whereby she primarily works with the bluish element of black light. With her, too, the phenomenon of time clusters the senses into an expressly non-synaesthetic context. An even more precise description is that it is silence, which is regarded as something motionless in time and is experienced visually as well as auditorally, that becomes the connecting link.  Thus in «Klang Fluss Licht Quelle—Vierzig Säulen und ein Raum» (1999), both media convene in a sensual gesture, in a palpability and subtlety as an expression parameter of the space they occupy. The design of painterly and tonal elements does not adhere to a ‹calculated› transformation logic, nor does it strive for unity. Rather, it is based on intuitive decisions.
With the arrival of electronic media, the practice has developed of translating music directly into an image as an electric signal, and vice versa, thus generating a rhythmic-formal parallel structure. Approaches of this kind can already be seen in the early video work by Nam June Paik.  The installation «Skate» (2004) by the London-based media artist Janek Schaefer <Bio NEU> is founded on a simple electric analogy: He manufactured special records whose grooves are interrupted. As a consequence, the needle hops from one groove fragment to the next. The three-armed record player on which the records are played is connected to a set of red light bulbs, the lamps lighting up to the arbitrary rhythms created by the playing.
Experiments conducted in the first half of the twentieth century with the aid of optical sound recording to acquire image and sound out of one and the same material, thus achieving an audiovisual fusion, can be regarded as the forerunners of today's visuals. Based on his own research, for «Tönende Ornamente» (1932) Oskar Fischinger assumed that there are unconscious connections between culturally disseminated ornaments and their sounds when playing them on optical sound equipment. In Leningrad in 1930, Arsenij Avraamov and Jevgenij Scholpo beganexperimenting with the synthesis of sound out of graphic forms. For his idea of the ‹drawn sound,› Scholpo constructed a special machine, whose principle is similar to so-called ‹wave tables› as used in digital sound synthesis: If one periodically reads out an individual image with a particular curve form using optical sound equipment, a continuous sound is created. Scholpo made attempts to explore to what extent e.g. the sound of scissorcut- like facial profiles reflects the character of the type of person being shown. This approach is paralleled by the audiovisual works by the duo Granular, who process image and sound according to the same principle (of kgranular synthesis).
Interest in the use of electric signals for the generation of music and image has existed since the initial years of electronic design methods, for example with the visualization of sounds on the oscillograph.  In addition, forms of visualization were sought which, like traditional notation, could serve as a suitable foundation for the musicological analysis of electronic sound processes (sonagram, graphic notation). In the process, it was realized that these visualizations have their own aesthetic appeal.
Opportunities for the visualization of sound are multiplied through the computerization, and thus the digitalization of sound. The principle of the sonagram is taken up by ‹analysis resynthesis software,› with which one can graphically manipulate the analysis data of acoustic sequences represented visually. The visualization ‹corrected› in this way can in turn be transformed back into sound. Guy van Belle and the group «code31» applied a similar process in the networked performance «Anyware» (2004), in which synthetic sounds and a dynamic colored area are controlled by the same parameters and sent to the other performers via the Internet. The analysis of the incoming sounds and images influences the renewed synthesis of sound and color, resulting in continuous feedback between image and sound as well as between players.
In «Wounded Man´yo 2/2000,» the Fluxus and media artist Yasunao Tone translates Japanese characters, which he draws with a mouse using the audio software «Sound Designer II,» into acoustic oscillation sequences. In this way, noise-likesound structures become audible which in their conciseness of sound can be associated with Japanese characters.
The Argentinean composer Ana Maria Rodriguez attempts to create a union of sound and image by analyzing narrowly defined raw material and then using it as a source for image and sound. In her work «Code Switching » (2004) with the Australian video artist Melita Dahl she proclaims the transformation from one code to another to be the explicit subject. Facial expression as visually mediated, and phonemes as acoustically produced elements from one and the same head and mouth constitute the raw material for this audiovisual installation. The faces of the performer Ute Wassemann are projected onto four large screens and transformed into one another using a morphing technique. Individual phonemes have been isolated and likewise compressed into an abstract sound/language space. The processes applied to the raw materials are both based on the same principle.
For the audiovisual online instrument «nebula.m81,» Netochka Nezvanova uses nothing but the definitions of technical formats: Without requiring a data transformation or a conversion, the program simply redefines the internationally standardized set of «ISO Latin 1» characters from arbitrarily selected Web sites into the audio format «Sun µlaw» with an 8-bit resolution and a sampling frequency of 8 kHz, and plays them as sound. The technical format definitions determine the aesthetic result.
If one considers the current status of technical development, at first glance image media appear to be more advanced: We are amazed at the visual effects in the latest movies and in commercials, and as a rule we are unaware of how much technological refinement was required to construct their sound backdrop. In fact, today's audio media possess much more far-reaching possibilities of reconstructing a space at another location or even creating it synthetically than does visual virtual reality.
In the chronology of the development of image and sound media, Daguerre's photographic process, introduced in 1839, was the first—even if it was an ‹unnatural› still image that simply ignored time. WhenEdison invented the phonograph in 1877,  he had no choice but to take time into account, as not only music, but even physical sound is per se a temporal phenomenon that dissolves into nothing when it stands still in time. Thus sound recording was in the world before image recording in the same way became complete through the inclusion of time in film.
For the most part, the technical lead of sound over the image has been retained: The radio established itself as a mass medium long before television did; the gramophone spread into private households long before film projectors did; there were portable cassette recorders before there were video cameras. Sixteen-mm film, which was invented before the tape recorder and became a common piece of household equipment before it did, constitutes an exception. Amongst the more extravagant technologies used more by the scientific or artistic elite, on the other hand, sound holds the lead: the electromechanical sound synthesis with Thaddeus Cahill's «Telharmonium» and the various stages of electronic sound synthesis with the Theremin and later with Robert Moog's voltage-controlled synthesizer existed before specific methods of image synthesis, or in the absence of the appropriate visual technologies were simply ‹misappropriated› to image synthesis, e.g. by Nam June Paik . Due to problems with the much larger volumes of data during image processing, the digital creation and processing of sound in the computer was introduced first.
Upon closer examination, the relationship between the availability of certain technologies and the artistic application of their principles proves to be complex. This becomes apparent when, for example, one considers that despite the lead of audio technology, sound artists frequently reacted more slowly to the emerging media than did visual artists: With the aid of film, the artistic montage was developed in the visual area. But as music only hesitantly overcomes its fixation on abstract sounds (instead of the concrete noises that become manageable with the introduction of the gramophone), it is not until the end of the 1940s—with Pierre Schaeffer—that it leads to newmusical results. Elementary sound phenomena may have never been denied the potential of naturally-given beauty: Igor Stravinsky, for instance, emphasized the aesthetic value of sounds such as the rustling of leaves or the singing of birds and describes them as being related to music in that they «… caress the ear and give us pleasure ….» However, in his view this does not yet mean that they possess the status of art: «[B]eyond this passive enjoyment we discover music, which allows us to actively participate in the working of a mind that is ordering, invigorating, and creative.» 
In the area between literature, theater, and music, i.e. the radio play and its offspring («Weekend,» Walter Ruttmann's film for radio from 1930), experiments with the montage of noise were successful before those carried out with music. Ruttmann's use of cinematic optical sound recording shows that it was apparently the inaccuracy and awkwardness of montage using the gramophone (by switching between the audio signals coming from two different records) that posed a technical obstacle, whereas film could be relatively accurately and simply assembled using scissors and glue. The collage first established itself in fine art around 1920, although the gramophone would have already long since made the acoustic collage possible. Here, too, it was primarily the practical disadvantages of the gramophone that prevented this development. In 1917—thus long before he applied montage, collage, and double exposure techniques in the film «The Man With the Camera»—Dziga Vertov carried out comparable experiments with gramophone recordings; however, because they were difficult to handle and the sound quality was poor, he gave them up to devote himself instead to film. 
Surprisingly, the ostensible use of collage techniques in music occurred for the first time independent from technical media. Charles Ives processed impressions from the urban world of noises as early as 1906 in «Central Park in the Dark.»  In this piece for orchestra there manifested a tradition of the simultaneity of musical styles, sound elements, and rhythms, which has continued into contemporary media composition through Bernd Alois Zimmermann and Heiner Goebbels.
It was also in 1906 that Thaddeus Cahill took up the practice of ‹telephone concerts ,› which since the 1880s had been becoming more and more popular in many large cities, when he introduced the world to the «Telharmonium.» His decision was primarily based on economic considerations: no other infrastructure produced as many potential listeners than the telephone, thus creating the means for raising the immense amounts of capital required for this new type of instrument. With this first machine world wide which could synthesize complex sounds and even provide for musically intellectual demands such as playability in pure pitch and a random number of new scales, of necessity the principle of a music materialized that with the aid of media technology could be played into spaces in such a way that it is only taken in as mere background music for other perceptions.
In 1915, Thomas Alva Edison also carried out experiments to allow piping selected pieces of music into factory buildings via a phonograph in order to cover the noise and to raise the morale of the workers, which would ultimately increase profits.  Empirical studies soon verified that workers produce more—and consumers buy more—when background music is played. Consequently, the attempt begun in 1922 by General Owen Squiers to specifically market the functional value of broadcasted music was successful: He began transmitting gramophone music via telephone lines into restaurants and offices. Since then, Squiers' company «Muzak,»  whose name was to become a synonym for functional background music, has come to line countless locations. Background music lures clientele into boutiques, and it lends more calm to the candlelight in a restaurant and more tension to the red light in the world of prostitutes and pimps.
The so-called ‹Mood-Song,› which was distributed on record in countless versions in the 1950s and 1960s, was, like background music, also based on the discrete arrangements of popular hits:  Recordings such as «Music to Work or Study by,» «Music to watch Girls by,» «Music to Read James Bond by,» or simply «Music to Live by» were aimed at the musical lining of specialsituations, whose visual ambience was depicted on the respective record cover.
Brian Eno's idea for ‹ambient music› examined what had since become the ubiquitous phenomenon of lining an otherwise primarily visually marked experience of the world (see below) with music.  Installations such as «Generative Roomscape 1» (1996) treat sound and image equally: Both of them are interpreted as atmospheric arrangements that complement each other without confirming the tendency towards the hierarchical subordination of the musical level.
In film, image and sound come together to form a media-technical construct. Movements in the image and of the image frame itself as well as the assembly of these images enter into complex relationships with the spoken language, music, and noises. While film music exhibited advanced methods of reinforcing narration and the aesthetic effect of the images soon after sound film was established,  for a long time sound abided in the subordinate function of reflecting the events visible in the image: «see a dog, hear a dog,» so the motto went. It was not until the 1970s that directors from the ‹New Hollywood› began to give sounds a fundamental and aesthetically independent role, e.g. in films such as «Star Wars» (George Lucas, 1977) or «Apocalypse Now» (Francis Ford Coppola, 1979), in which sounds—in part invented—were ‹composed› by sound designers in elaborate and time-consuming processes.  Since then, image and sound have had a similar status in the production process, even though the image continues to play the ultimate role in the moviegoer's consciousness. The goal of the envisaged fusion of the image and sound levels is to produce the perfect illusion.
In his film «Nouvelle Vague» (1990), Jean-Luc Godard allows sound and image to fall apart time and again, thus breaking down the illusionist image/sound relation characteristic of Hollywood films. He exaggerates noises, draws attention to intervention by the sound engineer through accelerated fading, allows dialogue to sink into background noises, and mounts music into an image in a way that is irritating and contradicts visible emotional contents.
In interactive media, the linear soundtrackinevitably becomes a dynamic ‹sound field,› an open structure. Here, the principle of linear narration in classic film and the fixed form sequence predominant in occidental music no longer works.  The authoring software «Koan,» which was developed for the dynamic addition of sound to free navigation through Web sites, is a pragmatic concretion of this condition. On the basis of the «Flash» format, zeitblom realized a dynamic sound background of a similar nature for a Web site by the Freunde Guter Musik < which reflects the degree of the visitor's activity.
Functional acoustic information—the output of audible responses to user activity—which one can refer to as an ‹acoustic user interface,› is similar to the soundtrack. Research in the area of the ‹auditory display›  is searching for neat models for the audification of computing processes and for acoustic orientation aids in large data stocks, for instance through so-called ‹earcons,› which like visual icons should help to quickly and intelligibly distinguish data types and means of navigation.
Principles like ‹rollover› have spread with the development of formats such as «Flash/Shockwave» and «Beatnik.» Consequently, in the area of net music,  artistic projects such as «electrica» have developed which emit acoustic signals to indicate what lies behind a graphic structure. While this kind of audiovisual navigation can function well on art Web sites, in everyday practice the problem arises that sound permanently reaches the ear, while visual stimuli can be selected depending on the direction of vision. This causes signal-like sounds such as the acoustic rollover to quickly become annoying. In addition, earcons are apparently difficult to decipher: Producers of radio jingles point out that in fast-moving, everyday radio, only about fifteen different types of noises (the ringing of a telephone, the jingle of coins, the flushing of a toilet, etc.) can be clearly identified by listeners. The reasons for this poor ability to identify sounds probably lie in the dominance of the visual within Western culture. 
The dominance of the image can be demonstrated using the computer as an example. The computer (and thus the interface to the Internet) is conceived as a text medium with a visual interface. Its era began with the spartan illustration of letters; all one could hear was the irritating hum of the cooling ventilators. The assertion of the graphic user interface on the basis of the intuitively usable desktop metaphor  by «Apple Macintosh» in 1984 supported this visual fixation. HTML, the basic code of the World Wide Web, is also text/imageoriented. Hyperlinks are symbolized by highlighted words and images.
The underrepresentation of the acoustic may have also resulted from the low computing performance levels of early PCs; however, it was more a result of the visual dominance of our age: In Western culture a preference for visual perception developed based on writing as the fixation of language.  In contrast to hearing, its objects tend to be lasting and quantifiable and thus embody objectivity and truth. Sound, on the other hand, is ephemeral and strongly dependent on subjective judgement. And so it is not really suited to the electronic superbrain, the machine, which primarily serves the efficiencyoriented elimination of uncertainties. 
Accordingly, like in a library, all computer data, regardless of whether they contain text, sound, or image, are archived alphabetically according to their language designation. Even today there are no sound databases that allow archiving according to tonal criteria or even performing an auditory search. 
Interactive audio applications deal with this kind of visually-guided navigation through sound and thus also with issues of notation. The direct transmission of the two central notation coordinates—time in the horizontal and pitch in the vertical, which are read out linearly by a cursor—places the clueless user in the precarious position of having to arrange concrete sound occurrences in time, therefore to ‹compose.› Because s/he is in most cases inexperienced, it is no wonder that many examples of this kind supply unsatisfactory results.
In the modules of the work «Small Fish» on CD-ROM by Kiyoshi Furukawa, Masaki Fujihata and Wolfgang Münch, the linear version of the cursor has been replaced by the simulation of the behavior of organic systems. Users produce and influence the music by manipulating graphic objects that lead an individual, ‹independent existence.› With the aid of object-oriented programming, the behavior of the individual objects has been structured in such a way that they exchange information among themselves. The user's interaction and collisions with other objects alter the state of individual objects and thus the overall system. None of the objects knows all of the connections, i.e. there is no point in the system where all information runs together. So a dynamic system of autonomous objects is coupled to the autonomous actions performed by the user. Here the tradition of musical notation on a surface is interwoven with the behavior of organisms and physical systems. The systems' behavior orients itself towards natural and mechanical processes, and this is why despite their complexity, they can—as is the case for the experimentalists' concepts—be comprehended and used in a practical way on the basis of everyday experience.
Media-artistic examinations of music time and again thematicize the intermediality of the acoustic experience. In the process, imbalances in the perception of everyday media life and the hierarchical relation between image and sound are questioned. The production, dissemination, and reception of music have changed since the introduction of electronic and digital media. Technical media as a common source of image and sound have stimulated a wealth of intermedia connections. What continues to be behind this is the centuries-old desire to fuse sensory impressions together into a synaesthetic experience.
Besides this explicitly art-synthetic approach, it is often overlooked that because of its special production and reception conditions, music actually already possesses intermedia characteristics. Musical practice of the twentieth century reflects these relationships, emphasizes them, or questions them inthat the concert is being enhanced by a variety of forms of visualization, and customary notation is being replaced by graphic symbols and visuallyinfluenced forms of interaction. Intermediality in music is therefore not a consequence of the advance of technology, rather it is a phenomenon which is inherent in music itself and which with the aid of media can be molded in particularly effective and diverse ways.
© Media Art Net 2004