Nonspeech Audio in Television User Interfaces

Richard van de Sluis, Berry Eggen and Jouke Rypkema

Philips Research Laboratories
Prof. Holstlaan 4
5656 AA Eindhoven
The Netherlands
sluisvd@natlab.research.philips.com

INTRODUCTION

a gong), sports (sound of a cheering audience), or films (part of the tune of a James Bond film). Since the EPG offers the possibility to set a reminder for a coming TV programme, the category sounds are also used as auditory reminders indicating that a TV programme of this category is about to start. Furthermore, certain characteristics of the category sounds are manipulated to represent the urgency of a reminder. For instance, a reminder for a TV programme which starts in 30 minutes seems to come from a source that is spatially further away than a cue for a programme that is going to start within five minutes.
EXPERIMENT 1: USABILITY OF CATEGORY SOUNDS
The first experiment investigated how well users can learn category sounds during 'normal' EPG use and whether they can exploit this knowledge to identify auditory reminders indicating the kind of TV programme that is going to start. Twenty subjects participated in the experiment. Phase one of the experiment focused on learnability. First, it was tested whether users could learn the category sounds during normal use of the EPG (unintentional learning), without explicit instructions to learn the sounds. They had to perform seven tasks exploring control of the EPG. These tasks were reasonably simple, e.g., "try to find an interesting documentary and set a reminder for it". When all tasks were completed, the category sounds were presented in random order (audio-only) and subjects were asked to write down the category name of each sound. Subjects who did not have a 100%-correct score after unintentional learning, were instructed to use the EPG to listen to the category sounds again in order to learn them all (intentional learning). A second test measuring correct identification of the category sounds was performed. If the category sounds were still not recognised 100% correctly, the intentional learning task was repeated until, finally, the 100%-correct level was reached. Phase two measured the effectivenessof the use of category sounds as remindersSubjects were presented with 22 short fragments of TV programmes. At the beginning of each fragment a category name was shown on the screen simultaneously with a vocal presentation of this category name. Subjects were instructed to say this category name aloud as soon as they heard the corresponding 'target' category sound. In each fragment, three or four sounds were played, one or two of which were target sounds. At the end of each fragment a question was asked to verify whether subjects had really been watching the TV fragment. The questions appeared on the screen and subjects were instructed to answer them aloud. In phase three of the experiment, the satisfactionof subjects about the category sounds was investigated in an interview.
Learnability: The average number of recognised sounds in the 'unintentional learning' test was 7.1 (standard deviation 1.8) out of 11 sounds (65%). The category sounds for newsand sportswere easily recognised. All subjects correctly identified the sounds for these categories. The category sounds for talkshows,comediesand filmswere also well identified. Magazines, gameshowsand hobbyprogrammes were most difficult to recognise. Only one subject recognised all category sounds in the unintentional learning test. The other 19 subjects performed an additional 'intentional learning' test, after which the average score was 96%. It took maximally 2 extra learning sessions of 1-3 minutes to learn all the category sounds.

 

Effectiveness: Subjects were able to detect almost all auditory reminders correctly while watching TV. The average number of target sounds detected correctly was 25.3 out of 26 (97%). From the questions at the end of each fragment the average number of correctly answered questions was 20.6 out of 22 (94%), which indicates that subjects had attended the TV programme.
Satisfaction: About 75% of the subjects was positive about the use of category sounds as auditory feedback in the EPG. However, some subjects said that the category sounds could become irritating and that their use therefore should be made optional. The greater part (80%) of the subjects was positive about the use of category sounds as a reminder and found it useful that the reminder also indicated the category of the TV programme.
EXPERIMENT 2: ENCODING URGENCY IN REMINDERS In the second experiment it was explored whether the urgencyof a reminder can be encoded into the category sound. For example, a reminder might occur 30 minutes, 5 minutes and 10 seconds before the programme actually starts. In this experiment the distance between listener and sound source in space is used as a metaphor for distance in time. When the distance to a sound source alters, the acoustic characteristics of the sound change (Nielsen, 1991). People are quite sensitive to these changes. Maybe the audible information about the distance of a sound source can inform the user about the distance in time as well. Two variables were tested: the intuitiveness of the sound-distance metaphor, and the effectiveness with which subjects were able to decode the urgency of the reminder. Seventeen subjects participated in the experiment. For each of the category sounds, three versions were used; a 'nearby', a 'half far' and a 'far away' category sound. To achieve this, three sound parameters were manipulated: overall intensity, intensity of high frequencies andsound reverberation. In the first task, included to estimate the intuitive understanding of the metaphor, subjects were presented with 11 pairs of category sounds. For each category, the name was vocally presented followed by the 'nearby' sound and the 'far away' sound of the same category in a randomised order. Subjects were told that the sounds represented TV programmes. They were asked to select the sound that, according to their intuition, represented the programme that would start first. After learning the category names corresponding to the category sounds, the subjects had to perform the second task to estimate the recognition of both category and urgency. In this task, after presenting a sound, subjects immediately had to determine both the category and the urgency, like in a real TV-watching situation. An example was presented of the three urgency variants. In the experimental phase, for each of the eleven category sounds there were three versions, yielding a total of 33 sounds, which were presented in random order.

14 out of the 17 subjects responded 100% correctly to the eleven pairs presented. The overall average of correct responses was 10.5 out of 11 (96%). All subjects were able to determine the category names 100% correctly. Determining the urgency however, appeared to be somewhat more difficult. Nevertheless, four subjects succeeded in determining the urgency of all 33 sounds correctly. The average correct score was 29.8 out of 33 sounds (90%).

Figure 2 shows the bar diagram of the average score. The values at the X-axis represent the actual urgency, that is the time before the TV programme starts. The times above the bars represent the urgency such as interpreted by the subjects. The diagram shows that a '30 min.-reminder' ('far away' sound) was never interpreted as a '10 sec.-reminder' ('nearby' sound), and vice versa. It has to be stated, though, that the second experiment was a sound-only experiment. In a real living-room environment urgency determination and watching TV will be done simultaneously. In such a dual-task situation urgency judgement might be more difficult to perform.

 

 

CONCLUSIONS
People can easily learn to match the category sounds to the corresponding TV-programme categories, the use of category sounds is effective, and the category sounds were appreciated by a large part of the subjects. The distance of a sound source is a useful metaphor to use in an auditory reminder to indicate the distance in time before a programme is going to start.
REFERENCES
Nielsen, S.H. (1991), Distance Perception in Hearing,Acoustic Laboratory, Institute of Electronic
Systems, Aalborg University Press, Aalborg, 125-126.
Westerink, J.H.D.M., M van der Korst, G. Roberts (1998), Evaluating the use of pictographical
representations for TV menus,Adjunct Proceedings of CHI'98, 217-218.

RSS: Syndicate content Syndicate content