Towards a Formal Representation of Multi-Modal Systems for Usability Assessment
Joanne Hyde
School of Computing Science, Middlesex University,
Bounds Green Road, London, N11 2NQ.
j.k.hyde@mdx.ac.uk
INTRODUCTION
The user interface is being called upon to handle an increasingly diverse range of users, and some designers feel that interaction can be facilitated by the use of systems that exploit more than one means of input or output at a time: so-called "multi-modal" systems. However, their design appears to be device-led, with developers interested in the novel nature of a device rather than its implications for the increased usability of the interface. Existing research tends towards an empirical approach in analysing the success of a particular device. There is thus an absence of appropriate multi-modal usability theory, complicated by the disagreement between various communities over the precise meaning of the term "modality"(e.g. Bernsen, 1995, Coutaz et al., 1993). This impedes research into the complex usability problems posed by multi-modal systems, and results in a lack of appropriate notations to describe multi-modal activity.
This research explores the problem of defining modality. Existing modelling notations and techniques are scoped to see how well they capture multi-modal interface usability issues. A new definition of modality is proposed, with an associated taxonomy to allow the identification and categorisation of instances of modalities at the interface. Further work involves the identification of usability properties of multi- modal systems, and theirformal application.
MODALITY RESEARCH
There are several, often competing, definitions of modality, which can result in confusion when designers from different backgrounds come together in the design of interactive computer systems. The computer systems perspective defines modality in terms of input or output devices (e.g. Bernsen, 1995), for example, a keyboard, mouse, monitor or microphone. The user perspective defines modality in terms of the sensory channel through which it is expressed, for example, tactile or visual (e.g. Purchase, 1998). Other definitions use elements from both viewpoints, but have difficulties in integrating them, because of the irreconcilable differences between what is being considered as a modality (e.g. Coutaz et al., 1995). It is this conflict which makes many current definitions unusable in a wider context. There is therefore a need for a definition of modality which is not as broad as one based on sensory channels, yet avoids being device-dependent, and is able to provide a clear basis for relating certain types of communication devices to the capabilities of the user. NOTATIONAL RESEARCH
In order to scope their ability to identify potential multi-modal usability problems, five different techniques for representing user interaction, covering a wide range of formal system and user approaches, were examined. They included a hierarchical goal-based technique (GOMS), a natural language goal-based method (Cognitive Walkthrough), a means-end planning based technique (Programmable User Model), a diagrammatic representation (State Transition Diagram), and one based on set theory and first-order predicate logic (Z).
The study, based around the examination of a task performed by a prototype robotic arm (see Hyde et al., 1998), found that these techniques were able to identify a wide
range of usability problems, but had difficulty in identifying usability issues directly related to multi-modality. Although Critical-Path-Method (CPM) GOMS (John and Kieras, 1996) initially seemed to be best able to handle multi-modal interaction issues, it was unable to shed much light other than providing comparative information about the two methods of input. The limitations of these techniques with regard to multi- modal usability issues centre around their emphasis on interaction ordering,and their inability to deal with simultaneous complexity. Notations which utilise different interaction paradigms may be more appropriate for describing multi-modal systems. NEW MODALITY DEFINITION AND TAXONOMY
Key attributes contained within the concept of modality were identified from the literature and work on notations, and included into a new definition of modality, as a temporally based instance of information perceived by a particular sensory channel. This produces a new three dimensional taxonomy (sensory channel, temporal nature, and information form) to aid modality classification.
The sensory channel refers to three human senses: audio, visual and haptic, since they are the three main channels through which information is perceived and communicated. The temporal nature describes whether a modality is discrete (unchanging within its occurrence, which is brief), continuous (repeated exactly the same more than once) or dynamic (changes in content within its occurrence, which may last for some time). The form of the information relates to its presentation, and can be divided into: lexical (in the form of text); concrete (in the form of the reproduction of a real life object); and symbolic (a representation of something rather than an actual reproduction of it). The twenty-seven cells derived from this new taxonomy allow for a large coverage of the interaction space, while the classification is small enough to be easily applied.
FURTHER WORK
Investigate taxonomy applications, identify significant properties of multi-modal systems, develop notation for multi-modal usability analysis of user interface. ACKNOWLEDGEMENTS
This work is supported by a postgraduate studentship from the School of Computing Science, Middlesex University.
REFERENCES
Bernsen, N.O. (1995) A toolbox of output modalities: representing output information
in multi-modal interfaces. In Bernsen, Jensager, Lu & Verjans (eds): Modality
theory and information mapping. Amodeus project deliverable D15 Coutaz, J., Nigay, L., & Salber, D. (1993) The MSM framework: a design space for
multi-sensory-motor systems. In Bass, J., Gornostaev, J., & Unger, C. (eds):
Lecture notes in computer science no. 753, Human computer interaction,
Springer-Verlag Moscow
Hyde, J.K., Blandford, A.E., & Goodman, H.S (1998) A Comparison of Five
Techniques for Investigating General and Multi-Modal Usability Issues. To be
published as School of Computing Science Technical Report. John, B.E. & Kieras, D.E. (1996) The GOMS family of user interface analysis
techniques: comparison and contrast. In: ACM Transactions on Computer-
Human Interaction Vol 3No 4, pp 320-351
Purchase, H. (1998) Separating text from technology: a semiotic definition of
multimedia communication. To appear inSemiotica1998.
