Education-line Home Page

An heuristic approach to the evaluation of educational multimedia software.

David Squires

Paper prsented at the CAL 97 Conference "Superhighways, Super CAL, Super Learning?" University of Exeter 23rd - 26th March 1997

School of Education, King’s College, Waterloo Road, London, SE1 8WA, England.

Tel +44 171 872 3107; Fax: +44 171 872 3182



Many educationalists would agree that learning is a situated. If learning is situated it follows that (i) it is not possible to evaluate an educational software package as an object in its own right, but only to evaluate the actual or perceived use of a package; (ii) learners construct their own concepts which they develop and refine through experience, leading to idiosyncratic use of software; and (iii) all the components of a learning environment (people and artefacts) interact and contribute to the learning process. Established predictive evaluation techniques, e.g. writing software reviews or the application of standard checklists, fail to take account of the importance of context. They are also often time consuming to use. It is suggested in this paper that the heuristic approach to evaluating software usablity established in the HCI area could be adapted to provide an efficient context sensitive approach to the evaluation of educational multimedia software. This approach is illustrated by suggesting some candidate heuristics for the predictive evaluation of educational multimedia software.

1. Introduction

It is possible to distinguish between predictive and interpretative evaluation of educational software. Predictive evaluation is concerned with the assessment of the quality and potential uses of a software application prior to its use with students. This type of evaluation is usually done by teachers when they are making purchasing decisions or preparing lessons and by review agencies. Interpretative evaluation is concerned with evaluating the observed use of an application by students. In this paper an heuristic approach to the predictive evaluation of educational software will be advocated.

Many educationalists would agree that learning is a situated. Such a view has three consequences for software evaluation. First, it is not possible to evaluate an educational software package as an object in its own right; it is only possible to evaluate the actual or perceived use of a package. As Reeves (1992) puts it "Learning is highly tuned to the situation in which it takes place". Thus the evaluation of an educational software package must be conducted with reference to an existing or intended educational environment (Squires and McDougall, 1996). Second, learners construct their own concepts which they develop and refine through experience. Thus the use of a package is typically idiosyncratic, depending on the way learners interpret its use in an educational setting. Third, all the components of a learning environment (people and artefacts) interact and contribute to the learning process. As Norman (1991) pointed out, the introduction of a cognitive artefact will change the user's perception of the task through a distribution of cognition between the user and the artefact.

There is an inherent problem with predictive evaluation - how can the evaluator assess the use of a package with reference to an appropriate context defined in terms of learner characteristics, resource provision and teaching approaches, when by definition the evaluation is conducted outside the intended context? It is suggested that the heuristic approach to evaluating software usablity (Molich and Nielsen 1990; Nielsen 1990) could be adapted to provide an efficient context sensitive approach to the evaluation of educational multimedia software.

2 Established approaches to educational software evaluation

Established predictive evaluation techniques, e.g. writing software reviews or the application of standard checklists, fail to take account of the importance of context and are time consuming. These problems are identified more specifically in this section.

2.1 Reviews

A review is evaluation in which software selection advice is given with a diverse audience in mind, typically for publication in professional educational and computing publications. There are some problems with review as a form of evaluation:

• The use of a non-interactive medium, paper based text, to report on the interactive medium of computer software makes it very difficult, if not impossible, to convey the essential aspects of a computer program (Squires and McDougall 1994).

• The accuracy of reviews is debatable, with some magazines reluctant to publish negative reviews, in part for fear of alienating potential advertisers (Office of Technology Assessment 1988)

• Given the aim of satisfying the needs of a diverse audience it is very difficult to reflect contextual issues

2.2 Checklists

A checklist aims to provide a comprehensive set of questions dealing with both educational and usability issues. Checklists date back to the early days of educational software use, e.g. (Heck et al. 1981), and they are still popular, e.g. (Rowley 1993), with new lists appearing for current software environments such as CD-ROM based packages (Steadman et al. 1993) and hypertext software (Tolhurst 1992). However, in a critical examination of the checklist approach McDougall and Squires (1995) identify a number problems which evaluators have found with the use of checklists as predictive evaluation tools:

• it is difficult to indicate relative weightings for questions (Winship, 1988)

• selection amongst packages of the same type emphasises similarities rather than differences (Squires and McDougall 1994)

• the focus is on technical rather than educational issues (Office of Technology Assessment 1988)

• it is not possible to cope with the evaluation of innovative software (Heller 1991)

• it is not possible to allow for different teaching strategies (Winship 1988)

• off-computer teacher generated uses are not considered (Squires and McDougall 1994)

• evaluation in different subject areas requires different sets of selection criteria (Komoski 1987)

2.3 Frameworks

Typically frameworks attempt to help evaluators by:

• categorising software packages, e.g. the frameworks developed by Beech (1983), Hofmeister (1984), Salvas and Thomas (1984), Newman (1988), Wellington (1985), USA Office of Technology Assessment (1988), Organisation for Economic Co-operation and Development (1989), and Pelgrum and Plomp (1991).

• describing the roles that software is intended to fulfil, e.g. the identification of the three roles for educational software as tutor, tool, and tutee (Taylor, 1980) or the description of software as acting as surrogate teacher or a learning resource (O'Shea and Self, 1983).

• identifying links to commonly accepted educational rationales with a focus on how curriculum tasks can be perceived, e.g. the identification of instructional, revelatory, conjectural and emancipatory paradigms by Kemmis et al. (1977).

• describing a package in terms of the software environment presented to the user, leading to descriptors such as multimedia, hypermedia, hypertext, direct manipulation, windows, CD-ROM, and Internet (Squires and McDougall 1996).

These various frameworks have their strengths and weaknesses. Squires and McDougall (1996) have specifically criticised them for not supporting a situated, context-oriented view of evaluation.

3 An heuristic approach

Molich and Nielsen (1990) and Nielsen (1990) have introduced the notion of a set of heuristics that can be used by expert evaluators to identify usability problems in the design of a software package. Typical heuristics include "Visibility of system status: The system should always keep users informed about what is going on, through appropriate feedback within reasonable time", and "Consistency and standards. Users should not have to wonder whether different words, situations or actions mean the same thing. Follow platform conventions" (Nielsen 1994). Research has shown that the use of these heuristics by five expert evaluators will typically lead to the identification of about 75% of the design problems associated with a package (Nielsen 1992).

In an educational context expert evaluators should be teachers; they have the experience and understanding of practical issues to enable realistic predictions of likely classroom uses for software. However, this experience needs to be utilised in a principled framework which takes account of curriculum issues and concerns originating from learning and teaching research. In addition this framework will need to address usability issues and the relationship between usability concerns and educational issues (Squires and Preece 19996). Given this framework teachers can then be considered as expert evaluators as required in the notion of heuristic evaluation advocated by Nielsen. In this context it is proposed that a set of heuristics should be developed for the evaluation of educational software, which would enable experienced teachers to act as expert evaluators.

Although not formally articulated as such, the heuristic approach is becoming evident in the educational hyper- and multi-media evaluation literature. For example, Thornton and Phillips (1996) give eight evaluation questions "to which answers need to be found if multimedia is to improve and become an effective and efficient mainstream learning tool". Their questions are simply expressed enquiries, such as "Do students find it stimulating" and "How can the interactivity be improved". In the following section an embryonic attempt is made to articulate this approach more clearly.

4 Some suggested heuristic evaluation guidelines for educational multimedia.

As in the formulation of usability heuristics by Nielsen, a credible set of educational software evaluation heuristics will only emerge as the result of a large scale study of evaluation criteria used by expert practitioners. The heuristic framework suggested below is only indicative of the form an heuristic approach could take. It is not claimed that a comprehensive set of heuristics; rather they are proposed to illustrate an heuristic approach.

4.1 Is the complexity of the multimedia environment appropriate?

A multimedia collage of text, graphics, video and sound may appear to be complex to the user. However, it may be the case that complex presentation is being used to present very limited educational material. In this sense the environment is only complex in a superficial sense. A deeper complexity is one that relates the structure of the multimedia environment to conceptual development.

4.2 Is the learner active?

The possibilities of multimedia presentation can lead to the development of environments in which the user is conceived as a passive recipient of a multimedia collage. Users are encouraged to become absorbed by the environment, rather than control it. The use of multimedia in this way clearly does not lead to a sense of involvement and ownership.

4.3 Is fantasy used in an appropriate way?

Many multimedia environments are fantastic in the sense that they mimic real life situations in an extreme fashion. Such environments are often labelled as examples of virtual reality. While the fantasy associated with these environments may be intrinsically motivating it may mitigate against the development of authenticity.

4.4 How appropriate is the content to the curriculum?

Curriculum issues may be explicit, implicit, or even absent, in the content of a package. Software in which curriculum issues are explicit is typified by packages that have been deliberately designed to be used with a defined curriculum. Implicit curriculum issues stem from cultural assumptions made by designers. An absence of curriculum issues arises when software not originally intended for use in education is "high jacked" for use in schools, e.g. word-processors.

4.5 How navigable is the software?

The software must be easy to use, and not create unnecessary usablity problems. Some idea of the structure of the environment should be provided, perhaps in the form of a map. The design should be consistent and icons should be meaningful. The interaction should be authentic, with the task matched to the interface design.

4.6 What form of learner feedback is provided?

Leaner feedback can be extrinsic, as in a tutorial package, or intrinsic, as in a simulation. Is the type of feedback appropriate for the intended learning outcomes and any assumed theory of learning?

4.7 What is the level of learner control?

It has been acknowledged for some time that the extent to which learners can develop a sense of ownership in an educational software environment is determined by the level of control they have in their interaction with the software environment (Chandler, 1984; Wellington, 1985; Blease, 1985; McDougall and Squires, 1986; Goforth, 1994). This implies that the use of multimedia software which provides high levels of learner control will help students feel that they are instrumental in determining the pattern and process of the learning experience, i.e. in developing a sense of ownership.

4.8 Are learners motivated when they use the software?

As with feedback, motivation can be either intrinsic or extrinsic. From a constructivist point of view the motivation should be intrinsic, with the task itself providing sufficient motivation. Extrinsic feedback needs to be viewed with care, as it can be misleading or irrelevant, e.g. when more attractive feedback is provided for incorrect rather than correct answers in a drill and practice exercise.

6 References

Beech, G. (1983) Computer Based Learning: Practical Microcomputing Methods. Wilmslow: Sigma Technical Press.

Blease, D. (1988). Evaluating Educational Software. London: Croom Helm.

Chandler, D. (1984). Young Learners and the Microcomputer. Milton Keynes: Open University Press.

Goforth, D.(1994) Learner control = decision making + information: a model and meta-analysis. Journal of Educational Computing Research 11(1) 1-26 .

Heller, R. (1991) Evaluating software: a review of the options. Computers and Education, 17 (4), 285-291.

Hofmeister, A. (1984) Microcomputer Applications in the Classroom. New York: Holt, Rinehart and Winston.

Kemmis, S., Atkin, R. and Wright, E. (1977) How Do Students Learn? Working Papers on CAL, Occasional Paper No. 5, Centre for Applied Research in Education, University of East Anglia, UK.

Komoski, P. K. (1987). Educational Microcomputer Software Evaluation. In J.Moonen and T. Plomp (Eds.), Eurit86: Developments in Educational Software and Courseware, (pp. 399-404). Oxford: Pergamon Press.

McDougall, A. and Squires, D. (1986) Student Control in Computer Based Learning Environments. In Salvas, A. D. and Dowling, C. (eds.) Computers in Education: On the Crest of a Wave? Melbourne: Computer Education Group of Victoria, 269-72.

McDougall, A., and Squires, D. (1995). A critical examination of the checklist approach in software selection. Journal of Educational Computing Research, 12(3), 263-274.

Molich, R. and Nielsen, J. (1990) Improving a human-computer dialogue. Communications of the ACM, 33(3), 338-348.

Newman, J. (1988) Software Libraries: The Backbone of Schools’ Computing. Proceedings of the Australian Computer Education Conference. Perth: Educational Computing Association of Western Australia, 242-251.

Nielsen, J. (1990) Traditional dialogue design applied to modern user interfaces. Communications of the ACM, 33(10), 109-118.

Nielsen, J. (1992). Finding usability problems through heuristic evaluation. In P. Bauersfield, J. Bennett, and G. Lynch (Eds.), Human Factors in Computing Systems CHI'92 Conference Proceedings, (pp. 373-380). New York: ACM Press.

Nielsen, J. (1994) Heuristic evaluation and usability inspection methods. CSTG-L Discussion List (24.2.94)..

Norman, D. A. (1991). Cognitive Artefacts. In J. M. Carroll (Ed.), Designing Interaction: Psychology at the Human-Computer Interface (pp. 17-38). Cambridge: Cambridge University Press.

O'Shea, T. and Self, J. (1983) Learning and Teaching with Computers. Brighton: The Harvester Press.

Office of Technology Assessment (1988) Power On! New Tools for Teaching and Learning. Washington, DC: US. Government Printing Office.

Organisation for Economic Co-operation and Development. (1989) Information Technologies in Education: The Quest for Quality Software. Paris: Organisation for Economic Co-operation and Development.

Pelgrum, J. and Plomp, T. (1991) The Use of Computers Worldwide. Oxford: Pergamon Press.

Reeves, T. C. (1992) Evaluating interactive multimedia. Educational Technology, (May) 47-52.

Rowley, J. E. (1993) Selection and evaluation of software. Aslib Proceedings 45(3), 77-81.

Salvas, A. D. and Thomas, G. J. (1984) Evaluation of Software. Melbourne: Education Department of Victoria.

Squires, D. and McDougall, A. (1994) Choosing and Using Educational Software: a Teachers’ Guide. London: Falmer Press.

Squires, D. and Preece, J. (1996) Usability and learning: Evaluating the potential of educational software, Computers and Education 27(1) 15-22.

Squires, D and McDougall, A. (1996) Software evaluation: a situated approach. Journal of Computer Assisted Learning, 12 (3), 146-161.

Steadman, S., Nash, C., and Eraut, M. (1992). CD-ROM in Schools Scheme Evaluation Report. Coventry: National Council for Educational Technology.

Taylor, R. P. (ed.) (1980) The Computer in the School: Tutor, Tool, Tutee. New York: Teachers College Press.

Thornton, D. and Phillips, R. (1996). Evaluation. In R. Phillips (Ed.), Developers Guide to Interactive Multimedia (pp. 105-122). Perth: Curtin University of Technology.

Tolhurst, D.. (1992) A checklist for evaluating content-based hypertext computer software. Educational Technology 32 (3), 17-21.

Wellington, J. J. (1985) Children, Computers and the Curriculum. New York: Harper and Row.

Winship, J. (1988 ). Software Review or Evaluation: Are They Both Roses Or Is One a Lemon? Paper presented at the Proceedings of the Australian Computer Education Conference, Perth.

This document was added to the Education-line database 12 April 1999