Seeking quality in criterion referenced assessment
Lee Dunn, Sharon Parry and Chris Morgan
Southern Cross University, Australia
Paper presented at the Learning Communities and Assessment Cultures Conference organised by the EARLI Special Interest Group on Assessment and Evaluation, University of Northumbria, 28-30 August 2002
Over the past decade, traditional norm referenced methods of assessment have come into question, and criterion referenced assessment in undergraduate education has gathered considerable momentum as a method of marking, grading and reporting students' achievements. The value of criterion referencing lies in its capacity to achieve greater transparency in marking and the descriptors it gives us for the abilities and achievements of learners. While the notion of marking and grading against explicit criteria and standards may seem a relatively simple concept, it is complex conceptually and involves a range of problematic assumptions. This paper explores some of the difficulties with implementing criterion referenced assessment, including difficulties in articulating clear and appropriate standards, problems with the alignment of criteria with other elements of the subject or program, and the competence and confidence of university teachers in exercising professional judgement. It is argued that quality and authenticity in criterion referenced assessment are elusive goals and that understanding its guiding principles is not enough. Criterion referenced assessment must be placed in its disciplinary context.
The strengths of criterion referenced assessment
Norm and criterion referenced assessment are two distinctly different methods of awarding grades that express quite different values about teaching, learning and student achievement. Norm referenced assessment, or 'grading on the curve' as it is commonly known, places groups of students into predetermined bands of achievements. Students compete for limited numbers of grades within these bands which range between fail and excellence. This form of grading speaks to traditional and rather antiquated notions of 'academic rigour' and 'maintaining standards'. It says very little about the nature or quality of teaching and learning, or the learning outcomes of students. Grading is formulaic and the procedure for calculating a final grade is largely invisible to students.
Criterion referenced assessment has been widely adopted in recent times because it seeks a fairer and more accountable assessment regime than norm referencing. Students are measured against identified standards of achievement rather than being ranked against each other. In criterion referenced assessment the quality of achievement is not dependent on how well others in the cohort have performed, but on how well the individual student has performed as measured against specific criteria and standards. Underlying this grading scheme is a concern for accountability regarding the qualities and achievements of students, transparency and negotiability in the process by which grades are awarded, an acknowledgement of subjectivity and the exercise of professional judgement in marking.
Criterion based pitfalls
Although both methods are commonly used in higher education, criterion referenced assessment is successfully supplanting norm referencing as the preferred marking scheme in many universities. Yet for academics who are making this transition, it is not proving to be easy. For those new to the practice, it may be very difficult to desist from normally distributing grades and thereby moderating the performance of students because the notion of a small percentage achieving low grades and a small percentage achieving high grades is well entrenched in higher education.
On a practical level, criterion referencing requires considerable negotiation to arrive at agreed criteria and standards, not only amongst academic colleagues, but also with industry bodies, professional associations and other educational institutions that may have a stake in the learning outcomes. There is a view that criterion referenced assessment is inextricably linked with the competence movement, and is thereby attempting to reduce the assessment of complex professional practice into a series of discrete, observable, lower order tasks (Morgan and O'Reilly 1999). However, Wenger (1998) argues that assessment criteria are complex and that they cannot convey every possible meaning. Hagar et al (1994) further assert that there are a variety of issues to be considered, among them that criterion referenced assessment only assesses trivial and atomistic tasks. They argue that it is unreliable because it involves inference and subjective professional judgement; that it focuses on outcomes at the expense of process; and that it represents a departure from tried and true methods. Of all these concerns, it is perhaps its departure from pretensions of scientific rigour which is the most problematic for many academics whose disciplinary culture, fears about standards, and own experiences as undergraduates, provide a heavy mantle to cast off.
There is a growing body of literature that documents efforts to introduce criterion referencing in higher education (see for example Carlson et al, 2000, Price and Rust 1999, O'Donovan, Price and Rust 2000). Issues of interest in this body of literature include (a) the critical necessity of shared understandings about criteria and standards between all stakeholders in the assessment process; (b) issues of fuzziness in the expression of criteria associated with high inference and low inference assessment tasks: (c) the inherent subjectivity of interpreting criteria; and (d) concerns of academic staff about the implications of grades being skewed noticeably from a normal distribution. The establishment of clear and appropriate criteria and standards of achievement is complex indeed.
Assessment criteria and standards
The establishment of appropriate criteria and standards for student achievement are far from clear among academics. According to the available literature, policies have changed to criterion referenced assessment in many instances before academics have embraced the new concepts or - in many cases - even understood them. Some of the issues concern how to write clear and appropriate criteria and whether criteria and standards are synonymous terms or whether they need to be separated conceptually and in practice (for example Carlson et al 2000; Barrie, Brew and McCulloch 1999; Brooker, Muller, Mylonas and Hansford 1998).
A confounding feature of criterion referenced assessment concerns varying definitions of 'criteria' and 'standards'. Sometimes the terms are used interchangeably, or the word 'criterion' includes both what is to be assessed and how it will be measured. Conceptually, the terms are complementary but they have separate meanings. A criterion is a characteristic by which quality can be judged, and a standard is a statement about the degree of quality to be attained. Barrie, Brew and McCulloch (1999) for example, found a diversity of understanding and some confusion about the elements of criterion referencing in the academic literature, and they identified seven qualitatively different approaches to writing assessment criteria. Although criterion referenced assessment is now widely adopted, academics tend to confuse the meanings of the two terms, making it difficult to make standards explicit to students. Carlson et al (2000) found that academics have more trouble defining standards than they do writing assessment criteria.
One of the advantages sought by proponents of criterion referenced assessment is that it depends fundamentally upon criteria that are clear and appropriate. But if academic staff have difficulty with the concepts and practice, students are likely to have even more difficulty. Sadler (1987) spelt out some of the difficulties of achieving explicit assessment criteria, many of which continue to challenge academics today. Sadler argued that fuzziness in verbal descriptions of criteria comes from the capacity for different interpretations of their meaning and from problems articulating where the boundaries of standards lie. He identified assessment criteria and standards as being 'sharp' or as having 'matters of degree' (1987:198).
In this vein, O'Donovan, Price and Rust (2000) found that students have difficulty with vague criteria where the matters of degree are not made explicit. Alternately, Brooker, Muller, Mylonas and Hansford (1998) identify a reductionist approach to writing 'sharp' criteria that can become little more than checklists and do not provide much formative feedback to students. However, the extent to which criteria should be precisely specified in advance depends upon the type of learning outcomes being sought and this is where disciplinarity comes into play. Professional judgement of the 'I know good work when I see it' kind has been overturned not the least because the current environment of accountability and quality assurance has required assessment decisions that are able to be justified.
Preconceived expectations of performance
The practice of norm referencing continues because academics lose confidence when grades turn out to be markedly higher or lower than would occur with a normal distribution. Additionally, some academics continue to apply norm referencing because they believe that academic standards will be lowered when competition is totally removed from assessment systems. They are concerned that academic rigour will be lost (Rowntree 1987). In fact, increased numbers of higher grades might mean that teaching has improved, students could have more clearly understood the assessment requirements, or marking using criteria and standards might be more reliable. A cluster of lower grades could mean that some students have not achieved pre-requisite learning, or are otherwise unprepared for assessment tasks.
The semantics of assessment criteria
Assessment criteria such as 'evidence of critical reasoning' accompanied by standards articulated in such terms as 'shows an imaginative approach', 'logically argued' or 'disorganised and rambling' are open to a variety of interpretations. A teacher could easily have a different understanding of the criteria and standards from the student who needs to interpret them. Nevertheless, there are settings in which it is appropriate for criteria to allow students some latitude. In addition, opportunities for higher order thinking or creativity can be constrained when criteria are developed so precisely as to be reductive in their effect on student performance. One way to offset misunderstanding and confusion is through expanded examples, models and definitions that give clear messages to students about the range of acceptable performance. Several studies (Carlson et al 2000; O'Donovan, Price and Rust 2000; Sambell and Johnson 1999; Brooker, Muller, Mylonas and Hansford, 1998) demonstrate how this can work in practice and in addition, can provide common parameters for staff in a marking team. In these cases, it is necessary to reflect upon the meanings and implications of the criteria, the standards and their linkages to course and subject goals and to the goals of particular assessment tasks.
Linking assessment criteria and learning objectives
For many authors it is important to see assessment as an integral part of the learning process (Carlson et al 2000:108). Black and Wiliam (1998) go further to argue that formative assessment is vital to learning processes. Notions of authentic assessment and constructivism are helpful here, particular when issues about the reductive nature of criterion referenced assessment are taken into consideration.
The idea of authentic assessment
Taking the idea of assessment criteria being used to guide learning, Cumming and Maxwell (1999) argue that the trend towards criterion referenced assessment has led to two considerations. They are (1) the use of learning outcomes as indicators of learning and (2) the notion that learning and assessment need to be meaningful for students because learning depends on context and motivation. The push for the close alignment of a syllabus to assessment tasks (Biggs, 1999) is consistent with this thinking and also with the aims of 'authentic' assessment that promotes the practice of directly assessing students on 'worthy intellectual tasks', as opposed to assessment that makes inferences about students' abilities through indirect assessment. Authentic assessment mirrors real contexts and ill-structured challenges (Practical Assessment and Evaluation online, accessed August 2002). Authentic assessment tasks help students to focus on demonstrating their ability to discern critical knowledge and to act effectively in situations that make sense in their future professional contexts. Learning outcomes, teaching and learning activities and assessment match as to their descriptions, content, depth and curriculum goals. Authentic assessment goes beyond the concept of validity; it is holistic and professionally valued.
Linked to the notion of authentic assessment is the notion of constructive learning. In this perspective people make their own meaning to construct learning outcomes. This idea contains several assumptions: that learning is a result of constructive activity by students; that social and cultural contexts and communities influence learning and that learning is a social and collaborative activity. Within this framework, teachers support the construction of learning and provide an environment where learning is able to take place. People learn through direct experience, and must be allowed to make errors and look for solutions to intellectual and practical problems. Biggs (for example 1992, 1999) links this theory of learning with the principles of curriculum alignment to form what he calls 'constructive alignment' of the curriculum. The model is compatible with the principles of authentic assessment because the focus is on designing the formal learning experience holistically to enable students to achieve learning outcomes in a way that has meaning for them as individuals and groups. The model of constructive alignment specifically links learning outcomes to assessment tasks and assessment criteria. It also embeds Biggs' (1992) SOLO taxonomy of learning and includes formative feedback via verbal descriptions of standards of attainment as well as criteria.
While the notion of constructive assessment seems entirely consistent with constructive learning processes, it actually refers to the meaning constructed by the learner in response to the assessment task. It is necessary to distinguish in 'authentic assessment' between assessing the 'performance' and assessing the understanding that is constructed by the learner.
How clear is too clear in criterion referenced assessment?
Although constructive alignment and authentic assessment practices are student-focussed and enable students to make their own meanings, there is a view that by their very nature pre-set criteria may not allow students to push the parameters of existing knowledge. There may not be room for unexpected learning outcomes, especially if, as Carlson et al (2000) argue, university teachers have difficulty articulating assessment standards. Other critics (see for example Edwards, 1997) are concerned that pre-determined learning outcomes are counter to the principles of adult learning where learners go beyond the course goals and view learning in terms of their own life goals. Riley and Stern (1998) assert that there is a danger that learning activities that relate to authentic assessment tasks can become an end in themselves and that the development of important knowledge bases can be hidden if learning objectives are written with the focus on authentic 'performance' as an indicator of the achievement of learning outcomes.
These issues are not able to be addressed at a generic level, for their foundations lie in epistemology. To better understand how university teachers might develop competence and confidence in criterion referenced assessment, it is necessary to direct our attention to disciplinary differentiation in undergraduate assessment, grading criteria and achievement standards.
Assessing with competence and confidence: the importance of academic disciplines
Angelo and Cross (1993:4) argue that "...A defining characteristic of any profession is that it depends on the wise and effective use of judgement and knowledge..." However, professional judgement, when it comes to setting assessment criteria, can vary across settings. Sadler (1987) argued that precisely defined standards could be 'sharp' or they could be 'matters of degree' where precision is not called for. This distinction is perhaps best explained by a disciplinary perspective on professional judgement, which, as Becher (1989) has shown, is shaped by the nature of its knowledge base.
The disciplinary groupings of hard, soft, pure and applied fields of knowledge derived by Becher (1989) from the work of Biglan (1973a; 1973b) and Kolb (1981) point to very different kinds of professional judgement based on different characteristics of knowledge. Exercising professional judgement in undergraduate assessment concerns measuring students' knowledge and skills in matters considered significant or important in the field.
Making judgements: hard pure disciplines
Hard pure knowledge ( which may be exemplified by physics and chemistry) is typified as being cumulative and atomistic in structure, concerned with universals, simplification and a quantitative emphasis. Knowledge communities tend to be competitive but gregarious: joint or multiple authorship is commonplace (Parry, Neumann and Becher 2002). Professional judgement relies upon a concrete knowledge base that is shared by the knowledge community. Answers to assessment tasks tend to be either correct or incorrect, with little or no room for interpretation. They are therefore more likely to be low inference tasks where criteria are concrete. They are also more likely to be specific, closely focused examination questions or multiple choice questions (Neumann, Parry and Becher 2002). Marking and grading may be confidently undertaken with sufficient command of the knowledge base, and in any case, there is less likelihood in hard pure fields that judgement will be questioned. Warren Piper, Nulty and O'Grady (1996), for example, found that professional judgement was less likely to be questioned the more mathematical the discipline.
When it comes to confidence, however, the nature of the knowledge base provides particular constraints on professional judgement. Here there is likely to be the desire to test everything in the curriculum since fields such as physics and chemistry depend in undergraduate years upon learning foundation knowledge in cumulative fashion. Over assessment is frequently identified in these settings (see also Warren Piper, Nulty and O'Grady 1996). Parry, Hayden and Speedy (2000) found that guidelines for marking and grading are relatively less common and that professional judgement - which is typically objective and disinterested - is more likely to be norm referenced (see also Warren Piper, Nulty and O'Grady, 1996).
Making judgements: soft pure disciplines
Soft pure knowledge (of which history and anthropology are worthy examples) is in contrast reiterative, holistic, concerned with particulars and based on interpretation. Unlike hard pure fields, knowledge seeks to provide new insights into existing phenomena. Scholarly enquiry is unlikely to be a collective endeavour because researchers tend to pursue individual interests at a deep level. Competent professional judgement in these settings is more likely to be conferred by the knowledge community and based upon familiarity with expectations, conventions, values and theoretical influences in the field. Ultimately, professional judgement is sophisticated, complex and subjective; assessment tasks are likely to be high inference. In these settings, undergraduate assessment is more likely to be a continuous process that highlights the student's intellectual development. Consistent with these features, Warren Piper, Nulty and O'Grady (1996) found that essays, short answer papers and project reports were the main assessment tasks and that guides to marking criteria in criterion referenced assessment were relatively more common and this is consistent with the high inference nature of assessment tasks. In addition, examinations are relatively less common because undergraduate students need to learn how to develop and shape an argument, so continuous and formative assessment are more prevalent (Neumann, Parry and Becher, 2002).
Making judgements: hard applied disciplines
Hard applied knowledge (such as in engineering and the technologies), is concerned with mastery of the physical environment and geared towards products and techniques. Knowledge is purposive and pragmatic, producing know-how via hard knowledge. Hard applied knowledge communities, according to Biglan (1973b), are also gregarious, with multiple influences and interactions on both their teaching and research activity. In these fields, the emphasis in assessment is likely to be on problem-solving and practical skills, and there is a strong value placed on the integration and application of existing knowledge (Smart and Etherington, 1995). Professional judgement in developing assessment criteria derives from a cumulative knowledge base, but is strongly influenced by professional standards. As in hard pure disciplines, assessment tasks are likely to be low inference, even though the kinds of tasks are different. Parry, Hayden and Speedy (2000) found that assessment tasks in hard applied fields were more likely to involve project work and simulation once the initial knowledge building blocks are established early in the degree program.
Making judgements: soft applied disciplines
Similarly, soft applied knowledge (such as education and management studies) is dependent on soft pure knowledge, but given expression through professional practice. Here, too, as in soft pure disciplines high inference assessment tasks predominate. In addition, however, there is a focus upon protocols and procedures, with the aim being the enhancement of professional practice. Like hard applied fields, assessment tasks emphasise knowledge application and integration, usually in essay or explanatory form (Neumann, Parry and Becher 2002).
So far, the discussion has concentrated on explicit features of disciplinary knowledge and how these are manifested through undergraduate assessment. However, in exploring the implications for criterion referenced assessment, there are two inexplicit features of disciplinarity that must be taken into account. One is the dynamism of disciplines, including the evolutionary nature of specialised fields, and the other is the nature of disciplinary culture - not all conventions, values or expectations are explicitly obvious to scholars, let alone to their students.
Implications for setting standards and making judgements
Becher (1989) has shown how fields of study, like their parent disciplines, are constantly changing, evolving as new knowledge is made. In disciplines centred on hard pure knowledge that is competitive and cumulative, the pace of knowledge production is rapid so that competent judgement about achievement criteria in assessment depends to a considerable degree upon a thorough knowledge of the field, including recent developments. In soft pure fields which are interpretive, reiterative and individualistic, knowledge production is slower and less competitive. Competent judgement depends less upon keeping up to date with very recent developments, and more upon having a deep and sophisticated knowledge of theoretical developments in the field and how to build an argument that provides new insight into existing phenomena (Parry 1998).
In both hard and soft applied fields, there is the need for a grounding in pure knowledge but an emphasis upon integration and application. For this, a strong understanding of the values and expectations of the profession concerned is vital. In applied settings, university teachers must draw upon very different kinds of expertise in making professional judgements about student assessment. The capacity to set appropriate learning aims and assessment tasks depends upon the assessor's knowledge of values and conventions in the field. Whether assessment tasks are low or high inference by nature is also important. Where tasks are high inference, the assessment criteria are likely to evolve over time marking the same task (Nulty, accessed online 2002). Where there are multiple markers, the process is even more problematic. The assessor's expertise in the values and conventions of the field is confounded by the dynamic nature of academic disciplines too, because knowledge is constantly evolving.
A second and pervasive consideration is that many, if not most disciplinary conventions, values and expectations are inexplicit and are learned by tacit means (see, for example, Gerholm 1990; Parry 1998). Competent student assessment depends on a sound knowledge of those inexplicit norms such as writing style, citation and acknowledgement, structure of argument, positioning with the audience and command of the tacit knowledge of the field (Bazerman 1988; Parry 1998). Not only is it essential for assessment criteria to relate explicitly to learning aims and what is taught, but undergraduate students need to know what is expected of them, including any implicit expectations such as writing style or citation practices.
Making expectations clear and explicit is problematic in soft, interpretive and applied fields where grading criteria cannot be too precise or they will constrain student performance. Not surprisingly, it is in these settings that exemplars of good work such as projects and portfolios are most likely to be used to inform students about how they will be assessed (Parry, Hayden and Speedy 2000). Art history, or in the applied domain, clinical aspects of nursing, are likely to be taught this way with criteria leaving room for individual interpretation, application and performance. While Carlson et al (2000) identified the difficulty many academics have in establishing clear achievement standards, they did not take into account the constraints of disciplines and in particular of the inherent fuzziness in soft disciplines.
Miller and Parlett (1976) identified three categories of students based on their approaches to studying for examinations. One is students who are 'cue-conscious'; they know there are implicit expectations about how they should perform and they are aware that they should find them out. Another category is 'cue-deaf'; these students are unaware that there are implicit expectations of their performance. The third category is 'cue-seeking'. These students are very tuned in to inexplicit conventions, traditions, aspects of style and so on, and they actively seek to find them out.
Miller and Parlett's (1976) categorisations point to the importance in undergraduate assessment, of university teachers providing appropriate cues to students through assessment criteria and explicit standards. Competent assessment depends upon the extent to which disciplinary conventions and values are highlighted through assessment criteria. While the conventions vary markedly across disciplinary groupings (see, for example Parry, 1998), so too do disciplinary values (see Lattuca and Stark, 1995). Braxton (1993) identified characteristics in soft fields such as valuing student character development, and emphasising the development of critical thinking skills (analysis and synthesis) as being important, so we are left with the question of how one confidently or competently builds these notions into criterion referenced assessment.
There is much work to be done on effective assessment within the context of particular disciplinary settings. Many of the issues and concerns associated with criterion or standards based assessment cannot be addressed until more empirical studies are undertaken and academic departments make collective, course-wide decisions about the kinds of values they expect to see embedded in their students' assessment tasks.
This paper has highlighted some key concerns about achieving quality in criterion referenced assessment practices: that academics are slow to change their attitudes to a positive view of criterion referenced assessment and may, therefore, default to norm referencing when in doubt; that the intensive level of negotiation required to formulate criteria and standards is difficult and time consuming and that academics find it hard to clarify and articulate assessment standards. In addition it explains how and why assessment tasks might not be appropriately authentic or enable students to construct their own meaning. It also explains why academics need to be able to characterise the nature of their field of knowledge because these characteristics constrain the extent to which assessment criteria can be sharply defined (low inference) or are interpretive (high inference). This kind of understanding is needed to properly inform effective assessment, but it remains an area where there is a paucity of empirical research.
This paper emphasises the necessity for academics to be reflective and to recognise that assessment is always a problematic activity.
Angelo, T. A. and Cross, K. P. (1993) Classroom Assessment Techniques A Handbook for College Teachers (2nd Edition). New York, Jossey-Bass.
Barrie, S, Brew, A., and McCulloch, M. (1999) "Qualitatively different conceptions of criteria used to assess student learning" paper presented to Australian Association for Research in Education (AARE) conference Melbourne. http://www.aare.edu.au/99pap/bre99209.htm accessed August 2002.
Bazerman, Charles. (1981) "What Written Knowledge Does". Philosophy of the social
Sciences 2: 361-387.
Becher, Tony. (1989) Academic Tribes and Territories: Intellectual Enquiry and the Cultures of the Disciplines. Milton Keynes: Open University Press.
Biggs, J. (1999) Teaching for Quality Learning at University Society for Research into Higher Education and Open University Press Buckingham UK.
Biggs, J. (1992) "A qualitative approach to grading students" in HERDSA News Vol 14 No. 3:3-6.
Biglan, A (1973a) "The Characteristics of Subject Matter in Different Scientific Areas." Journal of Applied Psychology 57: 195-203.
Biglan, A (1973b) " Relationships Between Subject Matter Characteristics and the Structure and Output of University Departments." Journal of Applied Psychology 57: 204-213.
Black, P. and Wiliam, D (1998) Inside the black box: raising standards through classroom assessment Department of Education and Professional Studies UK
Braxton, J. M. (1993) "Selectivity and Rigor in Research Universities." Journal of Higher Education 64:657-675.
Brooker, R, Muller, R, Mylonas, A and Hansford, B. (1998) 'Improving the Assessment of Practice Teaching: a criteria and standards framework' in Assessment & Evaluation in Higher Education Vol 23, No. 1: 5-20.
Carlson, T., Macdonald, D., Gorely, T., Hanrahan, S. and Burgess-Limerick, R.(2000) "Implementing Criterion-referenced Assessment within a Multi-disciplinary University Department" in Higher Education Research & Development Vol 19, No. 1:104-116.
Cumming, J.J. and Maxwell, G.S. (1999) "Contextualising Authentic Assessment" in Assessment in Higher Education Vol 6, No. 2:177-194.
Edwards, R (1997) Changing Places? Flexibility, lifelong Learning and a Learning Society Routledge New York.
Gerholm, Thomas. (1990) "On Tacit Knowledge in Academia." European Journal of Education 25: 263-271.
Hagar, P., Gonczi, A. and Athanasou, J. 1994 "General Issues about the Assessment of Competence" in Assessment & Evaluation in Higher Education Vol 19, No. 1:3-15
Kolb, D. A. (1981) "Learning Styles and Disciplinary Differences." In The Modern American College, pp. 232-55. Edited by A Chickering. San Francisco: Jossey Bass.
Lattuca, L. R. and Stark, J. S (1995) "Modifying the Major: Discretionary thoughts from Ten Disciplines." Review of Higher Education, 18 (3), pp315-344
Miller, C. M. L. and Parlett, M. (1976) "Cue Consciousness." In The Process of Schooling, pp. 143-50. Edited by Hammersley, and P. Woods. London: Routledge and Kegan Paul.
Morgan, C. and O'Reilly, M. (1999) Assessing Open and Distance Learners London: Kogan Page.
Neumann, R., Parry, S. and Becher, T. (2002) "Teaching and learning in their Disciplinary Contexts: A Conceptual Analysis". Higher Education. October (forthcoming).
Nulty, Duncan D. (accessed online September 2002) Three ways to pursue academic quality? Teaching and Learning Development Unit, Queensland University of Technology. http://www.tedi.uq.edu.au/conferences/A_conf/papers/Nulty.html
O'Donovan, B., Price, M. and Rust, C.(2000) "The Student Experience of Criterion-referenced Assessment (through the introduction of a common criteria assessment grid.)" in Innovations in Education and Teaching International IETI 38,1. http://www.tandf.co.uk/journals
Parry, S. (1998) "Disciplinary discourse in Doctoral Theses." Higher Education, 36 (3) pp. 273-299.
Parry, S. Hayden, M. and Speedy, G. (2000) "Report on a CUTSD-Funded Staff Development Project on Student Assessment Practices". Report to the Commonwealth Department of Education, Training and Youth Affairs, Lismore, Southern Cross University.
Practical Assessment, Research & Evaluation (a peer reviewed electronic journal) http://www.ericae.net/pare/getvn.asp?v=2&n=2 accessed August 2002.
Smart, J. C. and Etherington, C. A. (1995). "Disciplinary and Intellectual Differences in Undergraduate Education Goals." In Hativa, N. and Marincovich, M. Disciplinary Differences in Teaching and Learning: Implications for Practice. No 64 Winter. San Francisco: Jossey-Bass, pp. 49-57.]
Riley, K.L. and Stern, B.S. (1998) "Using Authentic Assessment and Qualitative Methodology to Bridge Theory and Practice" in The Education Forum Volume 62, Winter pp178-185.
Rowntree, D. (1987) Assessing Students: How shall we know them? Kogan Page London.
Sadler, D.R (1987) "Specifying and promulgating achievement standards" in Oxford Review of Education Vol 3. No 2:191-207.
Warren Piper, D. Nulty, D. D. and O'Grady, G. (1996) Examination Practices and Procedures in Australian Universities. Canberra: Department of Employment, Education, Training and Youth Affairs.
Wenger, E. (1998) Communities of Practice: Learning, Meaning and Identity Cambridge University Press, Cambridge.