Education-line Home Page

Screen or page: will the use of computer aided instruction improve phonological skills in year 1 classes?

Mary Wild
Oxford Brookes University

Paper presented to British Educational Research Association Annual Conference, University of Manchester, 16-18 September 2004

The paper reports the results of a randomised control trial that investigated the use of computer-aided instruction with Year 1 children. The focus was on the potential benefits of using computers for practising phonological awareness skills.

A total of six primary or first schools, all located in Oxfordshire but with differing pupil profiles, participated in the study. The total number of children involved in the study was 127. Each school was involved in the study for 10-12 weeks, over a single term, during the course of the academic year 2001- 2002.

 The main strategy was experimental, adopting a randomised control trial design, in which Year 1 pupils were stratified according to gender and broad academic ability and then randomly allocated to one of three groups. Two of the groups were taught in pairs using the same phonological awareness programme; one group undertook practice exercises using a computer and the other group undertook practice exercises using a more traditional paper-based format. The third, control group, experienced a practical maths programme, with no explicit literacy or ICT components. Children in all three groups were pre and post-tested using a range of literacy and mathematical assessments and their relative rates of progress assessed.  

Statistical analysis of the children’s attainments at pre and post-test indicated that there was a significant learning advantage accruing to children in the computer- based group as compared to the paper-based group and the control group. There were no significant learning differences between children in the paper–based and control groups. Analysis by gender indicated that the girls in the computer group made significantly more progress than the boys. Boys and girls made similar progress in the other two groups.

“Determining the actual, as opposed to the possible, impact of the new technology on literacy could be one of the most interesting research challenges in the twenty-first century”

(Hannon, 2000, p28).

 The rationale for conducting this study was based on the continuing impetus within educational policy and curriculum guidance for more extensive use of ICT at all levels of education (Blair, 1997; DES, 2000; DfES, 2003), juxtaposed with a particular paucity of research into educational ICT for the Early Years. There has been no shortage of contributions to the long-established debate about whether or not, or in what ways, ICT might be beneficial in children’s education. Strangely enough however, there has been comparatively little research that is directly linked to specific learning outcomes, and even less that has attempted to make explicit comparisons with other more traditional instructional media. BECTa (2001) for example, conducted a literature search into studies that concentrate on the use of ICT to support the teaching of specific aspects of English, Maths and Science at Key Stage 2 and found only 36 studies. The scarcity of such studies for the 0-8 year age range is even more acute (Lankshear and Knobel, 2003).

Furthermore, as Higgins (2003) has pointed out, although there is some evidence that ICT can aid the learning of pupils, “there is not a simple message in such evidence that ICT will make a difference simply by being used” (p5). He too, specifically highlights the need for more comparative research, in which the effectiveness of ICT can be judged against alternative approaches to teaching. In so doing he echoes the calls of policy makers (Blunkett, 2001) and fellow researchers. Reynolds (2001), for example, has suggested that whilst “ thus far the research into whether ICT has effects, the appropriate utilisation of IT, and “what works” ………is an area of more assertion than evidence.” Moreover, as Higgins and Moseley (2002) have pointed out, ” there was (and still is) virtually no evidence to justify the expenditure needed to replace paper and pencil by computer for daily reading and writing activities in primary schools”(p31).

Rather, much of the research to date may have been “overly optimistic” and has exhibited a preference for large-scale surveys or individual case studies of particular innovations in particular schools (Selwyn,1997). Specifically, in an extensive report prepared for the Teacher Training Agency, Moseley, Higgins, Newton, Tymms, Henderson and Stout (1999) point out that whilst a huge number of studies have been conducted into computer aided learning (or CAL), a “disappointingly small proportion of this is empirically based and an even smaller proportion looks at pupil improvement through identified gain, particularly for the primary age range” (Appendix 2 of the report.)

The present study sought to provide some such empirically based evidence in respect of the measurable impact of using ICT on the phonological skills of children within Key Stage 1, specifically Year 1 classes. It did so through the prism of tightly focused comparative research questions and the application of a rigorous methodology. In particular the following research questions were addressed:

A further subsidiary question is also addressed in this paper

The rationale for the skills focus on phonological awareness, the selected methodological approach, and for the gender dimension is considered in separate sections beneath, as is some of the evidence to date of the effectiveness of ICT in comparison with more traditional learning media.

Why focus on phonological awareness?

 Phonological awareness is a literacy skill that has assumed a growing importance within the school curriculum at Key Stage 1 over recent years, particularly since the Government introduced the National Literacy Strategy (DfEE, 1998). From the outset the NLS has been explicit in promoting the use of “systematic” phonics as a teaching method in school based literacy and continues to stress the importance of phonics in the curriculum (DES Standards Site, 2003, NLS, 2003).

Though such an overt emphasis on the teaching of phonics has not been uncontroversial and the “great debate” (Chall, 1967) over the best way in which to teach reading has a lengthy history, the belief that the learning of phonics has a valid place in the teaching of reading has received support from a number of meta-analyses of the available evidence (Adams, 199l; Snow, Burns and Griffin, 1998; Bus and van IJzendoom, 1999; Ehri, Nunes, Willows, Schuster, Yagoub-Zadah and Shannahan, 2001; Ehri, 2003).

Consequently, it would seem that phonological awareness would be a useful focus for a study into the comparative effectiveness of ICT as a teaching medium. As such, Underwood (1994) has previously suggested that ICT applications, even of the drill and practice orientation, may usefully have a role to play in learning situations where pupils are required to master highly structured learning goals. Relating this idea specifically to early literacy goals, Underwood (2000) has also suggested that software of the structured practice format might be especially useful in supporting the acquisition of basic literacy skills at the sub-word level and for phonics reinforcement.

Adopting a comparative methodology. 

To demonstrate whether an ICT approach would be “ especially useful” compared to other classroom methods, demands research of a specifically comparative nature. It has been argued (Boruch, 1997) that in order to gauge the effectiveness of any new educational approach it is necessary to compare the “condition of the individuals who have received the new service against the condition they would have been in had they not received the new service.” Furthermore, for any such comparison to be fair the comparison groups should not differ systematically from one another in either sample composition or programme implementation other than the nature of the programme itself. Within this context, Randomised Control Trials (RCTs), which  “Involve the random allocation of eligible individuals or entities to each of two or more treatment conditions” (Boruch, Synder and DeMoya, 1994) might represent a valid and useful investigative approach.

Such an approach to educational research is not without critics (Hammersley 1997; Edwards, 2000; Pirrie, 2001) who argue that there can be no such thing as a “science of education.” However, after bitter debate (Sylva, 2000), there has been an emerging recognition that rather than generically appropriate/inappropriate approaches, educational research must be informed by the research question that the research seeks to address (Pring 2000; Smeyers, 2001).

The chief benefit of adopting an RCT design is that the random allocation of participants to experimental or intervention groups mitigates against any systematic bias in group composition and thereby provides “strong evidence “ that any differences in so-called “outcomes” are causally related to the different interventions experienced by the participants in each group (Robson, 1993). This is of course strengthened in a design in which other potentially confounding variables are also controlled for. Additionally the adoption of an RCT design confers statistical advantages whereby the random allocation of participants to groups allows for the use of more robust statistical techniques ( Boruch 1997, ibid.).

What have previous studies shown about the comparative effectiveness of ICT?

 There have been relatively few comparatively oriented studies to date that investigate the use of ICT to support literacy skills by focusing on learning outcomes rather than learning processes. Some of the more renowned studies (BECTa 2000; 2001) and the Impact studies (Watson, Cox and Johnson, 1993; Harrison, Comber, Fisher, Haw, Lewin, Lunzer, McFarlane, Mavers, Scrimshaw, Somekh and Watling, 2002) have been characterised by a school-level approach rather than considering the issue at the classroom level.

 One relatively large study, which combined elements of case study methodology with attempts to quantity any learning advantages for children, was the TTA sponsored study referred to earlier (Moseley et al, 1999). In addition to the large-scale surveys, the project involved a development phase in which in which a sub-sample of 16 class teachers undertook specific ICT projects, related either to literacy or numeracy. These development projects were closely monitored and their impact on pupil attainment using standardised tests and criterion-referenced measures was assessed. Results from this final phase indicated significant gains in 14 out of 16 development classes. However, as the researchers involved freely admit throughout the report, the absence of control groups for the development projects mean that it is impossible to deduce a causal link between the ICT application and the progress in measured outcomes. The findings are at best only strongly indicative of such a link.

 Underwood has conducted a series of studies into the specific potential of Integrated Learning Systems (Underwood, 2000). These studies focused on children at Key Stage 2 and secondary level and compared ILS to traditional classroom practices. Interestingly, she found that overall both groups made similar levels of progress, but at an individual school level differential performance was recorded. Similar findings emerged from an earlier evaluation of two types of ILS programme (NCET, 1996). This suggests that the dynamics within individual schools exerted a more powerful influence on the results than the use of the different teaching media. 

Another study that specifically targeted children deemed to be at risk of learning disabilities was able to demonstrate benefits for 5-6 year old pupils of the ICT approach over more traditional instructional methods (Mioduser, Tur-Kaspa and Letner 2000). On the other hand, contrasting results were obtained in an intervention study conducted to compare the progress of children,  identified as at risk of reading failure, using the RITA computer-based literacy support system, with those using a more traditional reading support programme (Nicholson, Fawcett and Nicholson, 2000). Children aged 6 and 8 in four schools formed the main sample groups and were matched by reading and chronological age to control groups from other schools who received only the standard classroom support for reading. Although the intervention groups made significant progress over the control groups there were no significant differences between the intervention groups. This suggests that it was additional help rather than the type of help that proved most beneficial.

Focusing on children who were not deemed to have potential reading difficulties, there have been studies suggesting that ICT may be beneficial for children who are in the early stages of learning to read. Davidson, Elcock, Noyes and Terrell (1991) undertook a small-scale comparative study involving just 20 children in total. Half of the children used a computer programme with digitised speech as part of their classroom reading scheme. The other 10 children followed the regular scheme. The experimental group significantly increased their scores on a standardised reading test compared to the control group. Though such a small sample means that the findings must be interpreted cautiously, a later, slightly larger, study involved 60 children aged 5-7 years olds and also appeared to show that computers could provide effective reading practice for young children (Davidson, Elcock and Noyes 1996). However the overall amount of practice each child received was not controlled and the study relied on teacher delivery, meaning results could be due to teacher style or preferences.

More recently, VanDaal and Reitsma (2000) report on a study involving kindergarten children (since this originated in the Netherlands this includes children who would be considered Key Stage 1 age in England and Wales). The programme in the study was of a drill and practice type programme to aid initial reading and spelling skills, called ”Leescircus”. Two classes were selected and, within these two sub-groups, were randomly allocated to either experimental or control groups. Comparisons of pre and post-test scores showed significant gains for the computer group on letter knowledge and on reading single words and non-words as opposed to the control group. Unfortunately the authors acknowledge that the implementation of the programme was not uniform, and there were large variations in the amount of time individual children spent on the computer.

 Why include a gender dimension?

 This final strand of the research was designed to probe the commonly held assumption that boys are more interested in, and responsive to, computer technology, and that such technology could therefore be utilised to address boys’ perceived underachievement in reading in comparison with girls (Millard, 1996, 1997; Noble, 2000). Interestingly, though there has been a great deal of research into the attitudes of boys and girls towards computers, it is if far from clear that it is uniformly the case that boys are more positively disposed towards computers (Yelland,1995; Fitzpatrick and Hardman, 2000). There has also been some evidence to suggest that in some respects girls are more positively disposed than boys to use computers educationally rather than for amusement, at least within the home environment (Furlong, Furlong, Facer and Sutherland, 2000; Murphy and Beggs, 2003).

However, Brosnan (1998) presents evidence that a more positive attitude towards computers amongst boys aged 6-11, does translate into higher levels of computer-related attainment, albeit in a relatively small study involving only 48 children in total. Similarly Passig and Levin (2000),  working in three kindergarten classes (involving 90 children) found that different styles of computer interface were related to differing levels of child satisfaction and that this was related to gender. The famous “Honeybears” research (Littleton, Light, Joiner, Messer and Barnes, 1992). later extended by Joiner (1998) also suggests an interaction between gender and computer use.

Other research has considered the inter-gender dynamics of children using computers in classrooms and has provided evidence that in mixed-gender pairs boys tend to be more assertive, whereas in same gender pairs children are equally assertive (Underwood, G. 1994; Underwood, McCaffrey and Underwood 1990; Fitzpatrick and Hardman 2000). However Fitzpatrick and Hardman (2000), Underwood and Underwood, (1998) and Underwood, Underwood and Wood (2000) have suggested that differential patterns of interaction do not necessarily translate into  differential task performance, though the Underwood et al study (2002), based around interactive storybooks, indicates that girl-girl pairings did demonstrate better performance as measured by subsequent and delayed (over several weeks) story recall.

The interaction between gender and learning medium is clearly a complex one, and is at least suggestive that gender might impinge on the effectiveness of using a computer–based instructional programme. For this reason the present study incorporated a gender dimension.

Methodology

The research was conducted during the academic year 2001-2002, following a small pilot study in summer 2001. It involved a total of 127 children across 6 primary or first schools in Oxfordshire. The six schools required for the study were selected randomly from the Oxfordshire Schools Admissions and Transfers Booklet (OCC, 2000), though schools with less than 100 pupils on roll were not included since such schools might have insufficient pupils in the Year 1 age group to ensure adequate statistical power at the data analysis stage. Evidence was collected as to the demographic profile of each of the schools and the most recent Ofsted data per school was checked. Analysis of both sets of data suggested that although the schools were all within the same county they had quite different demographics, strengths and weaknesses.

Two schools were involved in the study per term, and each was visited by the researcher for the equivalent of two days per week for 10-12 weeks. The research involved only Year 1 children. In five of the schools the researcher worked with single classes. In one school there were two parallel Year 1 classes and the researcher worked within both classes with children who were judged by their teachers to be working at Year 1 ability levels. Thus although the research took place across six schools, seven classes were involved.

Within each class, there was random allocation of participants to one of three groups. Therefore any idiosyncrasies of school or classroom dynamics that might act as confounding variables could be expected to be equally distributed across the three groups. There were two main comparison groups, wherein participants followed the same phonological awareness programme, but one group undertook practise exercises using computer software and the second group undertook more traditional paper–based practice exercises using a series of comparable worksheets produced by the publishers of the computer software. There was also a third control group. However this was not a classical no-treatment control group. In order to mitigate any potential “halo” effects in the two main intervention groups of working with the peripatetic researcher/ teacher, the control group also worked with the peripatetic researcher/ teacher but on a series of unconnected practical maths games, which contained no explicit literacy or ICT components. Although participants were randomly allocated to the groups there was an element of stratification at the level of gender and broad ability levels. Ability stratification was determined according to the patterns of ability grouping structures applied by the class teachers of the classes involved in the study and was corroborated by using the British Picture Vocabulary Scales II (Dunn, Dunn, Whetton and Burley, 1997) at the data analysis stage.

The participation, selection and allocation procedures resulted in a total sample size of 127 children; of whom 44 children were in the computer-based intervention group, 43 children in the paper-based intervention group and 40 in the control group. Groups were almost equally balanced in terms of gender. The computer group had 22 boys and 22 girls; the paper-based group had 21 boys and 22 girls and the control group had 21 boys and 19 girls. Children with EAL or SEN were found to be equally distributes amongst groups.

All three groups were tested before and after the intervention programmes on a range of literacy and mathematical learning outcomes in order to determine the degree to which each group had progressed relative to each other. It was therefore a classical pre-test/post-test design and an example of a “value-added” design. The core programmes or interventions spanned a total of six weeks in each school and commenced after initial pre-testing was completed in the first two weeks in each school.  SHAPE  \* MERGEFORMAT

The Intervention Programmes

 It was considered essential that the programme experienced by the two main intervention groups should be as comparable as possible, except in respect of the variable of primary interest i.e. the practice media experienced by the participant children.

The programme selected for the purposes of the intervention study was the “Rhyme and Analogy” programme (Goswami & Kirtley, 1996), published by Oxford University Press/Sherston Software. The authors are renowned experts within the field of literacy and phonological awareness and the software company are an established and respected provider of such educational programmes. The educational software was awarded the BETT award for 2000. The educational pedigree of the selected programme was therefore sound. For the purposes of the study, it was not intended that the programme should be delivered in its entirety; rather the two main intervention groups would experience alternative elements within it. The research therefore was not an evaluation of the “Rhyme and Analogy” programme per se.

The overall programme comprises two packs of six Story Rhyme books and within the time constraints of the research study it was possible to follow only one set of six books: Pack A, together with the associated photo-copiable worksheets (paper–based group) or CDrom activities (computer-based group). The product information detailed on the cover of the activity software explicitly states that the CDrom exercises are “based on the Story Rhyme Photocopy masters”. This level of comparability is reflected in both the visual representation presented to the child on screen and also in respect of the nature of the tasks set.

The intervention programmes were delivered over a six-week period within a single term in each of the schools, involving two days per week in each school. Within this timetable it was necessary to rotate the order in which the researcher-tutor worked with each of the groups in order to minimise any confounding effects of time of presentation on performance.

All of the sessions were taught away from the main body of the classroom to minimise diffusion effects (see Plewis and Hurry (1998), and Craven, Marsh, Debus and Jayasinghe (2001) for a discussion of diffusion effects), and the venue was the same for each group within any one school. The overall timing of the sessions was determined by the pre-programmed duration of the CDrom exercises per book, and each group was consequently allowed an average of 20 minutes per practise session.

All of the computer sessions were conducted using the researchers own laptop:- a Dell Inspiron 4000. This ensured that any technological idiosyncrasies were standardised across all of the classrooms in the study and had the additional advantage of not putting schools in the position of having to allocate specific resources to the project. The laptop computer incorporated a mouse control mechanism within the keyboard. However it was felt that children would be more familiar with using a hand-held computer mouse and so such a device used for all of the computer sessions. Children were not required to be able to use the keyboard in order to follow the computer exercises.

Prior to undertaking the practice exercises in either format, depending on their group allocation, each session was preceded by the researcher reading the relevant Story Rhyme book. The reading of the Story Rhyme book was conducted in exactly the same fashion for both the computer-based and paper-based groups. Thus the only difference in content experienced was the form of the practice exercises. Once the Story Rhyme book had been read, the children were asked to work on the relevant practice exercises on either the computer or on worksheets depending on the intervention group to which they had been allocated. A series of delivery “scripts” was designed to ensure that task instructions and protocols for dealing with child-initiated comments were dealt with in a standardised format across the different intervention groups. During the first term an independent researcher was asked to corroborate that the principal researcher did not unduly favour any one of the participating groups and a system of taping sessions was undertaking to ensure continuing equity of approach.

Children undertook the practice sessions in pairs as this had been shown to work well during the pilot study and was also a common modus operandi within all of the participating classrooms. In order to avoid issues relating to specific personality dynamics the working dyads were rotated within each group for each session. The researcher-tutor was present with the children whilst they followed the relevant exercises, but did not provide any overt teaching. Again this was a common approach to skills practice within the participating classrooms.

The control group followed a series of practical mathematical games. Whilst not comprising a unitary programme, the games selected were loosely grouped around the mathematical concepts of number: conservation, matching and sequencing, and simple number operations. There was no use of ICT involved and no requirement on the children to read or write anything. Delivery of the control programme was designed to mirror many of the aspects of the delivery of the main intervention programmes, so that children in the control group differed only by virtue of not having received the explicit additional phonological training.

As with the two main intervention groups, children undertook the control activities in changing dyads and in the same location, separated from the main body of the classroom to minimise diffusion. The duration of each session was in line with the duration of the main intervention sessions i.e. c20 minutes each. In keeping with the delivery strategy of the main interventions the researcher-tutor provided initial instructions for each activity, but remained predominantly in a passive role for the duration of the activity itself.

The degree to which the class teacher might inadvertently incorporate aspects of any of the interventions into wider classroom activities was minimised by ensuring that the researcher was solely responsible for delivering the programmes.

Child Assessments

 In order to assess the children’s progress in their phonological awareness, two tests of phonological awareness skills were used at both pre and post test: the Phonological Assessment Battery (PhAB) (Frederickson, Frith and Reason, 1997) and the Marie Clay Dictation Test (Clay, 1979).

The Phonological Awareness Battery was selected because it encompassed a range of sub-tests designed to provide assessments for a range of different facets of phonological awareness. Thus it was selected in preference to other more established tests such as the Bryant and Bradley (1985) test of phonological awareness, which focuses on a lesser number of component skills. Although the PhAB test battery was relatively new and therefore little used to date in research, it had been subjected to rigorous trial processes to ensure both reliability and construct validity as well as establishing performance norms.

The PhAB test battery consists of the following six component tests or sub-scales: the alliteration test; the naming speed test; the rhyme test; the spoonerisms test; the fluency tests; the non-word reading test. Five of the six sub-scales were used. The speed test was omitted in view of the young age of the children concerned, for whom it was considered that phonological recognition was in itself a potentially new and developing skill, with speed of processing likely to develop subsequently.

The PhAB tests were administered individually and in practice the administration time averaged 20-25 minutes per child. As a reflection of the young age of the participant children, the tests were never referred to as tests, but were introduced as “games”. Apart from these minor adaptations, the researcher administered the tests in accordance with the scripted instructions printed in the test manual.

It was decided that in addition to considering the overall score obtained by each child at the data analysis stage, it would be useful to consider the children’s performance under a range of related component skills. Thus the scores form the alliteration test and the alliterative fluency section of the fluency test were combined to produce an indicator of each child’s alliterative abilities. Similarly the score from the rhyme test was combined with the rhyming fluency section of the fluency test to produce an indicator of rhyming ability. The fluency elements of the PhAB battery were also isolated to provide an indication of the extent to which children were able to apply their phonological awareness in a generative sense, since it could be argued that this is far more educationally worthwhile than simple recognition. The PhAB test battery was of course used at both pre-test and post-test and therefore would be indicative of children’s progress in respect of the underlying skills. There was little direct evidence that individual children remembered elements of the test at post-test, and in any case they had not been made aware of the “correct” response at pre-test. Nevertheless, there are statistical implications of such a pattern of testing and these were accounted for in the statistical procedures adopted at the analysis stage.

In order to check for any inadvertent bias on the part of the researcher, the services of another independent researcher were procured for the final term of the intervention and this researcher undertook the assessments for 23% of the Phonological Awareness (PhAB) tests in this term. Comparing scoring for the Phonological Awareness (PhAB) tests, the PhAB Total mean score obtained by the principal researcher was 44.56, for the independent researcher it was 40.00. ANOVA detected no significant difference between the two researchers’ scores (F(1,126)  =0.473, p>.05). Analysed per researcher, per intervention group, the pattern of no significant differences between researchers’ scores was sustained. (Computer group: F(1,126)  =1.288, p>.05; paper group: F(1,126)  =0.125, p>.05; control group: F(1,126)  =0.380 p>.05).

The second phonological test was the Marie Clay Dictation Test, which was selected as an indicator of the children’s ability to use their phonological awareness skills. The Marie Clay Dictation test explicitly tests a child’s ability to apply phonological awareness to an orthographic task. However the amount of writing required by the test is limited to two sentences and would therefore be likely to be manageable for most of the children within the study who were only in their first full year of formal schooling.

The Marie Clay Dictation test is a very well established test, forming a component part of the Marie Clay Diagnostic Survey and has been used in a great many research projects. The dictation test as well as being relatively short and therefore quick and easy to administer is also highly structured and specific. There are five alternative stories and it is advisable that alternative versions are presented if a chid is re-tested, as in the present study when the children were given the dictation test at both pre and post-test. Form A of the test was therefore used at pre-test and Form B at post–test.

Given that the dictation test is a written test and is relatively short and structured it was possible to secure the services of an independent researcher who blindly-scored all of the post-tests. When these scores were compared to those obtained by the principal researcher for the same test scripts, there was found to be 99% agreement (r= .99, p<.01 for Marie Clay Dictation).

Methodology of Data Analysis

 Descriptive statistics were obtained to provide demographic features, as well as mean scores, standard deviations, ranges and variances for the two intervention groups and the control group across all the learning outcomes at both pre- and post-test. This information enabled analysis of the relative rates of educational progress from pre- to post-test between groups to be compared.

The test scores utilised were pre-dominantly raw scores. This reflected the fact that the standardisation procedures for the selected tests related to children aged from 6 years. Since many of the children in the study were only 5 years old there were no standardised figures available to convert the raw scores for these children. For the purposes of consistency therefore raw scores were used for all children. The exception to this was the BPVSII scores, used at pre-test only as a corroboration of ability stratification, for which standardised scores from aged 3 years were available.

Having obtained descriptive statistics as indicated above, the composition of each group within the sample was analysed in order to check that the random allocation of individuals to each group had not inadvertently resulted in systematically biased groups. The scores obtained on the educational measures at pre-test were also compared, using ANOVA, to check whether there were any statistically significant differences between the groups before the intervention programmes were implemented. Having ascertained the comparative profiles of the three groups at pre-test in respect of gender, ability, precise age and performance on educational pre-test measures, the next stage was to consider the post-test educational performance of the three groups using the pre-test findings as a comparative baseline wherever statistically possible.

The scores obtained for each educational outcome at post-test were carefully checked to establish whether the necessary assumptions for ANCOVA were fulfilled. Where the assumptions for ANCOVA (Field, 2000 ) were met, ANCOVAs were carried out in order to establish the relationship, if any, between membership of the intervention and/or control groups and post-test score. “Group” was used as the between-subjects factor and gender, ability as measured by pre- test BPVS score, and age were entered as covariates likely to impinge on educational performance. Where the tests used at pre- and post- test were alternative versions of the same test it was possible to enter the pre-test scores as a covariate using the univariate ANCOVA procedure. Where the tests used at pre- and post-test were identical this fact violates the assumption of independence necessary for ANCOVA (Firth, 2002a) and so a repeated measures ANCOVA procedure was required instead. Further exploration of any detected interactions were performed using one-way ANOVA on the dependent variable scores after these had been adjusted in order to control for the possible confounding effects of covariates identified as significant in ANCOVA, particularly pre-test performance. The procedure used followed the model adopted in Tsitridou-Evangelou (2001), and can also be found in the first ImpacT study into ICT use in schools (Johnson, Cox and Watson, 1994).

For educational outcomes which violated the assumptions of ANCOVA, and where logarithmic transformation of the data failed to mitigate this, non-parametric tests were carried out instead. Given the fact that the study involved three groups the appropriate non-parametric test was the Kruskal-Wallis test, which is the distribution-free equivalent of one-way ANOVA.

Finally, ordinary Least Squares Regressions were carried out for those outcomes at post-test which fulfilled the necessary assumptions for linear regression (Field, 2000). Multiple regression analysis was carried out using group membership as the predictor and pre-test performance, gender, ability as measured by BPVS scores at pre-test, and age as covariate factors. The covariates were chosen as being factors likely to impinge on educational performance for which data was available. Regression models were run with and without outlier cases.

Having considered the results obtained for the post-test educational outcome measures by group, further analysis by gender was undertaken for those educational outcomes for which significant regression models in respect of the predictive power of group had been demonstrated.

Ethical Considerations

At every point within the research study due consideration and attention to ethical matters was maintained. The basis for this was adherence to the ethical guidelines produced by the BPS (Robson,1993) and BERA (1992). Particular consideration was given to ensuring confidentiality and preserving anonymity, gaining and maintaining “informed” consent and to ensuring that all of the intervention/control programmes were inherently of educational value.

Results

Pre-test Scores

Across the range of educational outcomes at pre-test there were no statistically significant differences in the performance levels of the three groups. Nevertheless as the pre-test summary table (Table 1) demonstrates, the computer group scored lower at pre-test than the other two groups on all but the PhAB Combined Rhyming measure.

Table 1: Summary of Mean Scores Attained at Pre-test

Comparison of Mean Scores at Pre and Post-test

For each of the outcomes, tables comparing the scores attained at pre and post- test are presented below. Graphs depicting the means and 95% confidence intervals at pre- and post-test are additionally presented in Appendix A in order to clearly highlight the respective advantage accruing to the various groups between the two measurement time-points.

Table 2: Comparison of Scores Attained on PhAB (Total) Outcome at Pre-test and Post-test

Having had the lowest mean score at pre-test the computer group now has the highest mean score. In comparison with the pre-test scores therefore, the overall progress made, as measured by the increase in mean score, is greatest for the computer group.

Table 3: Comparison of Scores Attained on PhAB Combined Alliteration at Pre-test and Post-test

Once again, having had the lowest mean score at pre-test the computer group now has the highest mean score. In comparison with the pre-test scores therefore, the overall progress made, as measured by the increase in mean score, is greatest for the computer group.

Table 4: Comparison of Scores Attained on PhAB Combined Rhyming at Pre-test and Post-test

As at pre-test, the control group had the lowest mean score but by post-test the computer group has overtaken the paper group to achieve the highest mean score. In comparison with the pre-test scores therefore, the overall progress made, as measured by the increase in mean score, is greatest for the computer group.

Table 5: Comparison of Scores Attained on PhAB Combined Fluency at Pre-test and Post-test

Having had the lowest mean score at pre-test the computer group again has the highest mean score. In comparison with the pre-test scores therefore, the overall progress made, as measured by the increase in mean score, is greatest for the computer group.  

Table 6: Comparison of Scores Attained on Marie Clay Dictation at Pre-test and Post–test

Although the mean scores achieved at post-test differed by less than 2 marks between groups, it can be seen that compared to the pre-test scores the computer group had made the greatest positive change in their mean score by post-test. 

Results of ANCOVA/Anova Procedures

Where statistically appropriate ANCOVA/ANOVA was carried out and the results per outcome are noted below. For each outcome a box-plot was generated for the adjusted scores to highlight the comparatives performances of each of the groups. These are included in Appendix B.

PhAB Total Outcome

A repeated measures ANCOVA was conducted on the post-test scores for the PhAB (Total) Outcome with group as the between-subjects factor and gender, age and underlying ability (as measured by the BPVS at pre- test). ANCOVA indicated no main between–subjects effect of group, but did show significant within-subjects interaction of group and test score. (F (2,126) =13.75, p<.01). Further exploration of this interaction through one-way ANOVA on the scores adjusted for pre-test performance was undertaken and the resulting ANOVA revealed a significant between-groups effect (F(2,126) =13.336 p<.01).

PhAB Combined Alliteration Outcome 

A repeated measures ANCOVA, with the between-subjects factor as group and the covariates as gender, age and underlying ability (as measured by the BPVS at pre-test), indicated no main between–subjects effect of group, but did show a significant within-subjects interaction of group and test score (F (2,126) = 5.94, p<.01). further. Exploration of this interaction through one-way ANOVA was undertaken on the scores adjusted for pre-test performance and the resulting ANOVA for the PhAB Combined Alliteration Adjusted revealed a significant between-groups effect (F (2,126) =3.531 p<.05).

PhAB Combined Fluency Outcome

A repeated measures ANCOVA, with the between-subjects factor as group and the covariates as gender, age and underlying ability (as measured by the BPVS at pre-test ), indicated no main between–subjects effect of group, but did show a significant within-subjects interaction of group and test score. (F (2,126) = 3.64, p<.05). Further exploration of this interaction through one-way ANOVA was undertaken on scores adjusted for pre-test performance using the procedure already outlined above. The resulting ANOVA for the PhAB Combined Fluency Adjusted revealed a significant between-groups effect (F (2,126) =3.179 p<.05).

Non-Parametric Comparison of Means

PhAB Combined Rhyming Outcome

At the time of post-test the distribution of scores for the PhAB Combined Rhyming outcome were insufficiently “normal” to justify use of ANCOVA and the non-parametric Kruskal-Wallis test was therefore used. This indicated a significant difference between the means of the groups at post–test whereas at pre-test no such difference had been detected. The Kruskal-Wallis test does not however allow for attributive claims in respect of effects due to particular groups to be made.  

Marie Clay Dictation Test   

At the time of post-test the distribution of scores for the Marie Clay Dictation test violated the assumptions of ANCOVA, even after attempting to transform the data logarithmically, and the non-parametric Kruskal-Wallis test was therefore used to evaluate the effect, if any, of the intervention on the outcome measure. Kruskal-Wallis showed no significant difference between the means of the groups at post–test .

Regression Modelling  

The final regression models reported for each outcome include, as covariates, only those factors found to be significant in univariate analysis.

Table 7: Analytical Regression Model for PhAB (Total) Outcome

Running this regression model produced the following synopsis of model effects, where R² shows how much of the variability in scores is accounted for by all the factors in the model; adjusted R² adjusts this to reflect the sample size of the study and the Beta value indicates the Effect Size, or advantage, of belonging to the computer group expressed in standard deviations. The B statistic reports the mean advantage of belonging to the computer group in relation to the measurement scale employed.

Table 8: Regression Model Statistics for PhAB (Total), Outliers Removed

It can be seen therefore, that having controlled for performance at the PhAB pre-test and underlying ability as measured by the BPVS at pre-test, there was a significant advantage in belonging to the computer group. The advantage was equivalent to an effect size of .25 standard deviations. The model reported excludes 4 outlier cases. However, inclusion of the outliers in the model still produced a statistically significant Beta value of .21 (p<.01).

Table 9: Analytical Model for PhAB Combined Alliteration

Running this regression model produced the following synopsis of model effects:

Table 10: Regression Model Statistics for PhAB Combined Alliteration, Outlier Retained

The model included one outlier, but since exclusion of these cases made no significant difference to the overall model, the model reported retains the outlier For this outcome the advantage of belonging to the computer group was equivalent to an effect size of .19 standard deviations.  

Table 11: Analytical Model for PhAB Combined Rhyming

 Having ascertained that the requisite assumptions for regression, as distinct from those for ANCOVA, (Field, 2000) were met, multiple regression modelling for the PhAB Combined Rhyming outcome was undertaken using the following model:

Running this regression model produced the following synopsis of model effects: 

Table 12: Regression Model Statistics for PhAB Combined Rhyming, Outlier Retained

There was only 1 outlier for the PhAB Combined Fluency measure and since exclusion of this case made no significant difference to the overall model, the model reported retains the outlier.

This time the advantage in belonging to the computer group was equivalent to an effect size of .14 standard deviations.

Marie Clay Dictation Outcome

 It was not possible to attempt regression modelling for the Marie Clay Dictation outcome as the pattern of scores markedly violated the required statistical assumptions. This remained the case even after logarithmic transformations were applied.

Regression Results by Gender

Statistical support for analysis by gender was derived from a repeated measures ANCOVA on the PhAB (Total) scores, which indicated a significant three-way interaction of PhAB Total scores, group and gender. (F (3,126) = .7.160, p<.01). Univariate exploration of this interaction confirmed a significant group by gender effect (F (3,126) = .3.241, p<.02) for PhAB Scores and a significant group by gender effect ( F (3,126) = .7.250, p<.01) for PhAB Total scores adjusted for covariates.

 Repeated measures ANCOVA on PhAB Combined Alliteration scores indicated s significant three-way interaction of PhAB Combined Alliteration scores, group and gender (F (3,126) = .3.358, p<.05).Univariate exploration of this interaction confirmed a significant group by gender effect. (F (3,126) = .0.024, p<.05) for PhAB Combined Alliteration, although when PhAB Combined Alliteration scores were adjusted for covariates statistical significance was narrowly missed ((F (3,126) = .2.429, p=.069).  

ANCOVA techniques were not possible for the PhAB Combined Rhyming owing to the non normal distribution. However application of the non –parametric Kruskal-Wallis test to the PhAB Combined Rhyming scores, split by gender, revealed a non-significant difference between groups for boys  but a significant difference between groups for girls .

Repeated measures ANCOVA for PhAB Combined Fluency showed no significant interaction of gender and group for the PhAB Combined Fluency scores,.( F (3,126) = .537, p>.05).

Separate regression models for boys and girls will be reported for each of the two Phonological Awareness (PhAB) scales for which statistical analysis above suggested a potentially differential effect of gender i.e. PhAB (Total), and PhAB Combined Rhyming. Regression models by gender were not undertaken for the PhAB Combined Fluency scores where ANCOVA demonstrated no significant interaction of scores with gender and group, nor for the PhAB Combined Alliteration scores where the interaction indicated by ANCOVA was not sustained when the Adjusted scores were used.

Regression of Total PhAB Outcome by Gender

The final analytical models for the regression by gender of the PhAB (Total) outcome measure are presented below.

Table 14: Analytical Model for PhAB (Total) for Boys

Table 15: Analytical Model for PhAB (Total) for Girls

Running these regression models produced the following synopses of model effects:

Table 16::Regression Model Statistics for PhAB (Total), Outliers Retained, for Boys

Table 17:Regression Model Statistics for PhAB (Total), Outliers Retained, for Girls

It can be seen therefore, that having controlled for performance at the PhAB pre-test and underlying ability as measured by the BPVS at pre-test, there was an advantage in belonging to the computer group. The advantage was equivalent to an effect size of .15 standard deviations for the boys but a greater advantage to the girls with an effect size of .25 standard deviations.

Regression of PhAB Combined Rhyming by Gender 

Multiple regression modelling for the PhAB Combined Rhyming outcome measure for boys revealed no predictive power attributable to group membership (Beta =0.82, p>.05) The final analytical model for the regression by gender for girls of the PhAB Combined Rhyming outcome measures is presented below.

Table 18: Analytical model for PhAB Combined Rhyming for Girls

Running this regression model produced the following synopsis of model effects:

Table 19:Regression Model Statistics for PhAB Combined Rhyming, Outliers Retained, for Girls

It can be seen therefore, that there was an advantage in belonging to the computer group,. equivalent to an effect size of .34 standard deviations for the girls.

Discussion 

The results of this study show structured use of literacy software in the Year 1 classroom led to greater improvements in the phonological awareness skills of the children who used ICT to support their practice of such skills compared to those who used a comparable programme based on a more traditional learning medium. Furthermore it appeared to be the case that the computer medium elicited differential learning according to gender.

Children in the computer group made more statistically significant educational progress on the four PhAB outcomes reported on. Specifically the Beta effect sizes of belonging to the computer group were as follows:

Table 20 Effect Sizes Related to Membership of the Computer Group

There was no statistically significant difference in the comparative scores of the children in the paper-based group and the “control” group. This was despite the fact that the paper–based group had followed the same phonological programme as the computer group, though the practise medium was different, whereas the control group had received no phonological awareness input within the intervention activities.  

Though not producing a statistically significant difference, the results of the Marie Clay Dictation outcome, also suggested an advantage for children in the computer group compared to the other two groups.  

Whilst the effect sizes reported are not huge they are consistent with the typically modest effect sizes forum in educational research Coe (2002). Within Coe’s paper, for example, “targeted interventions” for “at risk” students emerge as the most effective practice with an average effect size = 0.6. The next highest effect size is practice test taking (0.32), and reducing class size from 23-15 pupils (0.30). Within this context the effect sizes found in this study are not inconsequential.  

Of the four phonological awareness outcomes for which membership of the computer group conferred some educational advantage, two outcomes proved to be amenable to analysis by gender. These Beta effect sizes, for the advantage of belonging to the computer group per gender, are reprised below:

Table 2: Effect Sizes by Gender

Clearly the girls gained far more advantage than the boys from the computer medium for these two outcomes. There was no gender difference for the remaining other two groups. This suggests that the girls’ superior advantage was not due to the girls within the study being either generically more able than the boys, or indeed more phonologically skilled. Similarly, if the phonological programme itself were inherently more advantageous to girls then it would be expected that performance of the girls in the paper-based practise group would have been better than the boys. This was not the case. These results suggest that it was a factor, or factors, intrinsic to the computer application that enabled the girls to benefit more the computer medium.

An extension to the present study would be required in order to systematically explore the data in order to determine the factors and processes underlying the results obtained. Indeed this was the purpose in collecting additional more qualitative data during the course of the study. Evidence focusing on observational records of children participating in the study as well as interview data were collected. It is hoped that future papers will be able to exploit this data in order to illuminate the quantitative evidence presented here.

Initial analysis of these qualitative data sources would suggest that the amount of talk generated was greater for the computer group and that this talk was predominantly task related. There was also an emergent tendency for the computer group children to refer to the task in collaborative language; e.g. ”We done it”; “We need a word for this”. This was in contrast to the more egocentric comments regarding the paper–based tasks e.g. “You colour that and I’ll colour this one”, “I done them all”

There was some evidence within the observation data reviewed to date, that children working with the computers were motivated by the extrinsic motivators associated with the computer programme. The incidence of children expressing enjoyment through laughter was higher for the computer group and there was evidence of positive comments concerning the features of the computer programme. However the evidence in favour of the computer being wholly motivational was not incontrovertible. There were also incidences of children becoming restless and bored by the computer programme.

Clearly in order to be able to tell a coherent tale about why the computer group made better progress in the study it would be necessary to fully and systematically explore the observational data. This would enable an informed judgment as to the underlying patterns of learning behaviours and responses to the different learning media. At present the observational data that has begun to be explored offers only tantalising but sometimes contradictory clues.

The finding that the girls benefited much more than the boys from the computer-based instruction is perhaps the most surprising finding to emerge from this study. Intriguingly, there were certainly individual examples of boys within the present study, who did appear to be highly motivated by the computer, but for whom this motivation did not translate into better skills progress. Having said that, it was far from clear that there was a straightforward divide in terms of attitudes towards computers between the boys and the girls.

A number of linked projects, referred to in the introduction, (Littleton et al, 1992; Joiner, 1998), have led to the suggestion that boys were more likely to be sensitive to such features as the main characters or story theme of software. However the Rhyme and Analogy software included many examples of both that might be expected to appeal to boys: a speedboat in Book 1, a mechanical trap in Book 3, trolls in Book 5. In any case, the boys in the computer group did make more progress than the boys in the other two groups, it was just that the girls made much more progress. The anticipated analysis of the additional qualitative data collected during the project will hopefully allow some insights into the reasons for the detected gender difference.

Though there are always dangers in over extrapolating from any one research study, the findings would support the use of educational ICT within schools to support the development of children’s phonological awareness in KS1. This conclusion is reinforced by the parallel finding in respect of the relative lack of progress made by the group who made use of paper-based exercises to support their developing phonological awareness. This study may be a beginning in addressing the point made by Higgins and Moseley (2002, ibid, p31), that there has been to date very little evidence to “justify…replace(ing) paper and pencil by computer for daily reading and writing activities in primary schools”. On the other hand, the findings in respect of gender do, at the very least, suggest that ICT usage is not necessarily a panacea for the gender “gap” that is often a feature of our educational system.

The current study already contains within itself the potential for future fruitful development, specifically the investigation of the observational and interview data. In addition it would be constructive to design future similar studies in which specific facets and features of the ICT application were differently presented to groups of participants, in order to determine what were the underlying explanations for the noted results. Future similarly constructed studies would also benefit from a more longitudinal design, in which the benefits of particular ICT approaches could be considered in both the long and the short term.

As the current study shows, there appears to be some gender interaction with the beneficial impact of ICT and it would therefore seem desirable that future research would seek to probe the nature of this interaction further. In order to do so effectively the experimental paradigm could be extended to cover different permutations of types of computer applications, and gender groupings. As in the present study, a useful strategy would be to experimentally manipulate these permutations and compare the differing effects on children’s learning outcomes.  

Supported by a robust methodology, this study provides a credible piece of evidence rather than assertion (Reynolds, 2001, ibid) that the computer is a comparatively beneficial element in the educative process, and in the process suggested that this was particularly so for girls. In so doing it is a small beginning in answering the call for more comparative, outcomes related research into the educative use of ICT.

References

Adams, M.J. (1990). Beginning to Read. Thinking and Learning About Print. Cambridge, Massachusetts. MIT.

BECTa (2000). Preliminary Report for the DfEE on the Relationship Between ICT and Primary School Standards. http://www.becta.org.uk/news/reports/contents/html 

BECTa (2001). Primary Schools of the Future- Achieving Today. Coventry BECTa.

BERA (1992). Ethical Guidelines for Educational Research. Adopted at AGM of the British Educational Research Association on 28th August 1992.

Blair, T. (1997). Foreword. Connecting the Learning Society. Consultation Paper. London. DfEE. Blunkett, D. (2001). Foreword. Curriculum Online-A Consultation Paper. London. DfEE.

Boruch, R.F. (1997). Randomized Experiments for Planning and Evaluation. A Practical Guide. London. Sage.

Boruch, R., Snyder, B. and DeMoya, D. (1999). The Importance of Randomised Field Trials. Paper presented at Evidence-based Policies and Indicator Systems Conference. Durham University.

Brosnan, M.J. (1998a). The Implications for Academic Attainment of Perceived Gender-appropriateness Upon Spatial Task Performance. British Journal of Educational Psychology. 68, 203-215.

Bryant, P. and Bradley, L. (1985). Children’s Reading Problems. Oxford. Blackwell.

Bus, A. G. and van IJzendoorn, M.H. (1999). Phonological Awareness and Early Reading: A Meta-analysis of Experimental Training Studies. Journal of Educational Psychology.91 (3), 403-414.

Chall, J. S. (1967). Learning to Read: The Great Debate. New York. McGraw-Hill.

Clay, M.(1979). The Early Detection of Reading Difficulties. (3rd ED.). Auckland. New Zealand.

Coe, R. (2002). It’s the Effect Size, Stupid. What Effect Size Is and Why It Is Important. Paper presented at BERA Annual Conference, Exeter, 12-14 September 2002.

Craven, R.G., Marsh, H.W., Debus, R.L. and Jayasinghe, U. (2001). Diffusion Effects: Control Group Contamination Threats to the Validity of Teacher-Administered Interventions. Journal of Educational Psychology. 93 (3), 639-645.

Davidson, J., Coles, D. Noyes, P. and Terrell, C. (1991). Using Computer-delivered Natural Speech to Assist in the Teaching of Reading. British Journal of Educational Technology. 22 (2), 110-128.

Davidson, J., Elcock, J. and Noyes, P. (1996). A Preliminary Study of the Effect of Computer-assisted Practice on Reading Attainment. Journal of Research in Reading.19 (2), 102-110.

DES Standards Site 2003 http://www-standards.des.govuk

DffEE. (1998). The National Literacy Strategy.  London. DfEE Publications.

DfES (2003a). Fulfilling the Potential. Transforming Teaching and Learning Through ICT in Schools. DfES. Crown Copyright.

Dunn, L., Dunn. L., Whetton, C. and Burley, J. (1997). British picture Vocabulary Scale. (2nd Ed.). Berkshire. NFER

Edwards, T. (2000). “All the Evidence Shows…: Reasonable Expectations of Educational Research" Oxford Review of Education. 26 (3/4), 299-311.

Ehri, L.C. (2003). Systematic Phonics Instruction: Findings of the National Reading Panel. Paper presented to conference reviewing the National Literacy Strategy. February 2003. London

Ehri, L.C., Nunes, S.R., Willows, D.M., Schuster, B.V., Yaghoub-Zadah, Z. and Shanahan, T. (2001). Phonemic Awareness Instruction Helps Children Learn to Read: Evidence From the National Reading Panel’s Meta-analysis. Reading Research Quarterly. 36 (30), 250-

Field, A. (2000). Discovering Statistics Using SPSS for Windows. London. Sage Publications Ltd.

Firth, D. (2002a). SPSS #8. Analysis of Covariance and the General Linear Model. One of a series of lecture handouts accompanying a programme of lectures at the University of Oxford, Michaelmas Term 2002.

Fitzpatrick, H. and Hardman, M. (2000). Mediated Activity in the Primary Classroom: Girls, Boys and Computers. Learning and Instruction. 10, 431-446.

Frederickson, N., Frith, U. and Reason,  R. (1997). Phonological Assessment Battery. Manual and Test Materials. Verkshire. NFER.

Furlong, J., Furlong, R., Facer, K. and Sutherland, R. (2000). The National Grid for Learning: A Curriculum Without Walls? Cambridge Journal of Education. 30 (1), 91-110.

Goswami, U. and Kirtley, C. (1996). Rhyme and Analogy. Teacher’s Guide. Oxford. Oxford University Press.

Hammersley, M. (1997). Educational Research and Teaching: a response to David Hargreaves’ TTA Lecture. British Educational Research Journal. 23, (2), 141-161.

Hannon, P. (2000). Reflecting on Literacy in Education. London and New York. Routledge Falmer.

Harrison, C., Comber, C., Fisher, T., Haw, K., Lewin, C., Lunzer, E., McFarlane, A., Mavers, D., Scrimshaw, P., Somekh, B. and Watling, R. (2002). ImpaCT2. The Impact of Information and Communication Technologies on Pupil Learning and Attainment. Full Report. Downloaded from; http://www.becta.org.uk/research/inmpact2/index.cfm

Higgins, S. (2003). Does ICT Improve Learning and Teaching in Schools? A Professional User Review for BERA. Notts. BERA.

Higgins, S. and Moseley, D. (2002). Raising Achievement in Literacy Through ICT in Monteith, M. (Ed.). Teaching Primary Literacy With ICT. Buckingham. Open University.

Johnson, D.C., Cox, M.J. and Watson, D.M. (1994) Evaluating the Impact of IT on Pupils’ Achievements. Journal of Computer Assisted Learning. 10, 138-156.

Joiner, R.W. (1998) The Effect of Gender on Children’s Software Preferences. Journal of Computer Assisted Learning. 14, 195-198.

Lankshear, C. and Knobel, M. (2003). New Technologies in Early Childhood Literacy Research: A Review of Research. Journal of Early Childhood Literacy. 3 (1), 59-82.

Littleton, K., Light, P., Joiner, R., Messer, D. and Barnes, P. (1992). Pairing and Gender Effects on Children’s Computer-based Learning. European Journal of Psychology of Education. 7, 309-322.

Millard, E. (1996). Some Thoughts on Why Boys Don’t Choose In School. Literacy Today. 8. 15-16.

Millard, E. (1997). "Differently Literate." London. Routledge Falmer..

Mioduser, D., Tur-Kaspa, H. and Leitner, I. (2000). The Learning Value of Computer-based Instruction of Early Reading Skills. Journal of Computer Assisted Learning. 16, 54-63.

Moseley, D., Higgins, S., Bramald, R., Hardman, F., Miller, J., Mroz, M., Tse, H., Newton, D., Thompson, I., Williamson, J., Halligan, J., Bramald, S., Newton, L., Tymms, P., Henderson, B., and Stout, J. (1999). Ways Forward With ICT: Effective Pedagogy Using Information and Communications Technology for Literacy and Numeracy in Primary Schools. Newcastle. University of  Newcastle.

Murphy, C. and Beggs, J. (2003). Colloquium. Primary Pupils’ and Teachers’ Use of Computers at Home and School. British Journal of Educational Technology. 34 (1), 79-83.

NCET (1996). Integrated Learning Systems. A Summary of Phase II of the Pilot Evaluation of ILS in the UK. Coventry. NCET.

Nicholson, R., Fawcett, A. and Nicholson, M. (2000). Evaluation of a Computer-based Reading Intervention in Infant and Junior Schools. Journal of Research in Reading 23(2), 194-209.

NLS (2003). Teaching Phonics in the National Literacy Strategy. Paper presented at National Literacy Strategy Day. London, Feb. 2003.

Noble, C. and Bradford. W. (2000). "Getting it Right for Boys....and Girls". London. Routledge.

OCC. (2000). Oxfordshire Schools. Admissions and Transfers 2001/2002. Information for Parents. Oxford. Information Press.

Passig, D. and Levin, H. (2000). Gender Preferences for Multimedia Interfaces. Journal of Computer Assisted Learning. 16, 64-71.

Pirrie, A. (2001). Evidence-based Practice in Education: The Best Medicine? British Journal of Educational Studies. 49 (2), 124-136.

Plewis, I. and Hurry, J. (1998). A Multilevel Perspective on the Design and Analysis of Intervention Studies. Educational Research and Evaluation. 4 (1), 13-26.

Pring, R. (2000). The “False Dualism” of Educational Research. Journal of Philosophy of Education. 34, 247-260.

Reynolds, D. (2001b). Keynote Presentation-ICT in Education: The Future Research and Policy Agenda. Building an ICT Research Network Conference. London. June 2001.

Robson, C. (1993). Real World Research. A Resource for Social Scientists and Practitioner-Researchers. Oxford. Blackwell.

Selwyn, N. (1997). Colloquium. The Continuing Weaknesses of Educational Computing Research. British Journal of Educational Technology. 28 (4), 305-307.

Smeyers, P. (2001). Qualitative Versus Quantitative Research design: A Plea for Paradigmatic Tolerance in Educational Research. Journal of Philosophy of Education. 35 (3), 477-495.

Snow, C. E., Burns, M.S. and Griffin, P. (1998). Preventing Reading Difficulties in Young Children. Washington. National Academy Press.

Sylva, K.. (2000). Editorial. Oxford Review of Education26(3/4), 293-297.

Tsitridou- Evangelou, M. (2001). Evaluation of the Effects of a Pre-school Intervention on Literacy Development in Children. DPhil Thesis. University of Oxford.

Underwood, G. (1994). Collaboration and Problem Solving: Gender Differences and the Quality of Discussion in Underwood, J. (ed.). Computer Based Learning: Potential into Practice. London. David Fulton Publishers.

Underwood, J. (1994).Introduction: Where are we now and where are we going? in Underwood, J.(ed.). Computer Based Learning: Potential into Practice. London. David Fulton Publishers.

Underwood, J. (2000). A Comparison of Two Types of Computer Support for Reading Development. Journal of Research in Reading, 23 (2), 136-148.

Underwood, J. (2002). Computer Support for Reading Development in  Monteith, M. (ed.). Teaching Primary Literacy with ICT. Buckingham. Open University Press.

Underwood, G., Mc Caffrey, M. and Underwood, J. (1990). Gender Differences in a Co-operative Computer-based Language Task. Educational Research. 32 (1), 16-21.

Underwood, G. and Underwood, J. (1998). Children’s Interactions and Learning Outcomes with Interactive Talking Books. Computers and Education. 30, 95-102.

Underwood, J. and Underwood, G. and Wood, D. (2000). When Does Gender Matter? Interactions During Computer-based Problem Solving. Learning and Instruction.  10, 447-462

Van Daal, V. and Reitsma, P. (2000). Computer-assisted Learning to Read and Spell: results From Two Pilot Studies. Journal of Research in Reading, 23 (2), 181-193.

Watson, D., Cox, M. and Johnson, D. (1993). The ImpaCT Summary. An Evaluation of the Impact of Information Technology in Children’s Achievements in Primary and Secondary Schools. London. DfEE/ Kings College.

Yelland, N. (1995). Colloquium. Young Children’s Attitudes to Computers and Computing. British Journal of Educational technology. 26 (2), 149-151

Appendix A

Graphs depicting the means and 95% confidence intervals at pre- and post-test

Figure 1: PhAB (Total) Mean Scores at Pre-test and Post-test

Figure 2: PhAB Combined Alliteration Mean Scores at Pre-test and Post-test

Figure 3: PhAB Combined Rhyming Mean Scores at Pre-test and Post-test 

 Figure 4: PhAB Combined Fluency Mean Scores at Pre-test and Post-test

Figure 5: Marie Clay Dictation Mean Scores at Pre-test and Post-test

Appendix B

Figure 6: Box-plot of PhAB (Total) Scores at Post-test, Adjusted for Covariates

The box-plot generated for this outcome, supports the interpretation that it is the improving performance of the computer group between the two measurement time points that is the cause of the interaction detected in the original ANCOVA.

Figure 7: Box-plot of PhAB Combined Alliteration Scores at Post-test, Adjusted for Covariates

Again the box-plot generated for this outcome supports the interpretation that it is the improving performance of the computer group between the two measurement time points that is the cause of the interaction detected in the original ANCOVA.

Figure 8: Box-plot of PhAB Combined Fluency Scores at Post-test, Adjusted for Covariates

Once again, the box-plot generated for this outcome supports the interpretation that it is the improving performance of the computer group between the two measurement time points that is the cause of the interaction detected in the original ANCOVA.

This document was added to the Education-line database on 28 February 2005