Schooling Resources, Educational Institutions, and Student Performance: The International Evidence
Ludger Woessmann
Kiel Institute of World Economics, 24100 Kiel, Germany
Phone: (+49) 431 8814-497
E-mail: woessmann@ifw.uni-kiel.de
Paper presented at the Royal Economic Society Conference, 2001
The formation of human capital is essential for the economic success both of individuals and of society at large in a modern economy. The human capital stock comprises cognitive and non-cognitive skills and is mainly produced in families, schools, universities, and firms. This paper focuses on students' cognitive skills in mathematics and science, which are mainly formed in schools. Since "[e]arly learning begets later learning" (Heckman 1999, p. 2), basic knowledge formed early in school has a substantial impact on potential future prosperity of individuals and nations.
The empirical evidence on the determinants of educational performance overwhelmingly shows that at given levels of expenditures, an increase in the amount of resources used does not generally lead to an increase in educational performance. This has been shown within the United States (Hanushek 1986, 1996), within developing countries (Hanushek 1995), across countries (Hanushek and Kimko 2000), and across time within most OECD countries (Gundlach et al. 2001) and within some East Asian countries (Gundlach and Wößmann 1999). Figure 1 presents equivalent evidence from the latest and most coherent cross-country achievement study, the Third International Mathematics and Science Study (TIMSS). Again, cross-country resource differences do not help in understanding cross-country differences in educational performance. The correlation coefficient between expenditure per student and average TIMSS test scores is 0.13 in primary education and 0.16 in secondary education. By implication, the level of schooling productivity - the ratio of educational performance to resources used - differs widely across schooling systems. While a controversial debate surrounds the within-country evidence on resource effects on student performance(1), it is obvious and generally accepted that the large international differences in student performance levels in mathematics and science cannot sufficiently be attributed to differences in the amount of inputs used.
In other sectors of the economy, competition creates incentives to use resources efficiently. By contrast, schooling systems tend to be publicly managed all over the world, thereby often lacking incentives for improving students' performance or containing costs. As the Economist (1999, p. 21) put it, "[i]n most countries the business of running schools is as firmly in the grip of the state as was the economy of Brezhnev's Russia." While public provision of schooling may generally be associated with inefficiencies, the public schooling systems still differ substantially across countries in their institutional structure of educational decision-making processes. These institutional features include settings such as the centralization of examinations and of other decision-making powers, the distribution of responsibilities and influence between the administration, school principals, teachers, parents, and students, and the size of the private schooling sector. Institutional differences create different incentives for the agents involved in educational production, which should lead to different resource-allocation decisions and therby ultimately to differences in the educational performance of students.
This paper examines whether and, if so, how differences in institutional incentive mechanisms can add to an explanation of the large international differences in students' cognitive skills. Based on a student-level data base derived from TIMSS, microeconometric education production functions are estimated which include data on institutions as explanatory variables. The link between institutions and student performance could hardly by tested using country-specific evidence since there is no significant variation in most institutional features within a single country on which such an analysis could be based. Only the international evidence which encompasses many education systems with widely differing institutional structures has the potential to show whether institutions have important consequences for student performance.
1. The International Micro Data Base Based on TIMSS
TIMSS is the latest international student achievement test for which data is currently available, and it is the most extensive one ever conducted both in its coverage of countries and in the scope of its contents. In addition to testing students' cognitive skills in both mathematics and science, TIMSS gathered a wealth of contextual information about instruction and learning through student, teacher, and school-principal questionnaires. This TIMSS data base offers an unprecedented opportunity to examine the determinants of student performance. Drawing on this pool of data, I have constructed a student-level data base for more than 260,000 individual students in the middle school years. This new data set forms a representative sample in each of the 39 participating countries, representing a population of more than 30 million students(2). Appendix A gives a complete exposition of the variables used in this study and includes their descriptive statistics in Table A1.
TIMSS was conducted in 1994/95 under the auspices of the International Association for the Evaluation of Educational Achievement (IEA), which has gathered 40 years of experience with international comparative studies on educational achievement and learning contexts(3). In the test of students in the middle school years, where the broadest sample of countries participated, students enrolled in the two adjacent grades containing the largest proportion of 13-year-olds were tested, which are seventh- and eighth-graders in most countries. The students' achievement levels in mathematics and science were tested by a combination of multiple-choice and open-ended-response questions which covered a wide range of topics and capabilities in the two subjects. Combining the performance in the different questions of a subject, proficiency was mapped onto an international scale with a mean of 500 and a standard deviation of 100 to yield the international achievement scores.
In addition to the TIMSS mathematics and science achievement scores of the quarter of a million students, the micro data base gathers data from different TIMSS background questionnaires which enables one to control for individual family background and combines it with student-specific information on available resources and institutions as well as country-level data on the education systems. TIMSS collected contextual information about instruction and learning through student, teacher, and school questionnaires. The students who participated in TIMSS completed questionnaires about their demographics, home and family background, and classroom and out-of-school activities. The mathematics and science teachers of sampled students responded to questions about their professional training and education, their instructional practices, their responsibilities for decision-making in several areas, and about class sizes and the availability of materials. School questionnaires, answered by the principals of the tested schools, provide information on school characteristics and resources, the degree of centralization of decision-making, the allocation of responsibility for different tasks, and topics like the extent of parents' participation. From this qualitative survey data, dummy variables were created for the institutional features (see Appendix A).
For the TIMSS questionnaire variables, missing data was a problem. While some students, teachers, and school principals failed to answer individual questions, some other questions were not at all administered in some countries. Since dropping students with missing data on some explanatory variables from the analysis would severely reduce the sample size, delete the information available on the other explanatory variables, and introduce sample selection bias, imputation of missing values was chosen(4). The method used to impute data for missing values in the TIMSS questionnaires is presented in Appendix B. Results of robustness tests against dropping observations with imputed data for each individual variable are reported in Section 6.
Some country-level data on decision-making centralization was provided by the TIMSS national research coordinators of each participating country. Furthermore, the data base combines the TIMSS data with additional system-level data on the institutional structure of the schooling systems from the educational indicators collected by the OECD, concerning the level of decision-making and the extent of private school management. This data base allows an extensive student-level analysis of the determinants of educational quality through the estimation of microeconometric education production functions.
2. Estimating Microeconometric Education Production Functions
To determine the influence of student background, resources, and institutions on students' educational performance, education production functions can be estimated of the form
t_{i} = B_{i} a + R_{i} b + I_{i} g + e _{i}
where t is the test score of student i, B are the measures of the student's background, R are the measures of resource use, I are the measures of institutional features surrounding the student's learning (R and I are measured at the classroom, school, and country level), e is an error term, and a , b , and g are the parameters to be estimated.
Studies such as Lee and Barro (1997) and Hanushek and Kimko (2000) have used country-level data to analyze the determination of students' performance. These macro education production functions cannot control for individual influences on a student's performance. They are also restrained to the analysis of system-level institutional determinants like central examinations (as performed by Bishop 1997, 1999). The relevant level at which to perform the analysis is the individual student (not the class, school, district, or country), because this directly links a student's performance to her teaching environment. The estimation of such a microeconometric education production function provides the opportunity to control for individual background influences on student performance when looking at the influence of resources and institutions, to assess the influence of the relevant resource and teacher characteristics with which a student is faced, and to look at the institutional features relevant to the individual student.
In using student-level data, attention has to be given to the complex data structure given by the survey design and the multi-level nature of the explanatory variables. As is common in educational survey data, the TIMSS sampling design includes varying sampling probabilities for different students as well as stratified and clustered data (see Martin and Kelly 1998). The TIMSS procedure was designed to achieve nationally representative student samples by stratified sampling within each country. To avoid bias in the estimated equation and to obtain nationally representative coefficient estimates from stratified survey data, weighted least squares (WLS) estimation using sampling weights has to be employed so that the proportional contribution of each stratum in the sample to the parameter estimates is the same as would have been obtained in a complete census enumeration. DuMouchel and Duncan (1983) show that the use of a WLS estimator is especially relevant in an omitted-predictor model, which is certainly given in the estimation of an education production function where the innate ability of each student remains unmeasured. Therefore, the following WLS regressions use weights which assure that each student is weighted according to her probability of selection so as to yield representative samples within each country and to give each country the same weight in the international estimation.
Each country was sampled separately, so that sampling was done independently across countries, fixing the division into countries in advance. In consequence, the TIMSS data is stratified by country. Furthermore, the TIMSS sampling procedure had a two-stage clustered sample design within each country, with the first stage yielding a sample of schools and the second stage yielding a sample of classrooms (Gonzalez and Smith 1997). Thus, the primary sampling unit (PSU) in TIMSS was the school. Individual students who go to the same school may share some characteristics which are not perfectly captured by the included observable variables. Additionally, the data set is characterized by a hierarchical data structure with data collected at different levels. As the resource and institutional variables are not measured at the student level but at the classroom or school level (see above), the observations on these variables for students who share the same class or school depend on one another. As a result, observations in the same PSU are not independent, so that the structure of the error term in the equation given above may be more complicated than conventional least-squares methods assume. One method to correct estimated standard errors, usually referred to as robust linear regression (RLS), combines the WLS regression with robust estimates of standard errors which recognize the stratification and clustering of the survey design(5). RLS relaxes the independence assumption and requires only that the observations be independent across the PSUs, allowing any amount of correlation within the PSUs. Thus, RLS estimates appropriate standard errors when many cases share the same value on some but not all independent variables(6).
In the following analysis, robust standard errors based on RLS are presented in addition to conventional (raw) standard errors. The robust standard errors are based on countries as strata and schools as PSUs. As the highest level of clustering, schools were chosen as PSUs, thereby allowing any degree of dependence within schools. Therefore, the reported robust standard errors are actually upper bounds for the coefficients of those explanatory variables which are measured at the student or classroom level. The marks signaling significance levels in the results tables are based on these robust variance estimates.
3. Results on Family Background and Schooling Resources
Table 1 shows OLS and WLS regression results for the mathematics achievement score of 266,545 secondary-school students from 39 countries. While the results do not differ considerably between the OLS and the WLS estimation, the following discussion refers to the WLS estimates. Furthermore, significance statements are based on the robust variance estimation which accounts for the clustered data structure.
Student and Family Characteristics
Before being able to test for institutional effects on student performance, effects of differences in student characteristics and school resources have to be controlled for. Students in higher grades perform considerably better than students in lower grades, with the 8th-grade mathematics score being 40.3 points above the 7th-grade score (holding all other influences constant). After controlling for these grade-level differences, the age of students is negatively related to performance, probably reflecting a grade repetition effect. On average, girls performed 7.6 points lower than boys. Students being born in the country in which they attend school, students living with both parents, and students with at least one parent born in the country where they attend school performed better than otherwise.
The educational level achieved by the students' parents was strongly positively related to the students' educational performance. The effect captured by the variable "books at home," which proxies for the educational and social background of the family, was even stronger. Students of schools located in geographically isolated communities performed worse than students from more urban areas. As a control for the overall level of development of the country in which the student lives, GDP per capita is positively related to mathematics achievement. All these effects of student and family characteristics are statistically highly significant.
Student and family background effects on science achievement, reported in Table 2, are very similar to the case of mathematics achievement. While being qualitatively identical, the quantitative effect differs to some extent for some variables. For example, the lead of boys' performance over girls' performance was 8.5 points larger in science than in mathematics.
Resources and Teacher Characteristics
The estimated effects of the amount of resources used on student performance are consistent with most of the literature in that no strong positive relationship exists between spending and student performance (see, e.g., Hanushek 1996). In fact, instead of resulting in higher student performance, higher educational expenditure per student (measured at the country level) and smaller class sizes (measured at the classroom level) are statistically significantly related to inferior mathematics and science results(7). The statistically insignificant effect of ratios of students to total professional staff at the school also points in the "wrong" direction.
In contrast to the measured effects of teaching staff, the equipment with facilities has the expected effect when measured by the subjective assessment of the principals of the schools. Students in schools whose principals reported that the capacity to provide instruction is not affected by the shortage or inadequacy of instructional materials scored 7.2 points higher in mathematics relative to students in schools with a little or some limitation (6.5 in science), while students in schools with great shortage of materials scored 5.9 (11.6) points worse. Instruction time at the relevant grade level of the school is statistically significantly positively related to student performance in mathematics and science. While the relative importance of the explanatory dummy variables can be directly evaluated on the basis of their regression coefficients (the coefficient of dummies reports the conditional test score difference between students with and without the characteristic of interest), standardized coefficients (also reported in Tables 1 and 2 for the WLS estimation) can be used to compare the relative importance of the discrete explanatory variables. For example, a change of 1 standard deviation in instruction time is related to a change of only 0.025 standard deviations in the mathematics test score, while a 1 standard-deviation change in the mathematics class size is related to a change of 0.122 standard deviations in mathematics performance.
Students of female teachers score statistically significantly higher than students of male teachers in both mathematics and science. Conditional on each other, teacher's experience is positively and teacher's age is negatively related to students' performance. This may reflect positive effects of teaching experience in combination with negative effects of age differences between teachers and students, presumably due to increasing difficulties of intergenerational understanding and declining motivation of aging teachers. Teachers' level of education is positively related to students' performance, with the effects in science being larger than in mathematics.
4. Results on Educational Institutions
I analyze the effects of six categories of institutional features: cental examinations; the distribution of responsibilities between schools and administration; teachers' influence; students' incentives; parents' influence; the distribution of responsibilities between administrative levels; and extent of competition from private schools.
Central Examinations
Of the 39 education systems analyzed in this study, 15 have some kind of centralized examinations in the sense that a central decision-making authority has exclusive responsibility for or gives final approval of the content of examinations. Centrally and thus externally set examinations make schools and teachers accountable and increase students' rewards for learning (cf. Bishop 1997, 1999), which should positively impact on student performance. Accordingly, students in countries with centralized examination systems scored 16.1 points higher in mathematics and 10.7 points higher in science(8).
Furthermore, students in schools where external examinations or standardized tests had a lot of influence in determining the curriculum had test scores 4.3 points higher in mathematics. In science, the effect is negative when imputed observations are included, while it is positive but insignificant when the observations with imputed data on this variable are dropped from the sample (see Section 6). The weaker effect of standardized tests in science than in mathematics may reflect that science tests lend themselves less readily to standardization.
Distribution of Responsibilities between Schools and Administration
The responsibility for decisions in several areas of the education system is distributed differently between administration and schools across countries, with Dutch schools being relatively autonomous and schools in Greece, Norway, or Portugal not having much autonomy. Since school autonomy allows better use of decentralized knowledge but increases the potential of local opportunistic behavior, centralization may be expected to be conducive for decisions on standard setting, performance control, and size of the budget, while school autonomy should be conducive for decisions on teaching techniques, personnel management, and purchase of supplies.
Students in countries both with centralized decision-making responsibilities for tuition syllabi ("central curriculum") and for the list of approved textbooks ("central textbook approval") score higher in mathematics and science than students in countries without these decisions being centralized(9). However, the size of these effects is smaller than the effect of centralized examinations. Students in schools which had primary responsibility for formulating the school budget had lower scores in mathematics (5.9 points) and science (3.5 points) than students in schools which did not primarily determine their own budget(10). That is, taking away responsibility for setting standards and for the amount of available resources from the school level is conducive to student performance.
By contrast, school autonomy in process decisions on purchasing supplies goes hand in hand with superior achievement of students(11). Likewise, students in schools which had freedom to decide on the hiring of teachers performed statistically significantly better in mathematics (12.7 points) and science (5.2 points), as did students in schools which could determine teacher salaries themselves (10.6 points in mathematics and 15.2 points in science). Thus, school autonomy in personnel management seems highly conducive to student performance.
In sum, the evidence supports the hypothesis that the distribution of responsibilities between schools and administration matters for the educational performance of students. On the one hand, centralized decisions on standard setting, performance control, and the size of the school budget help to assure that the producers of education look for the performance of students. On the other hand, school autonomy seems to be the best way to guarantee high student performance in process and personnel-management decisions. Thus, the most conducive combination seems to be a mechanism of control from above to limit school-level opportunistic behavior combined with a high degree of freedom to decide at the school level on subjects where school-level knowledge is important.
Teachers' Influence
The degree of freedom of teachers to decide independently on several educational topics should impact on student performance by affecting the decision-making outcome in the education system. Since teachers cannot be easily monitored, the institutional setting will tilt their behavior either in the direction of furthering their own interests or of advancing students' performance. While it is the very aim of teacher unions to promote the special interests of teachers, a high influence of teacher unions may result in increased pay and decreased work-load, to the detriment of student performance. A high degree of influence of individual teachers with specific local knowledge on process decisions should be conducive to student performance, while it should be detrimental in decisions determining teacher salary levels or work-loads.
Correspondingly, students in schools whose principals reported that teachers had primary responsibility for the school budget scored 13.3 points worse in mathematics (4.6 points in science)(12). Conversely, students scored 14.1 points better in mathematics (6.8 in science) if teachers had primary responsibility for purchasing supplies. The findings imply that decisions on the amount of money to be spent should be taken away from teachers and schools, while decisions on which specific supplies to be purchased should be decentralized to teachers and schools(13).
With regard to teachers' influence on the curriculum that is taught in the school, a clear difference arises between teachers acting individually and teachers acting collectively. On the one hand, students in schools where each teacher individually had a lot of influence on the curriculum performed considerably better (12.0 points in mathematics and 10.8 points in science). On the other hand, students in schools where school teachers collectively or teacher unions had a lot of influence on the curriculum performed statistically significantly worse. This detrimental effect of teachers exercising a collective influence on the curriculum is strongest in the case of teacher unions (-32.3 points in mathematics and -18.4 points in science)(14).
Concerning specific influence areas of individual teachers, statistically significant results are confined to science. Students of teachers who reported that they had a lot of influence on money for supplies and on what kind of supplies are purchased showed statistically significantly better science performance. By contrast, students of teachers who reported that they had a lot of influence on the subject matter to be taught, determining the teachers' work-load, performed worse in science(15). Whether the class teacher is allowed to decide on the specific textbook to be used does not seem to have a significant effect on students' performance.
Overall, the findings on teachers' influence give a clear picture. If individual teachers can make use of their decentralized knowledge on which teaching method may be best for their students, this will help students to learn more. This conclusion is corroborated by the positive effects of individual teachers influencing the curriculum that is taught in the school and of teachers having responsibility for the purchase of supplies. However, if teachers can use their decision-making powers primarily to reduce their work-load, this will hurt students' learning opportunities. This conclusion is corroborated by the negative effects of teachers' responsibility for the school budget and for the teaching load and of teachers exerting collective power over the curriculum(16).
Students' Incentives
The incentives of students to learn should be influenced by institutional features of the education system which determine the time a student spends studying and the relative benefits of studying. As reported before, instruction time in school was positively related to students' performance. Likewise, centralized examinations, which should make students' learning efforts more visible to external observers, were shown to have a positive impact on students' educational achievement. As another factor influencing the extent to which studying is rewarded and laziness penalized, the scrutiny with which teachers observe and mark students' achievement (measured by the time which the class teacher spends outside the formal school day on preparing or grading exams) has a statistically significantly positive effect on student performance in mathematics, and a statistically insignificant positive effect in science.
The amount of homework is another measure determining the amount of time which students spend studying. However, minutes of homework per week assigned by the class teacher is statistically insignificantly related to students' performance in mathematics and negatively to students' performance in science. This may reflect that minutes of homework assigned may be very different from minutes of homework actually done by each student. Alternatively, it may reflect a non-linear, more complex relationship between minutes of homework assigned and student performance. Both in mathematics and in science, the frequency of homework assignments per week is negatively related to student performance, while the length of one homework assignment is actually positively related to student performance. It seems that assigning homework less often but on a more ambitious scale each is particularly conducive to students' learning.
Parents' Influence
Evidence was previously reported that parents' education and the number of books in a student's home were strongly positively related to the student's educational performance. Apart from the learning environment at home, the influence which parents exert on curricular matters and on teaching in the formal education system should also impact on students' learning opportunities. Accordingly, students in schools where parents had a lot of influence in determining the curriculum scored higher both in mathematics and science; however, these effects are not statistically significantly different from zero.
With regard to parents' influence on teaching, the class teacher reported whether parents uninterested in their children's learning and progress strongly limited how she teaches her class, e.g. because she then could not rely on parents in scrutinizing homework. Students in classes where uninterested parents strongly limited class teaching performed 10.1 points worse in mathematics and 11.0 points worse in science. The class teacher also reported whether interested parents limited class teaching, presumably by preventing her from teaching in the way she judged most suitable. When interested parents were deemed a cause of limitation, students scored 10.9 points worse in mathematics. However, this effect is very small and statistically insignificant in science. That is, even though science teachers maintained that their teaching was greatly limited by parents being excessively interested in their children's learning, this interference did not cause inferior performance of the students.
These positive effects of parents' involvement were not replicated in a positive effect of the time parents spent on meeting with teachers. In fact, the number of hours outside the formal school day reported by the class teacher to be spent on meetings with parents each week ("parent-teacher meetings") was negatively related to student performance. However, this may reflect the fact that teachers have more to discuss with parents of poor students than with parents of good students, so that the time spent on parent-teacher meetings is not exogenous to students' performance. Furthermore, the hours for parent-teacher meetings are preventing the teachers from doing other useful work like preparation and evaluation of classes and exams.
5. Institutional Evidence from OECD Indicators
Additional evidence on the effects of institutional features of the education system was obtained from the institutional measures of the OECD educational indicators. Table 3 reports the WLS mathematics results for these indicators, which are all measured in percentages within a country. Since the OECD indicators are country-level variables, the number of countries equals the number of independent observations for these effects. Consequently, the standard errors reported are robust standard errors based on countries as PSUs. To save on degrees of freedom given that the OECD variables are available only for a limited number of countries participating in TIMSS and for differing samples of countries each, each row in Table 3 reports the results for a separate regression. The regressions are again controlling for all the student background, resource, and other institutional variables reported in Table 1. Table 4 reports equivalent results for science performance.
Distribution of Responsibilities between Schools and Administration
Evidence based on TIMSS questionnaire measures presented above showed that the distribution of responsibilities between schools and administration in different educational decision-making areas has a significant impact on student performance. An OECD indicator of school autonomy reports the percentage of educational decisions in a country taken at the school level in full autonomy without consultations or preset frameworks from the administration. This general indicator of full school autonomy ("school autonomy" in Tables 3 and 4) - which comprises the decision-making domains of organization of instruction, personnel management, planning, and resources - is statistically significantly positively related to student performance in science, and statistically insignificantly positively in mathematics. The standardized coefficients show that if the percentage of decisions taken at the school level in full autonomy increased by 1 standard deviation (equivalent to 11.3 percentage points), students scored 0.062 standard deviations (6.1 test score points) higher in science. Equivalent results are obtained when school-level decisions taken within frameworks from or after consultation with other levels of administration are included ("school level decisions").
The variables on school responsibility are also given for the four sub-groups of decisions separately. For science performance, the coefficients on school level decisions are statistically significantly positive in the decision-making domains of organization of instruction, personnel management, and planning and structures. For mathematics performance, only the positive effect in organization of instruction is statistically significant.
Distribution of Responsibilities between Administrative Levels
When educational decision-making authority lies with the administration (as opposed to the school level), the remoteness of this authority from the school level establishes another feature of the institutional system of education. The dominant levels of administrative decision-making and control over funding differ widely across countries, with the local level of government taking central stage in the United States, the intermediate level in Germany, and the central level in Greece. While efficient monitoring of schools' actions and resource use by a central administration seems elusive because of information problems and a self-interested central administration may tend to excessive bureaucracy, closer ties between local administration and school personnel may mean increased potential for lobbying of school-based interest groups and for collusion.
Tables 3 and 4 report the effect of the extent of decision-making at the central level of government ("central government decisions"), where the residual category (the percentage of decisions not taken at the central level) encompasses the decisions taken at the school level and at the local and intermediate (sub-regional, provincial, and state) levels of government. Students in countries with a higher percentage of decisions taken at the central level of government scored lower in both the mathematics and the science tests, with only the science effect being statistically significant. The effects in each of the four sub-groups of decision-making in both mathematics and science are also negative, with the effects of instructional and resource decisions in both mathematics and science and the personnel management effect in science being statistically significant. By contrast, the percentage of decisions taken at an intermediate level of government (part of the residual category in the regressions presented in the tables) were positively related to student performance.
The distribution of responsibility for and control over funding between the different government levels is related to student performance in a similar way to the distribution of decision-making authority. The larger the share of funds provided at the local or the central level of government, the lower was students' performance in mathematics. Consequently, students performed considerably better the more funding was decided on at an intermediate level of government (the residual category). Once responsibility lies with the administration, an administrative level close enough to individual schools to be familiar with local needs, yet distant enough to limit opportunistic and collusive behavior, seems to be most conducive to focusing attention on student performance.
Private Schools
In general, production of basic education is run publicly all over the world. However, countries differ considerably in the extent of competition from private institutions in the education system. Three quarters of Dutch students attend schools which are managed privately, and Japan is the country with the largest shares of privately managed schools which are also financially independent of public funding, with one quarter of Japanese schools receiving less than half of their core funding from government agencies. At the other extreme, many countries such as Australia, France, Germany, or Sweden have virtually no financially independent private schools. Private schools are faced with monetary incentives to use resources efficiently, and they introduce competition into the public education system by offering choice to parents, thereby potentially advancing student performance.
As the results in Tables 3 and 4 show, students in countries with larger shares of enrollment in privately managed educational institutions scored statistically significantly higher in both mathematics and science. That is, countries with a higher share of private management control over schools performed better. This effect was even larger when only those private institutions were considered which were also financially independent of funding from government sources. Similarly, countries with a higher share of (public) educational expenditure going to private institutions performed better both in mathematics and in science (with only the mathematics effect being statistically significant). Again, this effect was even stronger when focussing only on independent private institutions. Thus, student performance is higher in education systems where private schools take over resource allocation from public decision-makers.
These effects of private school management are measured at the country level. This does not allow for an assessment of the relative performance of public and private schools, for which the relevant data is not available in the TIMSS case. However, measuring the system-level effect of private school management is the appropriate way to estimate the general effects of the competitive environment prevailing in the different education systems, because increased competition from private schools may also positively impact on the effectiveness of resource use in nearby public schools. Hoxby (1994) finds such positive effects of competition from private schools on the performance of public school students in US metropolitan areas(17). Furthermore, Hoxby (1996) shows that the negative effect of teacher unionization is statistically significantly reduced in the United States when a school faces competition from private schools.
6. Robustness
Since some of the variables of the TIMSS data set included a substantial amount of missing values and therefore had to be imputed, it remains to be tested whether the reported results are sensitive to the imputation. The robustness can be tested by dropping observations with imputed data individually for each variable and re-running the regressions. The only changes either in significance or direction of the relationships occur in the regressions for the following institutional variables(18). The effect of external exams' influence on the curriculum turns positive (albeit statistically insignificant) in science, replicating the mathematics result. The negative effect of teachers' responsibility for the school budget turns strongly statistically significant in science, while it is statistically significant only at the 15 percent level in mathematics. The coefficient on subject teachers' influence on the curriculum turns statistically insignificant and positive in both mathematics and science, as does the coefficient on school teachers' influence on the curriculum in science. The coefficient on the class teacher having strong influence on the subject matter taught turns statistically significant in science, while the insignificant coefficient on the choice of textbooks turns positive in science (as it is in mathematics). The effect of homework in mathematics and the effect of parents' influence on the curriculum in both mathematics and science turn statistically insignificantly negative.
Since the negative impact of teachers exercising collective influence over the curriculum is anyway best represented by the strong negative effect of teacher unions, and since a statistically significant impact of parents' involvement in teaching is shown by the strong negative impact of uninterested parents, it can in sum be stated that none of the findings relevant for the argumentation in this study depend on the data imputation. Furthermore, increasing the threshold of non-imputed variables for a student to be included in the sample (see Section 1) by another 10 variables - reducing the total sample size to 255,018 students in mathematics and to 251,292 students in science - does not lead to any change in significance or sign of the coefficients.
A comparison between performance in mathematics and in science shows that all of these results are very robust across the two subjects. Family and resource effects as well as institutional effects are qualitatively the same for mathematics and science learning. The only difference is that standardization effects seem to be more positive in mathematics than in science. This shows up in the facts that the effects of centralized examinations, curricula, and textbook approval are larger for mathematics than for science, that a strong influence of external examinations on the school curriculum has a positive effect on mathematics scores but an ambiguous one on science scores, and that school authority in the four decision-making domains reported by the OECD impacts positively on science performance but is unrelated to mathematics performance (with the exception of the organization of instructions). This difference may indicate that the propensity for standardization is higher in mathematics than in science.
7. Conclusions
Student-level estimates of education production functions reveal that differences in the incentive structures determined by the institutional features of the education systems strongly matter for student performance. The combined effect of performance-conducive educational institutions measured by the dummy variables in Tables 1 and 2 amounts to a test score difference of more than 210 points in mathematics (150 points in science), which equals about 2 standard deviations in test scores and compares to an average test score difference between seventh and eighth grade of 40 points. That is, a student who faced institutions that were all conducive to student performance would have scored more than 200 points higher in mathematics than a student who faced institutions that were all detrimental to student performance. In addition to that, there are the effects of the discrete variables and the system-level results reported in Tables 3 and 4.
The following institutional features of a schooling system favorably affect student performance:
central examinations,
centralized control mechanisms in curricular and budgetary affairs,
school autonomy in process and personnel decisions,
individual teachers having both incentives and powers to select appropriate teaching methods,
limited influence of teacher unions,
scrutiny of students' educational performance,
encouragement of parents to take interest in teaching matters,
an intermediate level of administration performing administrative tasks and educational funding, and
competition from private educational institutions.
While strong and unambiguous effects on students' educational performance can also be found for family background factors, the impact of resource factors appears to be dubious and weak at best. The effects of expenditure per student and class size point in the "wrong" direction, while equipment with instructional materials and teachers' experience and education show positive effects. A strong and systematic relationship between resource use and student performance clearly does not exist.
International differences in the institutions of the education systems rather than in available resources help in understanding cross-country differences in students' educational performance. Actually, while the previous country-level studies by Hanushek and Kimko (2000) and Lee and Barro (1997), which constrained themselves to family and resource effects, reached a maximum of one quarter of explained cross-country variation in students' achievement test scores only, adding three aggregated measures of the institutional features analyzed in this paper helps to explain three quarters of the cross-country variation in TIMSS mathematics scores and 60 percent in science scores(19). Hence differences in institutions of the education systems can explain a major part of the international differences in average student performance levels.
Cross-country differences in student performance are not a mystery. They are related to policy measures. However, the policy measures which matter for schooling output are not simple resource inputs. Spending more money within an institutional system which does not set suitable incentives will not improve student performance. The results of this paper imply that the crucial question for education policy is not one of more resources but one of creating an institutional system where all the people involved face incentives to use resources efficiently and to improve student performance. Success in educational production does not primarily depend on the amount of resources spent, but on the institutional features governing the education process.
References
Angrist, Joshua D., Victor Lavy (1999). Using Maimonides' Rule to Estimate the Effect of Class Size on Scholastic Achievement. Quarterly Journal of Economics 114(2): 533-575.
Bishop, John H. (1997). The Effect of National Standards and Curriculum-Based Exams on Achievement. American Economic Review, Papers and Proceedings 87(2): 260-264.
Bishop, John H. (1999). Are National Exit Examinations Important for Educational Efficiency? Swedish Economic Policy Review 6(2): 349-398.
Bryk, Anthony S., Stephen W. Raudenbush (1992). Hierarchical Linear Models: Applications and Data Analysis Methods. Advanced Quantitative Techniques in the Social Sciences 1. Sage Publications, Newbury Park et al.
Case, Anne, Angus Deaton (1999). School Inputs and Educational Outcomes in South Africa. Quarterly Journal of Economics 114(3): 1047-1084.
Cohen, Jon, Stéphane Baldi (1998). An Evaluation of the Relative Merits of HLM vs. Robust Linear Regression in Estimating Models with Multi-Level Data. Mimeo, American Institutes for Research, February.
Deaton, Angus (1997). The Analysis of Household Surveys. The Johns Hopkins University Press, Baltimore and London.
DuMouchel, William H., Greg J. Duncan (1983). Using Sample Survey Weights in Multiple Regression Analyses of Stratified Samples. Journal of the American Statistical Association 78(383): 535-543.
Economist, The (1999). A Contract on Schools - Why Handing Education Over to Companies Can Make Sense. January 16: 21.
Goldstein, Harvey (1999). Multilevel Statistical Models. Kendall's Library of Statistics 3, Revised internet edition of the 2nd print edition, available at http://www.arnoldpublishers.com/support/goldstein.htm. Edward Arnold, London.
Gonzalez, Eugenio J., Teresa A. Smith (eds.) (1997). User Guide for the TIMSS International Database - Primary and Middle School Years. International Association for the Evaluation of Educational Achievement, TIMSS International Study Center, Boston College, Chestnut Hill, MA.
Gundlach, Erich, Ludger Wößmann (1999). The Fading Productivity of Schooling in East Asia. Kiel Working Paper 945, September.
Gundlach, Erich, Ludger Wößmann, Jens Gmelin (2001). The Decline of Schooling Productivity in OECD Countries. Economic Journal: forthcoming.
Hanushek, Eric A. (1986). The Economics of Schooling: Production and Efficiency in Public Schools. Journal of Economic Literature 24: 1141-1177.
Hanushek, Eric A. (1995). Interpreting Recent Research on Schooling in Developing Countries. World Bank Research Observer 10(2): 227-246.
Hanushek, Eric A. (1996). School Resources and Student Performance. In: Gary Burtless (ed.), Does Money Matter? The Effect of School Resources on Student Achievement and Adult Success. The Brookings Institution, Washington, D.C.: 43-73.
Hanushek, Eric A., Dennis D. Kimko (2000). Schooling, Labor-Force Quality, and the Growth of Nations. American Economic Review 90(5): 1184-1208.
Heckman, James J. (1999). Policies to Foster Human Capital. NBER Working Paper 7288, August.
Hoxby, Caroline M. (1994). Do Private Schools Provide Competition for Public Schools? NBER Working Paper 4978, December.
Hoxby, Caroline M. (1996). How Teachers' Unions Affect Education Production. Quarterly Journal of Economics 111: 671-718.
Hoxby, Caroline M. (2000a). Does Competition among Public Schools Benefit Students and Taxpayers? American Economic Review 90(5): 1209-1238.
Hoxby, Caroline M. (2000b). The Effects of Class Size on Student Achievement: New Evidence from Population Variation. Quarterly Journal of Economics 115(4): 1239-1285.
Krueger, Alan B. (1999a). Experimental Estimates of Education Production Functions. Quarterly Journal of Economics 114(2): 497-532.
Krueger, Alan B. (1999b). An Economist's View of Class Size Research. Princeton University, Mimeo, December.
Krueger, Alan B., Diane M. Whitmore (2001). The Effect of Attending a Small Class in the Early Grades on College-Test Taking and Middle School Test Results: Evidence from Project STAR. Economic Journal: forthcoming.
Lee, Jong-Wha, Robert J. Barro (1997). Schooling Quality in a Cross Section of Countries. NBER Working Paper 6198, September.
Martin, Michael O., Dana L. Kelly (eds.) (1998). TIMSS Technical Report Volume II: Implementation and Analysis, Primary and Middle School Years. Boston College, Chestnut Hill, MA.
Pritchett, Lant, Deon Filmer (1999). What Education Production Functions Really Show: A Positive Theory of Education Expenditure. Economics of Education Review 18: 223-239.
Figure 1: Expenditure per student and educational performance: the cross-country evidence
Primary School Years |
Middle School Years |
a Average of mathematics and science results in 3rd/4th grade and in 7th/8th grade, respectively.
b At primary and secondary level, respectively, in international dollars, 1994.
Sources: See Appendix A.
Table 1: Effects on mathematics performance
Dependent variable: TIMSS international mathematics test score. Standard errors in parentheses.
OLS |
WLS |
||||||||
Coeff. |
Raw S.E. |
Coeff. |
Raw S.E. |
Robust S.E. |
Std. Coeff. |
||||
Constant |
426.985 |
(4.360) |
482.793 |
^{*} |
(4.211) |
(13.916) |
|||
Student and family characteristics |
|||||||||
Upper grade |
38.773 |
(0.425) |
40.342 |
^{*} |
(0.424) |
(1.086) |
0.202 |
||
Above upper grade |
99.486 |
(1.464) |
100.313 |
^{*} |
(1.513) |
(3.906) |
0.127 |
||
Age |
-9.884 |
(0.244) |
-14.183 |
^{*} |
(0.231) |
(0.779) |
-0.135 |
||
Sex |
-7.229 |
(0.343) |
-7.634 |
^{*} |
(0.346) |
(0.878) |
-0.038 |
||
Born in country |
8.372 |
(0.813) |
9.199 |
^{*} |
(0.816) |
(1.338) |
0.021 |
||
Living with both parents |
15.276 |
(0.514) |
12.099 |
^{*} |
(0.519) |
(0.814) |
0.040 |
||
Parent born in country |
5.132 |
(0.715) |
3.983 |
^{†} |
(0.722) |
(1.602) |
0.011 |
||
Parents' education |
|||||||||
Some secondary |
0.069 |
(0.707) |
-3.989 |
^{*} |
(0.702) |
(1.553) |
-0.014 |
||
Finished secondary |
25.755 |
(0.654) |
26.475 |
^{*} |
(0.660) |
(1.454) |
0.123 |
||
Some after secondary |
12.046 |
(0.695) |
15.130 |
^{*} |
(0.700) |
(1.515) |
0.066 |
||
Finished university |
36.600 |
(0.734) |
39.724 |
^{*} |
(0.746) |
(1.619) |
0.152 |
||
Books at home |
|||||||||
11-25 |
10.999 |
(0.755) |
10.326 |
^{*} |
(0.749) |
(1.360) |
0.037 |
||
26-100 |
37.317 |
(0.705) |
35.846 |
^{*} |
(0.701) |
(1.444) |
0.168 |
||
101-200 |
47.570 |
(0.761) |
46.713 |
^{*} |
(0.756) |
(1.543) |
0.186 |
||
More than 200 |
55.145 |
(0.753) |
54.269 |
^{*} |
(0.750) |
(1.562) |
0.235 |
||
Community location |
|||||||||
Geographically isolated area |
-14.707 |
(1.040) |
-18.502 |
^{*} |
(1.085) |
(3.385) |
-0.030 |
||
Close to the center of a town |
2.451 |
(0.361) |
1.598 |
(0.363) |
(1.479) |
0.008 |
|||
GDP per capita |
0.004 |
(5.9e-5) |
0.004 |
^{*} |
(5.8e-5) |
(2.1e-4) |
0.240 |
||
Resources and teacher characteristics |
|||||||||
Expenditure per student |
-0.009 |
(2.1e-4) |
-0.006 |
^{*} |
(2.1e-4) |
(6.9e-4) |
-0.106 |
||
Class size |
0.912 |
(0.018) |
1.176 |
^{*} |
(0.019) |
(0.090) |
0.122 |
||
Student-teacher ratio |
0.011 |
(0.003) |
0.006 |
(0.003) |
(0.007) |
0.004 |
|||
No shortage of materials |
8.525 |
(0.387) |
7.230 |
^{*} |
(0.394) |
(1.585) |
0.036 |
||
Great shortage of materials |
-1.480 |
(0.563) |
-5.925 |
^{†} |
(0.554) |
(2.393) |
-0.020 |
||
Instruction time |
3.7e-4 |
(2.3e-5) |
3.1e-4 |
^{*} |
(2.3e-5) |
(8.4e-5) |
0.025 |
||
Teacher characteristics |
|||||||||
Teacher's sex |
5.634 |
(0.372) |
5.727 |
^{*} |
(0.374) |
(1.345) |
0.029 |
||
Teacher's age |
-0.712 |
(0.033) |
-0.667 |
^{*} |
(0.033) |
(0.124) |
-0.062 |
||
Teacher's experience |
1.075 |
(0.032) |
1.038 |
^{*} |
(0.033) |
(0.121) |
0.097 |
||
Teacher's education |
|||||||||
Secondary |
11.151 |
(1.674) |
15.682 |
^{*} |
(1.569) |
(5.206) |
0.062 |
||
BA or equivalent |
10.919 |
(1.648) |
10.571 |
^{†} |
(1.542) |
(5.105) |
0.050 |
||
MA/PhD |
20.860 |
(1.694) |
25.576 |
^{*} |
(1.596) |
(5.411) |
0.090 |
OLS |
WLS |
||||||||
Coeff. |
Raw S.E. |
Coeff. |
Raw S.E. |
Robust S.E. |
Std. Coeff. |
||||
Institutional settings |
|||||||||
Central examinations |
|||||||||
Central examinations |
17.842 |
(0.434) |
16.062 |
^{*} |
(0.402) |
(1.435) |
0.045 |
||
External exams influence curriculum |
10.740 |
(0.539) |
4.271 |
^{‡} |
(0.524) |
(2.199) |
0.016 |
||
Distribution of responsibilities between schools and administration |
|||||||||
Central curriculum |
15.585 |
(0.539) |
10.776 |
^{*} |
(0.519) |
(1.783) |
0.048 |
||
Central textbook approval |
10.053 |
(0.474) |
9.559 |
^{*} |
(0.460) |
(1.563) |
0.078 |
||
School responsibility |
|||||||||
School budget |
-5.362 |
(0.663) |
-5.852 |
^{†} |
(0.683) |
(2.450) |
-0.017 |
||
Purchasing supplies |
-2.288 |
(0.976) |
0.538 |
(0.997) |
(3.488) |
0.001 |
|||
Hiring teachers |
13.959 |
(0.454) |
12.723 |
^{*} |
(0.471) |
(1.772) |
0.055 |
||
Determining teacher salaries |
6.539 |
(0.455) |
10.588 |
^{*} |
(0.464) |
(2.112) |
0.046 |
||
Teachers' influence |
|||||||||
Teachers' responsibility |
|||||||||
School budget |
-15.478 |
(1.032) |
-13.318 |
^{*} |
(1.100) |
(3.805) |
-0.022 |
||
Purchasing supplies |
11.361 |
(0.602) |
14.148 |
^{*} |
(0.642) |
(2.576) |
0.040 |
||
Hiring teachers |
-4.317 |
(5.413) |
-10.294 |
(6.197) |
(21.456) |
-0.003 |
|||
Determining teacher salaries |
-16.874 |
(5.153) |
-11.069 |
(5.492) |
(20.995) |
-0.003 |
|||
Strong influence on curriculum |
|||||||||
Teacher individually |
9.709 |
(0.442) |
11.952 |
^{*} |
(0.446) |
(1.730) |
0.051 |
||
Subject teachers |
-2.980 |
(0.473) |
-6.855 |
^{*} |
(0.476) |
(1.897) |
-0.034 |
||
School teachers collectively |
-9.333 |
(0.459) |
-12.659 |
^{*} |
(0.459) |
(1.836) |
-0.063 |
||
Teacher unions |
-27.532 |
(1.367) |
-32.329 |
^{*} |
(1.370) |
(5.979) |
-0.042 |
||
Class teacher has strong influence on |
|||||||||
Money for supplies |
2.800 |
(0.905) |
-0.815 |
(0.909) |
(3.734) |
-0.002 |
|||
Kind of supplies |
-2.701 |
(0.593) |
-0.627 |
(0.606) |
(1.997) |
-0.002 |
|||
Subject matter |
-0.613 |
(0.414) |
-0.830 |
(0.420) |
(1.585) |
-0.004 |
|||
Textbook |
-0.322 |
(0.480) |
2.687 |
(0.478) |
(1.913) |
0.011 |
|||
Students' incentives |
|||||||||
Scrutiny of exams |
4.410 |
(0.109) |
4.749 |
^{*} |
(0.110) |
(0.429) |
0.078 |
||
Homework |
-0.006 |
(0.002) |
0.001 |
(0.002) |
(0.010) |
0.001 |
|||
Parents' influence |
|||||||||
Parents influence curriculum |
-0.949 |
(1.314) |
3.714 |
(1.390) |
(5.516) |
0.005 |
|||
Uninterested parents limit teaching |
-12.546 |
(0.672) |
-10.107 |
^{*} |
(0.656) |
(2.756) |
-0.029 |
||
Interested parents limit teaching |
-8.879 |
(0.871) |
-10.860 |
^{*} |
(0.825) |
(4.090) |
-0.025 |
||
Parent-teacher meetings |
-5.966 |
(0.277) |
-6.152 |
^{*} |
(0.283) |
(1.021) |
-0.039 |
||
Observations |
266545 |
266545 |
|||||||
Schools (PSUs) |
6107 |
6107 |
|||||||
Countries |
39 |
39 |
|||||||
R^{2} (adj.) |
0.22 |
0.22 |