Part 2: Early years assessment is not reliable or valid and thus not helpful

This is the second post on early years assessment. The first is here

Imagine the government decided they wanted children to be taught to be more loving. Perhaps the powers that be could decide to make teaching how to love statutory and tell teachers they should measure each child’s growing capacity to love.

Typical scene in the EYFS classroom – a teacher recording observational assessment. 

There would be serious problems with trying to teach and assess this behaviour:

Definition: What is love? Does the word actually mean the same thing in different contexts? When I talk about ‘loving history’ am I describing the same thing (or ‘construct’) as when I ‘love my child’.

Transfer:  Is ‘love’ something that universalises between contexts? For example if you get better at loving your sibling will that transfer to a love of friends, or school or learning geography?

Teaching: Do we know how to teach people to love in schools? Are we even certain it’s possible to teach it?

Progress: How does one get better at loving? Is progress linear? Might it just develop naturally?

Assessment: If ‘loving skills’ actually exist can they be effectively measured?

 

 

 

 

Loving – a universalising trait that can be taught?

The assumption that we can teach children to ‘love’ in one context and they’ll exercise ‘love’ in another might seem outlandish but, as I will explain, the writers of early years assessment fell into just such an error in the Early Years Foundation stage framework and assessment profile.

In my last post I explained how the priority on assessment in authentic environments has been at the cost of reliability and has meant valid conclusions cannot be drawn from Early Years Foundation Stage Profile assessment data. There are, however, other problems with assessment in the early years…

Problems of ‘validity’ and ‘construct validity’

Construct validity: is the degree to which a test measures what it claims, or purports, to be measuring.

 Validity: When inferences can be drawn from an assessment about what students can do in other situations, at other times and in other contexts.

If we think we are measuring ‘love’ but it doesn’t really exist as a single skill that can be developed then our assessment is not valid. The inferences we draw from that assessment about student behaviour would also be invalid.

Let’s relate this to the EYFS assessment profile.

Problems with the EYFS Profile ‘characteristics of effective learning’

The EYFS Profile Guide requires practitioners to comment a child’s skills and abilities in relation to 3 ‘constructs’ labelled as ‘characteristics of effective learning’:

We can take one of these characteristics of effective learning to illustrate a serious problem of validity of the assessment. While a child might well demonstrate creativity and critical thinking (the third characteristic listed) it is now well established that such behaviours are NOT skills or abilities that can be learnt in one context and transferred to another entirely different context- they don’t universalise any more than ‘loving’. In fact the capacity to be creative or think critically is dependent on specific knowledge of the issue in question. Many children can think very critically about football but that apparent behaviour evaporates when faced with some maths.  You’ll think critically in maths because you know a lot about solving similar maths problems and this capacity won’t make you think any more critically when solving something different like a word puzzle or a detective mystery.

Creating and thinking critically are NOT skills or abilities that can be learnt in one context and then applied to another

Creating and thinking critically are not ‘constructs’ which can be taught and assessed in isolation. Therefore there is no valid general inference about these behaviours, which could be described as a ‘characteristic of learning’, observed and reported. If you wish a child to display critical thinking you should teach them lots of relevant knowledge about the specific material you would like them to think critically about.

In fact, what is known about traits such as critical thinking suggests that they are ‘biologically primary’ and don’t even need to be learned [see an accessible explanation here].

Moving on to another characteristic of effective learning: active learning or motivation. This presupposes that ‘motivation’ is also a universalising trait as well as that we are confident that we know how to inculcate it. In fact, as with critical thinking, it is perfectly possible to be involved and willing to concentrate in some activities (computer games) but not others (writing).

There has been high profile research on motivation, particularly Dweck’s work on growth mindset and Angela Duckworth’s on Grit. Angela Duckworth, has created a test that she argues demonstrates that adult subjects possess a universalising trait which she calls ‘Grit’. But even this world expert concedes that we do not know how to teach Grit and rejects her Grit scale being used for high stakes tests. Regarding Growth Mindset, serious doubts have been raised about failures to replicate Dweck’s research findings and studies with statistically insignificant results that have been used to support Growth Mindset.

Despite serious questions around the teaching of motivation, the EYFS Profile ‘characteristics of learning’ presume this is a trait that can be inculcated in pre-schoolers and without solid research evidence it is simply presumed it can be reliably assessed.

For the final characteristic of effective learning, playing and learning. Of course children learn when playing. This does not mean the behaviours to be assessed under this heading (‘finding out and exploring’, ‘using what they know in play’ or ‘being willing to have a go’) are any more universalising as traits or less dependent on context than the other characteristics discussed. It cannot just be presumed that they are.

Problems with the ‘Early Learning Goals’

 At the end of reception each child’s level of development is assessed against the 17 EYFS Profile ‘Early Learning Goals. In my previous post I discussed the problems with the reliability of this assessment. We also see the problem of construct validity in many of the assumptions within the Early Learning Goals. Some goals are clearly not constructs in their own right and others may well not be and serious questions need to be asked about whether they are universalising traits or actually context dependent behaviours.

For example, ELG 2 is ‘understanding’. Understanding is not a generic skill. It is dependent on domain specific knowledge. True, a child does need to know the meaning of the words ‘how’ and ‘why’ which are highlighted in the assessment but while understanding is a goal of education it can’t be assessed generically as you have to understand something and this does not mean you will understand something else. The same is true for ‘being imaginative’ ELG17.

An example of evidence of ELG 2, understanding, in the EYFS profile exemplification materials.

Are ELG1 ‘listening and attention’ or ELG 16 ‘exploring and using media materials’ actually universalising constructs? I rarely see qualitative and observational early years research that even questions whether these early learning goals are universalising traits, let alone looks seriously at whether they can be assessed. This is despite decades of research in cognitive psychology leading to a settled consensus which challenges many of the unquestioned constructs which underpin EYFS assessment.

It is well known that traits such as understanding, creativity, critical thinking don’t universalise. Why, in early years education, are these bogus forms of assessment not only used uncritically but allowed to dominate the precious time when vulnerable children could be benefiting from valuable teacher attention?

n.b. I have deliberately limited my discussion to a critique using general principles of assessment rather than arguments that would need to based on experience or practice.

Advertisements

Early years assessment is not reliable or valid and thus not helpful

The academic year my daughter was three she attended two different nursery settings. She took away two quite different EYFS assessments, one from each setting, at the end of the year. The disagreement between these was not a one off mistake or due to incompetence but inevitable because EYFS assessment does not meet the basic requirements of effective assessment – that it should be reliable and valid*.

We have a very well researched principles to guide educational assessment and these principles can and should be applied to the ‘Early Years Foundation Stage Profile’. This is the statutory assessment used nationally to assess the learning of children up to the age of 5. The purpose of the EYFS assessment profile is summative:

‘To provide an accurate national data set relating to levels of child development at the end of EYFS’

It is also used to ‘accurately inform parents about their child’s development’. The EYFS profile is not fit for these purposes and its weaknesses are exposed when it is judged using standard principles of assessment design.

EYFS profiles are created by teachers when children are 5 to report on their progress against 17 early learning goals and describe the ‘characteristics of their learning’. The assessment is through teacher observation. The profile guidance stresses that,

‘…to accurately assess these characteristics, practitioners need to observe learning which children have initiated rather than focusing on what children do when prompted.’

Illustration is taken from EYFS assessment exemplification materials for reading

Thus the EYFS Profile exemplification materials for literacy and maths only give examples of assessment through teacher observations when children are engaged in activities they have chosen to play (child initiated activities). This is a very different approach to subsequent assessment of children throughout their later schooling which is based on tests created by adults. The EYFS profile writers no doubt wanted to avoid what Wiliam and Black (Wiliam & Black, 1996) call the ‘distortions and undesirable consequences’ created by formal testing.

Reaching valid conclusions in formal testing requires:

  1.    Standard conditions – means there is reassurance that all children receive the same level of help
  2.    A range of difficulty in items used for testing – carefully chosen test items will discriminate between the proficiency of different children
  3.    Careful selection of content – from the domain to be covered to ensure they are representative enough to allow for an inference about the domain. (Koretz pp23-28)

The EYFS profile is specifically designed to avoid the distortions created by such restrictions that lead to an artificial test environment very different from the real life situations in which learning will need to be ultimately used. However, as I explain below, in so doing the profile loses necessary reliability to the extent that teacher observations cannot support valid inferences.

This is because when assessing summatively the priority is to create a shared meaning about how pupils will perform beyond school and in comparison with their peers nationally (Koretz 2008). As Wiliam and Black (1996) explain, ‘the considerable distortions and undesirable consequences [of formal testing] are often justified by the need to create consistency of interpretation.’ This is why GCSE exams are not currently sat in authentic contexts with teachers with clipboards (as in EYFS) observing children in attempted simulations of real life contexts. Using teacher observation can be very useful for an individual teacher when assessing formatively (deciding what a child needs to learn next) but the challenges of obtaining a reliable shared meaning nationally that stop observational forms of assessment being used for GCSEs do not just disappear because the children involved are very young.

Problems of reliability

Reliability: Little inconsistency between one measurement and the next (Koretz, 2008)

Assessing child initiated activities and the problem of reliability:

The variation in my daughter’s two assessments was unsurprising given that…

  • Valid summative conclusions require ‘standardised conditions of assessment’ between settings and this is not possible when observing child initiated play.
  • Nor is it possible to even create comparative tasks ranging in difficulty that all the children in one setting will attempt.
  • The teacher cannot be sure their observations effectively identify progress in each separate area as they have to make do with whatever children choose to do.
  • These limitations make it hard to standardise between children even within one setting and unsurprising that the two nurseries had built different profiles of my daughter.

The EYFS Profile Guide does instruct that practitioners ‘make sure the child has the opportunity to demonstrate what they know, understand and can do’ and does not preclude all adult initiated activities from assessment. However, the exemplification materials only reference child initiated activity and, of course, the guide instructs practitioners that

‘…to accurately assess these characteristics, practitioners need to observe learning which children have initiated rather than focusing on what children do when prompted.’

Illustration from EYFS assessment exemplification materials for writing. Note these do not have examples of assessment from written tasks a teacher has asked children to undertake – ONLY writing voluntarily undertaken by the child during play.

Assessing adult initiated activities and the problem of reliability

Even when some children are engaged in an activity initiated or prompted by an adult

  • The setting cannot ensure the conditions of the activity have been standardised, for example it isn’t possible to predict how a child will choose to approach a number game set up for them to play.
  • It’s not practically possible to ensure the same task has been given to all children in the same conditions to discriminate meaningfully between them.

Assessment using ‘a range of perspectives’ and the problem of reliability

The EYFS profile handbook suggests that:

‘Accurate assessment will depend on contributions from a range of perspectives…Practitioners should involve children fully in their own assessment by encouraging them to communicate about and review their own learning…. Assessments which don’t include the parents’ contribution give an incomplete picture of the child’s learning and development.’

A parent’s contribution taken from EYFS assessment exemplification materials for number

Given the difficulty one teacher will have observing all aspects of 30 children’s development it is unsurprising that the profile guide stresses the importance of contributions from others to increase the validity of inferences. However, it is incorrect to claim the input of the child or of parents will make the assessment more accurate for summative purposes. With this feedback the conditions, difficulty and specifics of the content will not have been considered creating unavoidable inconsistency.

Using child-led activities to assess literacy and numeracy and the problem of reliability

The reading assessment for one of my daughters seemed oddly low. The reception teacher explained that while she knew my daughter could read at a higher level the local authority guidance on the EYFS profile said her judgement must be based on ‘naturalistic’ behaviour. She had to observe my daughter (one of 30) voluntarily going to the book corner, choosing to reading out loud to herself at the requisite level and volunteering sensible comments on her reading.

 

Illustration is taken from EYFS assessment exemplification materials for reading Note these do not have examples of assessment from reading a teacher has asked children to undertake – ONLY reading voluntarily undertaken by the child during play.

The determination to preference assessment of naturalistic behaviour is understandable when assessing how well a child can interact with their peers. However, the reliability sacrificed in the process can’t be justified when assessing literacy or maths. The success of explicit testing of these areas suggests they do not need the same naturalistic criteria to ensure a valid inference can be made from the assessment.

Are teachers meant to interpret the profile guidance in this way? The profile is unclear but while the exemplification materials only include examples of naturalistic observational assessment we are unlikely to acquire accurate assessments of reading, writing and mathematical ability from EYFS profiles.

Five year olds should not sit test papers in formal exam conditions but this does not mean only observation in naturalistic settings (whether adult or child initiated) is reasonable or the most reliable option.  The inherent unreliability of observational assessment means results can’t support the inferences required for such summative assessment to be a meaningful exercise. It cannot, as intended ‘provide an accurate national data set relating to levels of child development at the end of EYFS’ or ‘accurately inform parents about their child’s development’.

In my next post I explore the problems with the validity of our national early years assessment.

 

*n.b. I have deliberately limited my discussion to a critique using assessment theory rather than arguments that would need to based on experience or practice.

References

Koretz, D. (2008). Measuring UP. Cambridge, Massachusetts: Harvard University Press.

Standards and Testing Agency. (2016). Early Years Foundation Stage Profile. Retrieved from https://www.gov.uk/government/publications/early-years-foundation-stage-profile-handbook

Standards and Testing Agency. (2016). Early Years Foundation Stage Profile: exemplification materials. Retrieved from https://www.gov.uk/government/publications/eyfs-profile-exemplication-materials

Wiliam, D., & Black, P. (1996). Meanings and consequences: a basis for distinguishing formative and summative functions of assessment? BERJ, 537-548.

The Secret Of My (Maths) Success

I had an interesting chat with a friend recently who’s just done a PGCE as a maths teacher. He’s been trained to build understanding through plenty of problem solving tasks.

The discussion made me reflect on the stark difference between the way I’ve taught maths to my own children at home, with the lion’s share of time spent learning to fluency, and the focus in schools on exercises to build understanding. After all, I reflected, the progress of my children has stunned even me. How is it they missed out on SO much work on understanding while accelerating far ahead of their peers?

It isn’t that I don’t appreciate that children need some degree of understanding of what they are doing. I remember when I discovered that the reason my friend’s daughter was struggling with maths at the end of Year 1 was because she had failed to grasp that crucial notion of ‘one more’. Her teacher had advised that she needed to learn her number bonds (and indeed she did) but while she did not grasp this basic notion the bonds were gibberish to her. What we call ‘understanding’ does matter (more thoughts here).

I’ve realised the reason I’ve never had to invest significant time in exercises to build understanding. It is because when my children are given a new sort of problem they can already calculate the separate parts of that problem automatically. All their working memory is focused on the only novel element of a procedure and so it is very quickly understood. Understanding is just not a biggy. Identify the knowledge necessary to calculate the component parts of a problem and get fluency in those and generally activities for understanding become a (crucial but) small part the maths diet.

The degree of focus on fluency that my children were given is highly unusual. I have huge piles of exercise books full of years of repeated calculations continued a year, two years, after they were first learned. My children learnt all possible addition and subtraction facts between one and twenty until they were known so well that recall was like remembering your own name. I did the same with multiplication and division facts. There were hours and hours and hours and hours of quite low level recall work.

Generally the the focus in schools is the opposite and this creates a vicious cycle. Children are taught more complex problems when they are not fluent in the constituent parts of the problem. Therefore they struggle to complete calculations because their working memory breaks down. The diagnosis is made that children don’t ‘understand’ the problem posed. The cure is yet more work focused on allowing children to understand how the problem should be solved and why. The children may remember this explanation (briefly) but it is too complex to be remembered long term as too many of the constituent elements of the problem are themselves not secure. When the children inevitably forget the explanation what is the diagnosis? – a failure of understanding. Gradually building ‘understanding’ eats more and more lesson time. Gurus of the maths world deride learning to fluency as ‘rote’ but perversely the more time is spent on understanding instead of fluency, the harder it is for children to understand new learning. By comparison my children seem to have a ‘gift that keeps on giving’. Their acceleration isn’t just in the level of maths proficiency they have reached it is in the capacity they have to learn new maths so much more easily.

gift_keep_giving_13Fluency… the gift that keeps on giving.

I’ve not got everything right but I’ve learned so much from teaching my own children including that the same general principle is true of understanding maths and understanding history. If understanding is a struggle it is because necessary prior knowledge is not in place or secure.

Go back – as far as you can get away with.

Diagnose those knowledge gaps.

Teach and secure fluency.

You’ll find understanding is no longer the same challenge.

Research and primary education

I took part in a panel discussion at the national ResearchEd conference yesterday. The subject of the discussion was primary education and I thought I would post the thoughts I shared:

At all levels of schooling classroom research is undoubtedly useful but the process of generalising from this research is fraught with difficulty. As E D Hirsch explains, each classroom context is different.

I think ideally we would like to base educational decisions on

  • Converging evidence from many years of research in numerous fields
  • That integrates both classroom research and lab based work so…
  • We can construct theoretical accounts of underlying causal processes.

These theoretical insights allow us to interpret sometimes contradictory classroom research. We actually have this ideal in the case of research into early reading and the superiority of systematic synthetic phonics. Despite this evidence the vast majority of primary schools ignore or are unaware of the research and continue to teach the ‘multi-cueing’ approach to reading.

While research on phonics is ignored some lamentably poor research has been enduringly influential in early primary education and treated with a breath taking lack of criticality. In the 1950s a comparison study of 32 children found that children taught at nursery using teacher centred methods showed evidence of delinquent behaviour in later life. However, this was a tiny sample and there was a tiny effect. No account was taken for the fact the teacher led research group had many more boys than the comparison child centred group – among other fatal flaws. Despite this that piece of research is STILL continually and uncritically cited. For example, the OECD used this study to support character education.  It is also central to the National Audit Office definition of ‘high quality early years provision as ‘developmentally appropriate’.

That flawed research features in the literature review for the EPPSE longitudinal study that has become one of the highest impact educational research programmes in Europe and whose findings underpin billions of pounds of government spending. EPPSE claims to have demonstrated that high quality pre-school provision is child centred and to have shown that such provision has an incredible impact on outcomes at aged 16. However, merely scratch the surface and you find there were obvious flaws with EPPSE. The scales used by classroom observers to discover the nature of quality provision lacked validity and actually predefined what constituted high quality provision as child centred. The researchers admitted problems with the control group meant causal connections couldn’t be drawn from the findings but then ignored this problem, despite the control group issue undermining their key conclusions.

It seems the key principles influencing early years education are too frequently drawn from obviously flawed research. These principles are also the product of misuse of the research we have. For example, it is statutory to devise activities to build resilience in early years education. However, Angela Duckworth, an international authority, admits that although it is a desirable trait we don’t really know for sure how to create it.

What explains the astonishing situation where theoretical research from cognitive psychology is ignored, obviously flawed huge government funded research projects become influential and new pedagogical approaches, based on faulty understanding of the evidence, are made statutory?

A glance along the bookshelves at any primary teacher training institution gives us a clue. There is a rigid child centred and developmentalist  orthodoxy among primary educationalists. This explains the lack of rigorous scrutiny of supportive research. In fact, except on social media, sceptical voices are barely heard.

The pseudo-expert

A week or so after our first child’s birth we met our health visitor, Penny. She was possibly in her early sixties and had worked with babies all her life. She was rather forthright in her advice but with the wisdom of 40 years behind her I was always open to her suggestions. Our baby refused to sleep in her first weeks. This meant I was getting one or two hours sleep a night myself and Penny’s reassuring advice kept me going. I can never forget one afternoon when our daughter was about 15 days old and Penny walked into our living room, taking in the situation almost immediately. “Now Heather,” she said, “I’m just going to pop baby in her Moses basket on her front. Don’t worry that she is on her front as you can keep an eye on her and if I roll up this cot blanket (deftly twisted in seconds) and put it under her tummy the pressure will make baby feel more comfortable…” Our daughter fell asleep immediately and Penny left soon after but SIX WHOLE HOURS later our baby was STILL sleeping soundly. She knew the specific risk to our baby from sleeping on her front was negligible and that it might just pull the parents back from the brink. I’m grateful to her for using her professional judgement that day.

Penny’s practical but sometimes controversial wisdom contrasted with the general quality of advice available at weekly baby clinic. Mums who were unable to think of an excuse to queue for Penny were told that ‘each baby was different’ and ‘mum and baby need to find their own way’. The other health visitors did dispense some forms of advice. If your baby wasn’t sleeping you could “try cutting out food types. Some mums swear its broccoli that does it” or “you could try homeopathy.”  The other health visitors had no time for Penny’s old fashioned belief that mothers could be told how to care for their babies. Instead of sharing acquired wisdom they uncritically passed on to mothers the latest diktats from on high (that seemed to originate from the pressure groups that held most sway over government) and a garbled mish-mash of pseudo-science.

A twitter conversation today brought back those memories. The early years teacher I was in discussion with bemoaned the lack of proper training for early years practitioners. Fair enough but what was striking was the examples she gave of the consequential poor practice. Apparently without proper training teachers wouldn’t understand about ‘developmental readiness’, ‘retained reflexes‘ or the mental health problems caused by a ‘too much too soon’ curriculum. The problem is that these examples of expertise to be gained from ‘proper’ training are actually just unproven theory or pseudo-science. The wisdom of the lady in her fifties who has worked for donkey’s at the little local day nursery is suspect if she is not ‘properly trained’. But trained in what? The modern reluctance to tell others how they should conduct themselves has created a vacuum that must be filled with pseudo-expertise masquerading as wisdom.

How often do teachers feel that they can’t point to their successful track record to prove their worth and instead must advocate shiny ‘initiatives’ based on the latest pastoral or pedagogical fads dressed up as science? The expert is far from always right but I value their wisdom. I also value the considered use of scientific research in education. Too often though these are sidelined and replaced with something far worse.

Reading failure? What reading failure?

“Yes, A level history is all about READING!”

I say it brightly as I dole out extracts from a towering pile of photocopying taken from different texts that will help the class get going with their coursework. I try and ooze reassurance. I cheerily talk about the sense of achievement my students will feel when they have worked their way through these carefully selected texts, chosen to transfer the maximum knowledge in the minimum reading time. I explain this sort of reading is what university study will be all about, while dropping in comforting anecdotes to illustrate it is much more manageable than they think. I make this effort because I NEED them to read lots. The quality of their historical thinking and thus their coursework is utterly dependent upon it.

Who am I kidding? This wad of material is the north face of the Eiger to most of my students. Some have just never read much and haven’t built up the stamina. The vocabulary in those texts (chosen by their teacher for their readability) is challenging and the process will be effortful. For a significant minority in EVERY class the challenge is greater. They don’t read well. Unfamiliar words can’t be guessed and their ability to decode is weak. To read even one of my short texts will take an inordinate time. Such students are bright enough, most students in my class will get an A after all, with some Bs and the odd C. They all read well enough to get through GCSE with good results and not one of them would have been counted in government measures for weak literacy. According to the statistics the biggest problem I face day in, day out as I teach A level history simply doesn’t exist. Believe me it exists and there is a real human cost to this hidden reading failure.

Take Hannah. She loves history, watches documentaries and beams with pleasure as we discuss Elizabeth I. She even reads historical novels. However, she really struggles to read at any pace and unfamiliar words are a brick wall. She briefly considered studying history at university but the reading demands make it impracticable. Her favourite subject can never be her degree choice because her reading is just not good enough. She is not unusual, her story is everywhere.

At this point I am going to hand over my explanation to Kerry Hempenstall, senior lecturer in psychology at RMIT. I include just a few edited highlights from his survey of the VAST research literature on older students’ literacy problems that you can consider for yourself by following the link. He says:

These struggling adolescents readers generally belong to one of two categories, those provided with little or poor early reading instruction or those possibly provided with good early reading instruction, yet for unknown reasons were unable to acquire reading skills (Roberts, Torgesen, Boardman, & Sammacca, 2008)…

Hempenstall outlines the problems with the ways reading is currently taught:

…Under the meaning centred approach to reading development, there is no systematic attention to ensuring children develop the alphabetic principle. Decoding is viewed as only one of several means of ascertaining the identity of a word – and it is denigrated as being the least effective identification method (behind contextual cues). In the early school years, books usually employ highly predictable language and usually offer pictures to aid word identification. This combination can provide an appearance of early literacy progress. The hope in this approach is that this form of multi-cue reading will beget skilled reading.

However, the problem of decoding unfamiliar words is merely postponed by such attractive crutches. It is anticipated in the meaning centred approach that a self-directed attention to word similarities will provide a generative strategy for these students. However, such expectations are all too frequently dashed – for many at-risk children progress comes to an abrupt halt around Year 3 or 4 when an overwhelming number of unfamiliar (in written form) words are rapidly introduced…

  1. a) New content-area vocabulary words do not pre-exist in their listening vocabularies. They can guess ‘wagon’. But they can’t guess’ circumnavigation’ or ‘chlorophyll’ based on context (semantics, syntax, or schema); these words are not in their listening vocabularies.
  2. b) When all of the words readers never learned to decode in grades one to four are added to all the textbook vocabulary words that don’t pre-exist in readers’ listening vocabularies, the percentage of unknown words teeters over the brink; the text now contains so many unknown words that there’s no way to get the sense of the sentence.
  3. c) Text becomes more syntactically embedded, and comprehension disintegrates. Simple English sentences can be stuffed full of prepositional phrases, dependent clauses, and compoundings. Eventually, there’s so much language woven into a sentence that readers lose meaning. When syntactically embedded sentences crop up in science and social studies texts, many can’t comprehend.” (Greene, J.F. 1998)

…In a study of 3000 Australian students, 30% of 9 year olds still hadn’t mastered letter sounds, arguably the most basic phonic skill. A similar proportion of children entering high school continue to display confusion between names and sounds. Over 72% of children entering high school were unable to read phonetically regular 3 and 4 syllabic words. Contrast with official figures: In 2001 the Australian public was assured that ‘only’ about 19% of grade 3 (age 9) children failed to meet the national standards. (Harrison, B. 2002) [Follow the link if you want to read all the research listed.]

Hempenstall outlines the research showing that the effects of weak reading become magnified with time:

“Stanovich (1986) uses the label Matthew Effects (after the Gospel according to St. Matthew) to describe how, in reading, the rich get richer and the poor get poorer. Children with a good understanding of how words are composed of sounds (phonemic awareness) are well placed to make sense of our alphabetic system. Their rapid development of spelling-to-sound correspondences allows the development of independent reading, high levels of practice, and the subsequent fluency which is critical for comprehension and enjoyment of reading. There is evidence (Stanovich, 1988) that vocabulary development from about Year 3 is largely a function of volume of reading. Nagy and Anderson (1984) estimate that, in school, struggling readers may read around 100,000 words per year while for keen mid-primary students the figure may be closer to 10,000,000, that is, a 100 fold difference. For out of school reading, Fielding, Wilson and Anderson (1986) suggested a similar ratio in indicating that children at the 10th percentile of reading ability in their Year 5 sample read about 50,000 words per year out of school, while those at the 90th percentile read about 4,500,000 words per year”…

Hempenstall explains just why it is crucial to spot problems with phonics in year 1:

The probability that a child who was initially a poor reader in first grade would be classified as a poor reader in the fourth grade was a depressingly high +0.88.Juel, C. (1988

If children have not grasped the basics of reading and writing, listening and speaking by Year Three, they will probably be disadvantaged for the rest of their lives. Australian Government House of Representatives Enquiry. (1993).The Literacy Challenge. Canberra: Australian Printing Office.

“Unless these children receive the appropriate instruction, over 70 percent of the children entering first grade who are at risk for reading failure will continue to have reading problems into adulthood”. Lyon, G.R. (2001).

[The research literature for this finding is enormous – do follow link if interested]

A study by Schiffman provides support for monitoring programs for reading disabilities in the first and second grades. In a large scale study of reading disabilities (n = 10,000),

82% of those diagnosed in Grades 1 or 2 were brought up to grade level.

46%     in Grade 3 were brought up to grade level.

42%     in Grade 4 were brought up to grade level.

10-15% in Grades 5-7 were brought up to grade level.

Berninger, V.W, Thalberg, S.P., DeBruyn, I., & Smith, R. (1987). Preventing reading disabilities by assessing and remediating phonemic skills. School Psychology Review, 16, 554-565.

Hempenstall lists research on what it is that causes such problems for struggling readers:

“The vast majority of school-age struggling readers experience word-level reading difficulties (Fletcher et al., 2002; Torgesen, 2002). This “bottleneck” at the word level is thought to be particularly disruptive because it not only impacts word identification but also other aspects of reading, including fluency and comprehension (LaBerge & Samuels, 1974). According to Torgesen (2002), one of the most important discoveries about reading difficulties over the past 20 years is the relationship found between phonological processing and word-level reading. Most students with reading problems, both those who are diagnosed with dyslexia and those who are characterized as “garden variety” poor readers, have phonological processing difficulties that underlie their word reading problems (Stanovich, 1988)” (p.179). [Do follow link for more]

To debate just how many children are functionally illiterate and condemn Nicky Morgan for apparent exaggeration entirely misses the point. Reading failure is endemic. I would estimate that about a third of my A level students have noticeable issues with word level reading that significantly impact upon their progress in history at A level. Reading failure is one of the biggest obstacles I have face in my teaching and I have every reason to comment on the issue. I don’t even deal with all those students who chose not to even attempt A level history because they knew it meant lots of reading.  At secondary school we should be giving students more complex texts to build their vocabularies and reading stamina. However, the research is pretty clear about when difficulties need to be identified if children are to overcome them – way back in year 1. The research is also pretty clear about what it is that struggling readers lack – a grasp of the alphabetic principle that they are able to apply fluently when reading. Given this, the opposition to the year 1 phonics check is hard to justify. We know so much now about effective reading instruction but it can only be used to help children if teachers are willing to adjust their practices. While around 90% of primary schools continue to focus on ‘mixed methods’ (guessing from cues rather than sounding out) that limit children’s chances of acquiring the alphabetic principle essential for successful reading, nothing will change.

No one questions what they want to believe. The problem with EPPSE

The EPPSE is a very large and enormously influential study commissioned by the DfE to find out what types of preschool provision and early experiences are most effective. It followed 3000+ children from the age of 3 to 16 years and reaches some very significant conclusions which I would question.

The research team was from the Institute of Education, Birkbeck and Oxford and as the Institute of Education blog explains:

The EPPSE project… has become one of the highest impact educational research programmes in Europe… EPPSE’s findings underpin billions in Government spending on nursery expansion, including the Sure Start programme, the extension of free pre-school to all three and four-year-olds in 2010 and this year, to the poorest 40% of two-year-olds this year… EPPSE’s evidence documenting excellent pre-school education and its ongoing benefits, especially for the most deprived children, has fed heavily into England’s early childhood curriculum and informed curricula in countries as diverse as Australia, China and Brazil. Nursery World editor Liz Roberts has noted “how highly regarded the Early Years Foundation Stage is around the world”.

The EPPSE project findings are stunning:

Attending any pre-school, compared to none, predicted higher total GCSE scores, higher grades in GCSE English and maths, and the likelihood of achieving 5 or more GCSEs at grade A*-C. The more months students had spent in pre-school, the greater the impact on total GCSE scores and grades in English and maths… the equivalent of getting seven B grades at GCSE, rather than seven C grades.

The EPPSE project also found that:

There was some evidence of statistically significant continuing pre-school effects on social behavioural outcomes at age 16 but these were weaker than at younger ages. Having attended a high quality pre-school predicted better social-behavioural outcomes in the longer term, though the effects were small.

The IOE blog gushes that EPPSE:

…brought together a rare combination: research funded by Government with a genuinely open mind, carried out by excellent and dedicated academics savvy enough to work with and influence politicians of all stripes…And thanks to the detailed work that began 17 years ago at the IOE, we also know what excellent nursery provision looks like.

Hold on a moment! It seems these researchers did not have an open mind. I have previously blogged about the fact the EPPE (the acronym before the study moved onto secondary school outcomes) studied quality using a measure, ECERS R, which had a predefined scale based on prejudged measures of quality. Having read through much of the voluminous literature there is so much I could discuss about the EPPE findings but in this post I will focus on another claim in the IOE blog, that this study has rigour. I am really not sure it does.

That is a big accusation to make against such a large and influential study conducted by highly regarded academics but it seems to have a fundamental problem that strikes at the heart of the validity of its findings.

The problem of the EPPE/EPPSE control group.

In their report at the end of KS1 (when the study children were 7 years old) the researchers acknowledge that there were problems with the control group. Because in England the vast majority of children attend a preschool it was not possible to find a representative sample of those children who didn’t:

The ‘home’ control group are from significantly disadvantaged backgrounds when compared with the sample as a whole with most mothers having less than a GCSE qualification (p11) .

On p28 the report explains that:

…comparison of the ‘home’ sample (the control who did not attend pre-school) with children who attended a pre-school centre showed that both the characteristics and attainments of home children vary significantly from those who had been in pre-school. It is not possible to conclude with any certainty that the much lower attainments of the ‘home’ group are directly due to lack of pre-school experience.’

The writers go on to talk positively about how they have used ‘contextualised multilevel analysis’ to try and compensate for the unrepresentative nature of the control sample and they feel this means their results are worth considering but, for example, they admit that when making judgements about the impact of longer duration of pre-schooling on higher cognitive outcomes by comparing with the ‘home’ group:

“causal connections cannot be drawn”

The problems with the control are not mentioned in the overall findings but in 2004 they are acknowledged in the body of the report. There is an attempt in 2004 to show that variation in pre-school quality and in duration of attendance can have an impact and this is because these findings don’t have to use the problematic control.

It seems obvious that children coming from very disadvantaged homes as the ‘home’ control group largely do, may well benefit from pre-school in ways most children wouldn’t. Therefore, despite contextualised multilevel analysis to take account of all other variables such as SES there will always be problems using this control to reach firm conclusions on the impact of pre-schooling on the whole population.

Fast forward to 2014 and the final reports in the children at age 16. I have looked through all the reports. It is clear from the tables included that all the startlingly good educational findings rest on comparisons with this control group. However I can find NO MENTION AT ALL of the sorts of problems with the control the researchers were willing to acknowledge in 2004. It is as if the control group issue just wafted away. It seems that such a large and important study, ‘one of the highest impact educational research programmes in Europe’, had no need to concern itself with pesky issues like that annoyingly poor control, that the whole vast edifice that is EPPSE rests upon.

How can it be that in 2004 the problem of the unrepresentative control meant many findings on the impact of preschool were tentative but in 2014 the issue of the control has totally disappeared? It is as if it never existed. Given the startlingly strong impact ANY form of preschool is claimed to have the findings need to be robust if they are to be used to make policy.

The EPPSE claims to be ‘proper’ research, not the sort of stuff that gives education research a bad name. It is also enormously influential and directly used in government policy making. I can understand the researchers, invested as they are, claiming their conclusions have validity but where is the scrutiny? What is happening at peer review? If anything highlights the unhealthiness of the rigid orthodoxy in education departments, especially in early years research, it is this EPPSE study. Once again no one seems to question what they want to believe.

If you found this interesting you may also want to read these posts:

https://heatherfblog.wordpress.com/2015/07/09/a-truism-that-needs-questioning/

https://heatherfblog.wordpress.com/2015/06/01/the-hydra/

A truism that needs questioning.

A truism that needs questioning: The importance of ‘high quality’ preschool education.

It is a truth universally acknowledged that young children, especially those not in possession of a good middle class upbringing, must be in need of ‘high quality preschool provision’. The phrase is on every politician’s lips. David Cameron is clear about this. Nicky Morgan, Tristram Hunt and Liz Kendall are sure it will create a skilled workforce of the future and Barack Obama has pumped countless dollars into ‘high quality’ preschool programmes in the belief that research shows that ‘high quality’ provision is the key to better life outcomes.

You might be surprised to learn ‘high quality’ has a very specific meaning that goes well beyond the common sense idea that some preschools must be better run than others. The National Audit Office commissioned a summary of the evidence on the impact of early years’ provision in which they explained that “In pre-school education (3+ years), quality is most often associated with the concept of developmentally appropriate practice

The English Early Years Foundations Stage statutory framework explains what is meant by ‘developmentally appropriate’ (i.e. high quality) practice for 0-5 year olds:

“Each area of learning and development must be implemented through planned, purposeful play and through a mix of adult-led and child-initiated activity.”

Everybody believes young children should play lots and can learn while playing. In England high quality provision does not just mean giving young children time to play it makes it statutory that the bulk of any learning must be through child initiated play. As the statutory framework explains:

“Children learn by leading their own play, and by taking part in play which is guided by adults.”

To be clear, if I want my four year old to learn to wash himself I could:

  1. Instruct him directly but that would be bad practice under the EYFS framework for the majority of learning goals (not really a high quality approach).
  2. I could play a game with him that involves washing. That would be ‘adult led’ play and only acceptable some of the time.
  3. Finally I could try and engineer a situation where my child is likely to want to play at washing himself (an ‘enabling environment’) and I should offer gentle nudges to ‘enrich’ his play in the right direction. That is ‘child initiated’ learning and is at the heart of what is meant by ‘high quality’ preschool practice.

Child initiated play is prioritised because it is believed it will facilitate the central goal of ‘high quality’ pre-schooling – character development. For example the ‘guiding principles’ of the English EYFS statutory framework are a series of dispositions. Children should become resilient, capable, confident and self-assured, strong and independent. This what is meant by the phrase ‘educating the whole child’.

What is the basis for this widely held view of ‘high quality’ pre-schooling?

For me this statutory definition of high quality pre-schooling was problematic for a number of reasons.

1. I’ve looked into character education and there seems a limited basis for the belief that the dispositions and skills which are the goals of this form of pre-schooling can be taught or if they can be inculcated, no real basis for the idea that child initiated learning is the way to do so. For example, it is statutory in English preschools to devise activities to build resilience. Angela Duckworth is viewed as an international authority on ‘grit’ but she admits that although it is a desirable trait we don’t really know for sure how to create it!

2.I taught my own young children to read, do maths, swim, wash, dress. They learnt maths to a high level without my engineering ‘enabling environments’ for child initiated learning.

3. This belief that high quality preschools are child-centred and ‘developmentally appropriate’ flies in the face of the enormous American state sponsored Project Follow Through. Follow Through found direct instruction pre-schooling delivered far greater cognitive gains over child centred approaches.

4.  The research by cognitive psychologists is pretty damning of the idea that developmentally appropriate practice is a good idea.

A report by the National Audit Office on the evidence for the impact of pre-schooling suggests the evidence base for the widely cited definition of ‘high quality’ is a small handful of very old and tiny studies, particularly one I have already written about, the highly flawed High/Scope Perry study which didn’t even find any long term cognitive benefits.

I thought there had to be a firmer basis for what amounts to an international education policy. I investigated further and did find lots of studies looking at the effect of pre-schooling on outcomes but it is hard to find any that policy makers would be interested in that provide a basis for how ‘high quality’ has been defined.

There is one very well-known and significant study that purports to do so. It is the ‘Effective Provision of Pre-School Education Project’ (EPPE), a large longitudinal study involving 3000 children and sponsored by the DfES. One of its aims was to identify the characteristics of an effective pre-school setting. The study involved careful classroom observation particularly with the most widely used measure of preschool classroom quality, the Early Childhood Environment Rating Scale (ECERS-R). The EPPE report explains the use of this ECERS-R measure:

“Matters of pedagogy are very much to the fore in ECERS-R. For example, the sub scale Organisation and Routine has an item ‘Schedule’ that gives high ratings to a balance between adult initiated and child initiated activities. In order to score a 5 the centre must have a balance between adult initiated and child initiated activities.”

Hold on, surely not? This very large government funded longitudinal study is aiming to identify high quality practice using a rating system which predefines what is meant by high quality! The ECERS-R rating system was developed in the late 1970s and is used extensively around the world to judge preschool quality. I spent some time looking for the evidence base for its assumptions. I found that the quality ratings were compiled by one of the creators, using her teaching experience. There has been some criticism of the ECERS-R. Gordon et al write:

“The ECERS and ECERS-R reflect the early childhood education field’s concept of developmentally appropriate practice, which includes a predominance of child initiated activities…a ‘whole child’ approach that integrates physical, emotional, social and cognitive development…there is surprisingly little empirical evidence of the validity of the ECERS-R instrument using item response models.”

Gordon et al explain that there is a fundamental problem with ECERS-R scoring system because statements that allow higher scores (indicators) are only counted if indicators of lower scores are met. However scales ‘mix dimensions’. I’ll explain. One of the scales a preschool is judged along, the ECERS10, includes indicators of nutrition (food served is of unacceptable nutritional value), caregiver child interactions (non-punitive atmosphere during meals), language (meals and snacks are times for conversation) and sanitation, among others! If the food is of unacceptable nutritional value the scorer cannot even judge items higher up the scale when they are really unrelated! Unsurprisingly researchers found ‘the category ordering assumed by the scale’s developers is not consistently evident.’ Interestingly they also found few associations between ECERS-R and child outcomes and they suggest ‘small correlations may be attributable, in part, to the low validity of the measure itself’.

So the best recent research in England on what is meant by ‘high quality’ preschool education, the EPPE longitudinal study, uses a measure which predefines quality. This measure has been widely used to define high quality but is based on a teacher’s observations and has questionable correlation with outcomes, unsurprising when you consider the scales mix dimensions. Finally the very best evidence the National Audit Office could find to justify the ‘developmentally appropriate’ definition of high quality was a tiny, highly flawed study from 50 years ago.

I don’t suppose politicians have any idea that they are endorsing a very particular ‘child centred/developmentally appropriate’ form of early education when they herald ‘high quality early education’ as the panacea for society’s ills or that there is little justification for actively endorsing this particular approach, let alone making it statutory. In fact, whatever, might be written about the findings of the EPPE study, the actually statistics endorse something much more like direct instruction. I talk more about the problems with EPPE/EPPSE here.

Other relevant posts on:

What a child initiated education looks like

The view of early years’ educationalists on direct teaching

The Hydra Part 2

The Hydra Part 2

or ‘Weikart and Scweinhart’s [Perry] High/Scope Preschool Curriculum comparison Study Through Age 23’ and Lifetime Effects: The HighScope Perry Preschool Study Through Age 40 (2005) For details of these studies see Part 1

This post begins with the story of two preschool approaches and their fates. One approach was ‘teacher led’ Direct Instruction and the other ‘child led’ High/Scope Perry preschool, already discussed at length in my previous post. As the NIFDI website explains:

There was an enormous educational experiment beginning in 1968 comparing these two approaches called Project Follow Through. It was the most extensive educational experiment ever conducted. Beginning in 1968 under the sponsorship of the American federal government, it was charged with determining the best way of teaching at-risk children from kindergarten through grade 3. Over 200,000 children in 178 communities were included in the study, and 22 different models of instruction were compared.

Project-Follow-Through-Chart

Evaluation of the project occurred in 1977, nine years after the project began. The results were strong and clear. Students who received Direct Instruction had significantly higher academic achievement than students in any of the other programs. They also had higher self esteem and self-confidence. No other program had results that approached the positive impact of Direct Instruction. Subsequent research found that the DI students continued to outperform their peers and were more likely to finish high school and pursue higher education. The Perry High/Scope approach is on the graph above. It is the ‘Cognitive Curriculum’ approach (second from the end). You can see it wasn’t quite as successful…

So which preschool method was the winner? The answer might seem obvious from the table above but you would be mistaken. The approach that now dominates is the High/Scope Perry preschool approach – and this is to some extent on the basis of two very small and problematic studies.

In my last blog I outlined the problems with these two small studies from 40-50 years ago by Schweinhart and Weikart that examined the impact of their Perry High/Scope preschool methods on the participants into adulthood. I began to explain the staggering influence over education policy these two studies have had. To find out the details of these two studies click back to my last blog but to give you the gist here is Kozloff’s summary some of the problems with the way these studies have been interpreted:

 What is “…just plain bizarre, is that [Schweinhart and Weikart] barely entertain the possibility that: (1) a dozen years of school experience; (2) area of residence; (3) family background; (4) the influence of gangs; and (5) differential economic opportunity, had anything to do with adolescent development and adult behavior.

In this post I will look at the impact of these studies on early years education.

First: These studies have been crucial in building a case for the importance of preschool education in the early years of childhood.

The National Audit Office commissioned a summary of the evidence on the impact of early years’ provision on young children with emphasis given to children from disadvantaged backgrounds in 2004. Such a paper offers a good review of the key research literature that has been influencing public policy.

The evidence of the biggest ever educational experiment “Project Follow Through” does not feature but both tiny Perry preschool studies feature heavily in the report. The second study of 123 subjects is one of six randomised controlled trials cited to provide evidence of the effectiveness of preschool programmes for disadvantaged children. These trials were all small scale and to an extent have contradictory findings. Some found reductions in antisocial behaviour but not academic gains and others had the opposite findings. None of the trials seemed to offer better evidence than the problematic second Perry study of 123 subjects.

The report then goes onto look at the evidence of preschool having an impact on the general population, rather than studies that only focus on those children that are highly disadvantaged. Perhaps it would be reasonable to argue that the majority of this research had positive findings, either for social development, academic development or both. However, there were still many contradictory findings and what seemed to be quite low effect sizes. Often, the preschool methods examined provide limited academic advantage but show positive social effects later in life. Having looked at the literature, I do begin to wonder if the commenter on this American website has a point:

“As the authors note, it is indeed quite a puzzle how pre-school education could possibly not show positive effects during schooling, yet have dramatically positive effects in adulthood. But if you look at the studies that find no effect during early schooling, you’ll find them very dense with objective facts such as testing results, etc. But if you look at those handful of studies purporting to show dramatic adulthood outcomes, you don’t find a lot of such data. In fact these latter studies aren’t scientific – they’re advocacy. They didn’t come up with rigorously selected criteria prior to pre-school to evaluate the outcomes, but instead retrospectively identified metrics to compare the control groups, leaving much room for post-hoc cherry-picking. The most parsimonious explanation of the paradox is that the adulthood-effects studies are flawed, and aren’t actually showing any real positive outcomes from pre-school.”

I’d need to do much more research to comment further. What is very interesting to me is that that there is no doubt the much publicised benefits of preschool education are built on shakier foundations than advocates would like policy makers to think. If you are interested in forming an opinion, this post , this and this post and this riposte are a great starting point.

There is a very concerning reliance on the Schweinhart and Weikart Perry preschool studies in the National Audit Office report.

1. The shockingly ‘dodgy’ first study (see my previous post) is relied upon to define ‘high quality’ child care.

I’ll explain further. One noticeable feature of the research on preschool effectiveness is the reliance on the idea that the reason some studies showed no effect was because the preschool programme was not ‘high quality’. On one level that is sensible as there must be huge variation in the quality of preschool provision but it is also a way of arguing that we should dismiss all studies with weak or no effects, presuming they are not ‘high quality’. I had noticed that the term ‘high quality’ is used frequently in the literature and repeatedly by policy makers and so I was interested to see how researchers had reached a decision on what constituted ‘high quality’. This is what the National Audit Office report had to say (I’ll highlight the key passage but thought I should include the full extract):

“In pre-school education (3+ years), quality is most often associated with the concept of developmentally appropriate practice. Bryant et al. (1994) report on several studies that illustrate the relationship between developmentally appropriate practice and child outcomes. The High/Scope study [Schweinhart’s and Weikart’s] shows that children who attend a developmentally appropriate, child-centred programme are better adjusted socially than similar children who attend a teacher-directed programme implementing a direct-instruction curriculum (Schweinhart, Weikart, and Larner 1986). In North Carolina, Bryant, Peisner-Feinberg, and Clifford (1993) found that children’s communication and language development were positively associated with appropriate care giving. Burts et al. (1992) and Hart & Todd (1995) show that children’s attendance in developmentally appropriate kindergartens is associated with fewer stress behaviours. Educational content is also important for this age group. Jowett & Sylva (1986) found nursery education graduates did better in primary school than playgroup graduates, suggesting the value of an educationally orientated pre-school. The research demonstrates that the following aspects of pre-school quality are most important for enhancing children’s development: Well-trained staff who are committed to their work with children, facilities that are safe and sanitary and accessible to parents, ratios and group sizes that allow staff to interact appropriately with children, supervision that maintains consistency, staff development that ensures continuity, stability and improving quality and a developmentally appropriate curriculum with educational content

  • Oh my goodness! How can the study referred to possibly support the weight being placed on it (see first half of my previous post)? The National Audit Office report writer considers it a central plank in research used to define ‘high quality’ child care when it is hopelessly flawed.

 

  • Not only this, there is good research demonstrating the academic advantage of preschool methods with approaches which are contradictory approaches to the Perry preschool methods. Why does the definition of ‘high quality’ actually exclude these successful alternative methods? The Perry Preschool method endorsed ‘developmentally appropriate practice’ which means ‘child led’ experiential curriculum rather than teacher led instruction. The National Audit Office report actually mentions the success of a very teacher led approach, used widely in France:

“Studies of children in the French Ecoles Maternelle programme (Bergmann 1996) show that this programme enhances performance in the school system for children from all social classes and that the earlier the children entered the pre-school program, the better their outcome.”

The early results of project Follow Through (with 9000 children assigned to nine early childhood curricula) showed that disadvantaged children who received Direct Instruction (anathema to those advocating child led approaches) went from the 20th to about the 50th percentile on the Metropolitan Achievement Test. Children who received the Perry High/Scope curriculum did not do as well. They fell from the 20th percentile to the 11th percentile.

2. It is a concern that the second Schweinhart and Weikart study is used to demonstrate the cost effectiveness of preschool education in the National Audit Office report.

“The Perry Pre-school Project is the most cited study in this area and its benefits are well reported (e.g. Schweinhart et al. 1993). The cost-benefit analysis for this project (Barnett 1996) is worthy of some consideration as its findings are extensively used for justifying expenditure on pre-school education and care.

This is troubling given the small size and context of the original study. It can’t possibly support these inferences.

Second: This research on impact of education in children’s early years has been used to justify our statutory Early Years Foundation Stage.

The EPPE study was an enormously significant longitudinal Study funded by the DfES from 1997 – 2004 and with findings in support of our ‘child led’ EYFS curriculum. It actually cites the first highly flawed Schweinhart and Weikart study where, among other problems, findings were the result of changes in outcomes of two or three people. This  is what it says of a study that should never have been taken seriously:

Previous Research on the Effectiveness of Pre-School Education and Care: The vast majority of longitudinal research on early education has been carried out in the U.S.Two of the studies cited most often are the Abecedarian Project and the Perry Pre-school Programme (Ramey & Ramey, 1998; Schweinhart and Weikart, 1997). Both used randomised control trial methods to demonstrate the lasting effects of high quality early intervention. These landmark studies, begun in the 1970s, have been followed by further small scale ‘experiments’ (see the Early Headstart, Love et al., 2001) and larger cohort studies (See Brooks-Gunn, 2003; Melhuish, 2004a, for reviews). This huge body of literature points to the many positive effects of centre-based care and education.”

The EPPE refers to the first Perry study as having been ‘admired for decades for its internal validity.’ I am not sure the writer has actually looked at the study. Amusingly, the EPPE report writer seems bemused when their own findings contradict those of the Perry study:

“ It appears therefore that the beneficial impact of pre-school on cognitive attainment is more long lasting than that on social behaviour. Social/behavioural outcomes may be more influenced than cognitive outcomes by the primary school peer group. Still this finding is at odds with the Perry Pre-school study, which indicated that the social outcomes of pre-school were more salient than the cognitive ones by adolescence. Data from the EPPE continuation study, will follow children into adolescence, to shed light on this.”

The influence the Perry study has had on understanding of what might be meant by ‘high quality’ provision is quite concerning given these contradictory findings.  Our national EYFS curriculum presumes ‘high quality’ provision means:

“…all areas of learning [are] to be delivered through planned, purposeful play, with a balance of adult-led and child-initiated activities… Professionals should therefore adopt a flexible, fluid approach to teaching, based on the level of development of each child. Research also confirms that the quality of teaching is a key factor in a child’s learning in the early years. High quality teaching entails high levels of interaction, warmth, trust, creativity and sensitivity. Practitioners and children work together to clarify an idea, solve a problem, express opinions and develop narratives.

However this is problematic:

1. We know the first flawed Schweinhart and Weikart Perry study contributed towards this idea of ‘high quality’ provision but it did not actually improve academic outcomes for its young participants. Evidence such as Follow Through (from the same era is) not even mentioned which did improve academic outcomes. The Perry programme was viewed as worthwhile because it was believed this intervention limited adult anti-social behaviour.

“The Perry program initially boosted IQs. However, this effect faded within a few years after the end of the two-year program, with no statistically significant effect remaining for males, and only a borderline significant effect remaining for females.

2. I believe good parenting makes a difference to children and so, although the studies seem unconvincing it is not outside the realms of possibility that the committed teachers involved in the second Schweinhart and Weikart study had some positive impact on their pupils. They were a highly committed team of extremely well qualified teachers. They must have involved themselves deeply in the lives of their very disadvantaged, very low IQ pupils, given they made 90 minutes visits every week to their homes as well as teaching them. Such a scheme is not really more widely replicable though.

Also, given the range of possible benefits the children experienced it is quite a leap to pin point the child-led learning as a crucial factor and suggest it provides a model of ‘high quality’ pre-schooling for all children today.

However some sort of  model is what the Perry preschool at Ypsilanti Michigan seems to be:

3. The use of these studies to justify ‘developmentally appropriate’, ‘child led’ practices is concerning. If children are not highly disadvantaged or at any great risk of engaging in felonies (most children) it is hard to see how these studies can justify the use of such approaches. This is particularly given the success of other methods. [The results of project Follow Through (with 9000 children assigned to nine early childhood curricula) showed that all groups made greater academic gains than using the Perry High/Scope approaches.]

Some of the many subsequent studies on the effectiveness of preschool education record academic gains and others don’t. Some record social gains and others don’t. However, partly thanks to the Schweinhart and Weikart studies the importance of the ‘high quality’ preschool education is a mantra repeated by all politicians. The Perry approach has become a model for ‘high quality’ preschool education around the world and currently, about 30 percent of all Head Start centres in America offer a version of the Perry curriculum (ICPSR 2010). For those, like myself, concerned that ‘child-led’ approaches are not the most efficacious, the impact of these studies is an enormous cause for concern.

So why are two small, context dependent flawed studies still widely cited? Why are the results of the largest ever educational experiment from the same era ignored in early years research literature? There is only one possible explanation. The small studies said what educationalists wanted to hear, that a child led curriculum could be proved to effect life outcomes.  The enormous Project Follow Through had more uncomfortable findings – so it was ignored.

I examine the evidence base for what is deemed ‘high quality’ pre school education here: https://heatherfblog.wordpress.com/2015/07/09/a-truism-that-needs-questioning/?relatedposts_hit=1&relatedposts_origin=626&relatedposts_position=2

 

The Hydra

‘The Hydra’

or ‘Weikart and Scweinhart’s [Perry] High/Scope Preschool Curriculum comparison Study Through Age 23’ and Lifetime Effects: The HighScope Perry Preschool Study Through Age 40 (2005)

A few years ago I read a serious book on early reading instruction that stated, without question, that there was strong scientific evidence that direct instruction methods, used even at nursery level, lead to increased criminality among adults. It seemed implausible. How could the teaching style of your nursery school, in some cases over only one year, have such a significant impact that it could be measurable 20 years later? However, I then came across the claim in numerous respectable publications. Greg Ashman  drew my attention to another in Psychology Today recently. Initially I felt I had to accept the claims – but should just investigate properly first. I soon discovered that all these claims pretty much originate

‘… from about half a dozen articles in the series written by Schweinhart and Weikart, in which they claim to compare (and apparently believe that they demonstrate the superiority of) their High/Scope pre-school curriculum with (1) a preschool that used Direct Instruction in reading and language for about an hour a day for one year, and (2) a traditional nursery school [Kozloff DI creates felons, but literate ones. Contribution to the DI Listserve, University of Oregon, 31 December, 2011.]

In this post I will explain how I discovered that it is no exaggeration to state that the influence of this and another related study on education policy around the world has been enormous. You will find this study as a central plank of evidence in swathes of papers on a diverse range of education related issues.

What was the approach of the apparently markedly superior Perry High/Scope approach?

The curriculum was based on the principle of active participatory learning, in which children and adults are treated as equal partners in the learning process, and children engage with objects, people, events, and ideas. Abilities to plan, execute, and evaluate tasks were fostered, as were social skills, including cooperation with others and resolution of interpersonal conflicts. The Perry curriculum has been interpreted as implementing the theories of Vygotsky (1986) in teaching self-control and sociability. [Heckman et al 2013]

What was the quality of the research in this study?

I soon discovered that to say this Schweinhart and Weikart study is problematic is an almighty understatement – as outlined by Bereiter, Kozloff (thanks to Greg Ashman for this) and Engelmann.

    • In this study 18 subjects had been taught using Direct Instruction at their preschool for about an hour a day and 14 that had been taught using High/Scope child led methods and 16 subjects in a ‘traditional’ preschool. All were low IQ.
    • These were about a third of the original subjects, the ones who could be traced 20 years later.
    • Far more of the adults who had been in the Direct Instruction preschool had been reared by single, working mothers whose income was about half that of households in the High/Scope group.
    • Eight of the 18 original Direct Instruction group had only one year of preschool while all the High/Scope subjects had two years of preschool.
    • Their gender balance was greatly different, with the High/Scope group having nearly two thirds participants female. This was not accounted for in the study.
    • Attendance at the local high school was 84% for the DI group and 64% for the High/Scope- and again this was not considered.
    • Differences in between groups actually amount to differences in the activities of only one or two persons.
    • The early results of project Follow Through (with 9000 children assigned to nine early childhood curricula) showed that disadvantaged children who received Direct Instruction went from the 20th to about the 50th percentile on the Metropolitan Achievement Test. Children who received the Perry High/Scope curriculum did not do as well. They fell from the 20th percentile to the 11th.

As Kozloff explains. What is “…just plain bizarre, is that these two writers barely entertain the possibility that: (1) a dozen years of school experience; (2) area of residence; (3) family background; (4) the influence of gangs; and (5) differential economic opportunity, had anything to do with adolescent development and adult behavior.

In fact a much larger study then found no correlation between preschool methods and criminality. It seems ridiculous that the outcomes of 18 twenty three year olds, should ever have been taken very seriously… However, this and another highly problematic study by Schweinhart  and Weikhart  (outlined below) have been cited 2444 times.

It sounds mad and I can’t quite understand it myself but  you will find these studies as a central plank of evidence in swathes of papers on a diverse range of education related issues:

Other areas the Schweinhart and Weikart papers have had influence:

Character Education

I was preparing for a panel debate on character education recently and started to look into the evidence in favour of teaching character attributes as a skill. Character can mean so many things that there is actually a vast body of relevant research but I began with a very glossy publication by the OECD. On p40 I found a Schweinhart and Weikart study was being used as evidence. Apparently their Perry preschool programme:

“significantly enhanced adult outcomes including education, employment, earnings, marriage, health, and participation in healthy behaviors, and reduced participation in crime [Heckman et al 2013]. “

The OECD report claimed that the evidence of this study provides:

some of the most compelling evidence that non-cognitive skills can be boosted in ways that produce adult success… Arthur Jensen’s (1969) discussion of IQ fadeout in Head Start and other compensatory programmes promoted the widespread embrace of the notion that intervention efforts are ineffective and that intelligence is genetically determined. His uncritical reliance on intelligence test scores illustrates the fallacy of relying on mono-dimensional measurements of human skills. The Perry intervention provides an effective rebuttal to these arguments. The programme greatly improved outcomes for both participating boys and girls, resulting in a statistically significant rate of return around 7%-10% per annum for both genders (see Heckman et al., 2010a). ”

What was the approach of this  second Schweinhart and Weikart study?

This study had 123 participants, all low IQ and they were divided between a treatment group receiving the ‘Active Participatory Learning’ (as outlined above) in classes of 13 students with two highly qualifies teachers for each group. The control group did not receiving any pre-schooling. “Sessions lasted 2.5 hours and were held five days a week during the school year. Teachers in the program, all of whom had bachelor’s degrees (or higher) in education, made weekly 1.5-hour home visits to treatment group mothers with the aim of involving them in the socio-emotional development of their children. The control group had no contact with the Perry program other than through annual testing and assessment (Weikart, Bond, and McNeil 1978).”

Concern 1: The research findings regarding delinquency have been disconfirmed

Crime reduction in adult life is argued by Heckman, (one of the OECD report authors) to be one of the major benefits of the Perry programme. However, there have been other studies that suggest the Perry programme was not especially effective.  For example this from Mills, Cole, Jenkin and Dale.

“In a previous study of the differential effects of contrasting early intervention programs on later social behavior (Mills, Cole, Jenkins, & Dale, 2002), we found no differences in self-report of juvenile delinquency at age 15 for children enrolled in direct instruction and child-directed models. These results disconfirmed the conclusion of Schweinhart, Weikart, and Larner (1986b) that direct instruction was linked to higher rates of juvenile delinquency and other social differences. Our previous study was limited to self-report of juvenile delinquency, a very coarse measure of social development, in an attempt to replicate the key finding of Schweinhart et al. (1986b). In the present study, we examine additional measures of social development, which might be more sensitive to subtle program differences, including school satisfaction, loneliness, and depression. We administered a battery of social development measures to 174 children at age 15 who had been randomly assigned at preschool age to the two early childhood models. We found no differences on any social outcome for program type. Across a wide range of social behaviors at age 15, there is no evidence that type of early intervention program differentially influences subsequent adolescent social behavior.”

Concern 2: Small sample size

Heckman an author of the OECD report has become a leading player in character education research. Fascinatingly this is on the back of his examination of this study. In his paper on the work he has done with Schweinhart’s and Weikart’s he acknowledges:

“The small sample size of the Perry experiment (123 participants) has led some researchers to question the validity and relevance of its findings (e.g., Herrnstein and Murray 1994; Hanushek and Lindseth 2009). Heckman et al. (2010a) use a method of exact inference that is valid in small samples. They find that Perry treatment effects remain statistically significant even after accounting for multiple hypothesis testing and compromised randomization.

However, for the biggest effects claimed, differences between groups amount to the differences between the activities of about 20 people. I can’t fathom how this sort of data can possibly provide some of  “the most compelling evidence that non cognitive skills can be boosted in ways that produce adult success.” Heckman is a Nobel prize winning economist. It would certainly take a kind of brilliance beyond any I can imagine to be able to justify the enormous influence this study has had over education and public policy.

Concern 3: Scaling up

A critic of the continued reliance on this study by education policy makers is Russ Whitehead who argues that this research was:

“From a time when very little of today’s safety net for the poor was in place…Further, [Perry was a]small single-site program run by [its] developers.  Concluding that findings from these studies demonstrate that current and contemplated state pre-k programs will have similar effects is akin to believing that an expansion of the number of U.S. post offices today will spur economic development because there is some evidence that constructing post offices 50 years ago had that effect.”

There are many studies that examine the impact of preschool education but the likes of Heckman continue to rely upon the outcomes of a small single site ’boutique’ programme from 40-50 years ago. Heckman himself suggests that positive effects of the Perry program have become a cornerstone of the argument for preschool programs (e.g., Obama 2013). “Currently, about 30 percent of all Head Start centers nationwide [across America] offer a version of the Perry curriculum (ICPSR 2010).”

My next post looks at the second study in more detail and, among other things,  I will look at the influence of this study on public policy regarding preschool education.

n.b. This post has been updated to make it clear that there are two separate studies being referenced.