Part 2: Early years assessment is not reliable or valid and thus not helpful

This is the second post on early years assessment. The first is here

Imagine the government decided they wanted children to be taught to be more loving. Perhaps the powers that be could decide to make teaching how to love statutory and tell teachers they should measure each child’s growing capacity to love.

Typical scene in the EYFS classroom – a teacher recording observational assessment. 

There would be serious problems with trying to teach and assess this behaviour:

Definition: What is love? Does the word actually mean the same thing in different contexts? When I talk about ‘loving history’ am I describing the same thing (or ‘construct’) as when I ‘love my child’.

Transfer:  Is ‘love’ something that universalises between contexts? For example if you get better at loving your sibling will that transfer to a love of friends, or school or learning geography?

Teaching: Do we know how to teach people to love in schools? Are we even certain it’s possible to teach it?

Progress: How does one get better at loving? Is progress linear? Might it just develop naturally?

Assessment: If ‘loving skills’ actually exist can they be effectively measured?

 

 

 

 

Loving – a universalising trait that can be taught?

The assumption that we can teach children to ‘love’ in one context and they’ll exercise ‘love’ in another might seem outlandish but, as I will explain, the writers of early years assessment fell into just such an error in the Early Years Foundation stage framework and assessment profile.

In my last post I explained how the priority on assessment in authentic environments has been at the cost of reliability and has meant valid conclusions cannot be drawn from Early Years Foundation Stage Profile assessment data. There are, however, other problems with assessment in the early years…

Problems of ‘validity’ and ‘construct validity’

Construct validity: is the degree to which a test measures what it claims, or purports, to be measuring.

 Validity: When inferences can be drawn from an assessment about what students can do in other situations, at other times and in other contexts.

If we think we are measuring ‘love’ but it doesn’t really exist as a single skill that can be developed then our assessment is not valid. The inferences we draw from that assessment about student behaviour would also be invalid.

Let’s relate this to the EYFS assessment profile.

Problems with the EYFS Profile ‘characteristics of effective learning’

The EYFS Profile Guide requires practitioners to comment a child’s skills and abilities in relation to 3 ‘constructs’ labelled as ‘characteristics of effective learning’:

We can take one of these characteristics of effective learning to illustrate a serious problem of validity of the assessment. While a child might well demonstrate creativity and critical thinking (the third characteristic listed) it is now well established that such behaviours are NOT skills or abilities that can be learnt in one context and transferred to another entirely different context- they don’t universalise any more than ‘loving’. In fact the capacity to be creative or think critically is dependent on specific knowledge of the issue in question. Many children can think very critically about football but that apparent behaviour evaporates when faced with some maths.  You’ll think critically in maths because you know a lot about solving similar maths problems and this capacity won’t make you think any more critically when solving something different like a word puzzle or a detective mystery.

Creating and thinking critically are NOT skills or abilities that can be learnt in one context and then applied to another

Creating and thinking critically are not ‘constructs’ which can be taught and assessed in isolation. Therefore there is no valid general inference about these behaviours, which could be described as a ‘characteristic of learning’, observed and reported. If you wish a child to display critical thinking you should teach them lots of relevant knowledge about the specific material you would like them to think critically about.

In fact, what is known about traits such as critical thinking suggests that they are ‘biologically primary’ and don’t even need to be learned [see an accessible explanation here].

Moving on to another characteristic of effective learning: active learning or motivation. This presupposes that ‘motivation’ is also a universalising trait as well as that we are confident that we know how to inculcate it. In fact, as with critical thinking, it is perfectly possible to be involved and willing to concentrate in some activities (computer games) but not others (writing).

There has been high profile research on motivation, particularly Dweck’s work on growth mindset and Angela Duckworth’s on Grit. Angela Duckworth, has created a test that she argues demonstrates that adult subjects possess a universalising trait which she calls ‘Grit’. But even this world expert concedes that we do not know how to teach Grit and rejects her Grit scale being used for high stakes tests. Regarding Growth Mindset, serious doubts have been raised about failures to replicate Dweck’s research findings and studies with statistically insignificant results that have been used to support Growth Mindset.

Despite serious questions around the teaching of motivation, the EYFS Profile ‘characteristics of learning’ presume this is a trait that can be inculcated in pre-schoolers and without solid research evidence it is simply presumed it can be reliably assessed.

For the final characteristic of effective learning, playing and learning. Of course children learn when playing. This does not mean the behaviours to be assessed under this heading (‘finding out and exploring’, ‘using what they know in play’ or ‘being willing to have a go’) are any more universalising as traits or less dependent on context than the other characteristics discussed. It cannot just be presumed that they are.

Problems with the ‘Early Learning Goals’

 At the end of reception each child’s level of development is assessed against the 17 EYFS Profile ‘Early Learning Goals. In my previous post I discussed the problems with the reliability of this assessment. We also see the problem of construct validity in many of the assumptions within the Early Learning Goals. Some goals are clearly not constructs in their own right and others may well not be and serious questions need to be asked about whether they are universalising traits or actually context dependent behaviours.

For example, ELG 2 is ‘understanding’. Understanding is not a generic skill. It is dependent on domain specific knowledge. True, a child does need to know the meaning of the words ‘how’ and ‘why’ which are highlighted in the assessment but while understanding is a goal of education it can’t be assessed generically as you have to understand something and this does not mean you will understand something else. The same is true for ‘being imaginative’ ELG17.

An example of evidence of ELG 2, understanding, in the EYFS profile exemplification materials.

Are ELG1 ‘listening and attention’ or ELG 16 ‘exploring and using media materials’ actually universalising constructs? I rarely see qualitative and observational early years research that even questions whether these early learning goals are universalising traits, let alone looks seriously at whether they can be assessed. This is despite decades of research in cognitive psychology leading to a settled consensus which challenges many of the unquestioned constructs which underpin EYFS assessment.

It is well known that traits such as understanding, creativity, critical thinking don’t universalise. Why, in early years education, are these bogus forms of assessment not only used uncritically but allowed to dominate the precious time when vulnerable children could be benefiting from valuable teacher attention?

n.b. I have deliberately limited my discussion to a critique using general principles of assessment rather than arguments that would need to based on experience or practice.

Early years assessment is not reliable or valid and thus not helpful

The academic year my daughter was three she attended two different nursery settings. She took away two quite different EYFS assessments, one from each setting, at the end of the year. The disagreement between these was not a one off mistake or due to incompetence but inevitable because EYFS assessment does not meet the basic requirements of effective assessment – that it should be reliable and valid*.

We have a very well researched principles to guide educational assessment and these principles can and should be applied to the ‘Early Years Foundation Stage Profile’. This is the statutory assessment used nationally to assess the learning of children up to the age of 5. The purpose of the EYFS assessment profile is summative:

‘To provide an accurate national data set relating to levels of child development at the end of EYFS’

It is also used to ‘accurately inform parents about their child’s development’. The EYFS profile is not fit for these purposes and its weaknesses are exposed when it is judged using standard principles of assessment design.

EYFS profiles are created by teachers when children are 5 to report on their progress against 17 early learning goals and describe the ‘characteristics of their learning’. The assessment is through teacher observation. The profile guidance stresses that,

‘…to accurately assess these characteristics, practitioners need to observe learning which children have initiated rather than focusing on what children do when prompted.’

Illustration is taken from EYFS assessment exemplification materials for reading

Thus the EYFS Profile exemplification materials for literacy and maths only give examples of assessment through teacher observations when children are engaged in activities they have chosen to play (child initiated activities). This is a very different approach to subsequent assessment of children throughout their later schooling which is based on tests created by adults. The EYFS profile writers no doubt wanted to avoid what Wiliam and Black (Wiliam & Black, 1996) call the ‘distortions and undesirable consequences’ created by formal testing.

Reaching valid conclusions in formal testing requires:

  1.    Standard conditions – means there is reassurance that all children receive the same level of help
  2.    A range of difficulty in items used for testing – carefully chosen test items will discriminate between the proficiency of different children
  3.    Careful selection of content – from the domain to be covered to ensure they are representative enough to allow for an inference about the domain. (Koretz pp23-28)

The EYFS profile is specifically designed to avoid the distortions created by such restrictions that lead to an artificial test environment very different from the real life situations in which learning will need to be ultimately used. However, as I explain below, in so doing the profile loses necessary reliability to the extent that teacher observations cannot support valid inferences.

This is because when assessing summatively the priority is to create a shared meaning about how pupils will perform beyond school and in comparison with their peers nationally (Koretz 2008). As Wiliam and Black (1996) explain, ‘the considerable distortions and undesirable consequences [of formal testing] are often justified by the need to create consistency of interpretation.’ This is why GCSE exams are not currently sat in authentic contexts with teachers with clipboards (as in EYFS) observing children in attempted simulations of real life contexts. Using teacher observation can be very useful for an individual teacher when assessing formatively (deciding what a child needs to learn next) but the challenges of obtaining a reliable shared meaning nationally that stop observational forms of assessment being used for GCSEs do not just disappear because the children involved are very young.

Problems of reliability

Reliability: Little inconsistency between one measurement and the next (Koretz, 2008)

Assessing child initiated activities and the problem of reliability:

The variation in my daughter’s two assessments was unsurprising given that…

  • Valid summative conclusions require ‘standardised conditions of assessment’ between settings and this is not possible when observing child initiated play.
  • Nor is it possible to even create comparative tasks ranging in difficulty that all the children in one setting will attempt.
  • The teacher cannot be sure their observations effectively identify progress in each separate area as they have to make do with whatever children choose to do.
  • These limitations make it hard to standardise between children even within one setting and unsurprising that the two nurseries had built different profiles of my daughter.

The EYFS Profile Guide does instruct that practitioners ‘make sure the child has the opportunity to demonstrate what they know, understand and can do’ and does not preclude all adult initiated activities from assessment. However, the exemplification materials only reference child initiated activity and, of course, the guide instructs practitioners that

‘…to accurately assess these characteristics, practitioners need to observe learning which children have initiated rather than focusing on what children do when prompted.’

Illustration from EYFS assessment exemplification materials for writing. Note these do not have examples of assessment from written tasks a teacher has asked children to undertake – ONLY writing voluntarily undertaken by the child during play.

Assessing adult initiated activities and the problem of reliability

Even when some children are engaged in an activity initiated or prompted by an adult

  • The setting cannot ensure the conditions of the activity have been standardised, for example it isn’t possible to predict how a child will choose to approach a number game set up for them to play.
  • It’s not practically possible to ensure the same task has been given to all children in the same conditions to discriminate meaningfully between them.

Assessment using ‘a range of perspectives’ and the problem of reliability

The EYFS profile handbook suggests that:

‘Accurate assessment will depend on contributions from a range of perspectives…Practitioners should involve children fully in their own assessment by encouraging them to communicate about and review their own learning…. Assessments which don’t include the parents’ contribution give an incomplete picture of the child’s learning and development.’

A parent’s contribution taken from EYFS assessment exemplification materials for number

Given the difficulty one teacher will have observing all aspects of 30 children’s development it is unsurprising that the profile guide stresses the importance of contributions from others to increase the validity of inferences. However, it is incorrect to claim the input of the child or of parents will make the assessment more accurate for summative purposes. With this feedback the conditions, difficulty and specifics of the content will not have been considered creating unavoidable inconsistency.

Using child-led activities to assess literacy and numeracy and the problem of reliability

The reading assessment for one of my daughters seemed oddly low. The reception teacher explained that while she knew my daughter could read at a higher level the local authority guidance on the EYFS profile said her judgement must be based on ‘naturalistic’ behaviour. She had to observe my daughter (one of 30) voluntarily going to the book corner, choosing to reading out loud to herself at the requisite level and volunteering sensible comments on her reading.

 

Illustration is taken from EYFS assessment exemplification materials for reading Note these do not have examples of assessment from reading a teacher has asked children to undertake – ONLY reading voluntarily undertaken by the child during play.

The determination to preference assessment of naturalistic behaviour is understandable when assessing how well a child can interact with their peers. However, the reliability sacrificed in the process can’t be justified when assessing literacy or maths. The success of explicit testing of these areas suggests they do not need the same naturalistic criteria to ensure a valid inference can be made from the assessment.

Are teachers meant to interpret the profile guidance in this way? The profile is unclear but while the exemplification materials only include examples of naturalistic observational assessment we are unlikely to acquire accurate assessments of reading, writing and mathematical ability from EYFS profiles.

Five year olds should not sit test papers in formal exam conditions but this does not mean only observation in naturalistic settings (whether adult or child initiated) is reasonable or the most reliable option.  The inherent unreliability of observational assessment means results can’t support the inferences required for such summative assessment to be a meaningful exercise. It cannot, as intended ‘provide an accurate national data set relating to levels of child development at the end of EYFS’ or ‘accurately inform parents about their child’s development’.

In my next post I explore the problems with the validity of our national early years assessment.

 

*n.b. I have deliberately limited my discussion to a critique using assessment theory rather than arguments that would need to based on experience or practice.

References

Koretz, D. (2008). Measuring UP. Cambridge, Massachusetts: Harvard University Press.

Standards and Testing Agency. (2016). Early Years Foundation Stage Profile. Retrieved from https://www.gov.uk/government/publications/early-years-foundation-stage-profile-handbook

Standards and Testing Agency. (2016). Early Years Foundation Stage Profile: exemplification materials. Retrieved from https://www.gov.uk/government/publications/eyfs-profile-exemplication-materials

Wiliam, D., & Black, P. (1996). Meanings and consequences: a basis for distinguishing formative and summative functions of assessment? BERJ, 537-548.

The Secret Of My (Maths) Success

I had an interesting chat with a friend recently who’s just done a PGCE as a maths teacher. He’s been trained to build understanding through plenty of problem solving tasks.

The discussion made me reflect on the stark difference between the way I’ve taught maths to my own children at home, with the lion’s share of time spent learning to fluency, and the focus in schools on exercises to build understanding. After all, I reflected, the progress of my children has stunned even me. How is it they missed out on SO much work on understanding while accelerating far ahead of their peers?

It isn’t that I don’t appreciate that children need some degree of understanding of what they are doing. I remember when I discovered that the reason my friend’s daughter was struggling with maths at the end of Year 1 was because she had failed to grasp that crucial notion of ‘one more’. Her teacher had advised that she needed to learn her number bonds (and indeed she did) but while she did not grasp this basic notion the bonds were gibberish to her. What we call ‘understanding’ does matter (more thoughts here).

I’ve realised the reason I’ve never had to invest significant time in exercises to build understanding. It is because when my children are given a new sort of problem they can already calculate the separate parts of that problem automatically. All their working memory is focused on the only novel element of a procedure and so it is very quickly understood. Understanding is just not a biggy. Identify the knowledge necessary to calculate the component parts of a problem and get fluency in those and generally activities for understanding become a (crucial but) small part the maths diet.

The degree of focus on fluency that my children were given is highly unusual. I have huge piles of exercise books full of years of repeated calculations continued a year, two years, after they were first learned. My children learnt all possible addition and subtraction facts between one and twenty until they were known so well that recall was like remembering your own name. I did the same with multiplication and division facts. There were hours and hours and hours and hours of quite low level recall work.

Generally the the focus in schools is the opposite and this creates a vicious cycle. Children are taught more complex problems when they are not fluent in the constituent parts of the problem. Therefore they struggle to complete calculations because their working memory breaks down. The diagnosis is made that children don’t ‘understand’ the problem posed. The cure is yet more work focused on allowing children to understand how the problem should be solved and why. The children may remember this explanation (briefly) but it is too complex to be remembered long term as too many of the constituent elements of the problem are themselves not secure. When the children inevitably forget the explanation what is the diagnosis? – a failure of understanding. Gradually building ‘understanding’ eats more and more lesson time. Gurus of the maths world deride learning to fluency as ‘rote’ but perversely the more time is spent on understanding instead of fluency, the harder it is for children to understand new learning. By comparison my children seem to have a ‘gift that keeps on giving’. Their acceleration isn’t just in the level of maths proficiency they have reached it is in the capacity they have to learn new maths so much more easily.

gift_keep_giving_13Fluency… the gift that keeps on giving.

I’ve not got everything right but I’ve learned so much from teaching my own children including that the same general principle is true of understanding maths and understanding history. If understanding is a struggle it is because necessary prior knowledge is not in place or secure.

Go back – as far as you can get away with.

Diagnose those knowledge gaps.

Teach and secure fluency.

You’ll find understanding is no longer the same challenge.

Research and primary education

I took part in a panel discussion at the national ResearchEd conference yesterday. The subject of the discussion was primary education and I thought I would post the thoughts I shared:

At all levels of schooling classroom research is undoubtedly useful but the process of generalising from this research is fraught with difficulty. As E D Hirsch explains, each classroom context is different.

I think ideally we would like to base educational decisions on

  • Converging evidence from many years of research in numerous fields
  • That integrates both classroom research and lab based work so…
  • We can construct theoretical accounts of underlying causal processes.

These theoretical insights allow us to interpret sometimes contradictory classroom research. We actually have this ideal in the case of research into early reading and the superiority of systematic synthetic phonics. Despite this evidence the vast majority of primary schools ignore or are unaware of the research and continue to teach the ‘multi-cueing’ approach to reading.

While research on phonics is ignored some lamentably poor research has been enduringly influential in early primary education and treated with a breath taking lack of criticality. In the 1950s a comparison study of 32 children found that children taught at nursery using teacher centred methods showed evidence of delinquent behaviour in later life. However, this was a tiny sample and there was a tiny effect. No account was taken for the fact the teacher led research group had many more boys than the comparison child centred group – among other fatal flaws. Despite this that piece of research is STILL continually and uncritically cited. For example, the OECD used this study to support character education.  It is also central to the National Audit Office definition of ‘high quality early years provision as ‘developmentally appropriate’.

That flawed research features in the literature review for the EPPSE longitudinal study that has become one of the highest impact educational research programmes in Europe and whose findings underpin billions of pounds of government spending. EPPSE claims to have demonstrated that high quality pre-school provision is child centred and to have shown that such provision has an incredible impact on outcomes at aged 16. However, merely scratch the surface and you find there were obvious flaws with EPPSE. The scales used by classroom observers to discover the nature of quality provision lacked validity and actually predefined what constituted high quality provision as child centred. The researchers admitted problems with the control group meant causal connections couldn’t be drawn from the findings but then ignored this problem, despite the control group issue undermining their key conclusions.

It seems the key principles influencing early years education are too frequently drawn from obviously flawed research. These principles are also the product of misuse of the research we have. For example, it is statutory to devise activities to build resilience in early years education. However, Angela Duckworth, an international authority, admits that although it is a desirable trait we don’t really know for sure how to create it.

What explains the astonishing situation where theoretical research from cognitive psychology is ignored, obviously flawed huge government funded research projects become influential and new pedagogical approaches, based on faulty understanding of the evidence, are made statutory?

A glance along the bookshelves at any primary teacher training institution gives us a clue. There is a rigid child centred and developmentalist  orthodoxy among primary educationalists. This explains the lack of rigorous scrutiny of supportive research. In fact, except on social media, sceptical voices are barely heard.

The pseudo-expert

A week or so after our first child’s birth we met our health visitor, Penny. She was possibly in her early sixties and had worked with babies all her life. She was rather forthright in her advice but with the wisdom of 40 years behind her I was always open to her suggestions. Our baby refused to sleep in her first weeks. This meant I was getting one or two hours sleep a night myself and Penny’s reassuring advice kept me going. I can never forget one afternoon when our daughter was about 15 days old and Penny walked into our living room, taking in the situation almost immediately. “Now Heather,” she said, “I’m just going to pop baby in her Moses basket on her front. Don’t worry that she is on her front as you can keep an eye on her and if I roll up this cot blanket (deftly twisted in seconds) and put it under her tummy the pressure will make baby feel more comfortable…” Our daughter fell asleep immediately and Penny left soon after but SIX WHOLE HOURS later our baby was STILL sleeping soundly. She knew the specific risk to our baby from sleeping on her front was negligible and that it might just pull the parents back from the brink. I’m grateful to her for using her professional judgement that day.

Penny’s practical but sometimes controversial wisdom contrasted with the general quality of advice available at weekly baby clinic. Mums who were unable to think of an excuse to queue for Penny were told that ‘each baby was different’ and ‘mum and baby need to find their own way’. The other health visitors did dispense some forms of advice. If your baby wasn’t sleeping you could “try cutting out food types. Some mums swear its broccoli that does it” or “you could try homeopathy.”  The other health visitors had no time for Penny’s old fashioned belief that mothers could be told how to care for their babies. Instead of sharing acquired wisdom they uncritically passed on to mothers the latest diktats from on high (that seemed to originate from the pressure groups that held most sway over government) and a garbled mish-mash of pseudo-science.

A twitter conversation today brought back those memories. The early years teacher I was in discussion with bemoaned the lack of proper training for early years practitioners. Fair enough but what was striking was the examples she gave of the consequential poor practice. Apparently without proper training teachers wouldn’t understand about ‘developmental readiness’, ‘retained reflexes‘ or the mental health problems caused by a ‘too much too soon’ curriculum. The problem is that these examples of expertise to be gained from ‘proper’ training are actually just unproven theory or pseudo-science. The wisdom of the lady in her fifties who has worked for donkey’s at the little local day nursery is suspect if she is not ‘properly trained’. But trained in what? The modern reluctance to tell others how they should conduct themselves has created a vacuum that must be filled with pseudo-expertise masquerading as wisdom.

How often do teachers feel that they can’t point to their successful track record to prove their worth and instead must advocate shiny ‘initiatives’ based on the latest pastoral or pedagogical fads dressed up as science? The expert is far from always right but I value their wisdom. I also value the considered use of scientific research in education. Too often though these are sidelined and replaced with something far worse.

Reading failure? What reading failure?

“Yes, A level history is all about READING!”

I say it brightly as I dole out extracts from a towering pile of photocopying taken from different texts that will help the class get going with their coursework. I try and ooze reassurance. I cheerily talk about the sense of achievement my students will feel when they have worked their way through these carefully selected texts, chosen to transfer the maximum knowledge in the minimum reading time. I explain this sort of reading is what university study will be all about, while dropping in comforting anecdotes to illustrate it is much more manageable than they think. I make this effort because I NEED them to read lots. The quality of their historical thinking and thus their coursework is utterly dependent upon it.

Who am I kidding? This wad of material is the north face of the Eiger to most of my students. Some have just never read much and haven’t built up the stamina. The vocabulary in those texts (chosen by their teacher for their readability) is challenging and the process will be effortful. For a significant minority in EVERY class the challenge is greater. They don’t read well. Unfamiliar words can’t be guessed and their ability to decode is weak. To read even one of my short texts will take an inordinate time. Such students are bright enough, most students in my class will get an A after all, with some Bs and the odd C. They all read well enough to get through GCSE with good results and not one of them would have been counted in government measures for weak literacy. According to the statistics the biggest problem I face day in, day out as I teach A level history simply doesn’t exist. Believe me it exists and there is a real human cost to this hidden reading failure.

Take Hannah. She loves history, watches documentaries and beams with pleasure as we discuss Elizabeth I. She even reads historical novels. However, she really struggles to read at any pace and unfamiliar words are a brick wall. She briefly considered studying history at university but the reading demands make it impracticable. Her favourite subject can never be her degree choice because her reading is just not good enough. She is not unusual, her story is everywhere.

At this point I am going to hand over my explanation to Kerry Hempenstall, senior lecturer in psychology at RMIT. I include just a few edited highlights from his survey of the VAST research literature on older students’ literacy problems that you can consider for yourself by following the link. He says:

These struggling adolescents readers generally belong to one of two categories, those provided with little or poor early reading instruction or those possibly provided with good early reading instruction, yet for unknown reasons were unable to acquire reading skills (Roberts, Torgesen, Boardman, & Sammacca, 2008)…

Hempenstall outlines the problems with the ways reading is currently taught:

…Under the meaning centred approach to reading development, there is no systematic attention to ensuring children develop the alphabetic principle. Decoding is viewed as only one of several means of ascertaining the identity of a word – and it is denigrated as being the least effective identification method (behind contextual cues). In the early school years, books usually employ highly predictable language and usually offer pictures to aid word identification. This combination can provide an appearance of early literacy progress. The hope in this approach is that this form of multi-cue reading will beget skilled reading.

However, the problem of decoding unfamiliar words is merely postponed by such attractive crutches. It is anticipated in the meaning centred approach that a self-directed attention to word similarities will provide a generative strategy for these students. However, such expectations are all too frequently dashed – for many at-risk children progress comes to an abrupt halt around Year 3 or 4 when an overwhelming number of unfamiliar (in written form) words are rapidly introduced…

  1. a) New content-area vocabulary words do not pre-exist in their listening vocabularies. They can guess ‘wagon’. But they can’t guess’ circumnavigation’ or ‘chlorophyll’ based on context (semantics, syntax, or schema); these words are not in their listening vocabularies.
  2. b) When all of the words readers never learned to decode in grades one to four are added to all the textbook vocabulary words that don’t pre-exist in readers’ listening vocabularies, the percentage of unknown words teeters over the brink; the text now contains so many unknown words that there’s no way to get the sense of the sentence.
  3. c) Text becomes more syntactically embedded, and comprehension disintegrates. Simple English sentences can be stuffed full of prepositional phrases, dependent clauses, and compoundings. Eventually, there’s so much language woven into a sentence that readers lose meaning. When syntactically embedded sentences crop up in science and social studies texts, many can’t comprehend.” (Greene, J.F. 1998)

…In a study of 3000 Australian students, 30% of 9 year olds still hadn’t mastered letter sounds, arguably the most basic phonic skill. A similar proportion of children entering high school continue to display confusion between names and sounds. Over 72% of children entering high school were unable to read phonetically regular 3 and 4 syllabic words. Contrast with official figures: In 2001 the Australian public was assured that ‘only’ about 19% of grade 3 (age 9) children failed to meet the national standards. (Harrison, B. 2002) [Follow the link if you want to read all the research listed.]

Hempenstall outlines the research showing that the effects of weak reading become magnified with time:

“Stanovich (1986) uses the label Matthew Effects (after the Gospel according to St. Matthew) to describe how, in reading, the rich get richer and the poor get poorer. Children with a good understanding of how words are composed of sounds (phonemic awareness) are well placed to make sense of our alphabetic system. Their rapid development of spelling-to-sound correspondences allows the development of independent reading, high levels of practice, and the subsequent fluency which is critical for comprehension and enjoyment of reading. There is evidence (Stanovich, 1988) that vocabulary development from about Year 3 is largely a function of volume of reading. Nagy and Anderson (1984) estimate that, in school, struggling readers may read around 100,000 words per year while for keen mid-primary students the figure may be closer to 10,000,000, that is, a 100 fold difference. For out of school reading, Fielding, Wilson and Anderson (1986) suggested a similar ratio in indicating that children at the 10th percentile of reading ability in their Year 5 sample read about 50,000 words per year out of school, while those at the 90th percentile read about 4,500,000 words per year”…

Hempenstall explains just why it is crucial to spot problems with phonics in year 1:

The probability that a child who was initially a poor reader in first grade would be classified as a poor reader in the fourth grade was a depressingly high +0.88.Juel, C. (1988

If children have not grasped the basics of reading and writing, listening and speaking by Year Three, they will probably be disadvantaged for the rest of their lives. Australian Government House of Representatives Enquiry. (1993).The Literacy Challenge. Canberra: Australian Printing Office.

“Unless these children receive the appropriate instruction, over 70 percent of the children entering first grade who are at risk for reading failure will continue to have reading problems into adulthood”. Lyon, G.R. (2001).

[The research literature for this finding is enormous – do follow link if interested]

A study by Schiffman provides support for monitoring programs for reading disabilities in the first and second grades. In a large scale study of reading disabilities (n = 10,000),

82% of those diagnosed in Grades 1 or 2 were brought up to grade level.

46%     in Grade 3 were brought up to grade level.

42%     in Grade 4 were brought up to grade level.

10-15% in Grades 5-7 were brought up to grade level.

Berninger, V.W, Thalberg, S.P., DeBruyn, I., & Smith, R. (1987). Preventing reading disabilities by assessing and remediating phonemic skills. School Psychology Review, 16, 554-565.

Hempenstall lists research on what it is that causes such problems for struggling readers:

“The vast majority of school-age struggling readers experience word-level reading difficulties (Fletcher et al., 2002; Torgesen, 2002). This “bottleneck” at the word level is thought to be particularly disruptive because it not only impacts word identification but also other aspects of reading, including fluency and comprehension (LaBerge & Samuels, 1974). According to Torgesen (2002), one of the most important discoveries about reading difficulties over the past 20 years is the relationship found between phonological processing and word-level reading. Most students with reading problems, both those who are diagnosed with dyslexia and those who are characterized as “garden variety” poor readers, have phonological processing difficulties that underlie their word reading problems (Stanovich, 1988)” (p.179). [Do follow link for more]

To debate just how many children are functionally illiterate and condemn Nicky Morgan for apparent exaggeration entirely misses the point. Reading failure is endemic. I would estimate that about a third of my A level students have noticeable issues with word level reading that significantly impact upon their progress in history at A level. Reading failure is one of the biggest obstacles I have face in my teaching and I have every reason to comment on the issue. I don’t even deal with all those students who chose not to even attempt A level history because they knew it meant lots of reading.  At secondary school we should be giving students more complex texts to build their vocabularies and reading stamina. However, the research is pretty clear about when difficulties need to be identified if children are to overcome them – way back in year 1. The research is also pretty clear about what it is that struggling readers lack – a grasp of the alphabetic principle that they are able to apply fluently when reading. Given this, the opposition to the year 1 phonics check is hard to justify. We know so much now about effective reading instruction but it can only be used to help children if teachers are willing to adjust their practices. While around 90% of primary schools continue to focus on ‘mixed methods’ (guessing from cues rather than sounding out) that limit children’s chances of acquiring the alphabetic principle essential for successful reading, nothing will change.

No one questions what they want to believe. The problem with EPPSE

The EPPSE is a very large and enormously influential study commissioned by the DfE to find out what types of preschool provision and early experiences are most effective. It followed 3000+ children from the age of 3 to 16 years and reaches some very significant conclusions which I would question.

The research team was from the Institute of Education, Birkbeck and Oxford and as the Institute of Education blog explains:

The EPPSE project… has become one of the highest impact educational research programmes in Europe… EPPSE’s findings underpin billions in Government spending on nursery expansion, including the Sure Start programme, the extension of free pre-school to all three and four-year-olds in 2010 and this year, to the poorest 40% of two-year-olds this year… EPPSE’s evidence documenting excellent pre-school education and its ongoing benefits, especially for the most deprived children, has fed heavily into England’s early childhood curriculum and informed curricula in countries as diverse as Australia, China and Brazil. Nursery World editor Liz Roberts has noted “how highly regarded the Early Years Foundation Stage is around the world”.

The EPPSE project findings are stunning:

Attending any pre-school, compared to none, predicted higher total GCSE scores, higher grades in GCSE English and maths, and the likelihood of achieving 5 or more GCSEs at grade A*-C. The more months students had spent in pre-school, the greater the impact on total GCSE scores and grades in English and maths… the equivalent of getting seven B grades at GCSE, rather than seven C grades.

The EPPSE project also found that:

There was some evidence of statistically significant continuing pre-school effects on social behavioural outcomes at age 16 but these were weaker than at younger ages. Having attended a high quality pre-school predicted better social-behavioural outcomes in the longer term, though the effects were small.

The IOE blog gushes that EPPSE:

…brought together a rare combination: research funded by Government with a genuinely open mind, carried out by excellent and dedicated academics savvy enough to work with and influence politicians of all stripes…And thanks to the detailed work that began 17 years ago at the IOE, we also know what excellent nursery provision looks like.

Hold on a moment! It seems these researchers did not have an open mind. I have previously blogged about the fact the EPPE (the acronym before the study moved onto secondary school outcomes) studied quality using a measure, ECERS R, which had a predefined scale based on prejudged measures of quality. Having read through much of the voluminous literature there is so much I could discuss about the EPPE findings but in this post I will focus on another claim in the IOE blog, that this study has rigour. I am really not sure it does.

That is a big accusation to make against such a large and influential study conducted by highly regarded academics but it seems to have a fundamental problem that strikes at the heart of the validity of its findings.

The problem of the EPPE/EPPSE control group.

In their report at the end of KS1 (when the study children were 7 years old) the researchers acknowledge that there were problems with the control group. Because in England the vast majority of children attend a preschool it was not possible to find a representative sample of those children who didn’t:

The ‘home’ control group are from significantly disadvantaged backgrounds when compared with the sample as a whole with most mothers having less than a GCSE qualification (p11) .

On p28 the report explains that:

…comparison of the ‘home’ sample (the control who did not attend pre-school) with children who attended a pre-school centre showed that both the characteristics and attainments of home children vary significantly from those who had been in pre-school. It is not possible to conclude with any certainty that the much lower attainments of the ‘home’ group are directly due to lack of pre-school experience.’

The writers go on to talk positively about how they have used ‘contextualised multilevel analysis’ to try and compensate for the unrepresentative nature of the control sample and they feel this means their results are worth considering but, for example, they admit that when making judgements about the impact of longer duration of pre-schooling on higher cognitive outcomes by comparing with the ‘home’ group:

“causal connections cannot be drawn”

The problems with the control are not mentioned in the overall findings but in 2004 they are acknowledged in the body of the report. There is an attempt in 2004 to show that variation in pre-school quality and in duration of attendance can have an impact and this is because these findings don’t have to use the problematic control.

It seems obvious that children coming from very disadvantaged homes as the ‘home’ control group largely do, may well benefit from pre-school in ways most children wouldn’t. Therefore, despite contextualised multilevel analysis to take account of all other variables such as SES there will always be problems using this control to reach firm conclusions on the impact of pre-schooling on the whole population.

Fast forward to 2014 and the final reports in the children at age 16. I have looked through all the reports. It is clear from the tables included that all the startlingly good educational findings rest on comparisons with this control group. However I can find NO MENTION AT ALL of the sorts of problems with the control the researchers were willing to acknowledge in 2004. It is as if the control group issue just wafted away. It seems that such a large and important study, ‘one of the highest impact educational research programmes in Europe’, had no need to concern itself with pesky issues like that annoyingly poor control, that the whole vast edifice that is EPPSE rests upon.

How can it be that in 2004 the problem of the unrepresentative control meant many findings on the impact of preschool were tentative but in 2014 the issue of the control has totally disappeared? It is as if it never existed. Given the startlingly strong impact ANY form of preschool is claimed to have the findings need to be robust if they are to be used to make policy.

The EPPSE claims to be ‘proper’ research, not the sort of stuff that gives education research a bad name. It is also enormously influential and directly used in government policy making. I can understand the researchers, invested as they are, claiming their conclusions have validity but where is the scrutiny? What is happening at peer review? If anything highlights the unhealthiness of the rigid orthodoxy in education departments, especially in early years research, it is this EPPSE study. Once again no one seems to question what they want to believe.

If you found this interesting you may also want to read these posts:

https://heatherfblog.wordpress.com/2015/07/09/a-truism-that-needs-questioning/

https://heatherfblog.wordpress.com/2015/06/01/the-hydra/