Another dot in the blogosphere?

Posts Tagged ‘assessment

The news that caused ripples in Singapore schooling last week was the official announcement from the Ministry of Education (MOE) of the new scoring system that will be implemented in the Primary School Leaving Examination (PSLE) in 2021.

There was a slew of news following the announcement. Some people made tsunamis out of the ripples, some rode the waves as they were [small sample of both].

Beneath the surface was an undercurrent that did not get much attention, but was the most significant change in terms of education. According to STonline, one of the changes was the switch from norm-referenced testing (NRT) to standards or criterion-referenced testing (CRT).

PSLE2021: From NRT to CRT

What are NRT and CRT in layman terms? Why is the switch an important driver of change?

In NRT, the results of a cohort of students are reduced to scores — T-scores in the case of PSLE — and lined up from the highest to the lowest (or vice versa). The result is a bell-shaped curve of scores: There will be a few very low and very high scores, and many somewhere-in-the-middle ones.
 

 
Reviewers of these scores typically use this distribution to create an even curve (a normal distribution, ND), and to rank and sort. In the adult world of work, this method might help determine who gets promotions or bonuses, what appraisal grade you get (if you are in the civil service), or who gets fired.

For example, a large organisation can first rank the performances of all its employees. If an ideal ND does not result, it can statistically massage it into an ideal bell curve. So if there are too many A-graders, some will be pushed into Bs, and as a result Bs become Cs and so forth. Once there is an ideal bell curve, someone can decide cut-offs and consequences, say, the top 5% get promotions and the bottom 15% are let go.

If this seems unfair to working adults, then what more for the 12-year-old children who take the PSLE but have no idea what is going on?

The core problem is that people are compared one against the other with or without their knowledge. If with, this can result in unhealthy competition because they want to be on the right part of the ND curve. If without, the people become victims of processes not transparent to them and circumstances beyond their control.

Is there a better way? Yes, it is called CRT (standard-based assessment and/or evaluation).

Modern corporations like Accenture are abandoning the outdated practice of norm-referencing [1] [2] and embracing comparisons of one. The fundamental principle is this: How one improves and contributes individually over time is more important than how one is measured against others.

For example, a worker might show evidence of specific skills that indicate that he or she is a novice, intermediate, or advanced worker. There is no comparing of all the workers regardless of their skill group or even comparing within each skill group.

To make this work, there must be standards or criteria that identify each skill group, e.g., skills A to J for novices to master; K to R for intermediates; S to Z for advanced plus five potential managerial markers.

Back to PSLE 2021. The switch is from NRT to CRT. It is more about the standards or specific criteria that indicate the test-based achievements of the child, and less about the comparison of one child with another.

This is a fundamental shift in mindset from sifting and sorting to measuring performance. The former is about what is good for the system and how to feed it; the latter is about where the learner is at and what is good for the learner.

However, this piecemeal change of the CRT system of academic levels (ALs) still falls short. I share thoughts on these in more reflections on PSLE2021 over the next few days.

Read Part 2: The Dark Side.

 
In 1982, the late Prince may have partied like it was 1999.
 

 

But in 2016, why are we assessing like it is still 1999? Or maybe even 1899?


Video source

In this video Noam Chomsky explains the problems with assessment: The way they are misused, misaligned, and misappropriate.

It is no surprise then that a Secret Teacher wrote the following article in The Guardian about how tests seemed to be dumbing down her students.

The teacher bemoans:

My students are bright, engaged and well-behaved, but there is something missing: they cannot think.

The Secret Teacher goes on to blame a focus on exams and I agree with the teacher for the most part. But tests are not the only thing to blame for students who do not know how to think independently.

Teachers who spoon feed, stifle thought, or fail to stay relevant are just as culpable.

For instance, the teacher said:

Last week I caught another of my A-grade students using his phone in the lesson. As a starter exercise, I told them to think of as many advantages as they could of being on the UN security council. “What are you doing?” I asked. “I’m googling the list of advantages,” came his wary reply. I was flabbergasted. I tried to explain that there is no list of advantages, but that I wanted his own views.

I am confident that the Secret Teacher is also a Good Teacher. But she also sounds like a traditional one in that she is averse to searching for Googleable answers. Perhaps she did not know how to take advantage of a now natural behaviour to show her students how to think, act, and write critically after Googling.

Most people would eventually realize that the most important factor in a schooling or educational system is the quality of its teachers. Those that join the profession are self-selecting by choice and pre-selected by institutes of teacher education.

But only the exceptional step up to deal with the problems with assessment or learn how to skilfully promote critical and creative thinking in a conservative system. The rest need professional development and the mindset of lead learners to do this.

This reflection is a response to a slow chat question on #asiaED about the role of assessment in systematic change.

The question was:

My response was:

The layperson’s likely view of assessement is summative tests and exams, typically of the high stakes variety, because that is what they have experienced. As its name implies, summative assessment is perceived and practiced as a terminal or downstream activity.

Informed educators might point out that formative assessment (on-going feedback) is more important for learning. Educated instructional designers will tell you that assessment or evalutation should be developed before content. Wise educational consultants and leaders will tell you that assessment is a key leverage point in systemic change.

Assessment is actually an upstream component. Change that and you affect processes downstream like teaching, learning support, learning environment design, and policy making.

Imagine for a moment that exams were removed and replaced with learner portfolios. Now imagine how teaching, teacher expectations, teaching philosophies, teacher professional development, and teacher evaluation might change.

I would like to answer a question directed at me:

I cannot say for sure how assessment should change and I do not think that data collected from such assessment only serve as leverage.

Consider an example of a change-in-progress and my suggestions on how to implement change and avoid pitfalls in the process.

There are at least two significant assessment-related changes in Singapore now. One is an emphasis on values-based education (instead of focusing on just grades) and the other is evaluating of the importance of a degree.

Added after initial posting, a timely tweet from a local rag:

These changes were a result of:

  • parental feedback on the unnecessary stress of high stakes testing (particularly of the Primary School Leaving Examination (PSLE)
  • the recognition of grade inflation (particularly at the GCE A Levels)
  • the mismatch between what employers need and what universities produce
  • new and visionary leadership at the Ministry of Education (MOE), Singapore

All these placed pressures on what we understand and value as traditional, summative assessment.

That said, MOE is not going to sacrifice the sacred cows of tests and exams. But it has started emphasizing other processes and measures.

Values-based lessons are being integrated into previously content-only lessons [news article after its announcement in 2011]. Primary school students can get into Secondary schools of their choice based on non-academic talents with the Direct School Admissions (DSA) scheme.

Experts of systemic change might label these efforts as piecemeal change. They do not profoundly disrupt existing processes and are instead implemented in periodically and strategically in an attempt to create overall change.

However, critical observers might also note that significant and sustained change tends to happen with disruptive interventions. Examples might include:

  • the impact of antibiotics and anaesthesia on medical practice
  • the effect of the printing press on schooling and the spread of information
  • the influence of smartphones on banking, commerce, education, entertainment and gaming, information consumption, content creation, and socialization.

I predict that e-portfolios will rise in importance as a means of recording and evaluating (not just assessing) both the processes and products of learning.

e-Portfolios are a systemic and disruptive change in that they:

  • start and end with the learner
  • belong to the learner
  • emphasize processes and not just products of learning
  • showcase holistic or other attributes (not just academic ability)
  • promote lifelong, career wide learning

The battle to create acceptance, buy-in, and hopefully ownership of what we now label as alternative assessment will probably last a decade or more. During this time, it might be tempting to try to collect evidence during a trial or a full blown implementation of the effectiveness of e-portfolios to convince stakeholders that the change is making a difference.

However, this is not a wise move. Efforts to do this would repeat the mistakes of the slew of early educational and action research comparing the effects of intervention A (for example, traditional instruction) and intervention B (technology-assisted instruction). There are far too many factors that influence learning outcomes, attitudes, values, etc.

If data on newer forms of assessment need to be collected, analyzed, and presented, I suggest that they be part of a much larger plan. Such a plan could include:

  • having regular conversations with stakeholders
  • creating a shared vision among stakeholders
  • relating success stories to create buy-in
  • developing informed, forward-thinking, and informal leadership
  • providing financial and implementation leeway for unforeseen obstacles

In summary, assessment is an important leverage point and an upstream component for changing educational systems. Data on disruptive changes like the adoption of e-portfolios for assessment and evaluation can be leveraged on to convince stakeholders. However, such data should only be part of a larger and sustainable plan.

272/365: Student by Rrrodrigo, on Flickr
Creative Commons Creative Commons Attribution-Noncommercial 2.0 Generic License   by  Rrrodrigo 

 
Recently I read an article on The Atlantic, The End of Paper-and-Pencil Exams?

The headline asked a speculative question, but did not deliver a clear answer. It hinted at mammoth change, but revealed that dinosaurs still rule.

Here is the short version.

This is what 13,000 4th grade students in the USA had to do in an online test that was part of the National Assessment of Educational Progress. They had to respond to test prompts to:

  • Persuade: Write a letter to your principal, giving reasons and examples why a particular school mascot should be chosen.
  • Explain: Write in a way that will help the reader understand what lunchtime is like during the school day.
  • Convey: While you were asleep, you were somehow transported to a sidewalk underneath the Eiffel Tower. Write what happens when you wake up there.

This pilot online assessment was scored by human beings. The results were that 40% of students struggled to respond to question prompts as they were rated a 2 (marginal) or 1 (little or no skill) on a 6 point scale.

This was one critique of the online test:

One downside to the NCES pilot study: It doesn’t compare student answers with similar questions answered in a traditional written exam setting.

I disagree that this is necessary. Why should the benchmark be the paper test? Why is a comparison even necessary?

While the intention is to compare the questions, what a paper vs computer-based test might do is actually compare media. After all, the questions are essentially the same, or by some measure very similar.

Cornelia Orr, executive director of the National Assessment Governing Board, stated at a webinar on the results that:

When students are interested in what they’re writing about, they’re better able to sustain their level of effort, and they perform better.

So the quality and type of questions are the greater issues. The medium and strategy of choice (going online and using what is afforded there) also influence the design of questions.

Look at it another way: Imagine that the task was to create a YouTube video that could persuade, explain, or convey. It would not make sense to ask students to write about the video. They would have to design and create it.

If the argument is that the YouTube video’s technical, literacy, and thinking skills are not in the curriculum, I would ask why that curriculum has excluded these relevant and important skills.

The news article mentioned some desired outcomes:

The central goal of the Common Core is deeper knowledge, where students are able to draw conclusions and craft analysis, rather than simply memorize rote fact.

An online test should not be a copy of the paper version. It should have unGoogleable questions so that students can still Google, but they must be tested on their ability to “draw conclusions and craft analysis, rather than simply memorize rote fact”.

An online test should be about collaborating in real-time, responding to real-world issues, and creating what is real to the learners now and in their future.

An online test should not be mired in the past. It might save on paper-related costs and perhaps make some grading more efficient. But that focuses on what administrators and teachers want. It fails to provide what learners need.

If this tweet was a statement in a sermon, I would say amen to that.

Teachers, examiners, and adminstrators disallow and fear technology because doing what has always been done is just more comfortable and easier.

Students are forced to travel back in time and not use today’s technologies in order to take tests that measure a small aspect of their worth. They bear with this burden because their parents and teachers tell them they must get good grades. To some extent that is true as they attempt to move from one level or institution to another.

But employers and even universities are not just looking for grades. When students interact with their peers and the world around them, they learn that character, reputation, and other fuzzy traits not measured in exams are just as important, if not more so.

Tests are losing relevance in more ways than one. They are not in sync with the times and they do not measure what we really need.

In an assessment and evaluation Ice Age, there is cold comfort in the slowness of change. There is also money to be made from everything that leads up to testing, the testing itself, and the certification that follows.

 
Like a glacier, assessment systems change so slowly that most of us cannot perceive any movement. But move they do. Some glaciers might even be melting in the heat of performance evaluations, e-portfolios, and exams where students are allowed to Google.

We can either wait the Ice Age out or warm up to the process of change.

By reading what thought leaders share every day and by blogging, I bring my magnifying glass to examine issues and create hotspots. By facilitating courses in teacher education I hope to bring fuel, heat, and oxygen to light little fires where I can.

What are you going to do in 2014?

 
I finally read a tab I had open for about a week: A teacher’s troubling account of giving a 106-question standardized test to 11 year olds.

This Washington Post blog entry provided a blow-by-blow account of some terrible test questions and an editorial on the effects of such testing. Here are the questions the article raised:

  • What is the purpose of these tests?
  • Are they culturally biased?
  • Are they useful for teaching and learning?
  • How has the frequency and quantity of testing increased?
  • Does testing reduce learning opportunities?
  • How can testing harm students?
  • How can testing harm teachers?
  • Do we have to?

The article was a thought-provoking piece that asked several good questions. Whether or not you agree with the answers is moot. The point is to question questionable testing practices.

I thought this might be a perfect case study of what a poorly designed test looks like and what its short-term impact on learning, learners, and educators might be.

The long term impact of bad testing (and even just testing) is clear in a society like Singapore. We have teach-to-the-test teachers, test-smart students, and grade-oriented parents. We have tuition not for those that need it but for those who are chasing perfect grades. And meaningful learning takes a back seat or is pushed out of the speeding car of academic achievement.

We live in testing times indeed!


http://edublogawards.com/files/2012/11/finalistlifetime-1lds82x.png
http://edublogawards.com/2010awards/best-elearning-corporate-education-edublog-2010/

Click to see all the nominees!

QR code


Get a mobile QR code app to figure out what this means!

My tweets

Archives

Usage policy

%d bloggers like this: