Another dot in the blogosphere?

Posts Tagged ‘assessment

Anyone aiming to be assessment literate needs to unpack the principle represented above. 

Here is my deconstruction. Not everything that can be assessed is worthwhile. Not everything that is worthwhile can be assessed. The overlap of what is assessable and worthwhile is small — this should be our focus.

If I was still a full-time professor, I would probably be a member of the ungrading movement. This is the pushback against number and letter grades because these often obstruct learning.

As the visual in the tweet above illustrates, grading and current forms of assessment ignore what lies beneath. They are not designed for the long tail of learning or the less tangible aspects of learning.

This is why I work with organisations that have a more progressive stance on what counts as success. For example, with one institute the focus is on formative feedback and the course is pass/fail. In another group I work with, my modules have no required assessment — I can focus on what my students need in the short and long term. Both allow me to facilitate learning by diving into what is hidden from hurried and mechanical assessment procedures.

The age of COVID-19 has pushed us to rely on technologies for remote teaching and learning. But how far have we pushed ourselves pedagogically? How have we actually changed the way we assess learning?

This Times Higher Education (THE) article started with the premise that the assessment of learning in higher education is often an afterthought that still takes the form of pen and paper examinations.

Traditional mainstays of assessment have failed in the age of COVID-19. This was evidenced by remote proctoring debacles and the abandoning of IB and GCE/GCSE exams.

According to the article, such dated assessment design is down to bureaucracy, i.e., administrative needs prioritised over student and learning needs. Students and faculty have little power (if any) to question the status quo.

A professor, Dr Jesse Stommel, who was interviewed for the article declared:

He and other interviewees were effectively suggesting what I like to call the pedagogy of trust (PoT). PoT is built on a foundation that students have varied life experiences, diverse needs, and a broad spectrum of goals.

Part of the PoT in assessment design might include more authentic assessments that are based on real-world issues, perhaps shaped by students themselves, and require meaningful opportunities for cooperation.

The article did not suggest how we might implement PoT in detail. To do so, faculty need to answer this question: Is trust mostly earned or created?

If educators think that students need to show that they are trustworthy first, nothing will change. There will always be some students who will cheat and take shortcuts. Ironically, they might do so because of university rules and procedures that assume that they are not trustworthy in the first place.

For example, students typically need to take an anti-plagiarism/cheating module and quiz that are both online because the university prefers an efficient and hands-off mode. Students soon discover that they can use more than one device and/or cooperate with one another to clear this administrative hurdle.

PoT starts with the educator: Opportunities for trust need to be created. This could mean taking the time and effort to be assessment literate, explaining the design and purpose of assessments to students, and counselling students who make mistakes.

This is my reflection about how a boy gamed an assessment system that was driven by artificial intelligence (AI). It is not about how AI drives games.
 

 
If you read the entirety of this Verge article, you will learn that a boy was disappointed with the automatic and near instant grading that an assessment tool provided. The reason why he got quick but poor grades was because his text-based answers were assessed with a vendor’s AI.

The boy soon got over his disappointment when he found out that he could add keywords to the end his answers. These keywords were seemingly disjointed or disconnected words that represented key ideas of a paragraph or article. When he included these keywords, he found out that he could get full marks.

My conclusion: Maybe the boy learnt some content, but he definitely learnt how to game the system.

A traditionalist (or a magazine wiriter in this case) might say that the boy cheated. A progressive might point out that this is how every student responds to any testing regime, i.e., they figure out the rules and how to best take advantage of them. This is why test-taking tends to reliably measure just one thing — the ability to take the test.

If the boy had really wanted to apply what he learnt, he would have persisted with answering questions the normal way. But if he did that, he would have been penalised for doing the right thing. I give him props for switching to a strategy that was gamed from the start.

This is not an attack on AI. It is a critique on human decision-making. What was poor about the decisions? For one thing, it seemed like the vendor assumed that the use of key words indicated understanding or application. If a student did not use the exact key words, the system would not detect and reward them.

It sounds like the AI was a relatively low-level matching system, not a more nuanced semantic one. If it was the latter, it would be more like a teacher who would be able to give each student credit when it was due if the same meanings were expressed.

The article did not dive into the vendor’s reasons for using that AI. I do not think the company would want to share that in any case. For me, this exhibited all the signs of a quick fix for quick returns. This is not what education stands for, so that vendor gets an F for implementation.

Depending on how you design and implement it, assessment is when students learn the least or the most.

Students might learn little to nothing if there is no assessment or if the assessment is not constructively aligned to learning outcomes, content, teaching strategies, and learning experiences.

On the other hand, learning is tested and measured if well-designed assessment challenges students to apply, analyse, evaluate, create, and/or cooperate.

Whether formative or summative, assessment puts the responsibility of showing evidence of learning on the student.

Prior to this, the teacher might have delivered information or provided other learning experiences. But we still do not know if the student has learnt anything.

Prior to assessment, students might have the opportunity to negotiate meaning with their peers, but we still do not know if the students have learnt anything.

The evidence of learning is often in the assessment. And yet this is one of the most poorly conceived and dreaded aspects of the teaching and learning process. One need only review poorly-written multiple-choice questions, critique vague rubrics, or rail against grading on a curve to see this.

Parts of assessment are also poorly understood by students, parents, and other stakeholders. They see the obvious classroom performances of teaching, but they are not privy to the heart-wrenching and mind-numbing process of, say, grading essays.
 

 
In this respect, assessment is like the engine under the hood of a sexy sports car. Everyone sees and appreciates the outsides. Very few know how to design and maintain the insides so that the car actually works like one.

Just like a car, when the engine breaks down, practically everything else stops working. You have a car that isn’t. Likewise, you have teaching that is empty of evidence of learning.

The title of this blog entry was the name of a sit-com in the mid-1990s. It is also how I would title this assessment mistake.

I share the sentiment of the tweet. Who is Susan? Why did she suddenly appear? What does she have to do with the crayon drama?

This is probably a mistake on the teacher’s part. But it was a needless one. It could have been avoided by some thorough proofreading.

If we consider the SAT, the prime test for entrance to US universities, what does that test actually measure?

The video below provides insights into the history and design of the SAT.


Video source

It concludes with this sobering thought:

The SAT was created in the pursuit of precision. An effort to measure what we’re capable of — to predict what we can do. What we might do. What we’ve forgotten is that, often, that can’t be untangled from where we’ve been, what we’ve been through, and what we’ve been given.

The same could be said about practically any other academic test taken on paper.

The issue of “grading on the curve” raised its ugly head in the news. This time the headline was a bold declaration:

But there was more to the headline. The article highlighted a variety of curve equivalent and curve adjacent schemes. Then there was a university don’s claim:

The claim that was not substantiated seemed to be that grading on a curve was part of assessment and that this was useful feedback. Specifically:

  • How is grading on a curve part of assessment when the other entities in the same article also claim they have done away with such moderating?
  • How exactly does sorting students on a curve provide feedback on meeting course objectives?

I do not know if he did not elaborate, or if the journalist or her editor left this out. Either way we have a claim without explanation or backing. None of us should take unsubstantiated claims seriously. Thankfully none of us will be graded on a curve to be critical thinkers.

Assessment in the form of summative tests and exams is the tail that wags the dog.
 

 
Why the tail? Summative assessments tend to happen at the end of curricular units. How do such tails wag the dog? They shape what gets taught and even how it gets taught.

So one might be happy to read this:

But to what effect?

It might be too early to tell given that this movement has just started. There was this report that parents and tuition centres were not buying into the new policy. That report was a follow up to a previous one last year on how “tuition centres rush in to fill (the) gap” left by a lack of mid-year exams.

So is this a case of wait and see? Perhaps.

While some hair on the tail of the dog might have been snipped, the tail is still there. Like academic streaming, having one’s worth dictated by exams is baked into our psyche.

The MOE and schools can apply invisible pressure on stakeholders like parents and tuition centres by reducing the number of exams. These stakeholders might feel the change and pressure, but not see the point. It will take time and constant reinforcement that exams are not the be-all and end-all.

We live in testing times. Not just politically or environmentally, but also in terms of actual tests.

So here is a basic tip with multiple-choice questions like the one above: Use LETTERS as options instead of numbers.


http://edublogawards.com/files/2012/11/finalistlifetime-1lds82x.png
http://edublogawards.com/2010awards/best-elearning-corporate-education-edublog-2010/

Click to see all the nominees!

QR code


Get a mobile QR code app to figure out what this means!

Archives

Usage policy

%d bloggers like this: