Another dot in the blogosphere?

Posts Tagged ‘sft

The simple message in the tweet below hides a profound principle of evaluation.

Why is the message a timely one? In many university campuses all over the world, an academic semester is nearing an end or already ended. It is time for the end-of-course evaluations.

Instructors who do not have teaching backgrounds, those who resent teaching, or those who cannot teach well are dreading these evaluations. If only they would collectively point out that such exercises are based on a flawed approach .

Many end-of-course evaluations (otherwise known as student feedback on teaching, SFTs) read like customer satisfaction surveys because they are often designed by administrators, not assessment and evaluation experts.

Even a well-prepared instructional designer should be able to point out that SFTs often only operates at Level 1 of the Kirkpatrick Evaluation Model. It is a simple measure of each student’s snapshot reaction to weeks or months of coursework.

SFTs should be about the effectiveness of teaching and the quality of learning. But if you unpack the questions in most evaluation forms, you will rate such “evaluations” as satisfaction surveys instead. A researcher with rudimentary knowledge of data collection will tell you that such information is not valid — it does not measure what it is supposed to measure.

I have reflected before on how I do not place much stock in SFTs if they are not well designed and implemented. I ignore the results even though I do well in them. How can I when I know that they are not valid measures? Why should I be satisfied with unsatisfactory practices?

Video source

I have been consistent about my stance against end-of-course student feedback on teaching (SFTs). Today my reflection was prompted by this tweet.

I am confident that, like me, this professor and others like him, do not get bad reviews. We are against a data collection method that is flawed. 

I caution administrators against using SFTs as the only measure of faculty teaching because SFTs are:

  • not valid in if they do not measure if effective learning took place
  • used for purposes other than to improve instruction
  • summative in that they do not allow teaching faculty to make changes that semester
  • reliant on student self-reports as a single data source

The tweet highlighted how invalid SFTs can be. No matter the questions asked, students might bias their answers because of non-teaching or superficial traits of their instructor/facilitator. The questions in an SFT are also likely to focus on teaching-related aspects of a course (e.g., the LMS) instead of how much or how well they learnt.

SFTs designed to measure traditional and face-to-face teaching methods also might not align to online methods or facilitative approaches. For example, SFTs rarely (if ever) focus on the design of effective asynchronous learning resources or personalised online coaching.

Administrators use SFTs to rank faculty during promotion and retention exercises. This is clear to any full-time university faculty with a significant teaching load. I know of ex-colleagues who would game the system by currying favour with their students so that they would get good SFTs. 

These folk needed the most help improving their instruction, but since they got good enough SFTs, they did not reflect and improve on their practice. They just got better at gaming the system.

If SFTs are primarily for improving the quality of courses and instruction, they cannot be implemented at the end of a course. Good teachers collect feedback constantly so they can make adjustments on the run.

Insisting that data from the end of one course should inform the design and implementation for the next one misses the point — teaching is dynamic and complex. You can take the same instructor, design, and content, but different batches of students will react differently.

SFTs also rely on self-reports by students. These are equivalent to the Kirkpatrick Level 1 “smiley sheets” that seek opinion rather than fact. If students like you, they will rate you higher than you serve. The opposite is also true.

So what else can we do in addition to or as alternatives to SFTs? In my reflection earlier this year, I suggested “multiple methods, e.g., observations, artefact analysis, informal polling of students, critical reflection”. 

Today I would add that faculty portfolios capture these methods. Remarks from casual observations by fellow faculty, marked up video recordings, key takeaways from brief but regular student polls, and faculty reflections can be collated on online platforms like a blog or Google Site.

Portfolios have another plus: They put the ownership of the design, implementation, and evaluation of courses in the hands of teaching faculty. If these instructors carefully maintain their portfolios outside university, they can take them wherever they go. 

That said, portfolios do not resolve the biggest problem with SFTs. They might still be about teaching. What matters is whether students learnt, what they learnt, how much and how well they learnt it, etc. 

That problem is not an easy one to solve. Students might view courses merely as stepping stones to paper qualifications. There is the long tail of learning, i.e., their ah-ha moments might occur outside the course and these are not captured. Their in-course learning might not be intentional but still desirable, e.g., they learnt how to manage their time, but these too are not measured.

The biggest problem is that both administrators and faculty might be content with measuring the low-hanging fruit. After all, it is easy to hide behind the rock called It Has Always Been Done This Way.

Even though this tweet is from a parody account, there is truth in its humour. If you focus on the wrong thing, you get invalid measurements.

If you want to find out if students have learnt anything, satisfaction is level 1 on the Kirkpatrick scale — the lowest level of evaluation. 

Worse than that, satisfaction of any experience is not reflective of whether or what a student has learnt. A single factor, whether positive or negative, can affect student satisfaction. Many different factors can be that single factor.

Measuring satisfaction is easy and lazy, and it does not come close to getting a measure of learning. Good learning is difficult, challenging, and long-term. 

Educators and administrators should not be satisfied with satisfaction surveys. They target the wrong ends and shift teaching behaviours towards pleasing and pandering. 

SFT is short for student feedback on teaching. There is some variant of this initialism in practically every higher education course.

The intent of SFTs is the same: It is supposed to let the instructor know what they did well and what areas need improvement. However, they end up as administrative tools for ranking instructors and are often tied to annual appraisals.

The teaching staff might get the summary results so late, e.g., the following semester, that they cannot remediate. As a result, some teaching faculty game the process to raise their scores while doing the bare minimum to stay employed.

Using SFTs alone to gauge the quality of a course is like relying on just one witness to a traffic accident. It is not reliable. It might not even be valid if the questions do are not aligned to the design and conduct of the course.

Instead, teaching quality needs to be triangulated with multiple methods, e.g., observations, artefact analysis, informal polling of students, critical reflection.

The tweet above provides examples of the latter two from my list. It also indicates why SFTs might not even be necessary — passionate educators are constantly sensing and changing in order to maximise learning.

The next tweet highlights a principle that administrators need to adopt when implementing multi-pronged methods. Trying to gauge good teaching is complicated because it is multi-faceted and layered.

You cannot rely only on SFTs which are essentially self-reporting exit surveys. This is like relying on one frame of a video. How do you know that the snapshot is a representative thumbnail of the whole video? At best, SFTs offer a shaky snapshot. Multiple methods are complicated, but they provide a more representative view of the video.

The number of likes this tweet received probably reflects the number of higher education faculty who can relate to it. 

By generalising the phenomenon we might conclude that we tend to focus on the negative. This is why newspapers and broadcasters tend to report bad news — it gets eyeballs and attention.

The underlying psychological cause is a survival instinct. We are primed to spot danger. Something negative is a possible threat and we pay a disproportionate amount of attention on that. 

But giving sensationalised news and one bad review too much attention is not good either. These might demoralise us and shift our energy away from what is important. 

What is important is making improvements. I do not place much weight on end-of-course evaluations because they are rarely valid or designed properly. 

Instead I focus on what happens at every lesson. I self-evaluate, I pick up cues as the lesson progresses, and I get feedback from my students. I do not wait for the end of a course because it is too late to do anything then. I prefer to prevent a ship from running aground.

 
I have had the privilege and misfortune of experiencing how student feedback on teaching (SFT) is done in different universities.

When I was a full-time professor, the institute I worked at specialised in teacher education and had experts in survey metrics. So no surprises — the SFTs were better designed and constantly improved upon.

One of the best improvements was the recognition that different instructors had different approaches. Each instructor had a set of fixed questions, but could also choose and suggest another set of questions.

As an adjunct instructor now and roving workshop facilitator, I have been subject to feedback processes that would not have passed the face validity test at my previous workplace.

One practice is administration using only positive feedback to market their courses. Feedback, if validly measured, should be used to improve the next semester’s offering, not be a shiny star in a pamphlet.

Another bad practice is sampling a fraction of a class. If there is a sampling strategy, it must be clear and representative. Feedback is not valid if only some participants provide it.

Yet another SFT foible is not sharing the feedback with the facilitator or instructor. One institute that operated this way had multiple sections of a course taught by different instructors. However, the feedback did not collect the name of their primary instructor because classes were shared.

All the examples I described attempted to conduct SFT. None do it perfectly. But some are better informed than others. Might they not share their practices with one another? If they do, will institutional pride or the status quo stand in the way?

Today I offer another reason why the one-size-fits-all type of end of course evaluations are not valid.
 

 
I have reflected on how I design and implement my classes and workshops to facilitate learning. I do not try to deliver content. The difference is like showing others how to prepare meals vs serving meals to them.

You would not evaluate a chef and a Grab delivery person the same way. Each has their role and worth, so each should be judged for that. Likewise student feedback on teaching (SFT) must cater to the design and implementation of a course.

 
I have never placed much weight on end of course feedback. This was even if the results of such data was favourable. Why? My knowledge research on such feedback and my experiences with the design of questions hold me back.

In my Diigo library is a small sample of studies that highlight how there is gender, racial, and other bias in end of course feedback tools. These make the data invalid. The feedback forms do not measure what they purport to measure, i.e., the effectiveness of instruction, because students are influenced by distractors.

Another way that feedback forms are not valid is in their design. They are typically created by administrators who have different concerns from instructors. The latter are rarely, if at all, consulted on the questions in the forms. As a result, students might be asked questions that are not relevant.

For example, take one such question I spotted recently: The components of the module, such as class activities, assessments, and assignments, were consistent with the course objectives. This seems like a reasonable question and it is an important one to both administrator and instructor.

An administrator wants alignment particularly if a course is to be audited externally or to be benchmarked against other similar offerings elsewhere. An instructor needs to justify that the components are relevant a course. However, there are at least three problems with such a question.

First, the objectives are not as important as outcomes. Objectives are theoretical and focus on planning and teaching while outcomes practical and emerge from implementation and learning. Improvement: Focus on outcomes.

The second problem is that it will only take one component — an activity, an assessment, or an assignment — to throw the question off. The student also has the choice to focus on one, two, or three components. Improvement: Each component needs to be its own question.

Third, not all components might be valid. Getting personal, one of the modules I facilitate has no traditional or formal assessment or assignments. The student cannot gauge a non-existent component, so the question is not valid. Improvement: Customise end of course forms to suit the modules.

Another broad problem with feedback forms is that they are not reliable. The same questions can be asked of different batches of students, and assuming that nothing else changes, the average ratings can vary wildly. This is a function of the inability to control for learner expectations and a lack of reliability testing for each question.

End of course evaluations are convenient to organisers of courses and modules, but they are pedagogically unsound and lazy. I would rely more on critical reflection of instructors and facilitators, as well as their ability to collect formative feedback during a course to make changes.

I object to end-of-course student evaluations, particularly if the course is, say, only two sessions deep. Heck, they can happen at the end of a half semester (after about six sessions) or a full semester (about double the number of sessions) and I would still object.

This not because I got poor results when I was a teaching faculty member. Quite the opposite. I had flattering scores that were often just shy of perfect tens in a variety of courses I used to facilitate.

No, I object to such evaluations because they rarely are valid instruments. While they might seem to be about the effectiveness of the course, they are not. These evaluations are administrative and ranking tools for deciding which courses and faculty to keep.

Course evaluations are also not free from bias. Even if the questions are objective, the participants of the questionnaire are not. One of the biggest problems with end-of-course evaluations are that they can be biased against women instructors [1] [2] [3].

I would rather focus on student learning processes and evidence of learning. Such insights are not clearly and completely observable from what are essentially perception surveys.

If administrators took a leaf from research methodology, they might also include classroom observations, interviews, discourse analysis (e.g, of interactions), and artefact analysis (e.g., of lesson plans, resources, assignments, and projects).

But these are too much trouble to conduct, so administrators settle for shortcuts. Make no mistake, such questionnaires can be reliable when repeated over time, but they are not valid for what they purport to measure.

Some might say that end-of-course evaluations are a necessary evil. If so, they could be improved to focus on processes and products of learning. This article by Faculty Focus has these suggestions.

Article by Faculty Focus has these suggestions for questionnaires that focus on processes and products of learning.

Are there any takers?

Yesterday I reviewed how we use Google Forms and Spreadsheets at the CeL. I also mentioned how I use the same in my courses. Today I envision how we might use a GF-like system in NIE.

Near the end of every course, we conduct SFT, student feedback on teaching. This is great except that this SFT:

  • utilizes unnecessary human labour
  • is paper-and-pencil based
  • gets done with only one class out of the X number that you teach/facilitate
  • is based largely on an old model of teaching

The SFT requires at least one administrator to collate and coordinate requests, to bring the forms and pencils to the venue, brief the participants and collect the materials. The forms then need to be scanned and the data processed confidentially. There is nothing wrong with that unless you realize that this can (and should) be done electronically.

I am told that students that the National University of Singapore provide feedback on teaching this way. If don’t complete SFTs, they do not get to sit for their exams or get their grades. So it is not as if the idea of electronic SFT is impossible.

One objection to this form of SFT is that our student teachers need computers to do this. They already have access to mobile computing devices. If they don’t already own one, they are provided with one when they step into our hallowed halls. They can complete the form in an iPad or smartphone that they already own. They can complete the form while they are travelling home in the train or in the comfort of their dorm rooms.

Another objection is that SFTs will drop if they are electronic. Really? SFTs drop if your teaching sucks. If your learners are going to provide very positive or negative feedback, they are going to do so whether they do it in person with pencil and paper or whether they do it on screen. Furthermore, as electronic SFTs do not require one or more admin folks to run around with the forms, all classes that you teach can be surveyed. This reduces bias and is a better reflection of your abilities as an educator. Or, with more than one SFT available, you can choose the best result from a pool.

But getting the SFT process from paper to electronic form is messing with entire systems, i.e., exams, staff appraisal, human resource, LMS, etc. The easiest and most important thing to change now is the type of questions. They are still oriented to the teacher as content expert. The categories of the seven questions in our current SFT are:

  1. preparation and organization*
  2. knowledge*
  3. enthusiasm for the subject*
  4. learning and thinking
  5. delivery*
  6. effectiveness*
  7. overall rating*

I view the asterisked* items as traditional measures. Some might argue that enthusiasm is not content-specific, but I wonder how you might be passionate about something you have little knowledge of. And in light of how heavily biased the questionnaire is on traditional teaching, how else are the participants supposed to evaluate “effectiveness” and “overall rating”?

The only measure that comes close to a more progressive and TE21-relevant question is whether the instructor promoted thinking.

Don’t get me wrong. All the measures are still important at the moment. They need some more relevant inclusions. Why? Just as teachers tend to teach to the test, teacher educators will want to meet the criteria in the SFT. If these don’t change, most people won’t change.


Archives

Usage policy

%d bloggers like this: