Björn Rohles rohles.net

Evaluation in Human-Centered Training Management Measuring What Matters in Education

Last update: Reading time: 12 minutes Tags: education, evaluation, human-centered design, training management

Education impacts how we think, work, and contribute to the world. Yet as education providers, we mainly measure what is easy rather than what matters. Our definition of educational quality determines what we need to pay attention to and what we risk overlooking. This is why we need clarity about what educational quality is and how to measure it.

Measurement matters for management and strategy, but it is not an end in itself: It needs to align with its strategic purpose. In educational institutions, the central strategic purpose is to ensure educational quality. Building on Human-Centered Training Management, this article examines the measurement of educational quality.

The human-centered training management cycle. At the beginning, there is a strategy, outlining vision, objectives and principles of training management. It is connected to the cycle itself, which could start at any point. One phase is exploration and analysis, where audience and their needs are understood. The next is defining requirements, which includes content, activities and learning objectives.  Based on this, course materials are created. Finally, the course is evaluated againt the requirements, in both experience and outcomes. This is the focus of this article, visualised with a zooming glass. From this evaluation, there are arrows leading back to earlier phases, or alternatively towards repeating the course, always with including feedback in a loop. Finally, it is possible to sunset the course, meaning it is stopped.
Evaluation within the Human-Centered Training Management Cycle

Measuring what is easy

When a measure becomes a target, it ceases to be a good measure.
Goodhart’s Law, as stated by Marilyn Strathern (1997)

What I observe in educational organisations is a range of metrics to define “quality”, such as:

  • sign-up number: how many participants enrolled
  • room occupancy: the ratio of booked rooms to total number of rooms
  • show-up rate: the ratio of participants who actually participated to total number of participants
  • number of courses or training hours: how many courses or training hours we have programmed

These are what I would consider output variables. They are easy to measure. They are straightforward to understand. And of course, they are, at least somehow, related to quality: We can hardly be successful in training management if nobody signs up. The problem is just: They are incomplete. We can have many people sign up, and still fail miserably. Output is simply not the same as outcome. But if we want to measure outcomes, we need to be aware of what actually counts in education.

What counts in education: Quality dimensions

However, not everything that can be counted counts, and not everything that counts can be counted.
William Bruce Cameron (1966)

Education is not about sign-up number, room occupancy, or show-up rates. Instead, it is about educational quality. Educational quality ultimately is about fulfilling requirements of a course. Educational quality comes in different flavours:

  • Functional quality is about what learners need to achieve (the jobs-to-be-done), such as developing a particular skill, performing specific tasks, or working on a defined project.
  • Hedonic quality is psychological and emotional. It creates the necessary conditions for learning and ultimately turns learning into positive experiences.
  • Eudaimonic quality is about meaning – a feeling that a learning experience deeply matters for what is important to a learner. It covers making an impact in life, feeling aligned with your values, and becoming the best version of yourself.
  • The final form of quality extends beyond a single person, which is why I want to call it beyond-self quality. It addresses impacts for society, humanity, or even all living beings, such as sustainability, equality, or democracy.
Educational quality has four dimensions: functional (the jobs-to-be-done, doing), hedonic (psychological, feeling), eudaimonic (meaning, values, transformation, the self) and beyond-self quality (about others, like society, humanity, or all living beings).
Dimensions of educational quality

Knowing about these dimensions of quality is an important first step in Human-Centered Training Management, as it allows defining learner expectations for a particular course or other learning experience. However, defining them is not enough: At some point, we will need to assess whether a course actually fulfils these expectations. Therefore, it is necessary to dive deeper into the evaluation of these dimensions of educational quality.

Evaluation of educational quality

In my article defining quality in Human-Centered Training Management, I identified some candidate criteria for the different dimensions of quality, but I did not go all that deeply into how to measure them. Three considerations are particularly relevant for measurement of educational quality.

Perceived and observed metrics

First, metrics can be perceived or observed:

  • Perceived metrics measure how a person feels or thinks about something. We collect them by asking the person about their impressions.
  • Observed metrics address what you can see or test from the outside.

Subjective criteria like satisfaction necessarily require perceived metrics. Other criteria allow different options, such as perceived learning or actual test scores. Perceived and observed metrics also do not necessarily align: You could have the subjective impression that you understood everything, but still make errors in a test. On the other hand, evidence is stronger when perceived and observed metrics align, like when we observe high collaboration in a course and learners indeed report strong feelings of connectedness.

There are also different perceptions about what counts as valid evidence. For example, some might consider observed metrics as more “objective”, and thus prefer a numerical satisfaction score like 4.2 out of 5 to anecdotal statements like “this course helped me make a project come true I always wanted to do”. But the latter is much more meaningful for eudaimonic educational quality. This does not make one metric better than the other: Every dimension of educational quality builds on a different idea of what learning is:

  • Functional quality ultimately defines learning as acquisition. Learners have a skill or they do not. And a test or project can objectively reveal this.
  • Hedonic quality defines learning as an experience. But by definition, an experience is subjective. We might be able to observe some evidence, such as a person looking nervous. But experiences ultimately need to be reported.
  • Eudaimonic quality is much more transformational and reflective. It unfolds over time. We might only see it in retrospect or after a strong reflection.
  • Functional, hedonic, and eudaimonic quality all relate to the individual. By definition, beyond-self quality does not. As a consequence, it is much harder to evaluate. Rather than measuring in the strict sense, we need to ask questions about ethics and the role of education as a whole and our training in particular for our values and the greater good. For example, how do we address the massive energy consumption of the fancy AI tools that we discuss in our courses? Or how do we speak about social media, which undoubtedly connects people and gives them a voice, but also caused fake news and ghost work (Gray & Suri, 2019)? These questions are not easy, but they are required.

So each of the quality dimensions implies different epistemology of evidence. And in practice, we need a good balance of metrics for these quality dimensions. However, knowing about the different metrics is just one step towards solid measurement. We also need to consider when this measurement happens and who is doing it.

Timing of measurement

Second, the timing and frequency of measurement makes a difference. In my experience, most educational institutions measure only once immediately at the end of a learning experience, for example in a survey after the course took place. This is useful for collecting feedback while it is still fresh in the minds of learners. However, measuring at the right moment matters as much as measuring the right thing.

Measurement can happen at various times, for example before a learning experience, at the start of it, while it is going (during the learning experience), at the end of it, or at after a learning experience
Different timings of data collection about learning experiences

General insights about learners and their needs typically happen before a learning experience, for example to identify learning objectives during definition of training requirements.

When learner expectations or insights into their prior knowledge need to be collected, this typically happens directly at the start of a learning experience. A trainer can use these insights to adapt the course content or pace.

Some metrics are most useful during a training. For example, asking about mental load during a course can help a trainer to adapt a course to learners’ needs: They can go slow and re-explain complicated content when necessary, or go faster when content is easy. These in-the-moment measurements are important for training managers as well. For example, when we know that hedonic quality tends to be low in the beginning of courses, it would be useful to observe the first hour of a course to identify what is going on. Is it that people arrive in a room full of strangers and feel uncomfortable? Is it that they have to log-in to a system with training exercises and struggle to find their passwords? Data and observations like this help training managers to work on systematic enhancements of these key moments.

Other metrics only make sense when measured some time after a learning experience or repeatedly over a longer period of time, such as transfer of knowledge into one’s personal context or work environment. Eudaimonic quality is a good example: We might have a vague feeling during a course that it helps us realise our potential. But ultimately, eudaimonic quality is about transformation. So whether the course really achieved that becomes only visible in hindsight, by observing how we evolved afterwards.

Who is measuring

Third, who is measuring is an important consideration. We often have the tendency to go for the easy, readily available option. But every decision we make impacts which data we collect and how to interpret it:

  • Learners can assess themselves, but are subject to various biases. An example is the Dunning-Kruger effect, a cognitive tendency of people with relatively low competence to overestimate their skills.
  • A trainer is mainly able to observe participants in the moment, but has limited capacity for adapting to specific individual needs.
  • A mentor might be able to analyse statements and observe evidence of competence, such as a portfolio. This allows the mentor to provide specific guidance, but requires substantial investment of time and energy.
  • An external person, such as the manager of an employee who participated in a course, might not know much about the training itself, but could notice a change in behaviour of a learner.
  • A training manager systematically collects data across training needs, learners, trainers, courses, and time. Especially in Human-centered Training Management, data triangulation is key. Training managers can achieve a valid evaluation only by considering a large variety of these perspectives.

Deciding who is measuring is also related to questions of power and trust. Measuring gives authority to the person doing it. For example, if we organise an on-demand course and let an HR manager assess the course quality, we de-emphasise the perspective of the actual learners, who might have very different learning objectives and priorities (unless we counter-balance this again by also speaking to learners). Questions of power are tricky and there is hardly one correct answer. But it is important to understand that the choice is not neutral and must, therefore, be deliberate.

Trust is particularly important for perceived metrics. For example, learners need to feel the psychological safety to report their honest impressions. Anonymous or pseudonymous data collection can encourage them to do this.

Towards a framework for measuring educational quality

In my earlier article on educational quality, I already identified some candidate criteria for functional, hedonic, eudaimonic, and beyond-self quality. Mapping them against the three evaluation dimensions explored here is a first step towards a framework for measuring educational quality.

  • As part of functional quality, effectiveness refers to whether people learn the required skill in a course. We could measure it by observing learners perform tasks or answer questions (observed metrics by the trainer), but we could also ask whether they feel comfortable in the new skill (perceived metric, reported by the learner). Measurement could take place during a learning experience or at its end.
  • However, effectiveness does not tell the whole story. You might be effective in the short-term, directly after the training; but two weeks later, you are not able to apply anything in your real-life context. This is where transferability comes into play. It is also part of functional quality, but needs to be assessed repeatedly over a longer period of time, either by the learners themselves or by an external observer such as a manager or mentor. Both perceived and observed metrics could be used.
  • Stimulation is part of hedonic quality and refers to how engaged a learner feels. It is typically measured as a perceived metric by the learner during or after a learning experience.
  • Eudaimonic quality includes self-actualizing, that is whether the learning experience helps learners to realize their full potential. Evaluating it requires repeated, long-term reflections by a learner (perceived metric) and potentially a mentor.
  • Societal impact is part of beyond-self quality, for example whether a learning experience supports democracy. This is beyond classical measurement, but ethical reflections by every party involved can still help to assess it on an ongoing basis.

The table below provides a summary of these examples.

Quality dimension Criterion Perceived / Observed Timing Who
functional effectiveness both during or after trainer, learner
functional transferability both long-term learner, external
hedonic stimulation perceived during or after learner
eudaimonic self-actualising perceived long-term learner, mentor
beyond-self societal impact ethical reflection ongoing everyone

Conclusion

As measurement matters for organisations, education providers too often move to the easy metrics like sign-up numbers and room occupancy as indicators of their “success”. These easy metrics can even become targets to optimise for. This is Goodhart’s law in action. And it means ignoring that educational quality is ultimately about something else: It has functional (“jobs-to-be-done”), hedonic (how education feels), eudaimonic (how education makes a person a better self) and beyond-self dimensions (how education contributes to society, humanity, and living beings as a whole). Education providers need a good balance of metrics to assess these dimensions. Defining them requires deliberate and informed choices about what to measure when and by whom. At the same time, we need to be aware that not everything can be measured – but just because we cannot measure something does not mean it does not count.

References

Cameron, W. B. (1966). Informal sociology: A casual introduction to sociological thinking (Bd. 21). Random House.
Gray, M. L., & Suri, S. (2019). Ghost work: how to stop Silicon Valley from building a new global underclass. Houghton Mifflin Harcourt.
Strathern, M. (1997). ‘Improving ratings’: Audit in the British University system. European Review, 5(3), 305–321. https://doi.org/10.1002/(SICI)1234-981X(199707)5:3%253C305::AID-EURO184%253E3.0.CO;2-4

Download this article

Download PDF Download EPUB