Goal attainment scaling.

On Friday, 2nd July 2021 (at 05.00 hrs), I participated in a debate organised by Professor Barbara Wilson and hosted virtually in Melbourne, Australia. The discussion concerned the use of goal attainment scaling (GAS). Two speakers supported its use clinically, in audit, and in research; two opposed it. At the outset, only 4% of listeners opposed its use; this rose to 19% at the end. Support dropped from 73% to 52%, with 23% and 29% undecided. This post explores the ideas and arguments put forward. It concludes that clinical goal attainment scaling adds little to the success of setting goals but carries a significant and unacknowledged risk of causing harm to a patient. Much more importantly, its misuse by commissioners and non-clinical managers poses a major risk to the process and provision of rehabilitation. This misuse continues despite explanations about the inappropriateness of its use as an audit tool or in determining service outcomes. By using goal attainment scaling at all, rehabilitation professionals and services threaten their services.

What is Goal Attainment Scaling (GAS)?

The principle is simple. As part of rehabilitation, goals are set by the team. Many goals relate to change in a patient’s clinical state, such as experiencing less pain, walking further with less effort, shopping in a local shop successfully, or not forgetting that something is cooking.

The patient’s preferences should guide all goals and wishes, though this is sometimes impossible (e.g. if unconscious or moderating anti-social behaviour is a goal).Negotiating between the therapist (or the team) and the patient in patients who can participate will usually identify more or less feasible goals. Usually, a time scale for achievement or review is set. Setting goals is integral to rehabilitation, even if its beneficial effect is challenging to detect. (here)

The first component added to setting goals is the attainment of goals. Suppose a plan is established that can be specified sufficiently to allow an observer to determine whether it has been achieved. In that case, the objective was achieved by the agreed time or at a specific time. Even recording a simple judgement of attainment, yes or no, is fraught with risks and difficulties. Some are:

an excessive focus on outcomes that can be measured to determine achievement may lead to a failure to attend to what is important. Goals of physical activities get preferred to goals concerned with adjustment, emotional state, interpersonal relationships etc.
a lack of attention to the quality of performance, such as how the patient feels about their performance; they might walk 100 metres but be so embarrassed by how they walk or by using a stick (cane) that they do not leave the house
the influence of variability, with the patient being able to perform at the requisite standard but only on a good day, or only after 20 minutes of preparation. They are unable when they want or need to.

The second component added is scaling, measuring the degree of attainment. A ‘pass/fail’ approach seems inappropriate for many reasons, particularly if the target is specified precisely to a date. The usual practice is to quantify the extent of attainment such as no change; some progress but not achieved; achieved target; exceed the target by some; greatly exceeded the target. As a descriptive approach, this seems reasonable. It allows discussion about the likelihood of further progress, why progress is less (or more) than expected, and how the rehabilitation should be altered.

Humans like numbers; they appear more scientific and can be manipulated arithmetically. It was inevitable that these verbal descriptors should be turned into numbers. It was equally inevitable that the numbers would be used as a measure. The numbers usually range between -2 (no change) and +2 (greatly exceed), with 0 being the target.

Many articles have been written about goal attainment scaling in rehabilitation. Two commonly referenced and used articles describe a method of identifying descriptors for the various levels (here) and a method of recording and using the scores (here). Suffice it to say, one needs to discuss with the patient goals that they want to achieve and how they might be measured, and then agree on a target and two lower and two higher levels. This can take a lot of time. Shai Betteridge, in the debate, reported that negotiations might take three weeks in the inpatient rehabilitation service at the Wolfson Rehabilitation Centre, London. I suspect most people are somewhat quicker, but setting three or four goals may still take several hours.

Arguments in favour

Many people support goal attainment scaling for many stated reasons. I will outline some of the commonly given reasons.

Patient-engagement.
One oft-repeated reason for using goal attainment scaling is that it increases patient understanding and engagement. It is said, without evidence, that discussing specified, measurable levels with a target outcome and levels of achievement both above and below helps motivate them.

Engaging the patient in the rehabilitation process is essential, and discussing goals and identifying what is important to them may help increase their commitment. However, identifying and specifying measurable levels below and above the general target may be more likely to disengage them. There is no evidence that it increases the patient’s commitment to the goal.

Increased patient engagement in the rehabilitation process is not a benefit of goal attainment scaling; it is a consequence of setting goals.

Patient-centred outcome.
A second, closely related reason often given is that it makes rehabilitation focus on a patient’s wishes and priorities and that outcomes measured are personalised to the patient.

As with the engagement argument, all goal-setting should be centred on the patient’s priorities. There is no added benefit to being patient-centred through using goal attainment scaling. Indeed the increased focus on measurement increases the risk of avoiding some important areas because measurable levels are too difficult to define.

Personalised goals set using goal attainment scaling are measurable if it is accepted that the measurement technique is valid. The method may need to be validated.

Quantification of the outcome.
The third, completely separate argument concerns measurement. The argument goes as follows:

a significant feature of rehabilitation is the setting of specific goals for each patient;
standardised outcome measures are the antithesis of patient-centred, context and situation-specific outcomes;
a means to quantify the achievement of patient-specific goals will allow personalised measurement of change;
one method quantifies the extent to which a patient achieves a goal – goal attainment scaling.

The first point will be accepted while also emphasising that only some goals are related to the patient’s ability to perform an activity. For example, an action might be to identify a care agency able to meet the person’s needs, a group for a patient to join, or a person willing to employ someone with the person’s strengths and limitations.

The second point is not accepted as absolute. Many standardised measures are developed in response to patient goals in general, and they are increasingly designed with patient input. Many indeed cover more activities than a specific goal will usually focus on but taking the patient’s overall outcome, standardised measures are often relevant to a patient and concordant with their general ambition. They may not identify a change in one goal; they may detect the change in areas critical to the patient.

The third statement is self-evidently true. But it does not follow that goal attainment scaling is the only or the best way to measure change. For example, suppose the goal was to walk to a shop, buy a few items, and return. In that case, a host of aspects of this activity can be quantified by timing or counting: overall time taken, the number of items remembered, returning with the correct amount of money; the number of falls; the number of prompts from someone else etc. One could measure time or quantify whatever was expected to be complicated.

The fourth statement is true, but it needs to acknowledge other ways of quantifying change about a goal or that it may be less than ideal.

Arguments against goal attainment scaling

As the poll before the debate showed, only a few people advanced arguments against it. However, many people probably have no opinion. Many use alternative methods because they find them more accessible or relevant, without necessarily considering why they do not use goal attainment scaling.

An educational reason.
Rehabilitation can be considered analogous to education (here), and in this analogy, goals are the educational objectives or outcomes aimed for. Assessment of rehabilitation progress and effect can be analogous to work-based educational reviews. Thus, assessing a patient’s current state within the rehabilitation process is similar to determining a learner’s education progress.

Within healthcare training, two types of assessment are recognised, though it is likely that all evaluations are a mixture:

summative; this categorises a student or trainee as achieving or not achieving some standard of performance.
formative; this identifies where the student or trainee is doing well, where progress is less satisfactory and needs more attention, and suggests improvements.

Using this categorisation – summative or formative – goal attainment scaling is firmly summative, with no fundamental productive component. Any patient being assessed on a goal attainment scale will feel that their progress is being classified in a judgemental way – worse than expected, as expected, or better than expected. Given the name – goal attainment scaling – it is hardly surprising if a patient considers it a judgemental,’ high stakes’ assessment.

Of course, the assessor will usually interpret and use the result to plan a way forward. Nevertheless, the patient’s perspective is likely to differ; more importantly, others, such as family members, other team members, managers and commissioners (who are paying), may see the result as a judgement. (For further discussion of summative and formative assessments, their perception and use, see here.)

Psychometrics.
The second area of concern relates to its (lack of sound) psychometric properties if used as a measure, whether simply for the individual patient, on its own, or as part of a group of similar goal attainment scale measurements, for a patient, or more probably as a group measure in audit or research. The problem with its use as a numerical scale is well explained by Professor William Levack in writing here (Chapter 5, pp 91-110) and audio-visually here (in 6 minutes 12 seconds).

One obvious flaw is that the various levels (-2, -1, +1, +2) are unlikely to form an equal interval scale. Indeed, the ‘expected’ point (= 0) is likely to be located near one end and is unlikely to be central. In a group of goal attainment scale scores, it is equally unlikely that all the expected points will be at the same relative point between -2 and +2. One patient (or one outcome of several for one patient) may have a 0 close to +2 and another a 0 close to -2.

The mathematical transformations are equally subject to uncertainty and considerable debate about their validity.

Use of goal attainment scaling.
The most significant concern, and the strongest argument against its use, is the frequent and severe misuse of the scores.

Strictly within a patient-therapist, confidential relationship using the verbal descriptors alone, it will usually be used appropriately provided the professional does not use the numbers and the professional takes care when interpreting and acting on the information. It is difficult to see much benefit, but within these parameters, the only residual risks are (a) patient perception of failure if they do not reach their goal and (b), more importantly, failure to attend to important but less easily quantified goals.

However, using aggregate numerical data derived through goal attainment scaling carries significant risks. As William Levack shows, it is psychometrically flawed. It is not only comparing or adding watermelons, mangoes and Kiwi fruit but also ignoring the variations in size between and within different fruit.

Resources used and opportunities missed.
The last consideration is practical. It takes time and effort to identify goals amenable to scaling and then more time to negotiate four additional levels. This risks not giving attention to other essential actions or interventions for the patient.

Further, in a research context, knowing who should set the goals is difficult. If it is the treating therapist, the risks of bias are significant: selection of more straightforward (or more complex) plans, setting easier or more difficult levels etc. If it is an external, neutral person, the goals may not be agreed upon by the treating therapist or team. It is also challenging to know who should measure the outcomes; the person who set them or another person. Inter-rater reliability may be low. (here, and see comments here)

Broader considerations

My biggest concern arises from the misuse and misinterpretation of data collected, primarily by managers and those paying, but probably also by service therapists and teams, patients, and the public.

The positive selling points are powerful: person-centred outcomes of importance to the patient are quantified in ‘objective scientific data’ (numbers). What could be better?

What is not acknowledged is that the process:

may not measure the goals of most importance to the patient because the goal cannot readily be measured and broken up into different levels;
produces invalid numerical data that cannot be analysed using parametric methods, if at all;
is subject to other factors that influence both the selection of the goal and the setting of expected and different levels;
can take a long time, a time that could be better used in other ways;
emphasises goal attainment over a more critical evaluation of whether the goal was correct and whether different approaches would be better.

The reply to my concerns about misuse given in the debate was that “we need to educate them [commissioners etc.]”. At one level, this is correct, but it shows a touching and, in my view, naive faith in the power of education. Commissioners want outcome data, and goal attainment scaling fits their needs very well because patients cannot complain. Commissioners will ignore references to published literature; unwelcome and inconvenient facts are easily overlooked. Commissioners already need to pay more attention to or follow strong NICE guidance, for example, concerning fertility services. Most commissioners need to be trained in science and clinical practice, and they change with depressing regularity as re-organisation occurs again and again.

Moreover, they will point to its use and say, “Look, you are using it, so why shouldn’t we?”. They will find someone who supports their use, verbally or in writing.

While goal attainment scaling is used in any way in rehabilitation, we (as a community) validate their misuse of the process. Given that the process of attainment scaling itself has virtually no person-centred benefits that cannot be achieved equally or more easily without scaling, we should stop using it and focus on goal setting and review entirely within a learning framework, not an evaluative framework.

Some evidence

What about evidence? This is not a systematic review, but a few relevant papers will be discussed briefly.

An actual systematic review studied the validity of Goal Attainment Scaling. (here) It is an exciting but challenging read that points out many severe weaknesses and inconsistencies. For example, the discussion says, “The variability noted among reviewed articles highlights that the GAS has many different interpretations. There is a lack of clarity regarding how the GAS is best interpreted, what specific construct the GAS measures, and whether the GAS measures a goal construct or whether the GAS is best regarded as its own measurement technique.” It highlights the need for a sound theoretical basis and considerable variability in what is considered a goal, with only one paper defining it.

One crucial component of the paper is a discussion on the nature of validity and the importance of interpretation and consequences. For example, they conclude, “Therefore, the GAS score has an applied purpose and a social consequence. Whether the GAS score is used in educational or clinical settings, its score has meaning, and a judgment or interpretation is formed based on its value. To evaluate how plausible an interpretation is, “it is necessary to be clear about what the interpretation claims””

After reading this article after the debate, I became more concerned about goal attainment scaling.

For anyone interested in statistical aspects and able to follow the complex statistical discussion, a second article (here) will be of interest. Although it discusses statistics, it makes some crucial points that clarify that goal attainment scaling cannot be used in research trials or audits.

The authors state that to avoid bias, all goals, all criteria for achievement, and all weights must be chosen before random allocation. This is far from clinical reality and excludes goal attainment scaling as a practical or credible outcome measure. In research trials, they say shortly after that: “practical challenges, such as the training and time required for goal setting as well as a lack of scientific literature on study design and analysis methods may be an obstacle in the application of GAS as an endpoint in clinical trials.”

The authors suggest it may be most suitable for rare disorders with very heterogeneous patient populations and offer a variety of statistical analytic methods.

A third study compared rehabilitation based on goal attainment scaling with “usual care physiotherapy rehabilitation”, described in the protocol papers thus: “therapists generally work on pain reduction and the improvement of range-of-motion, muscle strength, endurance and gait pattern”. (here) In other words, no goal-setting was undertaken.

The results showed no difference between the two groups in measures of physical activity (here) nor other measures of function (here). The only difference was in satisfaction with work. With 120 participants, this may be a valid observation, but its consistency with all other measures suggests it may be chance rather than actual.

There are many other papers, but these papers covering a range of issues certainly raise considerable doubts about goal attainment scaling as a measure in research. There is no reason to believe it is any better as a measure in clinical practice.

Conclusion

This post has put forward several arguments that suggest goal attainment scaling is flawed as a measure and indicated that the process of goal attainment scaling might neutralise the advantages of setting goals (which themselves are not proven beyond all reasonable doubt in rehabilitation practice). It further suggests that the continued use of goal attainment scaling will be used to justify the misuse of the process by managers and commissioners. Therefore, its use poses a non-trivial risk to the rehabilitation community. The continued usage of goal attainment scaling may also harm some patients. It should only be used clinically once good evidence shows the unequivocal clinical benefit of goal attainment scaling over and above personalised goal setting without goal attainment scaling. And once a good case is made for its proper use in research and audit, it should not be used in research and audit.

Goal attainment scaling.

Table of Contents

What is Goal Attainment Scaling (GAS)?

Arguments in favour

Arguments against goal attainment scaling

Broader considerations

Some evidence

Conclusion

Meta

Goal attainment scaling.

Table of Contents

What is Goal Attainment Scaling (GAS)?

Arguments in favour

Arguments against goal attainment scaling

Broader considerations

Some evidence

Conclusion

Meta

Discover more from Rehabilitation Matters

Subscribe to Blog