One Friday 2nd July 2021 (at 05.00 hrs) I took part in a debate organised by Professor Barbara Wilson and hosted, virtually, in Melbourne, Australia. The debate concerned the use of goal attainment scaling (GAS). Two speakers supported its use clinically, in audit, and in research and two speakers opposed its use. At the outset on 4% of listeners opposed its use; this rose to 19% at the end. Support dropped from 73% to 52%, with 23% and 29% being undecided. This post explores the ideas and arguments put forward. It concludes that clinically goal attainment scaling adds little or nothing to the success of setting goals, but that it carries significant and unacknowledged risk of causing harm to a patient. Much more importantly, its misuse by commissioners and non-clinical managers poses a major risk to the process and provision of rehabilitation. This misuse continues despite explanations about the inappropriateness of its use as an audit tool or in determining service outcome. By using goal attainment scaling at all, rehabilitation professionals and services threaten their own services.
What is goal attainment scaling?
The principle is simple. As part of rehabilitation, goals are set by the team. Many of the goals relate to change in a patient’s clinical state such as experiencing less pain, walking further with less effort, shopping in a local shop successfully, or not forgetting that something is cooking.
All goals should be guided by the patient’s preferences and wishes, though at times this is not possible (e.g. if unconscious, or if moderating anti-social behaviour is a goal). In patients who are able to participate, negotiation between and therapist (or the team) and the patient will usually identify goals that are more-or-less feasible. Usually a time scale for achievement or review is set. Setting goals is an important part of the rehabilitation process, even if its actual beneficial effect is difficult to detect. (here)
The first component added to the setting of goals is the attainment of goals. If a goal is set that can be specified sufficiently to allow an observer to determine whether it has been achieved, then one can say that, by the agreed time, or indeed, at a certain time, the goal was achieved. Even recording a simple judgement of attainment, yes or no, is fraught with risks and difficulties. Some are:
- an excessive focus on outcomes that can be measured to determine attainment, may lead to a failure to attend to what is important. Goals of physical activities get preferred to goals concerned with adjustment, emotional state, inter-personal relationships etc.
- a lack of attention to the quality of performance, such as how the patient feels about their performance; they might walk 100 metres, but be so embarrassed by how they walk or by using a stick (cane) that they do not leave the house
- the influence of variability, with the patient being able to perform at the requisite standard but only on a good day, or only after 20 minutes of preparation, and they are unable when they want or need to.
The second component added is that of scaling, measuring the degree of attainment. Just having a ‘pass/fail’ approach seems inappropriate for many reasons, particularly if the target is specified precisely to a date. The usual approach is to quantify the extent of attainment such as: no change; some progress but not achieved; achieved target; exceed target by some; greatly exceeded target. As a descriptive approach, this seems reasonable, and it allows discussion about the likelihood of further progress, why progress is less (or more) than expected, and how the rehabilitation should be altered, it at all.
Humans like numbers; they appear more ‘scientific’ and they can be manipulated arithmetically. It was inevitable that these verbal descriptors should be turned into numbers. It was equally inevitable that the numbers would be used as a measure. The numbers usually range between -2 (no change), and +2 (greatly exceed) with 0 being the target.
Many articles have been written about goal attainment scaling in rehabilitation. Two commonly referenced and used articles describe a method of identifying descriptors for the various levels (here) and a method of recording and using the scores (here). Suffice it to say, one needs to discuss with the patient goals that they want to achieve and how they might be measured, and then agree a target and two lower and two higher levels. This can take much time. Shai Betteridge, in the debate, reported that negotiations might take three weeks in the inpatient rehabilitation service at the Wolfson Rehabilitation Centre, London. I suspect most people are somewhat quicker, but it may still take several hours to set three or four goals.
Arguments in favour.
Goal attainment scaling is supported by many people, for many stated reasons. I will outline some of the commonly given reasons.
One oft-repeated reason given for using goal attainment scaling is that it increases patient understanding and engagement. It is said, without evidence, that discussing specified, measurable levels with a target and levels both above and below helps motivate them.
Obviously it is important to engage the patient in the rehabilitation process, and discussing goals and identifying what is important to them may help to increase their commitment. However, the process of identifying and specifying measurable levels below and above the general target maybe more likely to disengage them. There is no evidence that it increases the patient’s commitment to the goal.
Increased engagement of the patient in the rehabilitation process is not a benefit of goal attainment scaling; it is a consequence of the process of setting goals.
A second, closely related reason often given is that it makes rehabilitation focus on a patient’s wishes and priorities, and that outcomes measured are personalised to the patient.
As with the argument about engagement, all goal setting is or should be centred on the patient’s priorities. There is no added benefit to being patient-centred through using goal attainment scaling. Indeed the increased focus on measurement increases the risk of avoiding some important areas because measurable levels are simply too difficult to define.
Personalised goals set using goal attainment scaling are measurable, if it is accepted that the measurement technique is valid. The technique may not be valid.
Quantification of outcome.
The third, completely separate argument concerns measurement. The argument goes as follows:
- a major feature of rehabilitation is the setting of specific goals for each patient;
- standardised outcome measures are the antithesis of patient-centred, context and situation specific outcomes;
- a means to quantify achievement of patient-specific goals will allow personalised measurement of change;
- one method is to quantify the extent to which a patient achieves a goal – goal attainment scaling.
The first point will be accepted, while also emphasising that not all the goals relate to the patient’s ability to perform an activity. For example a goal might be to identify a care agency able to meet the person’s needs, or a group for a patient to join, or a person willing to employ someone with the person’s strengths and limitations.
The second point is not accepted as absolute. Many standardised measures developed in response to patient goals in general, and they are increasingly developed with input from patients. It is true that many cover more activities than a specific goal will usually focus on, but taking the patient’s overall outcome, standardised measures are often relevant to a patient, and concordant with their general ambition. They may not identify change on one goal; they will probably detect change in areas important to the patient.
The third statement is self-evidently true. But it does not follow that goal attainment scaling is the only, or the best way to measure change. For example, if the goal was to walk to a shop, buy a few items, and to return, there are a host of aspects of this activity that can be quantified by timing or counting: overall time taken; number of items remembered; returning with the correct amount of money; number of falls; number of prompts from someone else etc. One could count, time, or in other ways quantify whatever was expected to be difficult.
The fourth statement is true, but it does not acknowledge either the many possible other ways of quantifying change in relation to a goal or that it may be less than ideal.
As the poll before the debate showed, not many people advance arguments against. However there are probably many people who have no opinion, and many who use alternative methods because they find them easier or more relevant, without necessarily considering why they do not use goal attainment scaling.
An educational reason.
Rehabilitation can be considered as analogous to education, (here) and in this analogy, goals are the educational objectives or outcomes aimed for. Assessment of rehabilitation progress and outcome can be considered analogous to work-based assessments in education. Thus, assessing a patient’s current state within the process of rehabilitation is similar to assessing a learner’s progress in the process of education.
Within healthcare training, two types of assessment are recognised, though it is likely that all assessments are actually a mixture:
- summative; this categorises a student or trainee as achieving or not achieving some standard of performance.
- formative; this identifies where the student or trainee is doing well, and where progress is less satisfactory and needs more attention, and suggests improvements.
Using this categorisation – summative or formative – goal attainment scaling is strongly summative, with no real formative component. Any patient being assessed on a goal attainment scale will feel that their progress is being classified in a judgemental way – worse than expected, as expected, or better that expected. Given the name – goal attainment scaling – it is hardly surprising if a patient considers it a judgemental,’ high stakes’ assessment.
Of course, the assessor will, usually, interpret and use the result to plan a way forward. Nevertheless the patient’s perspective is likely to be different; more importantly others such as family members, other team members, managers and commissioners people (who are paying) may well see the result as a judgement. (For further discussion of summative and formative assessments, their perception and use, see here.)
The second area of concern relates to its (lack of sound) psychometric properties if used as a measure, whether simply for the individual patient, on its own, or as part of a group of similar goal attainment scale measurements, for a patient, or more probably as a group measure in audit or research. The problem with its use as a numerical scale are well explained by Professor William Levack, in writing here (Chapter 5, pp 91-110) and audio-visually here (in 6 minutes 12 seconds).
One, obvious, flaw is that the various levels (-2, -1, +1, +2) are unlikely to form an equal interval scale. Indeed, it may well be that the ‘expected’ point (= 0) is located very near one end and it is unlikely to be central. In a group of goal attainment scale scores, it is equally unlikely that all the expected points will be at the same relative point between -2 and +2. One patient (or one outcome of several for one patient) may have a 0 close to +2, another a 0 close to -2.
The mathematical transformations are equally subject to uncertainty, and considerable debate about their validity.
Use of goal attainment scaling.
The greatest concern, and the strongest argument against its use, relates to the frequent and serious misuse of the scores.
Strictly within a patient-therapist, confidential relationship using the verbal descriptors alone, its will usually be use appropriately provided the professional does not use the numbers, and provided the professional takes care when interpreting and acting on the information. It is difficult to see much benefit, but within these parameters the only residual risks are (a) patient perception of failure if they do not reach their goal and (b), more importantly, failure to attend to important but less easily quantified goals.
However any use of aggregate numerical data derived through goal attainment scaling carries great risks. As William Levack shows, it is psychometrically flawed. It is not only comparing or adding watermelons, mangoes and Kiwi fruit, it is also ignoring the variations in size between and within different fruit.
Resources used, and opportunities missed.
The last consideration is practical. It takes time and effort to identify goals that are amenable to scaling, and then more time to negotiate four additional levels. This risks not giving attention to other important goals, or indeed to the interventions for the patient.
Further, in a research context, it is difficult to know who should set the goals. If it is the treating therapist, the risks of bias are large: selection of easier (or more difficult) goals, setting of easier or more difficult levels etc. If it is an external, neutral person, then the goals may not be agreed by the treating therapist or team. It is also difficult to know who should measure the outcomes; the person who set them, or another person. Inter-rater reliability may be low. (here, and see comments here)
My biggest concern arises from the misuse and misinterpretation of data collected, primarily by managers and those paying, but probably also by service therapists and teams, patients, and the public.
The positive selling points are powerful: person-centred outcomes of importance to the patient are quantified in ‘objective scientific data’ (numbers). What could be better?!
What is not acknowledged is that the process:
- may not measure the goals of most importance to the patient, because the goal cannot readily be measured and/or broken up into different levels;
- produces invalid numerical data that cannot be analysed using parametric methods, if at all;
- is subject to other factors that influence both the selection of the goal, and the setting of expected and other levels;
- can take a long time, time that could be better used in other ways;
- emphasises goal attainment over more critical evaluation of whether the goal was correct and whether other approaches would be better.
The reply to my concerns about misuse given in the debate was that “we need to educate them [commissioners etc]”. At one level this is of course correct, but it shows a touching, and in my view naive faith in the power of education. Commissioners want outcome data, and goal attainment scaling fits their need very well, because patients cannot complain about it. Commissioners will simply ignore references to published literature; unwelcome and inconvenient facts are easily ignored. Commissioners already ignore or do not follow strong NICE guidance, for example in relation to fertility services. Most commissioners are not trained in science and clinical practice, and they change with depressing regularity as re-organisation occurs again, and again.
Moreover, they will point to its use and say “Look, you are using it so why shouldn’t we?”. They will find someone who supports their use, verbally or in writing.
I feel that while goal attainment scaling is used in any way in rehabilitation, we (as a community) validate their misuse of the process. Given that the process of attainment scaling itself has virtually no person-centred benefits that cannot be achieved equally or more easily without scaling, I feel that we should stop using it and focus on goal setting and review entirely within a learning framework, not an evaluative framework.
What about evidence? This is not a systematic review, but a few relevant papers will be discussed briefly.
An actual systematic review studied the validity of Goal Attainment Scaling. (here) It is an interesting but quite hard read, but it points out many serious weaknesses and inconsistencies. For example, in the discussion it says “The variability noted among reviewed articles highlights that the GAS has many different interpretations. There is a lack of clarity regarding how the GAS is best interpreted, what specific construct the GAS measures, and whether the GAS measures a goal construct or whether the GAS is best regarded as its own measurement technique.” It highlights the lack of any sound theoretical basis, and a huge variability in what is considered a goal, with only one paper giving a definition.
One very important component of the paper is a discussion on the nature of validity, and the importance of interpretation and consequences. For example they conclude “Therefore, the GAS score has an applied purpose and a social consequence. Whether the GAS score is used in educational or clinical settings, its score has meaning, and a judgment or interpretation is formed based on its value. To evaluate how plausible an interpretation is, “it is necessary to be clear about what the interpretation claims””
After reading this article, which was after the debate, I became yet more concerned about goal attainment scaling.
For anyone interested in statistical aspects, and able to follow complex statistical discussion, a second article (here) will be of interest. Although it discusses statistics, it makes some important points that, in my view, make it clear that goal attainment scaling cannot be used in research trials or audit.
The authors state that, to avoid bias, all goals, all criteria for achievement, and all weights must be chosen before random allocation. This is far from clinical reality, and probably excludes goal attainment scaling as a practical or credible outcome measure. in research trials Indeed they say shortly thereafter that: “practical challenges, such as the training and time required for goal setting as well as a lack of scientific literature on study design and analysis methods may be an obstacle in the application of GAS as an endpoint in clinical trials.”
The authors suggest it may be most suitable for very rare disorders with very heterogenous patient populations, and offer a variety of statistical analytic methods.
A third study compared rehabilitation based on goal attainment scaling with “usual care physiotherapy rehabilitation” which was described in the protocol papers thus: “therapists generally work on pain reduction and the improvement of range-of-motion, muscle strength, endurance and gait pattern”. (here) In other words, no goal setting was undertaken.
The results showed no difference between the two groups on measures of physical activity (here), nor in other measures of function (here). The only difference was in satisfaction with work. With 120 participants, it is possible this is a valid observation but its inconsistency with all other measures suggests it may be chance, and not true.
There are many, many other papers, but these papers covering a range of issues certainly raise considerable doubts about goal attainment scaling as a measure in research. There is no reason to believe it is any better as a measure in clinical practice.
This post has put forward several arguments that suggest goal attainment scaling is flawed as a measure, and has also suggested that the process of goal attainment scaling may neutralise the advantages of setting goals (which themselves are not proven beyond all reasonable doubt in rehabilitation practice). It further suggests that continued use of goal attainment scaling will be used to justify misuse of the process by managers and commissioners, and that its use therefore actually poses a non-trivial risk to the rehabilitation community. The continued use of goal attainment scaling may also harm some patients. Until good evidence is provided to show unequivocal clinical benefit of goal attainment scaling over and above personalised goal setting without goal attainment scaling, it should not be used clinically. And until a good case is made for its valid use in research and audit, it should not be used in research and audit.