How do you go about setting up an evaluation or audit of your therapy service—whether it’s a large volunteer organisation or your own private practice?

Clarifying your Aims

There’s lots of reasons for setting up a service evaluation or audit, and being clear about what your’s are is a vital first step forward. Some possible aims might be:

Showing the external world (e.g., commissioners, policy makers, potential clients) that your therapy is effective.
Knowing for yourself, at the practitioner or service level, what’s working well and what isn’t.
Enhancing outcomes by providing therapists, and clients, with ‘systematic feedback’.
Developing evidence for particular forms of therapy (e.g., person-centred therapy) or therapeutic processes (e.g., the alliance).

And, of course, there’s also:

Because you have to!

Choosing an Evaluation Design

There’s lots of different designs you can adopt for your evaluation and audit study, and these can be combined in a range of ways.

Audit only

This is the most basic type of design, where you’re just focusing on who’s coming in to use your service and the type of service you are providing.

Pre-/Post-

This is probably the most common type of evaluation design, particularly if your main concern is to show outcomes. Here, clients’ levels of psychological problems are assessed at the beginning and end of therapy, so that you can assess the amount of change associated with what you’re doing.

Qualitative

You could also choose to do interviews with clients at the end of therapy about how they experienced the service. A simpler form of this would be to use a questionnaire at the end of treatment. John McLeod has produced a very useful review of qualitative tools for evaluation and routine outcome monitoring (see here).

Experimental

If you’ve got a lot of time and resources to hand—and/or if you need to provide the very highest level of evidence for your therapy—you could also choose to adopt an experimental design. Here, you’re comparing changes in people who have your therapy with those who don’t (a ‘control group’). These kinds of studies are much, much more complex and expensive than the other types, but they are the only one that can really show that the therapy, itself, is causing the changes you’ve identified (pre-/post- evaluations can only ever show that your therapy is associated with change).

Choosing Instruments

There’s thousands of tools and measures out there that can be used for evaluation purposes, so where do you start?

Tools for use in counselling and psychotherapy evaluation and audit studies can be divided into three types. These are described below and, for each type, I have suggested some tools for a ‘typical’ service evaluation in the UK. Unless otherwise stated, all these measures are free to use, well-validated (which means that they show what they’re meant to show), and fairly well-respected by people in the field. All the measures described below are also ‘self-rated’. This means that clients, themselves, fill them in. There are also many therapist- and observer-rated measures out there, but the trend is towards using self-rated measures and trusting that clients, themselves, know their own states of mind best.

Just to add: however tempting it might be, I’d almost always you not to develop your own instruments and measures. You’d be amazed how long it takes to create a validated measure (we once took about six years to develop one with six items!) and, if you create your own, you can never compare your findings with those of other services. Also, for the same reason, it is almost always unhelpful to modify measures that are out in the public domain—even minimally. Just changing the wording on an item from ‘often’ to ‘frequently’, for instance, may make a large difference in how people respond to it.

Outcome Tools

Outcome tools are instruments that can be used to assess how well clients are getting on in their lives, in terms of symptoms, problems, and/or wellbeing. These are the kinds of tools that can then be used in pre-/post-, or experimental, designs to see how clients change over the course of therapy. These tools primarily consist of forms with around 10 ‘items’ or so, like, ‘I’ve been worrying’ or ‘'I’ve been finding it hard to sleep’. The client indicates how frequently or how much they have been experiencing this, and then their responses can be totalled up to give an overall indication of their mental and emotional state.

Its generally good practice to integrate clients’ responses to the outcome tools into the session, rather than divorcing them from the therapeutic process. For instance, a therapist might say, ‘I can see on the form that this has been a difficult week for you,’ or, ‘Your levels of anxiety seem to be going down again.’ This is particularly important if the aim of the evaluation is to enhance outcomes through systematic feedback.

General

A popular measure of general psychological distress (both with therapists and clients), particularly in the UK, is:

CORE-OM

This can be used in a wide range of services to look at how overall levels of distress, wellbeing, and functioning change over time. A shortened, and more easily usable version of this (particularly for weekly outcome monitoring, see below), is:

CORE-10

Another very popular, and particularly brief, general measure of how clients are doing is:

Outcome Rating Scale (available via license)

Two other very widely used measures of distress in the UK are:

The PHQ-9 is a depression-specific measure, and the GAD-7 is a generalised-anxiety specific measure, but because these problems are so common they are often used as general measures for assessing how clients are doing, irrespective of their specific diagnosis. They do also have the dual function of being able to show whether or not clients are in the ‘clinical range’ for these problems, and at what level of severity.

Problem-specific

There are also many measures that are specific to particular problems. For instance, for clients who have experienced trauma there is:

Impact of Events Scale (revised)

And for eating problems there is:

Eating Attitudes Test

If you are working in a clinic with a particular population, it may well be appropriate to use both a general measure, and one that is more specific to that client group.

Wellbeing

For those of us from a more humanistic, or positive psychology, background, there may be a desire to assess ‘wellness’ and positive functioning instead of (or as well as) distress. Aside from the ORS, probably the most commonly used wellbeing measure is:

There’s both a 14-item version, and shortened 7-item version for more regular measurement.

Personalised measures

All the measures above are nomothetic, meaning that they have the same items for each individual. This is very helpful if you want to compare outcomes across individuals, or across services, and to use standardised benchmarks. However, some people feel that it is more appropriate to use measures that are tailored to the specific individual, with items that reflect their unique goals or problems. In the UK, probably the best known measure here is:

PSYCHLOPS

This can be used with children and young people as well as adults, and invites them to state their specific problem(s) and how intense they are. Another personalised, problem-based tool is:

Personal Questionnaire

If you are more interested in focusing on clients’ goals, rather than their problems, then you can use:

Goals Form

Service Satisfaction

At the end of therapy, clients can be asked about how satisfied they were with the service. There isn’t any one generic standard measure here, but the one that seems to be used throughout IAPT is:

Patient Experience Questionnaire

Children and young people

The range of measures for young people is almost as good as it is for adults, although once you get below 11 years old or so the tools are primarily parent/carer- or teacher-report. Some of the most commonly used ones are:

YP-CORE: Generic, brief distress outcome measure
SDQ: Generic distress outcome measure, very well validated and in lots of languages
CORS: Generic, ultra-brief measure of wellbeing (available via license)
RCADS: Diagnosis-based outcome measure
GBO Tool: Personalised goal-based outcome measure
ESQ: Service satisfaction measure.

A brilliant resource for all things related to evaluating therapy with children and young people is corc.uk.net/

Process Tools

Process measures are tools that can help assess how clients are experiencing the therapeutic work, itself: so whether they like/don’t like it, how they feel about their therapist, and what they might want differently in the therapeutic work. These are less widely used than outcome measures, and are more suited to evaluations where the focus is on improving outcomes through systematic feedback, rather than on demonstrating what the outcomes are.

Probably the most widely used process measure in everyday counselling and psychotherapy is:

SRS (available via license)

This form, the Session Rating Scale, is part of the PCOMS family of measures (along with the ORS), and is an ultrabrief tool that clients can complete at the end of each session to rate such in-session experiences as whether they feel heard and understood.

For a more in-depth assessment of particular sessions, there is:

Helpful Aspects of Therapy Form

This has been widely used in a research context, and includes qualitative (word-based) as well as quantitative (number-based) items.

Several well-validated research measures also exist to assess various elements of the therapeutic relationship. These aren’t so widely used in everyday service evaluations, but may be helpful if there is a research component to the evaluation, or if there is an interest in a particular therapeutic process. The most common of these is:

Working Alliance Inventory

This comes in various version, and assesses the clients’ (or therapists’) view of the level of collaboration between members of the therapeutic dyad. Another relational measure, specific to the amount of relational depth, is:

Relational Depth Frequency Scale

A process tool that we have been developing to help elicit, and stimulate dialogue on, clients’ preferences for therapy is:

Cooper-Norcross Inventory of Preferences

This invites clients to indicate how they would like therapy to be on a range of dimensions, such that the practitioner can identify any strong preferences that the client has. This can either be used at assessment, or in the ongoing therapeutic work. An online tool for this measure can be accessed here.

Interviews

If you really want to find out how clients have experienced your service, there’s nothing better you can do than actually talk to them. Of course, you shouldn’t interview your own clients (there would be far too much pressure on them to present a positive appraisal) but an independent colleague or researcher can ask some key questions (for instance, ‘What did you find helpful? What did you find unhelpful? What would you have liked more/less of?) which can be shared with the therapist or the service more widely (with the client’s permission). There’s also an excellent, standardised protocol that can be used for this purposes:

Change Interview

Note, as an interviewing approach has the potential to feel quite invasive to clients (though also, potentially, very rewarding), it’s important to have appropriate ethical scrutiny here of procedures before carrying these out.

Children and young people

Process tools for children and young people are even more infrequent, but there is the child version of the Session Rating Scale:

CSRS

Demographic/Service Audit Tools

As well as knowing how well clients are doing, in and out of therapy, it can also be important to know who they are—particularly for auditing purposes. Demographic forms gather data about basic characteristics, such as age and gender, and also the kinds of problems or complexity factors that clients are presenting with. These tools do tend to be less standardised than outcome or process measures, and it’s not so problematic here to develop your own forms.

For adults, a good basic assessment form is:

CORE Assessment Form

For children and young people, one of the most common, and thorough, forms is:

Current View

Choosing Measurement Points

So when are you actually going to ask clients, and/or therapist, to complete these measures? The demographic/audit measures can generally be done just once at the beginning of therapy, although you may want to update them as you go along. Service satisfaction measures and interviews tend to be done just at the end of the treatment.

For the other outcome and process measures, the current trend is to do them every session. Yup, every session. Therapists often worry about that—indeed, they often worry about using measures altogether—but generally the research shows that clients are OK with it, provided that they don’t take up too much of the session (say not more than 5-10 minutes in total). So, for session-by-session outcome monitoring, make sure you use just one or two of the briefer forms, like the CORE-10 or SRS, rather than longer and more complex measures.

Why every session? The reason is that clients, unfortunately, do sometimes drop out, and if you try and do measures just at the beginning and end you miss out on those clients who have terminated therapy prior to a planned ending. In fact, that can give you better results (because you’re only looking at the outcomes of those who finished properly, who tend to do better) but it’s biased and inaccurate. Session by session monitoring means that you’ve always got a last score for every client, and now most funders or commissioners would expect to see data gathered in that way. If you’ve only got results from 30% of your sample, it really can’t tell you much about the overall picture.

Generally, outcome measures are completed at the start of a session—or before the start of a session—so that clients’ responses are not too affected by the session content. Process measures are generally completed towards the end of a session as they are a reflection on the session itself (but with a bit of time to discuss any issues that might come up).

Analysing the Data

Before you start a service evaluation, you have to know what you are going to do with the data. After all, what you don’t want is to a big pile of CORE-OM forms in one corner of your storage room!

That means making sure you price in to any evaluation the costs, or resources, of inputting the data, analysing it, and writing it up. It simply not fair to ask clients, and therapists, to use hundreds of evaluation forms if nothing is ever going to happen to them.

The good news is that most of the forms, or the sites that the forms come from, tell you how to analyse the data from that form.

The simplest form of analysis, for pre-/post- evaluations, is to look at the average score of clients at the beginning of therapy on the measure, and then their average score at the end. Remember to only use clients who have completed both pre- and post- forms. That will show you whether clients are improving (hopefully) or getting worse.

With a bit more sophisticated statistics you can calculate what the ‘effect size’ is. This is a standardised measure of the magnitude of change (after all, different measures will change by different amounts). The effect size can be understood as the difference between pre- and post- scores divided by the ‘standard deviation’ of the pre- scores (this is the amount of variation in scores, which you can work out via Excel using the function ‘stdev’). Typically in counselling and psychotherapy services, the effect size is around 1, and you can compare your statistics with other services in your field, or with IAPT, to see how your service is doing (although, of course, any such comparisons are ultimately very approximate).

What you can also do is to find out the percentage of your clients that have shown ‘reliable change’ (which is change more than a particular amount, to compensate for the fact that measures will always be imprecise), and ‘clinical change’ (the amount of clients who have gone from clinical to non-clinical bands and vice versa). If you look around on the internet, you can normally find the clinical and reliable change ‘indexes’ for the measures that you are using (though some don’t have them). For the PHQ-9 and GAD-7, you can look here to see both calculations for reliable and clinical change, and the percentages for each of these statistics that were found in IAPT.

Online Services

One way around having to input and analyse masses of data yourselves is to use an online evaluation service. This can simplify the process massively, and is particularly appropriate if you want to combine service evaluation with regular systematic feedback for clinicians and clients. Most of these (though not all) can host a wide range of measures, so they can support the particular evaluation that you choose to develop. However, these services come at a price: a license, even for an individual practitioner, can be in the hundreds or thousands of pounds. Normally, you’d also need to cost in the price of digital tablets for clients to enter the data on.

My personal recommendation for one of these services is:

Pragmatic Tracker

At the CREST Research Clinic we’ve been using this system for a few years now, and we’ve been consistently impressed with the support and help we’ve received from the site developers. Bill and Tony are themselves psychotherapists with an interest in—and understanding of—how to deliver the best therapy.

Other sites that I would recommend for consideration, but that I haven’t personally used, are:

Core Systems Trust: Particularly if you’re using the CORE family of tools, but they support other ones too.
Better Outcomes Now: for accessing the PCOMS tools (but not other measures). Also see My Outcomes for these measures.
OQ Measures: A US company, based around the Outcome Questionnaire.

Challenges

In terms of setting up and running a service evaluation, one of the biggest challenges is getting counsellors and psychotherapists ‘on board’. Therapists are often sceptical about evaluation, and feel that using measures goes against their basic values and ways of doing therapy. Here, it can be helpful for them to hear that clients, in fact, often find evaluation tools quite useful, and are often (though not always) much more positive about it than therapists may assume. It’s perhaps also important for therapists to see the value that these evaluations can have in securing future funding and support for services.

Another challenge, as suggested above, is simply finding the time and person-power to analyse the forms. So, just to repeat, do plan and cost that in at the beginning. And if it doesn’t feel like that is going to be possible, do consider using an online service that can process the data for you.

For the evaluation to be meaningful, it needs to be consistent and it needs to be comprehensive. That means it’s not enough to have a few forms from a few clients across a few sessions, or just forms from assessment but none at endpoint. Rather, whatever you choose to do, all therapists need to do it, all of the time. In that respect, it’s better just to do a few things well, rather than trying to overstretch yourself and ending up with a range of methods done patchily.

Some ‘Template’ Evaluations

Finally, I wanted to suggest some examples of what an evaluation design might look like for particular aims, populations, and budgets:

Aim: Showing evidence of effectiveness to the external world. Population: adults with range of difficulties. Budget: minimal

CORE-10: Assessment, and every session
CORE Assessment Form
Analysis: Service usage statistics; pre- to post- change, effect size, % reliable and clinical change

Aim: Showing evidence of effectiveness to the external world, enhancing outcomes. Population: young people with range of difficulties. Budget: minimal

YP-CORE: Assessment, and every session
Current View: Assessment
ESQ: End of therapy
Analysis: Service usage statistics; pre- to post- change, effect size, % reliable and clinical change; satisfaction (quantitative and qualitative analysis)

Aims: Showing evidence of effectiveness to the external world, enhancing outcomes. Population: adults with depression. Budget: medium

PHQ-9: Assessment and every session
CORE Assessment Form
Helpful Aspects of Therapy Questionnaire
Patient Experience Questionnaire: End of Therapy
Analysis: Service usage statistics; pre- to post- change, effect size, % reliable and clinical change; helpful and unhelpful aspects of therapy (qualitative analysis); satisfaction (quantitative and qualitative analysis)

And finally…

Please note, the information, materials, opinions or other content (collectively Content) contained in this blog have been prepared for general information purposes. Whilst I’ve endeavoured to ensure the Content is current and accurate, the Content in this blog is not intended to constitute professional advice and should not be relied on or treated as a substitute for specific advice relevant to particular circumstances. That means that I am not responsible for, nor will be liable for any losses incurred as a result of anyone relying on the Content contained in this blog, on this website or any external internet sites referenced in or linked in this blog. I also can’t offer advice on individual evaluations. Sorry… but hope the information here is useful.