Evaluation Designs

The very first step in any evaluation project is developing a plan for getting the work done. This plan or structure for an evaluation project is what we refer to as the ‘design’. The process of designing an evaluation starts out general but gradually becomes very specific. The first step entails clarifying the purpose of the evaluation which then develops questions that require data and information obtained from methods. The primary purpose of an evaluation is to determine whether the program had the desired effect.

What Is an Evaluation Design?

An evaluation design is the general plan or structure of an evaluation project. It is the approach taken to answer evaluation questions and is a particularly crucial step to providing an appropriate assessment. A good design provides an opportunity to enhance the quality of the evaluation, thereby minimizing and justifying the cost and time needed to do the work. There are different evaluation designs that answer different types of evaluation questions to evaluate programs. Since each program is unique, the evaluation design to be used should align with:

The goals of the program
Evaluation research questions
The purpose of the evaluation
Available resources

The first step to a successful evaluation is to identify the right evaluation design. However, evaluation designs that focus attention on the effectiveness of a program differ in their evidence strength. Evaluators sometimes use a ‘hierarchy of evidence’ to assess how well a project performs. The evaluation designs that are thought to produce the most powerful evidence on the effectiveness of a program usually top this hierarchy. Continue reading to get a basic overview of the distinct types of evaluation designs.

Experimental Designs

Experimental designs are used to determine if a program is more effective than the current process. This is typically the most thorough evaluation approach as participants are randomly deployed to a control group or a program. Those in the control group either receive the existing program or sometimes no program, while those in the program group get an intervention. Both groups receive some kind of pre/post assessment and the results are then compared. This helps rule out plausible explanations and enables them to argue confidently that their findings are experimental, hence the rigor. The main experimental designs are Randomised controlled trials (RCTs) which systematically test for differences between two or more groups of participants. This type of evaluation design is the gold standard against which other designs are judged, as it provides a powerful technique for evaluating cause and effect.

Randomization ensures that each participant gets an equal chance of being assigned to get or not get the intervention, thus providing a greater chance for a similar mix of attributes from the participants in the intervention and control groups. Without randomization, there would be systematic bias; where one group differs from the other, which affects the results. Since RCTs are usually conducted under conditions that offer a high level of control over factors that might offer unconventional explanations for findings, RCTs offer a relatively high level of assurance that the outcomes for participants directly result from the program. Although RCTs effectively answer questions about the performance of an intervention, they are less relevant in answering questions on why or how an intervention works. Due to this impracticality, evaluators often employ the next applicable thing – case comparison groups in quasi-experimental designs.

Quasi-experimental Designs

Quasi-experimental designs involve matching the participants beforehand or after the fact, using statistical Quasi- methods. A quasi-experimental design is often deployed when there is an inadequate number of participants to randomly allocate to a control group or program. As a result, identifying comparison groups to then collect data from is a significant challenge. This type of design is best applicable on most federal lists of evidence-based programs since it demonstrates changes over time. As opposed to an experimental design, a quasi-experimental design does not randomly allocate participants to a control group or an intervention. It identifies a comparison group that is as similar as possible to the treatment group in terms of pre-intervention characteristics. To create a rational comparison group, statistical techniques such as propensity score matching and regression discontinuity design are used to reduce the risk of bias. Quasi-experimental methods estimate the effect of an intervention or a policy when controlled experiments prove impractical.

When it is practically impossible to randomly allocate participants to intervention or control groups, Comparison groups are often used. They can involve participants on waiting lists for an intervention and participants of other programs, where both are unable to be randomly assigned into groups. The two groups likely match well in terms of demographic characteristics, provided participants in the program group have not been prioritized over the participants on the waiting list. Evidence of considerable advantage to the intervention participants compared to a comparison group can propose the effectiveness of the program, but it is harder to be certain that the change resulted from the program. Because the allocation of participants is not random, we can’t always be certain that any benefits or differences observed in the evaluation result from the intervention rather than the differences between the groups of participants beforehand. Nonetheless, if the results of repeated studies of a program while using various quasi-experimental and other non-experimental methods are consistent, then we can have greater confidence in the program’s effectiveness.

Non-experimental Designs

Non-experimental evaluation designs focus more on the ‘how’ and ‘why’ of a program. They are often used in cases where the use of comparison or control groups is not applicable, hence they don’t have comparison or control groups. Non-experimental designs include:

Pre- and post-intervention studies – These assess the effect of a program without using a comparison or control group.
Case studies – These are often used to get a perfect understanding of a single instance or activity within a program setting.
Most significant change (MSC) – These are most applicable when the focus is on identifying the results of an intervention.
Developmental Evaluation – This is a structured way of monitoring, assessing, and giving feedback regarding the development of a program during designing or modification.
Realist Evaluation – This uses qualitative methods such as focus groups or interviews to perfectly understand the underlying mechanisms of an intervention or program.
Empowerment Evaluation – This is a set of principles used to guide the evaluation at every stage.

While non-experimental evaluation studies will most likely produce actionable findings on program outcomes and performance improvement, they cannot control factors that could affect outcomes, such as selection bias.

To get in-depth knowledge about evaluation designs and approaches, enroll for a Post Graduate Diploma Programme in Monitoring, Evaluation, Accountability And Learning (MEAL) with us today and get a 10% discount! Subscribe to our newsletter to get this and more blogs on matters pertaining humanitarianism.

Evaluation Designs

Related Posts

Enablers of Purchasing and Supply Chain Management

Understanding Emergency Food Security

Climate Change and Adaptation

What is SBCC and why is it important?