Evidence-based practice has been the mantra within health settings over the last 2 decades. It is a phrase that can be found in the majority of academic texts and assignments of most midwifery students. Several definitions have been proposed to describe evidence-based practice; the most often cited is that by Sackett et al (1996: 71), who stated that evidence-based health care is:
‘…the conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients [women]. The practice of evidence-based medicine means integrating individual clinical expertise with the best available external clinical evidence from systematic research.’
This definition includes the woman at the centre of care and integrates clinical expertise with the best available research evidence, making it appropriate for midwives. However, given the volume of evidence being produced through research and the limited resources and/or ability of some midwives to assess the research thoroughly, determining what the best evidence is can remain a challenge. Moreover, perhaps the over usage of the term ‘evidence-based practice’ as a blanket justification for the care provided, undermines the importance of determining what constitutes strong or poor evidence.
Evidence–based practice is about using research, as opposed to doing research, and Lyndon-Rochelle et al (2003) suggested that midwives must be able to read and interpret research to be able to differentiate between good and poor research. Without this understanding, it is unlikely that the strongest evidence will be implemented effectively into practice.
Within health care there are two main categories of research (Muir-Gray, 1997). The first is to increase the understanding of health, ill health and the process of health care. This type of research is hypothesis-generating and provides baseline knowledge; it often takes the form of descriptive or qualitative research. The second is to enable an assessment of interventions used in order to promote health, prevent ill health or to improve the process of health care. This is called hypothesis-testing. The randomised controlled trial sits within this category and is often heralded as the ‘gold standard’ of evidence for comparing alternative types of care (Enkin et al, 1989).
The randomised controlled trial
Although there are many forms of evidence, which all contribute to the health services agenda, the randomised controlled trial is the most rigorous way of determining the effectiveness of one treatment compared with another. Examples within midwifery include comparisons of positions for pushing in the second stage of labour (Downe, 2004) and the use of one partograph versus another (Lavender et al, 1998) for supporting the first stage of labour.
There are three different types of randomised controlled trials, determined by the question being posed:
To illustrate the three types of trial, one could take a question which compares a new partograph with the modified (standard) partograph:
Having a clear understanding of the trial type and subsequent design and methods assists the reader of study to determine whether processes followed were appropriate and, ultimately, whether results are valid. It must be remembered that while randomised controlled trials are regarded as the gold standard, they still require careful appraisal. Chalmers (1995) identified three important factors when appraising a randomised controlled trial:
These, and other, important considerations are discussed in the rest of this paper.
Randomisation
Randomisation in trials is the best way of removing selection bias between two groups of participants; all participants have an equal chance of being allocated to either the control or intervention arm. When reading a trial paper, a way of testing the success of randomisation is to observe that different groups have the same characteristics at baseline. For instance, in a trial of pregnant women, there should be the same number of women of similar ages, ethnicity and gestation. The process of randomisation should also be clearly defined.
Sample choice and size
Choosing the sample for any trial needs consideration and should be justified in findings’ write-up. Researchers have to balance an approach that makes the results generalisable to the population of interest, while ensuring that the inclusion criteria allows for meaningful comparisons to be made. If exercise regimens in pregnant women are being compared, for example, the researcher would have to consider whether all women would be included, regardless of body mass index, level of fitness and disease status. The justification of such decisions should be meaningful and clearly stated.
There is a common misperception that the larger the sample size the better, when it comes to trials; this is not the case. Trials should be ‘powered’ to detect a predetermined difference in a pre-specified, clinically important outcome. To determine the sample size, a sample size analysis must be performed; this can be done manually or using computer software. To do this, the researcher must estimate the likely effect of the intervention (effect size), usually on one important primary outcome; determine the type of data (linear or non-linear) and the likely variability of that data; this should all be clearly stated when the trial is reported. While huge sample sizes may appear impressive, continuing to recruit participants beyond the sample size determined through the power calculation will not provide any additional evidence on the main outcome. Furthermore, one can argue that it is in fact unethical to recruit more participants to a study than are actually required.
Choice of outcomes
The choice of outcome is important when designing a trial and should be determined by its clinical importance, rather than the ease of collecting the data or as a means of reducing the sample size. Nevertheless, the practicalities of collecting data on rare outcomes need to be considered. For example, if a trial was to be conducted in the whole of the UK, whereby maternal mortality was the primary outcome, to detect a small difference would require thousands of women and would take many years to perform; the trial would therefore not be feasible. However, if maternal mortality was combined with other serious outcomes, such as near-miss mortalities and high-dependency care, then this would be capable of being done and would remain meaningful.
Sometimes, the outcome chosen within a study is because it is a precursor to the disease of interest. In the skin care trial (Lavender et al, 2013), comparing newborn bathing with a wash product versus cleansing with cotton wool and water, for example, the disease of interest was atopic eczema; however, the practicalities of assessing this in a robust randomised controlled trial would have been challenging, in terms of maintaining compliance, reducing confounding variables and having appropriate resources. A compromise was therefore to assess the babies' skin for transepidermal water loss (TEWL), which is a biophysical measure of water evaporation from the skin; this gives a good indication of skin barrier integrity. Poor barrier function would be a precursor to atopic eczema.
Controlling for bias
In order for research to be robust, adequate bias control measures must be in place. Such measures include participant randomisation and blinding. For a trial to be credible, blinding is preferable, for example whereby neither the researcher carrying out the measurements nor the participants receiving the allocated treatment, are aware of whether they are in the control or intervention arm of the study. Blinding can assist in preventing what is known as the ‘Hawthorne effect’, where participants behave in a way that they believe is expected of them. For example, if a participant was aware that they were receiving an active ingredient to reduce pain, they may report being in less pain, simply because that is what they believed was expected of them. Periodic blinding checks can assist in determining whether the participants are beginning to detect their group allocation. However, in some cases, blinding is not possible. For example, it would be impossible to blind the parents in the trial comparing baby wipes with cotton wool and water (Lavender et al, 2013) as they would need to know which care to provide. Papers reporting trials, however, should provide a clear rationale for why blinding may not be possible. Blinding of the statistician who is conducting the analysis is a further way of adding rigour to the research process.
In some trials (e.g. Lavender et al, 2012; 2013), the research teams present the unblinded results for interpretation, prior to making decisions regarding the study's conclusions and recommendations. Having an independent data monitoring committee is pivotal to such processes.
Reporting of results
When examining results of a trial it is important to consider whether all the participants are accounted for. If this is not the case, then results can be skewed, making them less convincing. If those who are lost to follow-up differ in some way between the randomised groups, then this is problematic.
A common error when reporting trials is the lack of complete information. For example, some researchers will report only the P-value (calculated probability of rejecting the null-hypothesis). In other cases, the mean values will be reported without any information on the standard deviation, which is needed to determine the amount of variation of a set of data values.
Outcome reporting bias
Publication of complete trial results is important to allow clinicians, consumers, and policy makers to make better informed decisions about health care. Not reporting a study on the basis of the strength and ‘direction’ of the trial results has been termed ‘publication bias’ (Dickersin et al, 1987). An additional and potentially more serious threat to the validity of evidence-based medicine is the selection for publication of a subset of the original recorded outcomes on the basis of the results (Hutton and Williamson, 2000), which is referred to as ‘outcome reporting bias’. The results that the reader sees in the publication may appear to be unselected, but a larger data set from which those results may have been selected could be hidden. This kind of bias affects not just the interpretation of the individual trial but also any subsequent systematic review of the evidence-base that includes it (Kirkham et al, 2010). Compulsory registration of a trial at the study outset and publication of all completed trial results, attempts to eradicate this poor practice (Begg et al, 1996).
The recent scientific literature has given some attention to the problems associated with incomplete outcome reporting, and there is little doubt that non-reporting of pre-specified outcomes has the potential to cause bias (Chan et al, 2004; Williamson et al, 2005; Dwan et al, 2008; Smyth et al, 2011).
Ways of detecting outcome reporting bias can be labour intensive. If the trial protocol is available or can be obtained from the study team then outcomes in the protocol and published report can be compared. If not, then outcomes listed in the methods section of an article can be compared with those whose results are reported. Further information can also be sought from authors of the study reports, although it should be realised that such information may be unreliable (Chan et al, 2004).
Bias
Awareness and recognition of bias is important when reading any evidence, especially if this knowledge is to be transferred into clinical practice. Alejandro et al (2007) outline a number of biases which can occur during the course of a randomised controlled trial. However, one bias that receives less attention is the one which can be introduced by the reader when reading the report of the trial (Owen, 1982). It is generally recognised that the focus and attention on bias in clinical trials occurs at the outset and planning of the trial and during the trial itself. Less attention is paid to how the results of the trial are interpreted into clinical practice. It is essential that the reader suspends any possible biases and carefully reads the publication if the best evidence is to be applied to clinical practice. Reader biases have been described by Owen (1982). Alejandro et al (2007) outline a number of biases which they believe to be common but not reported on (Table 1).
Bias | Description |
---|---|
Relationship to the author bias | Includes rivalry bias (underrating the strengths or exaggerating the weaknesses of studies published by a rival) and ‘I owe him one’ bias (favouring flawed results from a study by someone who did the same for the reader) |
Personal habit bias | Overrate or underrate a study depending on the readers' own personal habits |
Moral bias | Overrate or underrate a study on how much it agrees or disagrees with moral views of the reader |
Clinical practice bias | Supports or challenges current or past clinical practice |
Institution bias | Influenced by the way things are done or not done locally |
Territory bias | Studies overrated that support the readers' own speciality or profession |
Tradition bias | Supports or challenges traditional procedures |
So something bias | Suggests that an intervention is effective, when no alternative is available |
Technology bias | An attraction or aversion to technology |
Resource allocation bias | Preference for resource allocation |
Printed word bias | Overrated because of undue confidence in published data |
Prestigious journal bias | Results of studies published in prestigious journals are overrated |
Prominent author bias | Results of studies published by prominent authors are overrated |
Non-prominent author bias | Who is s/he bias |
Professional background bias | Esteemed author bias |
Esteemed professor bias | |
Friendship bias | |
Geography bias | Judged according to the country or region where it was conducted |
Language bias | Studies published in languages other than English are considered of inferior quality |
Complementary medicine bias | Systematic overrating or underrating of studies that describe complementary medicine interventions |
Flashy title bias | Results of studies with attractive titles are overrated (particularly by patients or journalists) or underrated (particularly by academics if they regard them as sensationalist) |
Substituted question bias | Reader substitutes a question for the question that the study is designed to answer and regards the results of the study as invalid if they do not answer the substituted question |
Vested interest bias | Bankbook bias |
Cherished belief bias | |
Reader attitude biases | Belligerence bias |
Empiricism bias | |
Careless reading bias |
Funding/industry
Industry funding for clinical trials is generally controversial. In midwifery, few trials are funded by commercial entities, probably due to concerns over impartiality and potential for bias. In a study conducted on baby skin care (Lavender et al, 2009), funded by Johnson and Johnson, one of the health visitors, at the start of the programme of research, stated:
‘It doesn't quite sit easily with me, but I'm not entirely sure the reasons why. I think I just wonder whether they would be very influential if they funded it … I think they would possibly cherry-pick information … they wouldn't want information that would destroy their product.’ (Health visitor 45)
Midwives are right to remain cautious when they read a trial paper that declares that funding has been received from industry. However, there are a number of things that should be considered before dismissing a trial on the basis of industry funding alone. The most important question is ‘Will this study answer a clinically important question?’ In considering this, one must determine whether it is only industry that will benefit or whether the study will produce important information for women, their families and/or health professionals.
Next, how the trial was managed should also be considered. There are different types of industry-funded studies, those which are industry-led and those that are investigator-led. For the latter, the research team has full autonomy on research design, execution and analysis of the data. Moreover, in most cases, the commercial company do not have access to the raw data. Contractual arrangements are another important issue. For investigator-led trials, it is good practice for the research team to have a contractual agreement with the commercial company to be able to publish the data regardless of the findings, as was the case with skin care trials (Lavender, 2012; 2013). This can make a commercial company vulnerable; however, in the study mentioned earlier (Lavender et al, 2009) one midwife stated:
‘This study has renewed my faith in industry. If J & J were prepared to subject their product to a well-designed trial, well good for them. Other industries should do the same.’ (Midwife 7)
Furthermore, it is important to determine who actually wrote the paper. Often commercial companies would employ medical writers to report on the study findings. While this is not necessarily a bad thing, there is more potential for introduction of bias, particularly in the interpretation of findings.
Randomised trial assessment tool
A number of assessment tools are available to assist midwives in critically evaluating randomised controlled trials. The critical appraisal skills programme (CASP) tool (http://www.casp-uk.net/), for example, provides a useful checklist to aid this process. A tool, such as this, can be used by midwives to familiarise themselves with trial methodology and is highly recommended.
Conclusion
This paper has provided a basic overview of some of the important considerations to help determine whether a trial is robust enough to provide good evidence to inform practice. Midwives should not take trial results at face value. Instead they should develop their skills in critiquing trial methodology so that they can determine the applicability of the trial to their practice. Trials do offer the most robust evidence when comparing two or more interventions although not all trials are well conducted. Furthermore, trials may only provide some of the answers. Qualitative research, alongside clinical trials can provide additional evidence on what works, for whom and in what setting.