How to conduct meta-analysis : A Basic Tutorial

ion of data and setting up our own table

individuals or population of interest to us.For example, if we are interested in the effectiveness of a drug such as nedocromil on bronchoconstriction (narrowing of air passages) among adult asthma patients, then we shall include only adult asthmatics for our study, not children or older adults (if such individuals are not of our interest); on the other hand, if we are interested to study the effectiveness of mindfulness meditation for anxiety for adults, then again adult age group would be our interest; we could further narrow down the age band to our interest.
Intervention needs to be as broadly or as narrowly defined keeping only the interventions of our interest.Usually, meta-analyses are done in assimilating studies that are RCTs or quasi-experimental studies where pairs of interventions (intervention versus placebo or interventions versus conventional treatment or interventions and no treatment) are compared (Normand, 1999).Note that meta-analyses are not necessarily restricted only to randomised controlled trials, these are now increasingly applied to observational study designs as well for example cohort and case control studies; in these situations, we refer to the specific expsoure variables of our interest (Stroup et al., 2000).Meta-analyses are also conducted for diagnostic and screening studies (Hasselblad and Hedges, 1995) Let's say we are interested to test the hypothesis that consumption of plant-based diets is associated with reduced risk of cardiovascular illnesses.You can see that for ethical reasons, it is not possible to conduct randomised controlled trials so that one group will be forced to consume plant based diet and the other group will be forced to consume non-plant based diet, but it is possible to obtain that information about heart diseases from two groups of people who have consumed and not consumed certain levels of vegetarian items in their diets.Such studies are observational epidemiological studies and using observational studies such as cohort and case control studies.In such situations, it is useful to summarise findings of cohort and case control studies.Intervention then is not appropriate; however, we use the term "Exposure".Likewise, the comparison group is important as well.The comparison group can be "no intervention", or "placebo", or "usual treatment".
The outcomes that we are interested can be narrowly or broadly defined based on the objective of the meta analysis.If the outcome is narrowly defined, then the meta analysis is only restricted to that outcome, for instance, if we are interested to study the effectiveness of mindfulness meditation on anxiety then, anxiety is our outcome; we are not interested to find out if mindfulness is effective for depression.On the other hand, if the objective of hte study is to test if mindfulness meditation is useful for "any health outcome", then the scope of the search is much wider.So, after you have set up your theory and your question, now is the time to rewrite the question and reframe it as a PICO formatted question.Say we are interested to find out if minduflness meditation is effective for anxiety, then we may state the question in PICO as follows: • P: Adults (age 18 years and above), both sexes, all ethnicity, all nationality Then, on the basis of PICO, we reframe the question as follows: "Among Adults, compared with all other approaches, what is the effectiveness of Mindfulness Meditation for the relief of Anxiety?" Step II: Conduct a Search of the Literature Databases After you have decided the PICO, you will conduct a search of the literature databases.This will help you to identify the appropriate search terms.These search terms are arranged using Boolean Logic, fuzzy logic, specific search related controlled vocabulary, symbols of truncation or expansion, and placement of the terms in different sections of a reported study (Tuttle et al., 2009).In Boolean Logic, you use the connectors, "AND", "OR", and "NOT" in various combinations to expand or narrow down search results and findings.For example, • "Adults" AND "Mindfulness Meditation" will find only those articles that have BOTH adults AND mindfulness meditation as their subject topics.While, • "Adults" OR "Mindfulness Meditation" will find all articles that have EITHER "Adults" OR "Mindfulness Meditation" in their subject topics, so the number of results returned will be larger.
• "Adults" NOT "Mindfulness Meditation" will find only those articles that contain "Adults" but will exclude all articles that have "Mindfulness Meditation" as their topic area.
In addition to the use of Boolean logic, you can also use "fuzzy logic" to search for specific articles.When you use fuzzy logic, you use search terms where you use words like "Adults" NEAR "Mindfulness" or "Adults" WITHIN 5 Words of "Mindfulness" to search for articles that are very specific.These can be combined in many different ways.
Many databases, such as Pubmed/Medline, contain MeSH (Medical Subject Headings) as controlled vocabulary where hte curators of thse databses maintain or archive difernet articles under specific search terms (Robinson and Dickersin, 2002).When you search Medline or Pubmed, you can use MeSH terms to search for your studies.You can use or combine MeSH terms along with other terms to search more widely or more comprehensively.
Besides these, you will use specific symbols such as asterisk (*) marks and dollar signs to indicate truncation or find related terms to find out articles.For example, if you use something like "Meditat$" in a search term, then you can find articles that use the terms "meditating", or "meditation", or "meditative" or "Meditational"; you will find list of such symbols in the documentation section of the database that you intend to search (Robinson and Dickersin, 2002).
Finally, search terms can occur in many different sections and parts of a study report.One way to search is to search the title and abstract of most studies.Another way to search place to search is within the entire body of the article.Thus, combining these various strategies, you can run a comprehensive search of the publications or research that will contain data that you can use for your meta-analysis.
Step III: Select the articles for meta analysis by reading Titles and Abstracts and full texts First, read the titles and abstracts of all relevant searched papers.But before you do so, set up a scheme where you will decide that you will select and reject articles for your meta analysis.For example, you can set up a scheme where you can write: • The article is irrelevant for the study question • The article does not have the relevant population • The article does not have the relevant intervention (or exposure) • The article does not have a relevant comparison group • The article does not discuss the outcome that is of interest to this research • The article is published in a non-standard format and not suitable for review • The article is published in a foreign language and cannot be translated • The article is published outside of the date ranges • The article is a duplicate of another article (same publication published twice) Use this scheme to go through each and every article you retrieved initially on the basis of reading their titles and abstracts.Usually only one clause is good enough to reject a study and note it that study got rejected on that criterion, and the first clause that rejects the study is noted down as the main cause.So, even if a study can be rejected on two clauses, the first one that rejects the study is mentioned as the main clause of rejection; you will need to put together a process diagram to indicate which articles were rejected and why.Such a process diagram is referred to as PRISMA (Preferred Reporting Items of Systematic Reviews and meta-analyses) chart (Moher et al., 2009).After you have run through this step and have identified a certain number of studies which must be included in the meta-analysis, obtain their full texts.Then read the full text once more and conduct this rejection exercise and note the numbers.As may be expected, you will reject fewer articles in this round.Then, read the full text and hand search the reference lists of these articles to widen your research.This step is critical.Often, in this step, you will find out sources that you must search, or identify authors whose work you must read to get a full list of all works and researches that have been conducted on this topic.Do not skip this step.In this step, you will note that some authors feature prominently, and some research groups surface; take a note of them; you may have to write to a few authors to identify if they have published more research.All this is needed to run a thorough search of the studies so that you will not miss any study that may be relevant for this meta analysis.
Step IV: Abstract information from these articles Once you know that you have a set of studies that you will work with, you will need to work with, you will now need to abstract data from them for your study.At the minimum, you must obtain the following information for each study included in you analysis: 1.The name of the first author This is just a suggestion; I do not recommend a fixed set of variables and you will determine what variables you need for each meta analysis.If you use a software such as Revman, then that will guide you with the process of abstraction of data from each article and you should follow the steps there.Note that in this case, we are only considering tabulation of these information per article.Also note that in this case, we will work with one intervention and one outcome in each table.You may have more than one outcome in the paper; in that case, you will need to set up different tables.Enter this information on a spreadsheet, and export the spreadsheet in the form of a csv file that you can input into R.In this exercise we will use R for statistical computing (R Core Team, 2013) Step V: Determine the quality of information of these articles For each study, you will need to critically appraise the information contained within it and decide if the study you are considering for your review meets the internal validity criteria.At the minimum you will need to identify the following: • What is the theory and the hypotheses this research is about?
• Is the sample size adequate for the research question? is this study underpowered?
• To what extent did the authors eliminate biases in the study?Even if it is an RCT, was there blinding?How confident are you that the authors conducted an appropriate randomisation procedure?What is the likelihood that the groups that were compared were very different with respect to the prognosis?-If this is an RCT, did the authors conduct an intention to treat analysis?
• If this is an observational study, how did the authors eliminate the risks of selection bias?How much was the risks of information bias from the participants eliminated?
• What confounding variables were controlled for?Are these confounding variables sufficient?(This will require that you will have to know something about the biology of the relationship; if you are not confident, ask someone) A great way to ascertain the quality of each article (rather each outcome within an article) is to use the GRADE (Grading recommendations assessment, development and evaluation) criteria and use the GRADEpro softwareto judge the quality of the outcome-intervention pairing.
Step VI: Determine the extent to which the articles are heterogeneous Think about the distinction between a systematic review and a meta analysis.A systematic review is one where the analysts follow the same steps as above (frame a question, conduct a search, identify the right type of research, extract information from the articles).Then, in a systematic review but not in a meta analysis, all studies that are fit to be included in the review get summarised and patterns of information are tabulated and itemised.This means, that all study findings for a set of outcomes and interventions are identified, tabulated and discussed in systematic reviews.On the other hand, in a meta analysis, there is an implicit assumption that the studies have come from a population that is fairly uniform across the intervention and outcomes.This may indicate one of the two issues: either that the body of the studies that you have identified are exhaustive and the estimates that you will obtain for the association between the exposure or intervention and the outcome are based on the subset of evidence that you have identified and define or estimate the true association.This is the concept of fixed effects meta analysis (Hunter and Schmidt, 2000).Alternatively, you can conceptualise that the studies that you have identified for this meta analysis constitute a sample that is part of a larger population of studies.That said, this subset of studies from that larger population is interchangeable with any other study in that wider population.Hence this set of studies is just a random sample of all possible studies.This is the notion of random effects meta analysis (Hunter and Schmidt, 2000).So, are the studies very similar or homogeneous in the scope of the intervention or population, or outcomes?Therefore, it is important that when we conduct a meta-analysis, because if the studies are so different from each other that it is impossible to pool the results together, then we will have to abandon any notion of pooling the study findings to arrive at a summary estimate.If the findings are close enough then the studies are homogeneous and we would conclude that it would be OK to pool the study results together using what is referred to as fixed effects meta analysis.If on the other hand, we see that the studies are different by way of their results but nevertheless there are other areas (selection of the population, the intervention, and the outcomes) that are sufficiently uniform, then we can combine the results of the studies themselves but we may conclude that the apparent lack of homogeneity would arise as these studies are part of a larger wider population of all possible studies and hence we would rather report a random effects meta analysis.
We will discuss two ways to measure heterogeneity of the studies.One way to test for heterogeneity is to use a statistic referred to as Cochran's Q statistic.The Q statistic is a chi-square statistic.
The assumption here is that the studies are all from the same "population" and therefore homogeneous and therefore a fixed-effects meta-analysis would be an appropriate measure to express the summary findings.Accordingly, the software first estimates a fixed-effects summary estimate.The fixed effects summary estimate is a sum of the weighted effect size.The weight of each study is determined by the variance of the effect estimate.
Then, the sum of squared difference between the summary estimate and each individual estimate would have a chi-squared distribution with K-1 degrees of freedom where K = number of studies.If the chi-square value would be low, this would indicate that the studies were indeed homogeneous, otherwise, it would indicate that the studies are heterogeneous.If the studies are found to be statistically heterogeneous, the next step for you would be to test whether there are real reasons for them to be heterogeneous, i.e., the population, the intervention, and the outcomes are very different from each other.If this indeed would be the case, then, you would summarise the study findings as you would with a systematic review.On the other hand, if you find that the studies are otherwise similar, but perhaps one or more studies were to drag the summary estimate to one direction rather than another, you would assume while the studies are not homogeneous, they may be based on a larger pool of studies.Hence you may conduct a random effect meta analysis.
Another measure of heterogeneity or statistical heterogeneity for meta analyses is mathPlaceholder0 estimate.I 2 estimate is derived from another related estimate referred to as H 2 , and H 2 is given by: H 2 = Q/K − 1 where K is the number of studies.Then, if Q > K -1, then I 2 is defined as (H 2 − 1)/H 2 ; otherwise I 2 is given a value of 0. For example, let's say are working with 10 studies, and the Q statistic is 36 (this will mean that the weighted sum of squared differences between the estimated fixed effect size and the individual effect size estimates in this case is 36); As Q > 9 for 10 studies (K = 10), therefore I 2 will be defined as 3/4 or 75%.A high I-squared statistic would mean gross heterogeneity while a low I-squared value would imply homogeneity of the studies (usually conventionally set at 30%) Step VII: Estimate summary effect estimate First, we shall determine the summary effect estimate assuming both fixed and random effects model Second, we shall construct a Forest Plot to visually inspect how the effect estimates of each individual study are distributed around a null value but also around the overall effect estimates.
Fixed and random effects models refer to the two assumptions: fixed effects model assume that the populations on which these studies are based are uniform enough to determine that these set of studies are sufficient to draw conclusions about the relationship between the exposure and the outcome; random effects model assume that while we can relax the assumption that the populations from where the studies arose were same and therefore these sets of studies were sufficient to draw our conclusions, the studies themselves form part of an interchangeable body of evidence.

Forest Plot of the results
In addition to the display of the heterogeneity of the estimates of the studies, the summary effect estimates based on fixed effects and random effects meta-analyses, we will also inspect a plot of all included studies in the meta analysis; this is referred to as "forest plot".In the "forest plot", the effect estimate of each study is presented in the form of a square box; the area of the square box is proportional to the weight assigned to this particular study; the weight in turn is assessed on the basis of their variances -the higher the variance the lower the area (so the area is inverse of the variance of each study).Then, across each study estimate runs a horizontal line -the length of this line is same as the width of the 95% confidence interval for the effect estimate for that particular study.The studies themselves are organised along the y-axis of the plot; the order in which the studies are arranged can be varied or as presented in the data set you created.On the x-axis of the forest plot the effect sizes are presented.A neutral point is plotted on the x-axis (this is either "1.0" when binary variables are studied in the meta analysis so your effect for each study is measured in terms of relative risk or odds ratio, or 0 when you used continuous measures for your outcome variables, so your effect measure is in terms of differences in the effect size between those with intervention or exposure and those in the control arm).A vertical broken line passes through the neutral point to indicate the information on each side of the line.When you are testing intervention, it will state that one side of the neutral line "favours intervention", and the other side of the line "favours control".In addition to these two indicators (that is the x-axis and the effect measures of each study in the form of boxes), we also get to see two diamonds.These diamonds represent the summary effect estimate in the form of fixed and random effects meta analysis final or summary estimates.The diamonds do not have a line that corresponds to their 95% confidence interval, instead the width of the diamond represent the 95% confidence interval bands around them.
Step VIII: Assess Publication Bias Now that you have identified: • The heterogeneity of the studies • The summary effect estimate and a forest plot It's time to test if there are biases that can impact the study conclusions.This means you will test whether your meta analysis has omitted studies that should have been included (Dickersin, 1990).If a study is based on a large sample size and has reported positive findings, such a study has a higher likelihood of getting published and be identified through searches than a study that is small and has reported equivocal or negative findings (Thornton and Lee, 2000).As this phenomenon in the context of a meta-analysis or systematic review will indicate that our results are based on a selection of studies and a systematic exclusion of studies that are nonetheless important, this leaves room for bias.This bias is referred to as "publication bias".There may be several reasons for such a publication bias, including: • Preference of journal editors to select those studies that have interesting study findings .
• Those who fund studies are more favorably likely to support studies that are large and have positive findings • Investigators are less likely to publish smaller studies with ambiguous or non-interesting findings • Smaller studies are delayed in their publishing and are not therefore captured If we accept that smaller studies with equivocal findings (that is findings that either does not support the preferred intervention or does not reach a level of statistical significance) tend to small in size and their findings are different from the summary estimate, then we may expect the following to be true: 1. Large influential studies will have their effect estimate close to the summary estimate 2. Large influential studies will be few (one or two) 3. Smaller studies may have widely variable effect estimate equally distributed around the summary estimate 4. The smaller a study, more variable will be the distribution of their results.So, if we consider small studies, they will be widely distributed around the neutral line or a line representing summary estimate when plotted in a graph.
These can be tested by plotting the effect estimates of the studies on x axis and either the sample size of the studies or the effect measure variability (variance or standard deviation or a similar measure) on the y axis of a plot.If there would not be serious publication biases, the plot would resemble a funnel with one or two dots representing studies with large sample size or low variance and effect estimate close to or identical to the summary estimate.The base of the funnel will be populated by small sized studies (or studies with large variances) with effect estimates scattered evenly around the summary estimate (Duval and Tweedie, 2000).If on the other hand, there is publication bias, then we would expect that one of the quadrants of the "funnel" in the lower side will be absent or blank.This is a visual assessment and most meta analysis packages and software allow for this plot.
Step IX: Run subgroup analyses and meta regression After you have examined the heterogeneity of the studies, estimated the summary effect size, plotted the forest plot, and tested for publication bias by testing and plotting the funnel plot, you can comment about the association between the exposure or the intervention and the outcome.But that would still mean that there are certain aspect of the study that need to be examined or some characteristics of the participants that need to be examined separately or in separate analyses.For example, you could examine what would be the relationship between the intervention and the outcome if only studies of longer duration or studies with predominantly sicker participants were included?Or you could run regress the effect estimate on average age of the participants the studies to test if the summary estimate would vary with age.Such analyses are referred to as subgroup analyses or meta-regression and part of any meta analysis.
In summary, a meta-analysis is a method of analysis where data from diverse studies are synthesised to arrive at a summary estimate.The steps of meta analysis are similar to that of a systematic review and include framing of a question, searching of literature, abstraction of data from individual studies, and framing of summary estimates and examination of publication bias.It is very important to conduct subgroup analyses and meta regression to test how the summary effects would change with different types of studies or different chracteristics of participants in the study.We now move to a real life example of a meta-analysis to illustrate a few of these points.

Meta Analysis: Reanalysis of DASH diet on hypertension
The dietary approaches to stop hypertension (DASH) is a diet and lifestyle based intervention to prevent hypertension and related illnesses (Sacks et al., 2001).In this meta analysis, we are interested to find out if longer term salt restriction is beneficial for people with normal blood pressure as well.We are going to rerun a meta analysis from the following paper by FJ He and GA Macgregor (He and MacGregor, 2002).We are only going to look at the subset of studies dealing with normotensives n the paper.Here is a simplified step (not the nine step we outlined earlier but seven steps): • Step I: PICO question and framing of search terms Step II: Listing of the studies on which they based their meta analysis

PICO
"Does moderate restriction of salt in the diet, when compard with no salt restriction lead to reduction in blood pressure for normotensive individuals (that is, those with normal levels of blood pressure)?"Following this, here is the screen shot of the search they conducted:

Identification of studies
Note the search terms and follow the search processes in the following diagram.In this example, we will use the PRISMA chart (the Preferred Reporting Items for Systematic reviews and Meta analyses) for understanding the process and we will use a total of 11 studies to illustrate meta analysis.Abstraction of data and setting up our own table labelPlaceholder1 We reconstruct the spreadsheet table and we will reanalyse part of the data.We have saved the data for normotensive individuals in the file hypertension.csv.We first read the data and save the data to a dataframe in R: htn meta <-read.csv("hypertension.csv",header = T)

Examination of Heterogeneity
If we summarise the effect estimates without weighing the studies in any way, we see that the average drop in the diastolic blood pressure among normotensive individuals, who were long term on low salt diet, was about one point.Systolic blood pressure in these studies dropped by three points.Let's now run a formal meta analysis to see if the weighted averages are any different: The forest plot suggests that although a few small studies suggest a strong effect size, most studies are within the two point drop mark.
Let's check the summary estimates for diastolic blood pressure,

Examination of Publication bias
Now let us examine evidence of publication bias in this meta analysis.We will do this with the help of a funnel plot.We issue funnel() function in R and we can now examine the funnel plot.Note that the x axis of this plot provides the effect size of each study and the y-axis of this plot provides the standard error.As standard error is essentially a function of the sample size, you can see that the smallest standard error (that is studies with the largest sample size) is placed on the top of the y-axis and the largest standard error (that is, studies that indicate smaller through smallest sample) are placed on the bottom of the y axis.An examination of this plot reveals that the lower right quadrant of the funnel is 'empty' indicating that the data of this meta analysis is mostly derived from large studies (that is studies with relatively low standard error) and those with large effect estimate in the direction of estimate that favours the interventions (the left side of the funnel base).
Figure 5: Figure 5. Funnel Plot, where we see that there is a relative absence of studies in the right lower quadrant

Subgroup analysis and meta regression
We are not done yet.We have only reviewed some aspect of the analyses of this analysis.We still need to run if there were important differences between the studies as in our original tests for heterogeneity we found that the studies were heterogeneous; also, we had different types of studies included, some studies had longer duration of the treatment, and other studies had shorter duration of treatment.We also had some studies that were based on crossover trials, and some studies were based on parallel arms trials, so it is possible that these studies would yield different summary estimates?In order to examine this possibility, we run what is referred to as subgroup analysis or meta regression.If you can divide the set of studies into different groups based on some criteria on a categorical variable (for which you have collected data of course and included them in the original data set that you used for analyses), then you can conduct a subgroup analyses.Often, you will be left with a variable (say age, or a specific concentration of a biomarker), then you can conduct what is referred to as meta regression, where you can regress the summary estimates on the various factors that can influence or explain the relationships.Remember that you will need at least 10 studies to run subgroup analyses.In this analysis presented below, we ran the subgroup analyses based on whether the studies were crossover trials or whether they were parallel arm trials.You can see that the parallel arm trials were more homogeneous and has smaller effect size.The crossover trials were more heterogeneous and larger effect size.Even then, there were no statistically significant difference between the studies (that is whether they were crossover or parallel arm trials) in terms of their overall effect size.

2.
The year the article was published 3. The population on whom the study was conducted 4. The type of research (was it an RCT?Or if observational, what type of study was it?) 5. What was the intervention exactly?(A brief description of the intervention) 6.The comparison condition (what was it compared with?) 7. What was the outcome and how was it measured?8. How many individuals were in the intervention (Ne)? 9. How many people were included in the control arm (Nc)? 10.If the outcomes were measured in a continuous scale, what was the mean value of the outcome among those in the intervention arm?11.If the outcome was measured on a continuous scale, the mean of the outcome among those in the comparison condition 12.If the outcome was measured on a continuous scale, what was the standard deviation of the measure for the exposure or intervention?13.If the outcome was measured on a continuous scale, what was the standard deviation of the measure for the comparison arm?14.If the outcome was measured on a binary scale (more on this later), the number of people with the outcome in the intervention arm 15.If the outcome was measured on a binary scale, the number of people with the outcome on the comparison arm 16.A quality score or a note on the quality or critical appraisal of each study

Figure 1 :
Figure 1: Figure 1.The Search Terms they included in the paper.labelPlaceholder1

Figure 2 :
Figure 2: Figure 2. The PRISMA chart to select the studies for this review.labelPlaceholder1 Figure 4: Figure 4. Forest Plot to study distribution of the effect estimates of the diastolic blood pressure for the DASH study