Abstract
In conventional (aggregate data) meta-analysis, the results of many similar studies are statistically combined to yield a single pooled result. Conventional meta-analyses have many limitations. They cannot examine research questions that were not examined in the source studies, interactions between variables cannot be studied, granular analyses cannot be performed, and systematic biases in the source study data will be retained in the pooled results. Individual participant data meta-analysis (IPD-MA) differs from conventional meta-analyses in that, instead of pooling the results of already completed analyses from source studies, the statistical team obtains and processes individual participant data from the source studies. This allows the specification of a new study protocol that can be uniformly applied, across source studies, to the individual participant data. Matters that can thus be harmonized across the source studies include participant eligibility criteria, choice of exposures and outcomes, operational definitions of exposures and outcomes, time points for data examination, and the method of data analysis. IPD-MA can be performed as a 1-stage or 2-stage procedure; the latter is simpler. Whereas IPD-MA overcomes some of the limitations of conventional meta-analysis, it has its own limitations. Obtaining individual participant data can be difficult and time-consuming, reprocessing and reanalyzing source study data requires time and effort, and new biases may be introduced. The new biases arise from lack of availability of individual participant data from all source studies, limitation of the generalizability of findings when harmonization of the study protocol excludes subjects from analysis, loss of randomization structure when participant eligibility restrictions are applied in IPD-MAs of randomized controlled trials, and failure to adequately adjust for necessary covariates. Readers need to be aware of these biases, and authors of IPD-MAs need to report on the potential impact of these biases on their results.
J Clin Psychiatry 2025;86(3):25f16001
Author affiliations are listed at the end of this article.
When many studies examine the same research question, the results of these “source studies” can be statistically combined using different methods to yield a single result. Methods that a statistical team may use for combining source study results include pooled analysis, conventional (aggregate data) meta-analysis, individual participant data meta-analysis (IPD-MA), and network meta-analysis. Which of these methods is appropriate depends on matters such as the availability of source studies, source study design, expertise of the statistical team, and the research questions sought to be answered with the data from the source studies.
An earlier article in this column1 explained conventional meta-analysis and provided brief notes about the other methods of combining data. This article explains IPD-MA. This article will be easy to understand if the reader already understands concepts in conventional meta-analysis. Readers who wish to refresh their knowledge may refer to the earlier article.1
Meta-Analysis
Results in research are biased by matters such as sample characteristics, study methods, and statistical procedures. Different studies contain different biases and therefore present different results even though the research question is the same; the differences are usually in magnitude (eg, A > B vs A >> B), but sometimes in direction, instead (eg, A > B vs A < B). Averaging results across studies, using meta-analysis, can then yield a better representation of what might be true in the real world.1 An assumption here is that the biases in the different studies reasonably represent situations or subpopulations in the real world; so, pooling studies with different biases will yield an estimate that reasonably represents what is true for the population as a whole.
Bias in Conventional Meta-Analyses
Objections are immediately apparent. There is no assurance that, in studies pooled in meta-analysis, the biases will counterbalance each other, or cancel out, or provide an adequate representation of groups and subgroups in the population. So, if studies pooled in meta-analysis are systematically biased in one direction, the results of the meta-analysis will also be biased in that direction and therefore not comprise a valid answer to a research question (internal validity) or a valid representation of what is true in the real world (external validity).
As an obvious example, if randomized controlled trials (RCTs) of a new antidepressant are biased, by design, to recruit patients with at least moderately severe depression, pooling such RCTs in meta-analysis will not provide results about the efficacy of that antidepressant in mild or mild to moderate depression. The same argument applies to other restrictions in sample selection criteria in the source RCTs. In meta-analysis, when studies with limited external validity are pooled, the results of the meta-analysis will also have limited external validity.
As a less obvious example, if nonrandomized cohort studies of psychotropic drug exposure in pregnancy systematically suffer from confounding by indication, then pooling such studies will yield an estimate that also suffers from confounding by indication. This is “less obvious” because when source studies yield consistent results, it is easy to assume that “so many studies can’t be wrong” and to wrongly conclude that an adverse outcome is truly caused by the exposure. In meta-analysis, when studies with limited internal validity are pooled, the results of the meta-analysis will also have limited internal validity.
Limitations of Conventional Meta-Analyses
We now understand that conventional meta-analysis may merely aggregate systematic biases in the source studies and that the results of the meta-analysis, therefore, may lack internal and/or external validity. At best, conventional meta-analysis may detect high heterogeneity in the pooled studies, and this heterogeneity can be explored, post hoc, through subgroup and sensitivity analyses, or through meta-regression (readers may note that subgroup and sensitivity analyses, and meta-regression, can also be specified a priori). However, meta-regression can only examine study-level variables, carries its own limitations,2 and, anyway, cannot be performed without a sufficient number of source studies.
That’s about as much as conventional meta-analysis can do. Pooled results can be obtained only for research questions answered in source study analyses; conventional meta-analysis cannot examine research questions that were not examined in the original studies. Interactions between variables cannot be studied. Granular analyses cannot be performed.
The reason for these limitations is that conventional meta-analysis pools the results of analyses (from source studies) without accessing the raw data from which those analyses were obtained. This is where IPD-MA comes in.
Individual Participant Data Meta-Analysis
Conventional meta-analysis is a statistical procedure that uses source study results to average information across studies. That is, the statistical team uses as its raw data information extracted from text, tables, and figures in the source studies. In contrast, IPD-MA is a statistical procedure that uses source study raw data to average information across studies. That is, the statistical team obtains individual participant data from the source study investigators and uses these individual participant data for meta-analysis.
Advantages of Individual Participant Data Meta-Analysis
According to a systematic review, the first IPD-MA may have been published in 19913; however, a study of adjuvant tamoxifen and cytotoxic therapy in early breast cancer, published in 1988, may claim that privilege.4 Having individual participant data from the source studies gives the meta-analysis team many advantages.2,3 Specifically, the team can harmonize the data across studies (Table 1) and choose to study different outcomes and the drivers thereof (Table 2), producing a new protocol that is uniformly applied to the data in each source study.
IPD-MA has other advantages, too. When there is overlap in samples across different source studies, such as when the data were drawn for overlapping years from the same database, all nonoverlapping subjects can be included (in conventional meta-analyses, only 1 study is chosen from overlapping studies). Next, results presented by source study authors can be verified. Finally, because the IPD-MA team applies its own protocol to the individual participant data, selective reporting and file drawer biases are reduced.
Approaches to Individual Participant Data Meta-Analysis
An IPD-MA can be performed as a 1-stage analysis or as a 2-stage analysis. In a 1-stage analysis, after harmonization of data (Table 1), the data of all the participants in all the source studies are examined in a single analysis in which the clustering of participants in each study is retained; this allows for both within-study and between-studies modeling. The 1-stage approach is useful for examination of interactions between variables but involves more assumptions, is computationally intensive, and requires greater statistical skills. Most IPD-MAs are therefore conducted as 2-stage analyses.
Two-stage IPD-MA is easier to understand. The difference between conventional meta-analysis and 2-stage IPD-MA lies in what is done in the first stage. As a reminder to readers, in conventional meta-analysis the first stage comprises a mostly mechanical extraction of results from the source studies. In contrast, in the first stage of a 2-stage IPD-MA, the data are harmonized. The harmonized data in each source study are then reanalyzed based on the harmonized study protocol. This produces new results for each source study. In the second stage, the new results are pooled using conventional meta-analysis.
Disadvantages of Individual Participant Data Meta-Analysis
When IPD-MA has so many advantages, why are most published meta-analyses still conventional meta-analyses rather than IPD-MAs? The answer is that IPD-MAs are associated with challenges and even disadvantages (Table 3). The attention of readers is particularly drawn to the disadvantage that, when an IPD-MA of RCTs reframes participant eligibility criteria, removal of participants from the source RCTs results in a disturbance of the randomization structure. Thus, the RCTs become nonrandomized cohorts; they are no longer truly RCTs. This disadvantage of IPD-MAs is seldom recognized, let alone acknowledged.
The loss of randomization can be easily recognized in some IPD-MAs. As an example, in a study by Correll et al,5 the PRISMA flow diagram showed that 273 (20.4%) of 1,336 patients in 3 RCTs were excluded because they did not meet the IPD-MA criteria. With so many patients removed, the randomization structure of the RCTs could have been meaningfully disturbed.
Sometimes, the loss of randomization may not be easily noticed in IPD-MAs of RCTs. As an example, in 1 section of a 2-stage IPD-MA described by d’Angremont et al,6 outcomes were assessed in a subgroup of Alzheimer’s disease patients who had delusions (1,515 out of 6,649; 22.8%) or hallucinations (742 out of 6,649; 11.1%) at baseline. Very clearly, when 77% to 89% of the randomized sample was excluded from the analyses, the randomization structure would have been substantially disturbed.
When the groups being compared are no longer randomized at baseline, they could differ substantially on measured, unmeasured, and unknown covariates, many of which could be of importance to the outcome being assessed. The statistical team would then need to adjust their analyses using a full range of the relevant (measured) covariates; the unmeasured and unknown covariates can never be adjusted for. Adjustment for measured covariates can be done in a 1-stage IPD-MA or in the first stage of a 2-stage IPD-MA. However, such adjustment is hardly ever done. One possible reason is that the need for the adjustment is not recognized. Another is that all relevant covariates may not be available in all source studies. A third is that is that to adjust for many variables in small sample studies may result in overfitting and a violation of model assumptions.
A general caveat is that embarking on an IPD-MA may be justified only when there is good reason to expect that individual participant data will be available for most of the studies in the field. As a side note, this may explain why many IPD-MAs have been conducted in cardiology.
Individual Participant Data Meta-Analysis and Bias
As discussed in an early section in this article, the internal and external validity of conventional meta-analyses may be limited by biases. It is tempting for readers to assume that, when individual participant data are available, bias can be reduced. But this does not happen. Unavailability of individual participant data from all eligible studies is a potentially large source of bias. Harmonization of sample selection criteria and removal of ineligible participants results in continued or increased compromise of external validity. Removal of ineligible participants disturbs the structure of randomization (in IPD-MAs of RCTs) and introduces bias that was absent in the source studies. Inadequate adjustment for covariates compromises internal validity. Readers need to consider the applicability of these limitations when reading an IPD-MA.
Table 4 presents criteria for evaluation of IPD-MAs, and unique sources of bias that authors should consider and report in IPD-MAs.
Example of an Individual Participant Data Meta-Analysis
Readers are now encouraged to examine the relatively simple, 2-stage IPD-MA described by d’Angremont et al6; the free full text of this article is available in PubMed Central. The authors of this IPD-MA identified 34 relevant RCTs of cholinesterase inhibitor (ChEI) treatments in patients with Alzheimer’s disease, Lewy body disease, or Parkinson’s disease. They examined whether ChEIs attenuated neuropsychiatric symptoms, especially delusions and hallucinations, in the patients in the RCTs. They were able to obtain individual participant data for only 17 (50%) RCTs; so, there is an immediate concern about potential bias in data availability.
The 17 RCTs provided individual participant data for 7,167 patients, and 6,649 patients from among these were included in the IPD-MA; so, 7.2% of the patients were excluded, disturbing the structure of the randomization. Having individual participant data allowed the authors to examine treatment effect sizes for each of 12 neuropsychiatric symptoms, and for delusions and hallucinations, in particular. Furthermore, having individual participant data allowed the authors to examine treatment effect sizes for delusions and hallucinations only among those patients who actually had these symptoms at baseline. However, the sample size was attenuated by 77% for the delusions analysis and by 89% for the hallucinations analysis, and no covariate adjustments were made to address the substantial disturbance of the randomization structure of the RCTs.
The IPD-MA6 identified statistically significant findings favoring ChEIs, but the effect sizes were very small, and many did not remain significant after correcting for multiple hypothesis testing. Furthermore, the IPD-MA did not examine dose-response relationships, time to onset of benefit, and remission rates.7
Most IPD-MA studies examine many outcomes, and it is not uncommon for a 1-stage analysis to be used for some outcomes and a 2-stage analysis for others. As a caveat, unless authors carefully explain what they did, some IPD-MAs can be complex and hard to understand.
Take-Home Message
IPD-MA has its advantages but is not necessarily the best way to combine data from source studies. Readers of IPD-MAs need to examine how an IPD-MA improves our understanding of answers to a research question but also need to consider whether the IPD-MA introduces new biases. In this context, the criteria for assessment of bias, outlined in Table 4, will be helpful.
Parting Notes
All through this article, “conventional meta-analysis” was the term used to refer to the form of meta-analysis with which most readers are familiar. When distinguishing it from IPD-MA, conventional meta-analysis is usually referred to as aggregate meta-analysis or aggregate data meta-analysis. In similar vein, data extracted for use in conventional meta-analyses are referred to as aggregate data, as distinct from individual participant data.
Article Information
Published Online: July 7, 2025. https://doi.org/10.4088/JCP.25f16001
© 2025 Physicians Postgraduate Press, Inc.
To Cite: Andrade C. A primer on individual participant data meta-analysis and its strengths and limitations. J Clin Psychiatry 2025;86(3):25f16001.
Author Affiliations: Department of Psychiatry, Kasturba Medical College, Manipal Academy of Higher Education, Manipal, India; Department of Clinical Psychopharmacology and Neurotoxicology, National Institute of Mental Health and Neurosciences, Bangalore, India.
Corresponding Author: Chittaranjan Andrade, MD, Department of Clinical Psychopharmacology and Neurotoxicology, National Institute of Mental Health and Neurosciences, Bangalore 560029, India ([email protected]).
Relevant Financial Relationships: None.
Funding/Support: None.
References (7)
- Andrade C. Understanding the basics of meta-analysis and how to read a forest plot: as simple as it gets. J Clin Psychiatry. 2020;81(5):20f13698. CrossRef
- Veroniki AA, Seitidis G, Tsivgoulis G, et al. An introduction to individual participant data meta-analysis. Neurology. 2023;100(23):1102–1110. CrossRef
- Riley RD, Lambert PC, Abo-Zaid G. Meta-analysis of individual participant data: rationale, conduct, and reporting. BMJ. 2010;340:c221. PubMed CrossRef
- Early Breast Cancer Trialists’ Collaborative Group. Effects of adjuvant tamoxifen and of cytotoxic therapy on mortality in early breast cancer: an overview of 61 randomized trials among 28,896 women. N Engl J Med. 1988;319(26):1681–1692. CrossRef
- Correll CU, Doane MJ, McDonnell D, et al. Olanzapine/Samidorphan effects on weight gain: an individual patient data meta-analysis of phase 2 and 3 randomized double-blind studies. J Clin Psychiatry. 2025;86(1):24m15526. CrossRef
- d’Angremont E, Begemann MJH, van Laar T, et al. Cholinesterase inhibitors for treatment of psychotic symptoms in Alzheimer disease and Parkinson disease: a meta-analysis. JAMA Neurol. 2023;80(8):813–823. CrossRef
- Andrade C. Cholinesterase inhibitors for delusions and hallucinations in Alzheimer disease and Parkinson disease: questionably significant benefits. J Clin Psychiatry. 2023;84(4):23f15009. CrossRef
This PDF is free for all visitors!