Introduction
Determining the right sample size is a critical step in designing any clinical trial, and medical device studies are no exception. How many patients do we need? is not just a statistical question – it’s a regulatory and ethical one as well. Regulators expect sponsors to justify their sample size with a solid scientific rationale.
For instance, the EU Medical Device Regulation explicitly requires that clinical investigations include “an adequate number of observations to guarantee the scientific validity of the conclusions” and mandates a statistical justification and power calculation for the sample size in the clinical investigation plan.
Similarly, standards like ISO 14155:2020 insist that sample size calculations consider the expected effect size, data variability, and planned analysis methods. In practice, an underpowered study (too few subjects) might miss important device benefits or risks, while an overly large study can waste resources and expose more patients than necessary. Here’s how sponsors can approach sample size justification in device trials.
Define the Objectives and Endpoints
The foundation of any sample size justification is a clear definition of the trial’s primary objective and endpoint. Are you trying to show that your device is superior to the current standard of care? Non-inferior? Or perhaps you’re estimating a rate (like diagnostic accuracy) with a certain precision? The answer drives the statistical hypothesis and formula for sample size.
For example, a superiority trial will have a hypothesis about detecting a minimum clinically important difference between device and control outcomes. You should specify what difference (or effect) the study is powered to detect – say, a 10% higher success rate with the new device, or a 5 mmHg drop in blood pressure beyond standard treatment. This clinically meaningful difference is a key input to the calculation.
Equally important is identifying the primary endpoint’s nature (e.g., binary outcome, continuous measurement, time-to-event) because that dictates the statistical model. Focusing on one or two primary endpoints is advisable; multiple primary endpoints complicate sample size calculation and typically require adjustments (or a larger sample) to maintain statistical validity. In short, a well-justified sample size begins with what you’re measuring and what hypothesis you’re testing.
Statistical Assumptions: Alpha, Power, and Variability
Once the endpoint and hypothesis are set, the classic statistical parameters come into play:
- Significance Level (α): This is the probability of a Type I error (false positive) that you’re willing to accept, commonly 0.05 (5%). Regulatory guidance (FDA, ICH) generally recommends α = 0.05, though in some cases a more stringent alpha (e.g., 0.01) might be used if multiple comparisons are involved. Your sample size justification should state the alpha level used.
- Power (1–β): Power is the probability that your study will detect the specified effect size if it truly exists (i.e., avoiding a Type II error or false negative). Typically, trials aim for at least 80% power and often 90% for critical endpoints. A higher power means a larger sample size, all else equal. You should justify the chosen power level; for example, a pivotal trial might use 90% to give high confidence in results, whereas a smaller exploratory study might accept 80%. It’s important to show regulators that you’ve considered the risk of missing an effect – underpowered studies are a common critique in device submissions.
- Variability and Event Rates: Estimating the variability of your endpoint (for continuous data, the standard deviation; for binary outcomes, the expected proportion responding in each group) is crucial. These assumptions often come from prior studies, pilot data or literature. For instance, if historical data suggest about 70% of patients respond to a similar device, you might power your trial to detect an increase to 85%. If no prior data exist (as can happen with very novel devices), you may need a conservative guess or conduct a small pilot to inform these numbers. In your justification, clearly state the assumed rates or standard deviations used in the calculation, and provide rationale (e.g., “Based on a preliminary study or analogous device, we expect a standard deviation of 1.5 mmHg in blood pressure reduction”).
With these inputs – effect size, alpha, power, and variability – a biostatistician can perform the power calculation to arrive at a sample size per group (or overall, depending on the design). It’s good practice to mention the formula or method (e.g., “two-sample t-test calculation” or “Chi-square test for proportions”) and any software used. Remember, regulators or ethics committees might scrutinise this section, so transparency is key.
Adjustments and Special Cases
Real-world trial planning often requires adjustments to the initial sample size calculation:
- Drop-outs and Loss to Follow-up: Device trials may have patients withdraw or be lost, especially if follow-up is long or the device requires compliance. It’s common to inflate the sample size by an estimated percentage to account for drop-outs (e.g., add 10-20% more subjects). The justification should note this, e.g., “We added 15% to the sample size to compensate for potential drop-outs, based on experience in similar trials.”
- Multiple Endpoints or Subgroups: If the trial has co-primary endpoints or plans to look at key subgroups, ensure the sample is sufficient for each consideration. For co-primaries, a Bonferroni or similar correction might be applied to alpha, increasing required N. If powering for subgroups, you might effectively need a larger overall N to maintain power within each subgroup. Clearly explain any such strategy in your justification.
- Non-Inferiority Margins: For non-inferiority trials, a critical part of sample size justification is the chosen margin (how much worse can the device be and still be considered non-inferior?). This margin should be clinically justified – regulators will reject arbitrary or too-large margins. The sample size then ensures enough power to distinguish between the device being within this margin versus falling outside it.
- Bayesian or Adaptive Designs: If using a Bayesian trial design or an adaptive design (common in device research to improve efficiency), traditional power calculations may not directly apply. Nevertheless, you must justify that the planned sample size (or sample size rules in an adaptive design) will yield robust evidence. For adaptive trials, mention any planned interim analyses and how they affect sample size (some designs have “adaptive sample size re-estimation”). Regulators expect assurance that even with adaptive features, the final sample is adequate and the type I error is controlled.
Communicating the Justification
In writing the sample size justification (typically in your protocol’s statistical section or a separate justification document), be concise but thorough:
- Start by stating the primary endpoint and hypothesis.
- List the assumptions: expected effect size, control group rate (if applicable), alpha, desired power, and any variance estimates.
- Mention the formula or software used to compute the number.
- State the resulting sample size, then discuss any inflations or adjustments (drop-outs, etc.), arriving at the final number of subjects per group or total.
- Cite sources for your assumptions (literature, pilot studies, regulatory guidance). For example, “A response rate of 75% in the control group was assumed based on Smith et al. (2019)…”
By addressing these points, you demonstrate a rational, data-driven approach to sample size. This instills confidence that your trial is appropriately sized to achieve its objectives. In fact, under EU regulations, providing such justification isn’t optional – it’s required to show the investigation’s scientific rigour. ISO 14155:2020 also underscores that the sample size must be adequately justified to ensure ethical and scientific soundness.

Conclusion
Justifying sample size in medical device trials is a balancing act between scientific necessity and practical feasibility. A well-justified sample size reassures regulators that your study can produce meaningful results (if the device truly works as intended) without exposing excessive subjects. Conversely, a flimsy justification—“We chose 30 patients per group because that’s what we used before”—will raise red flags and delay approval or trigger an FDA query.
By carefully defining your hypotheses, selecting appropriate statistical parameters, and transparently explaining your assumptions, you make a strong case for your sample size. This not only satisfies regulatory requirements but ultimately contributes to the credibility of your trial outcomes.
Remember, sample size planning should involve both clinical insight and statistical expertise; engage your medical experts and biostatisticians early to align on what constitutes a clinically important effect. With a solid justification in hand, you can proceed with confidence that your medical device trial is poised to generate reliable, actionable evidence.