Sample Size Calculation Guide - Part 2: How to Calculate the Sample Size for an Independent Cohort Study

1 Faculty of Clinical Pharmacy, Zagazig University, Zagazig, El-Sharkia, Egypt.

Find articles by Nadien Khaled Fahim

Ahmed Negida

2 Faculty of Medicine, Zagazig University, Zagazig, El-Sharkia, Egypt.

Find articles by Ahmed Negida 1 Faculty of Clinical Pharmacy, Zagazig University, Zagazig, El-Sharkia, Egypt. 2 Faculty of Medicine, Zagazig University, Zagazig, El-Sharkia, Egypt.

* Corresponding author: Ahmed Negida; Email: ahmed01251@medicine.zu.edu.eg, ahmed.said.negida@gmail.com

This open-access article distributed under the terms of the Creative Commons Attribution Non-Commercial 4.0 License (CC BY-NC 4.0). (https://creativecommons.org/licenses/by-nc/4.0/)

INTRODUCTION

In the previous article, we explained how to calculate the sample size for a cross sectional study based on a rate or a single proportion (1). In this article, we will explain how to calculate the sample size for an independent cohort study based on a comparison of two proportions representing the event rates in both the exposed and the non-exposed groups.

WHEN TO USE THE SAMPLE SIZE CALCULATION PROCEDURE OF TWO PROPORTIONS

The methods explained hereafter should be used in case that the primary outcome of your research study is expressed as a risk ratio or two proportions. Although the risk ratios and two proportions are mainly obtained from cohort studies, other research designs might follow the same scenario if the primary outcome is a comparison of two proportions.

For example, a prospective cohort study to assess the risk of dementia among patients with cerebral microbleeds (exposed group) in comparison with those without cerebral microbleeds (non-exposed group); in this study the incidence of dementia in the two groups are expressed as the relative risk (RR).

Another example, a randomized controlled trial to compare the sustained virologic response rates between daclatasvir and ledipasvir treatments in patients with hepatitis C virus infection; in this study, the SVR rates are expressed as the relative risk (RR).

Requirements for sample size calculation based on prevalence

(1) Expected RR: between exposed and non-exposed groups*

(2) Probability of event in exposed group*

(3) Probability of event in non-exposed group

(4) Statistical power: 0.8, 0.85, or 0.9

(5) Alpha: usually 0.05

(6) Ratio of unexposed to exposed group (1 in case of equal groups)

* Either the RR or the probability of event in exposed group will be needed.

EXAMPLE: CASE STUDY OF EARLY MORTALITY IN CKD PATIENTS WITH HIGH GFR

Assume that we will conduct a cohort study to investigate the impact of high GFR on early mortality in patients with chronic kidney disease, who started hemodialysis. In this study, we will follow two groups of CKD patients: the exposed group is defined as CKD patients with GFR>10 ml/min/1.73 m 2 while the non-exposed group is defined as those with GFR≤10 ml/min/1.73 m 2 .

The literature showed that the RR of early mortality between patients with high vs. low GFR rates was 2.72 as reported by Gómez de la Torre-Del Carpio (2); in this study the proportion rate of mortality in the non-exposed group (low GFR group) was 7.5%. The following steps will show how to calculate the sample size to detect a RR of 2.72 with 90% statistical power and 5% margin of error assuming two equal group.