Author
Kamau, E
Kelly, S
Darji, D
Baidjoe, A
Brownstein, J
Campbell, F
Dasgupta, A
Degail, M
Demidova, A
Ferretti, L
Han, A
Koyie, S
Ngamala, P
Polain, O
Rojek, A
Sauer, J
Scarpino, S
Sewalk, K
Sopko, J
Zakrzewski, S
Merson, L
Kraemer, M
Journal title
Wellcome Open Research
DOI
10.12688/wellcomeopenres.24776.2
Volume
10
Last updated
2026-05-13T22:05:04.48+01:00
Page
524-524
Abstract
<ns3:p>Background Early-phase data during an epidemic are often heterogeneous and difficult to integrate across systems, therefore a need for standard tools and reporting guidelines to facilitate timely and reliable data collection. The Global.health team have developed a data schema for the ingestion of epidemic data, allowing interoperability where data curated to this schema are readily ingested into existing systems for analysis. This paper describes the definition of ‘core data’ within the Global.health schema to focus data collection on the most relevant and available data to inform epidemic response during the first 100 days of an outbreak. Methods We used expert consultation and a structured literature review to identify key epidemiological questions and parameters that must be addressed during the first 100 days of an outbreak. Relevant digital toolkits and reporting frameworks were reviewed, and minimum data variables required for parameter estimation were identified. These variables were mapped to the existing Global.health schema and assessed for availability in early outbreak data from four recent epidemics. Variables were categorized by availability and those with sufficient early availability were retained in a proposed core schema. Data formats were harmonized with WHO Epi Core, T0 and T1 toolkits to enhance interoperability. A complementary modular schema was defined to capture pathogen-specific variables. Results The literature review yielded 78 key epidemiological parameters relevant to early outbreak assessment, organized into eleven categories. Analysis of variable availability in early outbreak datasets showed that 42 of 140 variables in the existing Global.health schema were consistently available and suitable for inclusion in a core early-epidemic schema. Variables related to demographics, case status, symptom reporting, confirmation dates, outcomes, and exposure history were frequently available, while vaccination history, detailed treatment data, and certain clinical variables were less consistently reported. The resulting core schema comprises 42 interoperable variables across seven domains and aligns with WHO data standards and controlled terminologies. Conclusions Standardized, interoperable data capture during the early phase of epidemics is essential to enable timely estimation of key epidemiological parameters and to inform response strategies. The Global.health core schema provides a minimum, evidence-informed dataset for early outbreak investigation while maintaining compatibility with WHO reporting standards. By prioritizing variables that are both epidemiologically critical and realistically available in early data streams, this framework supports improved data harmonization, analysis, and decision-making during the first 100 days of an epidemic.</ns3:p>
Symplectic ID
2419558
Favourite
Off
Publication date
08 May 2026
Please contact us with feedback and comments about this page.