Felix Emeka Anyiam (Initial Co-Lead CODATA Connect 2019-2024)
In this post, Felix Emeka Anyiam, who was Initial Co-Lead of CODATA Connect, our Early Career Researcher initiative, from 2019-2024, reflects on his experiences over eight years of participating in CODATA activities. In particular, he emphasizes the benefits of sustained collaborations and connections: “long-term, networked training matters more than one-off workshops” and praises the CODATA Connect and CODATA Data Schools model which allowed students to return in more responsible, leadership roles. Felix’s story shows the CODATA Connect provided an environment and collaborations that benefited Felix in this journey. But it also shows how Felix’s open and generous character, his enthusiasm to participate, brought rewards. Please enjoy this uplifting story! Simon HODSON, Executive Director, CODATA.
Welcome to Trieste: how it started, 2017
In August 2017, at the International Centre for Theoretical Physics (ICTP) in Trieste, Italy, I encountered research data science not merely as a set of analytical tools, but as a global public good. I arrived as a public health researcher from Nigeria, trained in epidemiology and biostatistics, seeking stronger quantitative approaches to interrogate health systems data. I left with something more enduring: an entry point into a global ecosystem shaped by CODATA’s commitment to open science, equity, and long-term capacity building.

The CODATA-RDA Research Data Science Summer School in Trieste offered more than technical instruction. It introduced a way of thinking about data, FAIR by design, ethically governed, and shared across disciplines and borders. Participants from low- and middle-income countries (LMICs) were not positioned as beneficiaries, but as peers and future contributors. CODATA functioned not as a sponsor, but as a convenor of people, ideas, and responsibility. That distinction would shape my professional trajectory in the years that followed.
Continuity as capacity: returning, deepening, expanding (2017–2018)
One year later, in August 2018, I returned to ICTP for the Climate Data Science Advanced Workshop, again under the CODATA-ICTP collaboration, with Clement Onime and Simon Hodson among the local organisers. This second invitation proved pivotal. It reinforced the idea that capacity building is most effective when it is iterative and cumulative, allowing participants to deepen expertise, cross disciplinary boundaries, and apply learning to new problem domains.

The 2018 programme expanded my analytical perspective beyond health to climate systems, environmental data, and computational modelling. Skills that later proved essential for interdisciplinary work at the intersection of climate, urban systems, and public health. More importantly, it signalled something fundamental about the CODATA model: participation was not episodic. There was an intentional pathway for return, growth, and contribution.
From Participant to Contributor: Teaching, Networks, and Leadership (2018–2025)
Following my initial training through the CODATA-RDA Research Data Science programmes in Trieste, the relationships established during those early years began to translate into sustained international collaboration. It was through these engagements that the foundations were laid for the first Urban Data Science Summer School in 2018, hosted by the Summer–Winter School at CEPT University, Ahmedabad, India, in collaboration with CEPT Faculty at the time, Dr Shaily Gandhi, marking an important expansion of CODATA-enabled capacity building beyond the initial training context. My role as a co-instructor extended this work to undergraduate and postgraduate cohorts.
Building on this momentum, the programme evolved into a more structured and geographically diverse initiative. The second edition of the Urban Data Science Summer School took place from 13 to 23 May 2019 (https://shailygandhi.github.io/UrbanDataScience2019/). These successive schools reflected not only the maturation of an academic programme, but also the strength of the collaborative networks that had emerged from CODATA’s training ecosystem, and networks sustained through shared curriculum development, co-teaching, and long-term professional exchange since those early connections in Trieste.

That same year, I was appointed inaugural co-lead of CODATA Connect, the organisation’s Early Career and Alumni Network, a role I held from 2019 to 2024 (https://codata.org/initiatives/data-skills/codata-connect/members/). CODATA Connect was established to address a persistent gap in global training initiatives: what happens after the workshop concludes. Rather than allowing capacity gains to dissipate, the network was designed as a continuity mechanism, enabling early-career researchers to remain engaged, visible, and supported within the wider CODATA ecosystem.
Working collaboratively with co-leads and core members from India, Costa Rica, Europe, Africa, Asia, Australia, and Latin America, CODATA Connect evolved into a distributed, peer-led platform for sustained skills development and exchange. Together, we coordinated a series of research skills webinars, thematic workshops, and podcast series that translated FAIR data principles, reproducibility, and ethical data stewardship into applied, domain-specific contexts. These activities included structured webinar series on research skills and reproducibility, smart and resilient cities, and open data practices, as well as hands-on technical workshops, such as training on distributed computing using Spark with R, explicitly targeted at early-career researchers in resource-constrained settings.
In parallel, CODATA Connect supported the development of cross-institutional podcast series, including Data for Resilient Cities, Data–Knowledge–Action for Urban Systems, Data for Disaster Risk Reduction, and Open GeoAI, which brought together researchers, practitioners, and policy actors to explore how open data, geospatial analytics, and AI can inform urban resilience, disaster risk reduction, health, and sustainable development. These initiatives not only expanded the reach of CODATA’s data-skills agenda but also created durable knowledge artefacts that continue to serve as learning resources beyond the immediate training context.
Throughout this period, my own contributions were embedded within this collective effort alongside colleagues such as Shaily R. Gandhi (Initial Lead-India), Mariana Cubero-Corella (Costa Rica), Anup Kumar Das (India), Neema Sumari (Tanzania), Kishore Sivakumar (Netherlands), Adenike Shonowo (Nigeria), Jacqueline Stephens (Australia), Jaime Rugeles (Colombia), Zhifang Tu (China), and others. We worked to ensure that CODATA Connect remained inclusive, interdisciplinary, and globally representative. The emphasis was consistently on peer mentorship, leadership development, and translation of open science principles into local research practice, particularly within low- and middle-income country contexts.
This trajectory reached a moment of continuity in August 2025, when I returned once again to ICTP, Trieste, this time not as a participant, but as a tutor and co-lead for the CODATA-RDA Advanced Workshop on Urban Data Science https://indico.ictp.it/event/10990).

Having first attended the CODATA-RDA programmes as a student in 2017 and 2018, returning as a facilitator underscored the iterative nature of CODATA’s capacity-building model. Alongside colleagues Dr Shaily Gandhi (ITU Linz, Austria) and Dr Neema Sumari (Sokoine University of Agriculture, Tanzania), I contributed to hands-on sessions on geospatial analytics for urban planning and policy, predictive modelling for population dynamics, infrastructure, and health-risk assessment, and decision-support systems for resilient and sustainable cities.
The 2025 workshop brought together researchers from multiple regions to deepen expertise in big-data analytics, computational infrastructure, urban and environmental data science, and ocean-science data, all grounded in FAIR principles and ethical data stewardship. Contributing to the same platform that had shaped my own formation in research data science reinforced a central lesson of this journey: effective capacity building is not a single intervention, but a networked process sustained through collaboration, continuity, and shared responsibility, where today’s participants become tomorrow’s instructors, mentors, and stewards of the global data ecosystem.
Broadening horizons: global exposure through CODATA-enabled opportunities
Alongside teaching and network leadership, CODATA-enabled pathways opened doors to broader global engagement. I was selected to participate in the International Training Workshop on Open Science and the SDGs hosted by the Chinese Academy of Sciences in Beijing in 2023, contributing to discussions on ethical data reuse and sustainable development. These collaborations produced the peer-reviewed article: Statements on Open Science for Sustainable Development Goals in the Data Science Journal, in which I was a co-author (https://doi.org/10.5334/dsj-2024-049). Earlier, I had been selected for Topics in Digital and Computational Demography at the Max Planck Institute for Demographic Research (Germany) and for the ALPSP Virtual Conference and Awards in the United Kingdom, one of only 20 global recipients.
Travel grants from CODATA supported participation in the ICTP Trieste programme (2018) and the Science for Development Workshop in South Africa (2020), underscoring CODATA’s practical commitment to inclusion. These experiences reinforced a consistent message: global capacity building is strongest when financial, intellectual, and institutional barriers are addressed together.
This period of sustained engagement and international collaboration was also marked by formal recognition from the wider scientific community. In 2025, I was inducted into Sigma Xi, The Scientific Research Honor Society, in recognition of my research contributions and commitment to advancing science in the public interest. While this honour is conferred independently, it reflects the cumulative impact of long-term investment in research training, open science practice, and global collaboration. The skills, networks, and values cultivated through CODATA’s capacity-building ecosystem were central to developing the kind of research profile and scholarly orientation that such recognition acknowledges.
SAIL 2025 as a milestone, not the destination
In 2025, I was invited to present at the Symposium on Artificial Intelligence for Learning Health Systems (SAIL 2025), co-hosted by Harvard Medical School and convened around a shared commitment to equity-driven, ethically grounded applications of artificial intelligence in healthcare. My presentation drew on doctoral research that applied machine-learning methods to examine inequities in HIV self-testing uptake across sub-Saharan Africa, using large-scale demographic health survey data from 24 countries (https://sail.health/event/sail-2025/program/).

The study employed Classification and Regression Tree (CART) and Random Forest models to identify socio-demographic predictors of willingness to self-test for HIV. Beyond methodological performance, the analysis foregrounded a persistent equity concern: rural populations, individuals with lower levels of education, and those in lower-income groups remain systematically underserved. The work demonstrated how predictive analytics, when designed transparently and interpreted responsibly, can inform targeted, community-embedded public health interventions rather than reinforce existing disparities.
What made participation in SAIL 2025 particularly significant, however, was not the event itself but the lineage that made meaningful engagement possible. The ability to work confidently across disciplinary boundaries, to interrogate data quality and representativeness, to foreground ethics and FAIR principles, and to communicate complex analytical approaches to diverse audiences was not acquired in isolation. These capacities were cultivated incrementally through long-term engagement with CODATA-led training programmes, teaching roles, and international peer networks.
Across plenary sessions, panels, and technical discussions at SAIL, a consistent message emerged: AI should not be framed as a luxury innovation for high-resource health systems, but as a practical, scalable tool for strengthening learning health systems where access, quality, and data infrastructure remain uneven. Conversations around AI-enabled clinical decision support in low- and middle-income countries, data governance for learning health systems, and patient-centred innovation resonated strongly with principles long emphasised within CODATA’s capacity-building ecosystem.
Several themes from the symposium were especially aligned with this trajectory. First, the centrality of context, that AI systems must be designed to work within real-world constraints rather than idealised data environments. Second, the discussions highlighted that data quality and equity cannot be treated separately: AI systems trained on incomplete, biased, or poorly governed datasets are likely to reinforce existing health disparities rather than mitigate them. Third, the importance of trust, transparency, and explainability, particularly when deploying models in sensitive or high-stakes health domains. Finally, there was a strong emphasis on collaboration over competition, underscoring the need for interdisciplinary and cross-sector partnerships to advance AI for health responsibly.
Seen through this lens, SAIL 2025 was not a destination, but a convergence point, where years of sustained capacity building translated into frontier research engagement. It affirmed that long-term investment in data skills, ethical reasoning, and global research networks enables researchers, particularly those working in LMIC contexts, to contribute meaningfully to shaping emerging conversations at the intersection of AI and health.
Rather than standing apart from earlier stages of training and collaboration, SAIL 2025 illustrated the cumulative effect of CODATA’s model: a pathway in which early exposure evolves into leadership, stewardship, and the application of advanced methods to questions of equity and public value.
From skills to stewardship: Governance and Responsibility
More recently, my engagement with CODATA has extended beyond training and programme delivery into data governance, interoperability, and infrastructure stewardship. I currently serve as a member of the Cross-Domain Interoperability Framework (CDIF) Working Group and Advisory Group, where I contribute to the development and review of interoperability standards, emerging CDIF profiles, and strategic oversight for globally connected data ecosystems. This work involves close collaboration with an international body of senior experts, as well as ongoing technical discussions focused on enabling responsible data reuse across domains.
In parallel, I serve as a reviewer for the Data Science Journal and have contributed to CODATA’s Smart Cities Task Group and the Resilient and Healthy Cities Working Group, with a particular focus on data-driven approaches to urban health, climate resilience, and risk reduction. These roles reflect an increasing emphasis on stewardship, helping to shape not only how data are analysed, but how they are governed, shared, and translated into public value within complex socio-technical systems.
This evolution from skills acquisition to systems-level responsibility has been further strengthened through formal engagement with public-sector digital governance. In December 2025, I completed the AI and Digital Transformation in Government programme delivered by Saïd Business School, University of Oxford, in collaboration with UNESCO. The programme offered a rigorous, practice-oriented exploration of how governments can responsibly harness artificial intelligence and data-driven technologies to deliver inclusive, ethical, and effective public services.
Key areas of focus included AI ethics and governance, human-centred service design, digital leadership, cyber resilience, and the management of systemic change within public institutions. Importantly, the programme foregrounded the role of evidence, accountability, and institutional capacity in ensuring that digital transformation serves citizens rather than exacerbates existing inequalities.
Taken together, these governance, editorial, and policy-oriented engagements reflect a central lesson of sustained capacity building: technical competence must ultimately be matched by institutional responsibility. The transition from learning how to use data to helping shape the frameworks that govern its use represents a critical step in ensuring that data science and AI contribute to equitable, trustworthy, and socially grounded outcomes at scale.
What this journey tells us about sustaining capacity
Several lessons emerge from this journey. First, long-term, networked training matters more than one-off workshops. Skills persist when they are reinforced through return, teaching, and community. Second, effective capacity building produces leaders and stewards, not just analysts. Third, continuity, supported by mentorship, alumni networks, and governance roles, is essential for translating training into durable impact, particularly in LMIC contexts.
Looking ahead
As data science and artificial intelligence increasingly shape global responses to health, climate, and development challenges, CODATA’s model offers a compelling blueprint. Capacity building is not an event; it is a commitment sustained over time. For early-career researchers, particularly those working in resource-constrained settings, CODATA continues to demonstrate what is possible when openness, equity, and continuity are placed at the centre of scientific practice.
Short Biography of the Author
Felix Emeka Anyiam is a public health researcher and data scientist based at the University of Port Harcourt, Nigeria. His work focuses on the ethical and equitable application of data science and artificial intelligence to health systems, urban resilience, and development challenges in low- and middle-income countries. An alumnus and long-term contributor to CODATA-led Research Data Science programmes, he has served as co-instructor in CODATA-RDA Advanced Workshops, inaugural co-lead of CODATA Connect (the Early Career and Alumni Network), and a member of multiple CODATA task and working groups. His research and teaching emphasise FAIR data principles, reproducibility, and responsible data governance within global and local research ecosystems.