JMIR AI

Recent Articles

Important Ethical, Technical, and Epidemiological Considerations in an AI Tool Production (ETEPAI): Scoping Review

Artificial intelligence (AI) tools are being developed in a rapidly evolving technology. The convergence of ethical, technical, and research methods’ considerations is crucial for multidisciplinary teams aiming to produce effective AI tools. The success of these tools postdeployment hinges on the intricate interplay between the AI system’s development on its output through rigorous decision-making processes and stakeholders’ capacity to act on the AI’s recommendations.

Foundations of AI

Enhancing COVID-19 Screening Models With Epidemiological and Mobility Features: Machine-Learning Model Study

Despite the significant post–COVID-19 pandemic surge in research using symptom data and machine learning (ML) for patient screening, data on patient trajectories and epidemiological conditions, although crucial, have remained underused.

Viewpoints and Perspectives in AI

A Pragmatic Framework for Federated Learning Risk and Governance in Academic Medical Centers

With the rapid development of artificial intelligence (AI), particularly large language models, there is growing interest in adopting AI approaches within academic medical centers (AMCs). However, the vast amounts of data required for AI and the sensitive nature of medical information pose significant challenges to developing high-performing models at individual institutions. Furthermore, recent changes in government funding priorities may result in the decentralization of biomedical data repositories that risk creating significant barriers to effective data sharing and robust model development. This has generated significant interest in federated learning (FL), which enables collaborative model training without transferring data between institutions, thereby enhancing the protection of proprietary and sensitive information. While FL offers a crucial pathway to enable multi-institutional AI development while maintaining data privacy, it also exposes AMCs to novel governance, security, and operational risks that are not fully addressed by existing procedures. In response, this manuscript provides a perspective grounded in both leading international standards (NIST AI RMF [National Institute of Standards and Technology Artificial Intelligence Risk Management Framework], International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC) 42001) and in the real-world governance experience of AMC leadership. We present a risk differentiation framework, an FL risk matrix, and a set of essential governance artifacts—each mapped to key institutional challenges and reviewed for alignment with core standards but offered as pragmatic, illustrative guides rather than prescriptive checklists. Together, these tools represent a novel resource to support AMC security, privacy, and governance leaders with standards-informed, context-sensitive tools for addressing the evolving risks of FL in biomedical research and clinical environments.

Applications of AI

Facial Expression–Based Evaluation of the Emotion Estimation Software Kokoro Sensor in Healthy Individuals: Validation and Reliability Pilot Study

In recent years, artificial intelligence (AI) systems have increasingly been used to assess emotional states in health care. AI offers a safe, quick, user-friendly, and objective emotional evaluation method. However, evidence supporting its implementation in health care remains limited.

Applications of AI

Assessment of the Modified Rankin Scale in Electronic Health Records With a Fine-Tuned Large Language Model: Development and Internal Validation

The modified Rankin scale (mRS) is an important metric in stroke research, often used as a primary outcome in clinical trials and observational studies. The mRS can be assessed retrospectively from electronic health records (EHRs), but this process is labor-intensive and prone to interrater variability. Large language models (LLMs) have demonstrated potential in automating text classification.

Applications of AI

Explaining the Slow Adoption of AI Innovations in Health Care: Network Analysis Approach

Artificial intelligence (AI) is a topic of considerable hype, with many actors sensing its high potential for health care applications. Despite this, the adoption has been slow, with few applications being implemented in clinical practice.

Responsible Health AI

Performance of Large Language Models Under Input Variability in Health Care Applications: Dataset Development and Experimental Evaluation

Large language models (LLMs) are increasingly integrated into health care, where they contribute to patient care, administrative efficiency, and clinical decision-making. Despite their growing role, the ability of LLMs to handle imperfect inputs remains underexplored. These imperfections, which are common in clinical documentation and patient-generated data, may affect model reliability.

Ethical, Legal, and Social Issues in AI

AI-Generated Images of Substance Use and Recovery: Mixed Methods Case Study

Images created with generative artificial intelligence (AI) tools are increasingly used for health communication due to their ease of use, speed, accessibility, and low cost. However, AI-generated images may bring practical and ethical risks to health practitioners and the public, including through the perpetuation of stigma against vulnerable and historically marginalized groups.

Reviews in AI

Application of AI Models for Preventing Surgical Complications: Scoping Review of Clinical Readiness and Barriers to Implementation

The impact of surgical complications is substantial and multifaceted, affecting patients, families, surgeons, and healthcare systems. Despite the remarkable progress in artificial intelligence (AI), there remains a notable gap in the prospective implementation of AI models in surgery that use real-time data to support decision-making and enable proactive intervention to reduce the risk of surgical complications.

Applications of AI

Evaluation of Large Language Models for Peer Review in Transplantation Research: Algorithm Validation Study

Peer review remains central to ensuring research quality, yet it is constrained by reviewer fatigue and human bias. The rapid rise in scientific publishing has worsened these challenges, prompting interest in whether large language models (LLMs) can support or improve the peer review process.

Reviews in AI

Large Language Models for Health Care Text Classification: Systematic Review

Large language models (LLMs) have fundamentally transformed approaches to natural language processing tasks across diverse domains. In health care, accurate and cost-efficient text classification is crucial—whether for clinical note analysis, diagnosis coding, or other related tasks—and LLMs present promising potential. Text classification has long faced multiple challenges, including the need for manual annotation during training, the handling of imbalanced data, and the development of scalable approaches. In health care, additional challenges arise, particularly the critical need to preserve patient data privacy and the complexity of medical terminology. Numerous studies have leveraged LLMs for automated health care text classification and compared their performance with traditional machine learning–based methods, which typically require embedding, annotation, and training. However, existing systematic reviews of LLMs either do not specialize in text classification or do not focus specifically on the health care domain.

Research Letter

Evaluating Large Language Model–Generated Clinical Summaries Through a Dual-Perspective Framework: Retrospective Observational Study

Large language models (LLMs) are increasingly used by patients and families to interpret complex medical documentation, yet most evaluations focus only on clinician-judged accuracy. In this study, 50 pediatric cardiac intensive care unit notes were summarized using GPT-4o mini and reviewed by both physicians and parents, who rated readability, clinical fidelity, and helpfulness. There were important discrepancies between parents and clinicians in the realm of helpfulness, along with important insights by clinicians assessing clinical accuracy and parents assessing readability. This study highlights the need for dual-perspective frameworks that balance clinical precision with patient understanding.