| アブストラクト | BACKGROUND: Biomedical Large Language Models (LLMs) combined with prompt engineering offer domain-specific reasoning, yet their application to individual-level causality assessment remains unexplored. This study evaluated five combinations of biomedical LLMs, prompting strategies, and causality algorithms by comparing their agreement with two human expert evaluators. RESEARCH DESIGN AND METHODS: A total of 150 Individual Case Safety Reports (ICSRs) were analyzed: 140 reports from Food and Drug Administration Adverse Event Reporting System (FAERS), and 10 myocarditis/pericarditis ICSRs from Vaccine AERS (VAERS). Assessments were conducted using the Naranjo and WHO-UMC algorithms. Biomedical LLMs tested included TinyLlama 1.1B, Medicine LLaMA-3 8B, and MedLLaMA v20, combined with Chain-of-Thought (CoT) or Decomposition prompting. Agreement was measured using Gwet's Agreement Coefficient 1 (AC1) and percentage agreement, alongside performance metrics and qualitative error analysis. RESULTS: The Medicine LLaMA-3 8B-Naranjo-CoT combination achieved the highest agreement with human assessors for the final classification of causality (64%). Biomedical LLMs demonstrated low inter-rater agreement on critical items of causality assessment such as identification of listed AE, temporal plausibility, alternative causes, and objective evidence of AEs. Frequent model failures included irrelevant responses. CONCLUSIONS: Biomedical LLMs showed improved performance over general purpose models previously tested but remain suboptimal for reliable causality assessment of ICSRs. |
| ジャーナル名 | Pharmaceutical research |
| Pubmed追加日 | 2026/5/23 |
| 投稿者 | Heckmann, Nicole Sonne; Papoutsi, Despoina Georgia; Barbieri, Maria Antonietta; Battini, Vera; Molgaard, Soren Norlin; Schmidt, Simon Orum; Melskens, Lars; Sessa, Maurizio |
| 組織名 | Department of Drug Design and Pharmacology, University of Copenhagen, Copenhagen,;Denmark.;Department of Clinical and Experimental Medicine, University of Messina, Messina,;Italy.;Department of Biomedical and Clinical Sciences (DIBIC), Universita Degli Studi Di;Milano, Milan, Italy.;Safety Operations, Novo Nordisk, Soborg, Denmark.;Denmark. maurizio.sessa@sund.ku.dk. |
| Pubmed リンク | https://www.ncbi.nlm.nih.gov/pubmed/42174348/ |