アブストラクト | INTRODUCTION: Coding medicinal products described on adverse event (AE) reports to specific entries in standardised drug dictionaries, such as WHODrug Global, is a time-consuming step in case processing activities despite its potential for automation. Many organisations are already partially automating drug coding using text-processing methods and synonym lists, however addressing challenges such as misspellings, abbreviations or ambiguous trade names requires more advanced methods. WHODrug Koda is a drug coding engine using text-processing algorithms, built-in coding rules and machine learning to code drug verbatims to WHODrug Global. OBJECTIVE: Our aim was to evaluate the drug coding performance of WHODrug Koda on AE reports from VigiBase, the World Health Organization's global database of individual case safety reports, in terms of level of automation and coding quality. METHODS: Koda was evaluated on 4.8 million drug entries from VigiBase. Automation level was computed as the proportion of drug entries automatically coded by Koda and was compared to a simple case-insensitive text-matching algorithm. Coding quality was evaluated in terms of coding accuracy, by comparing Koda's prediction to the WHODrug entries found on the AE reports in VigiBase. To better understand the cases in which Koda's coding results did not match with the WHODrug entries in VigiBase, a manual assessment of 600 samples of disagreeing encodings was performed by two teams of expert drug coders. RESULTS: Compared with a simple direct-match baseline, Koda can increase the automation level from 61% to 89%, while providing high coding quality with an accuracy of 97%. CONCLUSIONS: Even though Koda was designed for use in clinical trials, Koda achieves automation level and coding quality for drug coding of AE reports comparable with the performance observed in a previous evaluation of Koda on clinical trial data. Koda can thus help organisations to automate their drug coding of AE reports to a large degree. |