アブストラクト | BACKGROUND: Vaccine has been one of the most successful public health interventions to date. However, vaccines are pharmaceutical products that carry risks so that many adverse events (AEs) are reported after receiving vaccines. Traditional adverse event reporting systems suffer from several crucial challenges including poor timeliness. This motivates increasing social media-based detection systems, which demonstrate successful capability to capture timely and prevalent disease information. Despite these advantages, social media-based AE detection suffers from serious challenges such as labor-intensive labeling and class imbalance of the training data. RESULTS: To tackle both challenges from traditional reporting systems and social media, we exploit their complementary strength and develop a combinatorial classification approach by integrating Twitter data and the Vaccine Adverse Event Reporting System (VAERS) information aiming to identify potential AEs after influenza vaccine. Specifically, we combine formal reports which have accurately predefined labels with social media data to reduce the cost of manual labeling; in order to combat the class imbalance problem, a max-rule based multi-instance learning method is proposed to bias positive users. Various experiments were conducted to validate our model compared with other baselines. We observed that (1) multi-instance learning methods outperformed baselines when only Twitter data were used; (2) formal reports helped improve the performance metrics of our multi-instance learning methods consistently while affecting the performance of other baselines negatively; (3) the effect of formal reports was more obvious when the training size was smaller. Case studies show that our model labeled users and tweets accurately. CONCLUSIONS: We have developed a framework to detect vaccine AEs by combining formal reports with social media data. We demonstrate the power of formal reports on the performance improvement of AE detection when the amount of social media data was small. Various experiments and case studies show the effectiveness of our model. |
組織名 | Department of Information Science and Technology, George Mason University,;Fairfax, VA, USA.;Lane Department of Computer Science and Electrical Engineering, West Virginia;University, Morgantown, WV, USA.;Benjamin M. Statler College of Engineering and Mineral Resources, West Virginia;Department of Epidemiology & Public Health, University of Maryland School of;Medicine, Baltimore, MD, USA. Yuzhang@som.umaryland.edu.;Division of Biostatistics and Bioinformatics, University of Maryland Marlene and;Stewart Greenebaum Comprehensive Cancer Center, Baltimore, MD, USA.;Yuzhang@som.umaryland.edu. |