Authors: Bogie, Kath | Xu, Yifan | Ma, Junheng | Zhang, Adah | Wang, Yuanyuan | Zanotti, Kristine | Sun, Jiayang
Article Type:
Research Article
Abstract:
Ovarian cancer (OvCa) is the fifth leading cause of cancer deaths in women and remains the deadliest gynecological cancer. Our study goal is to examine associations between diagnostic patterns and OvCa stages. We used the data from a web-based survey in which more than 500 women diagnosed with OvCa provided both free text responses and staging information. We employed text mining and natural language processing (NPL) to extract information on clinical diagnostic characteristics, together with 21 dichotomous symptomatic variables, patient-centered advocacy, and polytomous disease severity, with internal validation. We conducted multivariate analyses and developed tree-based classification models with the confirmation
…of Random Forest to determine important factors in the relationships of the clinical diagnostic characteristics with OvCa stages. Models including the symptoms, patient advocacy tendency, disease severity and doctors’ responses as predictors, had a much better predictive power than those limited to doctors’ responses alone, indicating that OvCa stage at diagnosis depends on more than just doctors’ responses. Although effective early stage diagnosis and treatment remains a challenge, our analysis of patient-centered clinical diagnostic characteristics and symptoms shows that self-advocacy is essential for all women. The frontline physician is critically important in ensuring effective follow-up and timely treatment before diagnosis.
Show more
Keywords: Ovarian cancer, diagnosis, survey, follow-up, multivariate analysis, text mining, data mining, tree-based classification, random forest, structural and non-structural missing data, patient advocacy, symptoms, doctor’s responses
DOI: 10.3233/MAS-170402
Citation: Model Assisted Statistics and Applications,
vol. 12, no. 3, pp. 275-285, 2017
Price: EUR 27.50