Committee Chair
Sartipi, Mina
Committee Member
Fell, Nancy; Gao, Lani; Cho, Jin
College
College of Engineering and Computer Science
Publisher
University of Tennessee at Chattanooga
Place of Publication
Chattanooga (Tenn.)
Abstract
Stroke is one of the leading causes of long-term disability and death in the United States. Stroke patients often face severe health consequences, significantly impacting their lives and placing a substantial financial burden on their families and the wider healthcare system. Therefore, reliable predictions of various patient outcomes, such as early hospital readmission, length of stay (LOS) in the hospital, and risk of mortality, can help patients and healthcare providers in various aspects. Furthermore, successful modeling of such phenomena can help identify the influential factors affecting the patient outcomes, and, by this, improve the quality of care for patients. In this research, we have combined statistical analysis and machine learning (ML) algorithms to enhance the prediction of three patient outcomes — i.e. 30-day readmission, LOS, and mortality — for stroke patients in Tennessee. Since typically such a dataset is imbalanced, due to a small fraction of those events, various ML algorithms, suitable for imbalanced data, such as XGBoost, LightGBM, and CatBoost, were employed in this work. To further improve the performance of the models, various data-level approaches were used to overcome the imbalanced nature of the data. These methods include cluster centroids, NearMiss, and Instant Hardness Threshold. It was shown that such a combination of data modification, especially with under-sampling methods, and suitable ML algorithms can lead to high model performance, measured in terms of Recall and other metrics. Furthermore, based on the features of the data available in our work, using SHAP explainable ML method, the influential factors affecting these outcomes were identified; higher age and mostly the vital signs at the time of admission play an important role in LOS. For 30-day readmission peripheral artery disease, sleep disorders, as well as prescribed medicine such as anticoagulant and antibiotic agents were among the most influential features. For mortality, static patient health conditions were the most influential factors. A simple Graphical User Interface (GUI) was also developed for one of the LOS outcomes, which can be extended to other outcomes, to demonstrate the capability of this work for practical applications.
Acknowledgments
I would like to express my gratitude to my advisor, Dr. Sartipi, for her invaluable mentorship, encouragement, and the financial support throughout my doctoral studies. Her insightful guidance, thoughtful feedback, and commitment to excellence have been instrumental at every stage of this research. Dr. Sartipi has been not only an exceptional mentor but also someone who is always supportive and genuinely invested in the success of the entire research group. Her openness to discuss concerns and her dedication to fostering a collaborative environment have meant a great deal to me. I am also thankful to the members of my dissertation committee: Dr. Nacy Fell, Dr. Lany Gao, and Dr. Jin Cho, for their time, expertise, and constructive feedback, all of which greatly enhanced the quality and depth of this dissertation. I am especially grateful to Dr. Cho, whose patience and guidance helped me get started on this project and navigate the essential steps of conducting academic research. I would also like to acknowledge the financial support provided through the UTC’s employee benefits program, which made it possible for me to continue my research. This support played a vital role in making my doctoral journey possible.
Degree
Ph. D.; A dissertation submitted to the faculty of the University of Tennessee at Chattanooga in partial fulfillment of the requirements of the degree of Doctor of Philosophy.
Date
8-2025
Subject
Cerebrovascular disease--Patients--Tennessee--Data processing; Medical informatics; Medical statistics
Document Type
Doctoral dissertations
DCMI Type
Text
Extent
xvii, 119 leaves
Language
English
Rights
http://rightsstatements.org/vocab/InC/1.0/
License
http://creativecommons.org/licenses/by/4.0/
Date Available
8-31-2026
Recommended Citation
Rahmati, Monireh, "Toward explainable machine learning methods for stroke patient outcomes in Tennessee" (2025). Masters Theses and Doctoral Dissertations.
https://scholar.utc.edu/theses/1014
Department
Dept. of Computer Science and Engineering