Logo-dht
Digital Health Trends. 2025;1(1): 63-69.
doi: 10.34172/dhtj.07
  Abstract View: 37
  PDF Download: 38

Original Article

Comparative Study of Classification-Based Data Mining Algorithms for Predicting Cardiovascular Diseases

Sepideh Seyedi-Sahebari 1 ORCID logo, Ali Farzaneh 2* ORCID logo

1 Student Research Committee, Tabriz University of Medical Sciences, Tabriz, Iran
2 Department of Epidemiology, Erasmus University Medical Center, Rotterdam, the Netherlands
*Corresponding Author: Ali Farzaneh, Email: farzanehali78@gmail.com

Abstract

Background: Cardiovascular diseases (CVDs) remain a leading cause of mortality, demanding timely and accurate diagnosis. Traditional clinical assessments are often prone to errors, highlighting the need for predictive models that leverage large-scale clinical data, including data mining techniques that can extract data from complex medical datasets. This study comparatively analyzed classification-based data mining algorithms for predicting CVDs and evaluating their performance across multiple metrics to identify the most effective predictive model for clinical applications.

Methods: The UCI Heart Disease dataset (270 records with 14 clinical attributes) was used. Data preprocessing involved cleaning, normalization, discretization, and partitioning into training (70%) and testing (30%) sets. NB, ANN, kNN, SVM, and CART algorithms were implemented using Orange. Model performance was evaluated by accuracy, sensitivity, specificity, precision, recall, F-measure, and AUC using hold-out validation and 5-fold cross-validation. Feature importance and decision rules were extracted from tree-based models for interpretability.

Results: SVM and NB achieved the highest overall predictive performance (accuracy: 84.44%, sensitivity: 86.00%, specificity: 82.50%, AUC: 0.9136; accuracy: 84.07%, AUC: 0.9133). ANN and KNN demonstrated moderate predictive ability, while CART (accuracy: 78.52%) provided interpretable decision rules. Decision tree (DT) analysis identified thalassemia status, chest pain type, and number of major vessels colored as the most influential attributes. Several clinically interpretable rules were extracted, offering potential guidance for risk assessment. Statistical comparisons indicated no significant difference between SVM and NB performance, suggesting both models provide reliable predictions.

Conclusion: SVM and NB offer robust predictive capabilities for CVD, outperforming traditional statistical approaches. DT models provide additional interpretability, facilitating clinical understanding and application. These findings underscore the importance of evaluating multiple predictive models in context-specific datasets to identify optimal approaches for risk assessment, resource allocation, and quality of care improvement, thereby enhancing early detection and supporting evidence-based CVD management.

First Name
Last Name
Email Address
Comments
Security code


Abstract View:

Your browser does not support the canvas element.

PDF Download:

Your browser does not support the canvas element.


Full Text View:

Your browser does not support the canvas element.


Submitted: 04 Jul 2025
Revision: 20 Aug 2025
Accepted: 15 Sep 2025
ePublished: 27 Sep 2025
EndNote EndNote

(Enw Format - Win & Mac)

BibTeX BibTeX

(Bib Format - Win & Mac)

Bookends Bookends

(Ris Format - Mac only)

EasyBib EasyBib

(Ris Format - Win & Mac)

Medlars Medlars

(Txt Format - Win & Mac)

Mendeley Web Mendeley Web
Mendeley Mendeley

(Ris Format - Win & Mac)

Papers Papers

(Ris Format - Win & Mac)

ProCite ProCite

(Ris Format - Win & Mac)

Reference Manager Reference Manager

(Ris Format - Win only)

Refworks Refworks

(Refworks Format - Win & Mac)

Zotero Zotero

(Ris Format - Firefox Plugin)