{"id":169413,"date":"2024-05-25T02:44:11","date_gmt":"2024-05-25T06:44:11","guid":{"rendered":"https:\/\/www.immortalitymedicine.tv\/development-and-validation-of-machine-learning-algorithms-based-on-electrocardiograms-for-cardiovascular-nature-com\/"},"modified":"2024-08-18T11:40:04","modified_gmt":"2024-08-18T15:40:04","slug":"development-and-validation-of-machine-learning-algorithms-based-on-electrocardiograms-for-cardiovascular-nature-com","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/machine-learning\/development-and-validation-of-machine-learning-algorithms-based-on-electrocardiograms-for-cardiovascular-nature-com.php","title":{"rendered":"Development and validation of machine learning algorithms based on electrocardiograms for cardiovascular &#8230; &#8211; Nature.com"},"content":{"rendered":"<p><p>Data sources    <\/p>\n<p>    This study was performed in Alberta, Canada, where there is a    single-payer healthcare system with universal access and 100%    capture of all interactions with the healthcare system.  <\/p>\n<p>    ECG data was linked with the following administrative health    databases using a unique patient health number: (1) Discharge    Abstract Database (DAD) containing data on inpatient    hospitalizations; (2) National Ambulatory Care Reporting System    (NACRS) database of all hospital-based outpatient clinic, and    emergency department (ED) visits; and (3) Alberta Health Care    Insurance Plan Registry (AHCIP), which provides demographic    information.  <\/p>\n<p>    We used standard 12-lead ECG traces (voltage-time series,    sampled at 500Hz for the duration of 10seconds for each of 12    leads) and ECG measurements (automatically generated by Philips    IntelliSpace ECG systems built-in algorithm). The ECG    measurement included atrial rate, heart rate, RR interval, P    wave duration, frontal P axis, horizontal P axis, PR interval,    QRS duration, frontal QRS axis in the initial 40ms, frontal    QRS axis in the terminal 40ms, frontal QRS axis, horizontal    QRS axis in the initial 40ms, horizontal QRS axis in terminal    40ms, horizontal QRS axis, frontal ST wave axis (equivalent to    ST deviation), frontal T axis, horizontal ST wave axis,    horizontal T axis, Q wave onset, Fridericia rate-corrected QT    interval, QT interval, Bazetts rate-corrected QT interval.  <\/p>\n<p>    The study cohort has been described    previously25. In brief,    patients who were hospitalized at 14 sites between February    2007 and April 2020 in Alberta, Canada, and includes 2,015,808    ECGs from 3,336,091 ED visits and 1,071,576 hospitalizations of    260,065 patients. Concurrent healthcare encounters (ED visits    and\/or hospitalizations) that occurred for a patient within a    48-hour period of each other were considered to be transfers    and part of the same healthcare episode. An ECG record was    linked to a healthcare episode if the acquisition date was    within the timeframe between the admission date and discharge    date of an episode. After excluding the ECGs that could not be    linked to any episode, ECGs of patients <18 years of age, as    well as ECGs with poor signal quality (identified via warning    flags generated by the ECG machine manufacturers built-in    quality algorithm), our analysis cohort contained 1,605,268    ECGs from 748,773 episodes in 244,077 patients (Fig.    1).  <\/p>\n<p>    We developed and evaluated ECG-based models to predict the    probability of a patient being diagnosed with any of 15    specific common CV conditions: AF, SVT, VT, CA, AVB, UA,    NSTEMI, STEMI, PTE, HCM, AS, MVP, MS, PHTN, and HF. The    conditions were identified based on the record of corresponding    International Classification of Diseases, 10th revision    (ICD-10) codes in the primary or in any one of 24 secondary    diagnosis fields of a healthcare episode linked to a particular    ECG (Supplementary Table 5). The validity of    ICD coding in administrative health databases has been    established previously36,37. If an ECG was    performed during an ED or inpatient episode, it was considered    positive for all diagnoses of interest that were recorded in    the episode. Some diagnoses, such as AF, SVT, VT, STEMI, and    AVB, which are typically identified through ECGs, were included    in the study as positive controls to showcase the effectiveness    of our models in detecting ECG-diagnosable conditions.  <\/p>\n<p>    The goal of the prediction model was to output calibrated    probabilities for each of selected 15 conditions. These learned    models could use ECGs that were acquired at any time point    during a healthcare episode. Note that a single patient visit    may involve multiple ECGs. When training the model, we used all    ECGs (multiple ECGs belonging to the same episode were    included) in the training\/development set to maximize learning.    However, to evaluate our models, we used only the earliest ECG    in a given episode in the test\/holdout set, with the goal of    producing a prediction system that could be employed at the    point of care, when the patients first ECG is acquired during    an ED visit or hospitalization (See section Evaluation below    for more details).  <\/p>\n<p>    We used ResNet-based DL for the information-rich voltage-time    series and gradient boosting-based XGB for the ECG    measurements25. To determine    whether demographic features (age and sex) add incremental    predictive value to the performance of models trained on ECGs    only, we developed and reported the models in the following    manner: (a) ECG only (DL: ECG trace); (b) ECG + age, sex (DL:    ECG trace, age, sex [which is the primary model presented in    this study]); and (c) XGB: ECG measurement, age, sex.  <\/p>\n<p>    We employed a multi-label classification methodology with    binary labelsi.e., presence (yes) or absence (no) for each one    of the 15 diagnosesto estimate the probability of a new    patient having each of these conditions. Since the input for    the models that used ECG measurements was structured tabular    data, we trained gradient-boosted tree ensembles    (XGB)38 models, whereas    we used deep convolutional neural networks for the models with    ECG voltage-time series traces. For both XGB and DL models, we    used 90% of training data to train the model, and used the    remaining 10% as a tuning set to track the performance loss and    to early stop the training process, to reduce the chance of    overfitting39. For DL, we    learned a single ResNet model for a multi-class multi-label    task10, which mapped    each ECG signal into 15 values, corresponds to the probability    of presence of each of the 15 diagnoses. On the other hand, for    gradient boosting, we learned 15 distinct binary XGB models,    each mapping the ECG signal to the probability for one of the    individual labels. The methodological details of our XGB and DL    model implementations have been described    previously25.  <\/p>\n<p>    Evaluation design: we used a 60\/40 split on the data for    training and evaluation. We divided the overall ECG dataset    into random splits of 60% for the model development (which used    fivefold internal cross-validation for training and fine-tuning    the final models) and the remaining 40% as the holdout set for    final external validation. We ensured that ECGs from the same    patient were not shared between development and evaluation data    or between the train\/test folds of internal cross-validation.    As mentioned earlier, since we expect the deployment scenario    of our prediction system to be at the point of care, we    evaluated our models using only the patients first ECG in a    given episode, which was captured during an ED visit or    hospitalization. The number of ECGs, episodes, and patients    used in overall data and in experimental splits are presented    in Fig. 1 and Supplementary    Table 5. In addition to the    primary evaluation, we extend our testing to include all ECGs    from the holdout set, to demonstrate the versatility of DL    model in handling ECGs captured at any point during an episode.  <\/p>\n<p>    Furthermore, we performed Leave-one-hospital-out validation    using two large tertiary care hospitals to assess the    robustness of our model with respect to distributional    differences between the hospital sites. To guarantee complete    separation between our training and testing sets, we omitted    ECGs of patients admitted to both the training and testing    hospitals during the study period, as illustrated in    Supplementary Figure 1. Finally, to    underscore the applicability of DL model in screening    scenarios, we present additional evaluations by consolidating    15 disease labels into a composite prediction, thereby    enhancing diagnostic yield20.  <\/p>\n<p>    We reported area under the receiver operating characteristic    curve (AUROC, equivalent to C-index) and area under the    precision-recall curve (AUPRC). Also, we generated F1 Score,    Specificity, Recall, Precision (equivalent to PPV) and Accuracy    after binarizing the prediction probabilities into    diagnosis\/non-diagnosis classes using optimal cut-points    derived from the training set Youdens index40. We also used    the calibration metric Brier Score41 (where a    smaller score indicates better calibration) to evaluate whether    predicted probabilities agree with observed proportions.  <\/p>\n<p>    Sex and Pacemaker Subgroups: We investigated our models    performance in specific patient subgroups, based on the    patients sex. We also investigated any potential bias with    ECGs captured in the presence of cardiac pacing (including    pacemaker or implantable cardioverter-defibrillators [ICD]) or    ventricular assist devices (VAD) since ECG interpretation can    be difficult in these situations, by comparing the model    performances in ECGs without pacemakers in the holdout set    versus the overall holdout set (including ECGs both with or    without pacemakers) (Fig. 1). The diagnosis and    procedure codes used for identifying the presence of pacemakers    are provided in the Supplementary Table 7.  <\/p>\n<p>    Model comparisons: For each evaluation, we report the    performances from the fivefold internal cross-validation as    well as the final performances in the holdout set, using the    same training and testing splits for the various modeling    scenarios. The performances were compared between models by    sampling holdout instances with replacement in pairwise manner,    to generate a total of 10,000 bootstrap replicates of pairwise    differences in AUROCi.e., each comparing without pacemakers    versus the original. The difference in the model performances    was said to be statistically significant if the 95% confidence    intervals of the mean pairwise differences in AUROCs did not    include the zero value for the compared models.  <\/p>\n<p>    Visualizations: We used feature importance values based on    information gained to identify the ECG measurements that were    key contributors to the diagnosis prediction in the XGB models.    Further, we visualized the gradient activation maps that    contributed to the models prediction of diagnosis in our DL    models using Gradient-weighted Class Activation Mapping    (GradCAM)42 on the last    convolutional layer. Also, we used feature importance values    based on information gain to identify the ECG measurements that    were key contributors to the diagnosis prediction in the XGB    models.  <\/p>\n<p><!-- Auto Generated --><\/p>\n<p>Read the original post:<br \/>\n<a target=\"_blank\" href=\"https:\/\/www.nature.com\/articles\/s41746-024-01130-8\" title=\"Development and validation of machine learning algorithms based on electrocardiograms for cardiovascular ... - Nature.com\" rel=\"noopener\">Development and validation of machine learning algorithms based on electrocardiograms for cardiovascular ... - Nature.com<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> Data sources This study was performed in Alberta, Canada, where there is a single-payer healthcare system with universal access and 100% capture of all interactions with the healthcare system. ECG data was linked with the following administrative health databases using a unique patient health number: (1) Discharge Abstract Database (DAD) containing data on inpatient hospitalizations; (2) National Ambulatory Care Reporting System (NACRS) database of all hospital-based outpatient clinic, and emergency department (ED) visits; and (3) Alberta Health Care Insurance Plan Registry (AHCIP), which provides demographic information <a href=\"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/machine-learning\/development-and-validation-of-machine-learning-algorithms-based-on-electrocardiograms-for-cardiovascular-nature-com.php\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"limit_modified_date":"","last_modified_date":"","_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[1231415],"tags":[],"class_list":["post-169413","post","type-post","status-publish","format-standard","hentry","category-machine-learning"],"modified_by":null,"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/169413"}],"collection":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/comments?post=169413"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/169413\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/media?parent=169413"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/categories?post=169413"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/tags?post=169413"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}