Machine learning for the development of diagnostic models of … – Nature.com

Design

This was a prospective multicenter observational study. Unlike studies on prognostic models, in the present study, diagnostic models were developed, that is, models designed to determine whether a patient was in the compensated or decompensated phase of their disease (exacerbation of COPD and/or HF decompensation).

The criteria for admission to this study and the recruitment process have been previously reported19. Patients older than 55years who were able to walk at least 30m, with a main diagnosis of decompensated HF and/or exacerbation of COPD and hospitalized in the Department of Internal Medicine, Cardiology or Pneumology were included. Participants with a pacemaker or intracardiac device, domiciliary oxygen therapy users prior to admission and patients with HF functional class IV of the New York Heart Association (NYHA) classification were excluded29.

Four hospitals participated: two tertiary university hospitals (600900 hospital beds) and two regional secondary care hospitals (150400 hospital beds) in the provinces of Barcelona and Madrid.

Each center had a trained interviewer, and each department had a referring physician who was accessible to the interviewer. Each day, the interviewer contacted the referring physician to review the hospitalization census and identify patients with the diagnosis of interest. Next, the interviewer confirmed the main diagnosis (decompensated HF and/or exacerbation of COPD) with the physician responsible for the patient and then contacted the participant (the same day or the next day) to obtain informed consent and verify compliance with all admission criteria of this study. The sample was obtained through convenience sampling, and all patients were enrolled consecutively as they were identified.

The recruitment and follow-up periods lasted 18months starting in November 2010.

Each patient underwent three identical evaluations: the first in the hospitalization unit (V1) and the other two consecutively and at least 24h apart in the participants home 30days after hospital discharge (V2 and V3). Thus, each participant underwent one evaluation in the decompensated phase (V1) and two in the compensated phase (V2, V3) of their disease.

The evaluation protocol19 included documentation of symptoms (dyspnea according to the NYHA29 and Modified Medical Research Council (mMRC)30 scales) and physiological parameters (HR and Ox) in two consecutive periods: effort (walking at a normal pace and on flat terrain for a maximum of 6min) and recovery (seated for 4min after the end of the effort period).

HR and Ox were considered time series with a sample frequency of 1Hz and were collected throughout the evaluation with a pulse oximeter (Model 3100, brand Nonin Medical, Inc., Plymouth, MN, USA) placed on the left index finger.

Given the absence of a single standard diagnostic test to verify whether a patient was in the compensated or decompensated phase of their disease, the clinical judgment of the participants responsible physician was considered a standard diagnostic test. Thus, in the decompensated phase, the diagnosis of decompensated HF and/or COPD exacerbation corresponded to the confirmed diagnosis from the participants attending physician (in cases of diagnostic doubt, the patient was excluded). For the compensated phase, a standard diagnosis of compensated HF and/or stable COPD was confirmed by a study physician through telephone contact with the participant 30days after hospital discharge. During this telephone interaction, the patient was considered to be in the compensated phase if none of the following events had occurred since hospital discharge: increased cough, sputum or dyspnea; initiation of or an increase in corticosteroid use; and initiation of antibiotic treatment or medical consultation for worsening of the clinical situation from any cause. In cases of doubt or if the compensated phase could not be confirmed, successive telephone contacts were made until the phase could be confirmed. The interviewer scheduled home visits for the respective evaluations (V2, V3) only after confirmation and within 2448h of receiving confirmation.

Given the objective of this study (development of an online algorithm capable of detecting the onset of an exacerbation from HR and Ox data), various characteristics of each of the evaluations were extracted (V1, V2, V3). For this purpose, the effort phase (walking) and recovery phase of each evaluation were separated by verifying the times recorded manually in the data collection records at the beginning and end of each phase of the test and visually reviewing the signals to confirm the manual records. Once the signals were separated according to the evaluation phase, the corresponding characteristics of the available measures were extracted.

Numerous characteristics were extracted from the signals. During each of the tests, two different phases were considered: effort and recovery, which were treated separately. From each of the phases, three signals were considered: HR, Ox and the normalized difference between these variables. From each of these three temporal signals, the characteristics of the temporal (the mean, standard deviation, and range) and frequency domains (the characteristics of the first and second harmonics, the distribution of the harmonics [kurtosis and skewness], the sum of all harmonics and the six first indices of the principal component analysis [PCA] for the normalized fast Fourier transform [FFT] of the signal) were extracted. Accordingly, 16 characteristics were obtained from each phase (effort and recovery) of each signal (HR, Ox, and the normalized difference between these), resulting in a total of 96 characteristics for each evaluation. The normalized difference between Ox and HR was defined using the sklearn standardscaler function (the mathematical formula is available at https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html), and PCA was applied to the HR and Ox time series using the sklearn.decomposition.PCA function (formula available at https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html). Regarding the selection of the first 6 components of the PCA, this decision was made based on the researchers' criteria, considering that typically in this type of analysis, the first 3 to 6 components are considered.

Given that the main objective of this study was the detection of a transition from a state considered normal or stable (HF or COPD in the compensated phase [V2, V3]) to a state of decompensation or exacerbation (decompensated phase [V1]), a methodological scheme was applied based on calculation of the differences between the evaluations of each available characteristic. Thus, if a patient had three evaluations (V1, V2 and V3), six differences or useful comparative signals were obtained from these evaluations (V1V2, V1V3, V2V1, V2V3, V3V1, V3V2). The label of each of these comparative signals is illustrated in Fig.1.

Labeling and interpretation of comparative signals.

Although the differences V1V2 and V1V3 might be more appropriately considered decompensation recovery rather than no decompensation, we decided to discard a third label category (decompensation recovery) due to the small sample size and because the main objective of the trial was the detection of decompensation.

In a first approximation, potential predictive characteristics were selected using the random forest31, gradient boosting classifier31 and light gradient-boosting machine (LGBM)32 classification algorithms, which integrate the functions of characteristic selection by importance within the decision. We selected the top 10 features based on their importance ranking within the structure of each classifier model.

Figure2 shows an outline of the process for preparation and selection of the characteristics of the signals.

Process for preparation and selection of the characteristics of the evaluations.

During the process of selecting characteristics, all those that were redundant or had very low variabilities were discarded. In this study, by definition, we did not have variables with perfect separation that could cause overestimation of the diagnostic capacity of the models (overfitting)26.

In addition to the characteristics selected from the HR and Ox signals, the age, sex and baseline disease (HF or COPD) of the patients were considered potential predictors.

For the development of the algorithms, the ML techniques most used in the studies of classification models were considered: (i) decision trees, (ii) random forest, (iii) k-nearest neighbor (KNN), (iv) support vector machine (SVM), (v) logistic regression, (vi) naive Bayes classifier, (vii) gradient-boosting classifier and (viii) LGBM.

For each of these techniques, hyperparameters were selected based on a brute force scheme using all available data through a cross-validation scheme (K-fold cross-validation, k=5). A normalization process based on the medians and interquartile ranges (IQRs) was applied to all characteristics31.

Once the best parameters of each technique were identified, internal validation was performed with a leave-one-patient-out method. Thus, a new model was calculated for each patient by replacing the models data from the training and validation sets with the patients data. Figure3 shows an outline of the training and validation process.

Scheme of the training and validation of the study algorithms.

The observation units (inputs) on which the algorithms were applied were the differences between two different evaluations, as illustrated in Fig.1. Thus, the algorithms classified the evaluated difference as a state of no decompensation (label=0) or a change to decompensation (label=1). Therefore, the following parameters were defined:

True positive (TP) a change to decompensation as the classification result for a V3V1 or V2V1 comparison.

True negative (TN) no decompensation as the classification result for a V1V2, V1V3, V2V3 or V3V2 comparison.

False positive (FP) change to decompensation as the classification result for a V1V2, V1V3, V2V3 or V3V2 comparison.

False negative (FN) no decompensation as the classification result for a V3V1 or V2V1 comparison.

The parameters used to evaluate the diagnostic performance of the algorithms were S, E and accuracy (A). Each patient could have up to six observation units or inputs; therefore, up to six classification results were obtained, which were then defined as TP, TN, FP or FN. Then, the S, E and A were obtained for each patient. The final S, E and A of the entire sample were calculated from the mean of the parameters obtained from each patient.

The predictive values were not considered because the proportions of evaluations in the decompensated phase (33% [V1]) and compensated phase (66% [V2, V3]) did not correspond to the usual proportion found in clinical practice (the vast majority of patients in the community are usually in the compensated phase).

Missing data were not included in the analysis, but patients with missing data were not excluded (all available patient data were included in the analysis). No imputation of the missing data was performed.

During the process of signal review and verification of the start and end times of each evaluation from the manual records, missing sections of HR and/or Ox data due to poor contact between the skin and the sensor were observed. This incidence caused the introduction of some filters to be applied to exclude these missing sections from the analysis. Thus, an evaluation was excluded if it had a loss rate (missing measures divided by the total number of measures) greater than 10% in any phase. In addition, evaluations performed at home (V2, V3) that did not reveal an improvement in the sensation of dyspnea for the patient (of at least one point according to the mMRC scale30) with respect to the decompensated phase evaluation (V1) were also excluded to ensure that home assessments were performed in the compensated phase.

No indeterminate results were noted in the index test (algorithms); in all cases, the model produced a no decompensation or a change to decompensation result. On the other hand, all evaluations were always performed after a definitive result of the standard diagnostic reference test: clinical diagnosis of the decompensated phase by the doctor responsible for the patient in the hospital evaluation (V1) and clinical diagnosis of the compensated phase by the doctor who contacted the patients by phone before home evaluations (V2, V3). Thus, the algorithms were developed and applied on evaluations clearly labeled as the compensated or decompensated phase by the reference diagnostic test.

All methods and procedures were performed in accordance with the relevant guidelines and regulations. The study followed the principles contained in the Declaration of Helsinki and approved by the Ethics and Research Committee (ERC) of the center promoting the study (ERC of the Matar Hospital, approval number 1851806). Informed consent was obtained from all participants and/or their legal guardians.

See the rest here:

Machine learning for the development of diagnostic models of ... - Nature.com

Related Posts

Comments are closed.