{"id":1027402,"date":"2023-08-06T16:56:48","date_gmt":"2023-08-06T20:56:48","guid":{"rendered":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/uncategorized\/machine-learning-for-the-development-of-diagnostic-models-of-nature-com.php"},"modified":"2023-08-06T16:56:48","modified_gmt":"2023-08-06T20:56:48","slug":"machine-learning-for-the-development-of-diagnostic-models-of-nature-com","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/machine-learning\/machine-learning-for-the-development-of-diagnostic-models-of-nature-com.php","title":{"rendered":"Machine learning for the development of diagnostic models of &#8230; &#8211; Nature.com"},"content":{"rendered":"<p><p>Design    <\/p>\n<p>    This was a prospective multicenter observational study. Unlike    studies on prognostic models, in the present study, diagnostic    models were developed, that is, models designed to determine    whether a patient was in the compensated or decompensated phase    of their disease (exacerbation of COPD and\/or HF    decompensation).  <\/p>\n<p>    The criteria for admission to this study and the recruitment    process have been previously reported19. Patients older    than 55years who were able to walk at least 30m,    with a main diagnosis of decompensated HF and\/or exacerbation    of COPD and hospitalized in the Department of Internal    Medicine, Cardiology or Pneumology were included. Participants    with a pacemaker or intracardiac device, domiciliary oxygen    therapy users prior to admission and patients with HF    functional class IV of the New York Heart Association (NYHA)    classification were excluded29.  <\/p>\n<p>    Four hospitals participated: two tertiary university hospitals    (600900 hospital beds) and two regional secondary care    hospitals (150400 hospital beds) in the provinces of Barcelona    and Madrid.  <\/p>\n<p>    Each center had a trained interviewer, and each department had    a referring physician who was accessible to the interviewer.    Each day, the interviewer contacted the referring physician to    review the hospitalization census and identify patients with    the diagnosis of interest. Next, the interviewer confirmed the    main diagnosis (decompensated HF and\/or exacerbation of COPD)    with the physician responsible for the patient and then    contacted the participant (the same day or the next day) to    obtain informed consent and verify compliance with all    admission criteria of this study. The sample was obtained    through convenience sampling, and all patients were enrolled    consecutively as they were identified.  <\/p>\n<p>    The recruitment and follow-up periods lasted 18months    starting in November 2010.  <\/p>\n<p>    Each patient underwent three identical evaluations: the first    in the hospitalization unit (V1) and the other two    consecutively and at least 24h apart in the participants    home 30days after hospital discharge (V2 and V3). Thus,    each participant underwent one evaluation in the decompensated    phase (V1) and two in the compensated phase (V2, V3) of their    disease.  <\/p>\n<p>    The evaluation protocol19 included    documentation of symptoms (dyspnea according to the    NYHA29 and Modified    Medical Research Council (mMRC)30 scales) and    physiological parameters (HR and Ox) in two consecutive    periods: effort (walking at a normal pace and on flat terrain    for a maximum of 6min) and recovery (seated for    4min after the end of the effort period).  <\/p>\n<p>    HR and Ox were considered time series with a sample frequency    of 1Hz and were collected throughout the evaluation with    a pulse oximeter (Model 3100, brand Nonin Medical, Inc.,    Plymouth, MN, USA) placed on the left index finger.  <\/p>\n<p>    Given the absence of a single standard diagnostic test to    verify whether a patient was in the compensated or    decompensated phase of their disease, the clinical judgment of    the participants responsible physician was considered a    standard diagnostic test. Thus, in the decompensated phase, the    diagnosis of decompensated HF and\/or COPD exacerbation    corresponded to the confirmed diagnosis from the participants    attending physician (in cases of diagnostic doubt, the patient    was excluded). For the compensated phase, a standard diagnosis    of compensated HF and\/or stable COPD was confirmed by a study    physician through telephone contact with the participant    30days after hospital discharge. During this telephone    interaction, the patient was considered to be in the    compensated phase if none of the following events had occurred    since hospital discharge: increased cough, sputum or dyspnea;    initiation of or an increase in corticosteroid use; and    initiation of antibiotic treatment or medical consultation for    worsening of the clinical situation from any cause. In cases of    doubt or if the compensated phase could not be confirmed,    successive telephone contacts were made until the phase could    be confirmed. The interviewer scheduled home visits for the    respective evaluations (V2, V3) only after confirmation and    within 2448h of receiving confirmation.  <\/p>\n<p>    Given the objective of this study (development of an online    algorithm capable of detecting the onset of an exacerbation    from HR and Ox data), various characteristics of each of the    evaluations were extracted (V1, V2, V3). For this purpose, the    effort phase (walking) and recovery phase of each evaluation    were separated by verifying the times recorded manually in the    data collection records at the beginning and end of each phase    of the test and visually reviewing the signals to confirm the    manual records. Once the signals were separated according to    the evaluation phase, the corresponding characteristics of the    available measures were extracted.  <\/p>\n<p>    Numerous characteristics were extracted from the signals.    During each of the tests, two different phases were considered:    effort and recovery, which were treated separately. From each    of the phases, three signals were considered: HR, Ox and the    normalized difference between these variables. From each of    these three temporal signals, the characteristics of the    temporal (the mean, standard deviation, and range) and    frequency domains (the characteristics of the first and second    harmonics, the distribution of the harmonics [kurtosis and    skewness], the sum of all harmonics and the six first indices    of the principal component analysis [PCA] for the normalized    fast Fourier transform [FFT] of the signal) were extracted.    Accordingly, 16 characteristics were obtained from each phase    (effort and recovery) of each signal (HR, Ox, and the    normalized difference between these), resulting in a total of    96 characteristics for each evaluation. The normalized    difference between Ox and HR was defined using the sklearn    standardscaler function (the mathematical formula is    available at     <a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.preprocessing.StandardScaler.html\" rel=\"nofollow\">https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.preprocessing.StandardScaler.html<\/a>),    and PCA was applied to the HR and Ox time series using the    sklearn.decomposition.PCA function (formula available at        <a href=\"https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.decomposition.PCA.html\" rel=\"nofollow\">https:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.decomposition.PCA.html<\/a>).    Regarding the selection of the first 6 components of the PCA,    this decision was made based on the researchers' criteria,    considering that typically in this type of analysis, the first    3 to 6 components are considered.  <\/p>\n<p>    Given that the main objective of this study was the detection    of a transition from a state considered normal or stable (HF or    COPD in the compensated phase [V2, V3]) to a state of    decompensation or exacerbation (decompensated phase [V1]), a    methodological scheme was applied based on calculation of the    differences between the evaluations of each available    characteristic. Thus, if a patient had three evaluations (V1,    V2 and V3), six differences or useful comparative signals were    obtained from these evaluations (V1V2, V1V3, V2V1, V2V3,    V3V1, V3V2). The label of each of these comparative signals    is illustrated in Fig.1.  <\/p>\n<p>            Labeling and interpretation of comparative signals.          <\/p>\n<p>    Although the differences V1V2 and V1V3 might be more    appropriately considered decompensation recovery rather than    no decompensation, we decided to discard a third label    category (decompensation recovery) due to the small sample    size and because the main objective of the trial was the    detection of decompensation.  <\/p>\n<p>    In a first approximation, potential predictive characteristics    were selected using the random forest31, gradient    boosting classifier31 and light    gradient-boosting machine (LGBM)32 classification    algorithms, which integrate the functions of characteristic    selection by importance within the decision. We selected the    top 10 features based on their importance ranking within the    structure of each classifier model.  <\/p>\n<p>    Figure2 shows an outline of    the process for preparation and selection of the    characteristics of the signals.  <\/p>\n<p>            Process for preparation and selection of the            characteristics of the evaluations.          <\/p>\n<p>    During the process of selecting characteristics, all those that    were redundant or had very low variabilities were discarded. In    this study, by definition, we did not have variables with    perfect separation that could cause overestimation of the    diagnostic capacity of the models    (overfitting)26.  <\/p>\n<p>    In addition to the characteristics selected from the HR and Ox    signals, the age, sex and baseline disease (HF or COPD) of the    patients were considered potential predictors.  <\/p>\n<p>    For the development of the algorithms, the ML techniques most    used in the studies of classification models were considered:    (i) decision trees, (ii) random forest, (iii) k-nearest    neighbor (KNN), (iv) support vector machine (SVM), (v) logistic    regression, (vi) naive Bayes classifier, (vii)    gradient-boosting classifier and (viii) LGBM.  <\/p>\n<p>    For each of these techniques, hyperparameters were selected    based on a brute force scheme using all available data through    a cross-validation scheme (K-fold cross-validation,    k=5). A normalization process based on the medians and    interquartile ranges (IQRs) was applied to all    characteristics31.  <\/p>\n<p>    Once the best parameters of each technique were identified,    internal validation was performed with a leave-one-patient-out    method. Thus, a new model was calculated for each patient by    replacing the models data from the training and validation    sets with the patients data. Figure3 shows an outline of    the training and validation process.  <\/p>\n<p>            Scheme of the training and validation of the study            algorithms.          <\/p>\n<p>    The observation units (inputs) on which the algorithms were    applied were the differences between two different evaluations,    as illustrated in Fig.1. Thus, the algorithms    classified the evaluated difference as a state of no    decompensation (label=0) or a change to decompensation    (label=1). Therefore, the following parameters were defined:  <\/p>\n<p>        True positive (TP) a change to decompensation as        the classification result for a V3V1 or V2V1 comparison.      <\/p>\n<p>        True negative (TN) no decompensation as the        classification result for a V1V2, V1V3, V2V3 or V3V2        comparison.      <\/p>\n<p>        False positive (FP) change to decompensation as        the classification result for a V1V2, V1V3, V2V3 or        V3V2 comparison.      <\/p>\n<p>        False negative (FN) no decompensation as the        classification result for a V3V1 or V2V1 comparison.      <\/p>\n<p>    The parameters used to evaluate the diagnostic performance of    the algorithms were S, E and accuracy (A). Each patient could    have up to six observation units or inputs; therefore, up to    six classification results were obtained, which were then    defined as TP, TN, FP or FN. Then, the S, E and A were obtained    for each patient. The final S, E and A of the entire sample    were calculated from the mean of the parameters obtained from    each patient.  <\/p>\n<p>    The predictive values were not considered because the    proportions of evaluations in the decompensated phase (33%    [V1]) and compensated phase (66% [V2, V3]) did not correspond    to the usual proportion found in clinical practice (the vast    majority of patients in the community are usually in the    compensated phase).  <\/p>\n<p>    Missing data were not included in the analysis, but patients    with missing data were not excluded (all available patient data    were included in the analysis). No imputation of the missing    data was performed.  <\/p>\n<p>    During the process of signal review and verification of the    start and end times of each evaluation from the manual records,    missing sections of HR and\/or Ox data due to poor contact    between the skin and the sensor were observed. This incidence    caused the introduction of some filters to be applied to    exclude these missing sections from the analysis. Thus, an    evaluation was excluded if it had a loss rate (missing measures    divided by the total number of measures) greater than 10% in    any phase. In addition, evaluations performed at home (V2, V3)    that did not reveal an improvement in the sensation of dyspnea    for the patient (of at least one point according to the mMRC    scale30) with respect to    the decompensated phase evaluation (V1) were also excluded to    ensure that home assessments were performed in the compensated    phase.  <\/p>\n<p>    No indeterminate results were noted in the index test    (algorithms); in all cases, the model produced a no    decompensation or a change to decompensation result. On the    other hand, all evaluations were always performed after a    definitive result of the standard diagnostic reference test:    clinical diagnosis of the decompensated phase by the doctor    responsible for the patient in the hospital evaluation (V1) and    clinical diagnosis of the compensated phase by the doctor who    contacted the patients by phone before home evaluations (V2,    V3). Thus, the algorithms were developed and applied on    evaluations clearly labeled as the compensated or decompensated    phase by the reference diagnostic test.  <\/p>\n<p>    All methods and procedures were performed in accordance with    the relevant guidelines and regulations. The study followed the    principles contained in the Declaration of Helsinki and    approved by the Ethics and Research Committee (ERC) of the    center promoting the study (ERC of the Matar Hospital,    approval number 1851806). Informed consent was obtained from    all participants and\/or their legal guardians.  <\/p>\n<p><!-- Auto Generated --><\/p>\n<p>See the rest here: <\/p>\n<p><a target=\"_blank\" rel=\"nofollow noopener\" href=\"https:\/\/www.nature.com\/articles\/s41598-023-39329-6\" title=\"Machine learning for the development of diagnostic models of ... - Nature.com\">Machine learning for the development of diagnostic models of ... - Nature.com<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> Design This was a prospective multicenter observational study. Unlike studies on prognostic models, in the present study, diagnostic models were developed, that is, models designed to determine whether a patient was in the compensated or decompensated phase of their disease (exacerbation of COPD and\/or HF decompensation).  <a href=\"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/machine-learning\/machine-learning-for-the-development-of-diagnostic-models-of-nature-com.php\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"limit_modified_date":"","last_modified_date":"","_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[1231415],"tags":[],"class_list":["post-1027402","post","type-post","status-publish","format-standard","hentry","category-machine-learning"],"modified_by":null,"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/1027402"}],"collection":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/comments?post=1027402"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/1027402\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/media?parent=1027402"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/categories?post=1027402"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/tags?post=1027402"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}