Evaluation of Bayesian classifiers in asthma exacerbation prediction after medication discontinuation

Spyroglou, Ioannis I.; Spöck, Gunter; Rigas, Alexandros G.; Paraskakis, E. N.

doi:10.1186/s13104-018-3621-1

Research note
Open access
Published: 31 July 2018

Evaluation of Bayesian classifiers in asthma exacerbation prediction after medication discontinuation

Ioannis I. Spyroglou ORCID: orcid.org/0000-0001-6680-3656¹,
Gunter Spöck²,
Alexandros G. Rigas¹ &
…
E. N. Paraskakis³

BMC Research Notes volume 11, Article number: 522 (2018) Cite this article

1793 Accesses
12 Citations
Metrics details

Abstract

Objective

The achievement of the optimal control of the disease is of cardinal importance in asthma treatment. As the control of the disease is sustained the medication should be gradually reduced and then stopped. Nevertheless, the discontinuation of asthma medication may lead to loss of disease control and eventually to an exacerbation of the disease. The goal of this paper is to examine the performance of Bayesian network classifiers in predicting asthma exacerbation based on several patient’s parameters such as objective measurements and medical history data.

Results

In this study several Bayesian network classifiers are presented and evaluated. It is shown that the proposed semi-naive network classifier with the use of Backward Sequential Elimination and Joining algorithm is able to predict if a patient will have an exacerbation of the disease after his last assessment with 93.84% accuracy and 90.9% sensitivity. In addition, the resulting structure and the conditional probability tables give a clear view of the probabilistic relationships between the used factors. This network may help the clinicians to identify the patients who are at high risk of having an exacerbation after stopping the medication and to confirm which factors are the most important.

Introduction

Longitudinal studies are becoming increasingly popular in the field of medicine. Several artificial intelligence techniques have been developed for analysing this kind of data in several diseases [1, 2].

In addition numerous studies using exhaled volatile organic compounds, innovative exhaled inflammatory markers, telemonitoring data etc. have implemented a number of machine learning approaches to predict asthma exacerbation in children [3,4,5,6]. Bayesian network classifiers (BNCs) constitute a very important artificial intelligence technique [7]. The main advantage of BNCs compared to other classifiers (support vector machines (SVMs), logistic regression etc.) is that they are graphical models with the capability of displaying relationships between the predicting factors clearly. For that reason, BNCs seem to be a more appropriate classifier for studies of complex and multifactorial diseases such as asthma. In addition, BNCs with their graphical structure have the ability to show cause–effect relationships and therefore can be used to represent both direct and indirect causal relationships of the predicting factors of a disease [8].

Asthma is a complex chronic disease and the exacerbations of the disease usually occur after the discontinuation of medication [9]. Exacerbations are perceived by a progressive increase of asthma symptoms such as dyspnea, coughing, wheezing and by a decrease in spirometry measures such as forced expiratory volume in 1 s (FEV1) and peak expiratory flow (PEF).

The aim of this study is to predict and identify the patients that are at risk of having an asthma exacerbation after the medication cessation. The course of a patient after discontinuation of the medication is a very important issue. In some extreme cases an asthma exacerbation could lead even to patient’s death [10,11,12].

The identification of risk factors for asthma exacerbations remains a task not yet accomplished and BNCs can be an efficient method for detecting some of them.

Main text

Methods

A dataset of repeated measurements from 65 patients (195 observations, 2–4 measurements for each patient) aged from 1 to 14.5 years was gathered by the Paediatric Department of the University Hospital of Alexandroupolis, Greece during the period from 2008 to 2016. All of the patients have achieved good control of the disease and have interrupted their medication.

Additionally, it was necessary to include a time variable [ordinal categorical variable, i.e. the possible values ($t=1,2,\ldots$) are ordered ($1< 2 < \ldots$)] and a patient identity (id) variable (65 categories, one for each patient) in the BNC. A category change in a predictor variable through time may have different impact on different patients. The inclusion of id and time as variables deals with this matter as they will be contained in the conditional probability estimation of the class variable described in the next subsection. Prognostic factors used in the network are described in Table 1. The interval between the measurements is the medical surveillance interval of 6 months [13]. The first assessment (t = 1), is the one after discontinuation of the medication.

More information about the variables are given in the complete dataset provided in Additional file 1 [14,15,16].

Table 1 The encoding of the variables (nodes)

Full size table

Bayesian network classifiers

BNCs are used for classifying instances into classes. Nodes represent the variables and arcs describe the probabilistic dependencies between them [17]. The combination of graph and probability theory, allows us to model complex relationships between a big number of factors. It is usual in BNCs the predictor variables to be called attributes and the dependent variable class variable. The goal of a BNC is to estimate the probability of each class of the class variable given the attributes based on the Bayes rule [18]:

$$\begin{aligned} P(C|A)=\frac{P(C)P(\mathbf {A}|C)}{P(\mathbf {A})}, \end{aligned}$$

(1)

where $\mathbf {A}=A_1,A_2,...,A_n$ and n the number of attributes. Also P(C) are the prior probabilities of the class variable C given by $P(c_i)=N_{i}/N$ ($N_{i}$ is the number of times category $c_i$ occurs in N samples). $P(\mathbf {A}|C)$ is the likelihood and $P(C|\mathbf {A})$ is the posterior probability. The algorithms used in this work are now described.

Naive Bayes classifier (NB)

NB is the most simple structure. It assumes that the attributes are conditionally independent given the class variable. In this case only the prior probability of the class and the conditional probabilities of each attribute given the class are required. So $P(C|\mathbf {A})$ is proportional to $P(C)\prod _iP(A_i|C)$ and taking the logarithm of the probabilities then a log-linear model is obtained somehow similar to a logistic regression model [18].

Tree—augmented Naive Bayes classifier (TAN)

It begins with the NB structure. Thereafter, a Hill-Climbing (HC) algorithm is used to find connections among nodes. The algorithm adds arcs until there is no further improvement in the performance of the classifier. An alternative is learning an one-dependence BNC with the use of Chow–Liu’s algorithm by maximizing certain scores (AIC, BIC, log-likelihood). In TAN the class variable has no parents and each one from the attributes has two parents at most, the class variable and another [19, 20].

Semi-Naive Bayes classifiers (SNBC)

Another alternative of BNCs is to transform the basic structure of a NB classifier onto a structure that takes into account dependencies between the attributes, while the tree structure is maintained. The basic idea of SNBC is to eliminate attributes in a way that the performance of the classifier is increased. There are two algorithms used. The filter forward sequential selection and joining (FSSJ) where the algorithm starts from a null BNC and adds attributes and the backward sequential elimination and joining (BSEJ) which starts with a full BNC and eliminates attributes in a way of increasing the performance [18].

Results

The calculations were performed in R GUI 3.3.3 with the use of “bnclassify” and “bnlearn” packages [21, 22]. The last assessments of each patient are considered as test set. One major problem is that only 14.9% (29 out of 195) of the cases are high alert cases for an exacerbation. As a result there is a high risk that the classifiers will be biased towards the majority class. For this reason we decided to find an optimal cutoff different than the classic 0.5 to determine from which point and above a case will be considered as high alert. Therefore, a validation set which follows from repeated hold—out cross—validation in the training set is used to create a Receiver Operating Characteristics (ROC) curve to determine the optimal threshold with the minimum distance from the point (0,1) criterion [23]. A validation set must be used in order the results to be unbiased. The ROC curves are presented in Additional file 2.

The results of the implementations are tested by true positive (TP), true negative (TN), false positive (FP) and false negative (FN) values which give the following measures:

$$\begin{aligned} Sensitivity= & {} \frac{N_{TP}}{N_{TP}+N_{FN}},\end{aligned}$$

(2)

$$\begin{aligned} Specificity= & {} \frac{N_{TN}}{N_{TN}+N_{FP}},\end{aligned}$$

(3)

$$\begin{aligned} Accuracy= & {} \frac{N_{TP}+N_{TN}}{N_{TP}+N_{TN}+N_{FP}+N_{FN}}. \end{aligned}$$

(4)

The accuracy results are summarized in Table 2. The values inside the parentheses are the accuracy measures with the initial cutoff (0.5).

Table 2 Accuracy measures for BNCs

Full size table

The problem with this choice is that the sensitivity values are low and this is problematic in asthma exacerbation prediction. Therefore, it is required to change the normal cutoff to a lower value which is 0.06. As we can see in Table 2, the BSEJ algorithm results to a classifier that can identify high-alert cases better than the others. At the same time, the classifier has high specificity which leads to a more accurate model. The structure of the BSEJ classifier is presented in Fig. 1 showing how asthma exacerbation is affected by the attributes and the probabilistic relationships between them. These are described by the Conditional Probability Tables (CPT).

Discussion

Our study showed that BNCs seem to be quite efficient in early prediction of high-alert asthma exacerbation cases. At this point, it is necessary to mention that multiple time points from the same patient may introduce bias in the final model, due to within-subject correlations. These correlations can be estimated through a GEE (Generalized Estimating Equations) logistic regression model [24]. In our case independence correlation structure seems to work well. However, in a larger scale (with more patients and time points) the classifier should be modified to deal with a potential more complex correlation. In addition, other classification techniques (SVMs, logistic regression) did not perform that well. Moreover, we have confirmed that gender, spirometric parameters, food allergies, age, day and night symptoms, ATAQ and ACT scores are the most important factors for a future exacerbation following treatment cessation. Using several algorithms we concluded that BSEJ algorithm has the best performance. The classifier derived by this algorithm contains 14 attributes. The advantage of this approach is that it takes into account the dependence that may exist between the attributes. Instead of using BSEJ we could have tried every possible combination of NB classifiers. The reason which led us to use BSEJ is that NB classifiers assume that the attributes are independent which is not valid in the case of asthma because the combination of some symptoms or patient’s characteristics could lead to an exacerbation. The importance of the factors can be examined through the CPTs which are provided in Additional file 3. For example, regarding BMI as has been shown in previous studies [25,26,27,28], the majority of the patients with low FVC% predicted who presented asthma exacerbation were obese. This shows the importance of those two factors combined, despite the fact that the effect of obesity on asthma exacerbations is still not very clear [29]. The presence of asthma symptoms during day, night or physical activities seems to favour an exacerbation as well. It is known that poor asthma control could lead to an exacerbation of the disease and all these can have significant effects in the quality of sleep [30]. Moreover, nocturnal asthma is associated with the increase of symptoms [31] and the need of additional medication. Additionally, the ACT score seems to play an important role in predicting future exacerbations [32], but we cannot rely only on this, because as the CPT of ACT shows, we have also a high percentage of Good Asthma Control in high-alert cases. Conclusively it seems that CPTs provide valuable information about important predicting factors the role of which in asthma prediction has been shown in numerous previous studies [27, 28].

Summarizing if we observe all the CPTs of the classifier, we will realize that all of the remaining factors seem to play an important role in asthma exacerbation prediction. This in turn indicates that asthma exacerbation prediction cannot depend only on few factors but it is a multi-factorial case. Most of the factors included are significantly associated with asthma exacerbations [10, 33]. In addition, a comparison with other studies using similar factors showed that the BSEJ BNC offered improvement in prediction accuracy. In [6] some of the factors included are the same as ours. Our BSEJ BNC seems to identify better high alert cases and at the same time exhibits higher overall accuracy in testing each patient’s last assessment. However, it would be very interesting to test how the BSEJ BNC will behave if environmental and socio-economic factors are also included [6].

Conclusion

The goal of this study was to create a BNC using several factors for the prediction of high alert cases for an asthma exacerbation. The best performance was obtained with a classifier created with BSEJ algorithm. The fact that the prediction accuracy exceeds 90% (93.84%) with a sensitivity of 90.9%, shows that this classifier can be a useful tool for the clinical doctors. The basic advantage of using BNCs in asthma exacerbation prediction compared with the traditional clinical prediction methods which used simple parameters with low prognostic accuracy is that utilizes simultaneously a number of factors associated with exacerbation. Thus, a high accuracy in the exacerbation prediction is achieved.

Limitations

The main limitation of this study is that the dataset is not large enough, so the statistical findings from this work should be studied in a larger scale in the future.

Abbreviations

ACT:: asthma control test
AIC:: Akaike information criterion
ATAQ:: Asthma Therapy Assessment Questionnaire
BIC:: Bayesian information criterion
BMI:: body mass index
BNC:: Bayesian network classifier
BSEJ:: backward sequential elimination and joining
FEV1:: forced expiratory volume in 1 s
FSSJ:: filter forward sequential selection and joining
FVC:: forced vital capacity
GEE:: generalized estimating equations
HC:: hill climbing
LOGLIK:: log-likelihood
NB:: Naive Bayes
PEF:: peak expiratory flow
ROC:: receiver operating characteristics
SNBC:: semi-Naive Bayes classifier
SVM:: support vector machine
TAN:: tree augmented Naive Bayes

References

Tandon R, Adak S, Kaye JA. Neural networks for longitudinal studies in Alzheimer’s disease. Artif Intell Med. 2006;36(3):245–55.
Article PubMed Google Scholar
Maity TK, Pal AK. Subject specific treatment to neural networks for repeated measures analysis. Proc Int MultiConf Eng Comput Sci. 2013;1:60–5.
Google Scholar
van Vliet D, Alonso A, Rijkers G, Heynens J, Rosias P, Muris J. Prediction of asthma exacerbations in children by innovative exhaled inflammatory markers: results of a longitudinal study. PLoS ONE. 2015;10(3):e0119434.
Article PubMed PubMed Central CAS Google Scholar
van Vliet D, Smolinska A, Jöbsis Q, Rosias P, Muris J, Dallinga J. Can exhaled volatile organic compounds predict asthma exacerbations in children? J Breath Res. 2017;11(1):016016.
Article PubMed Google Scholar
Finkelstein J, Jeong IC. Machine learning approaches to personalize early prediction of asthma exacerbations. Ann NY Acad Sci. 2017;1387(1):153–65.
Article PubMed Google Scholar
Luo G, Stone BL, Fassl B, Maloney CG, Gesteland PH, Yerram SR, et al. Predicting asthma control deterioration in children. BMC Med Inform Decis Making. 2015;15:84.
Article Google Scholar
Jensen FV. An introduction to Bayesian networks, vol. 210. London: UCL Press; 1996.
Google Scholar
Margaritis D. Learning Bayesian network model structure from data. Ph.D. thesis. School of Computer Science, Carnegie-Mellon University, Pittsburgh. Technical Report CMU-CS-03-153; 2003.
Beasley R, Semprini A, Mitchell EA. Risk factors for asthma: is prevention possible? Lancet. 2015;386(9998):1075–85.
Article PubMed Google Scholar
Camargo CA Jr, Rachelefsky G, Schatz M. Managing asthma exacerbations in the emergency department: summary of the National Asthma Education and Prevention Program Expert Panel Report 3 guidelines for the management of asthma exacerbations. Proc Am Thorac Soc. 2009;6(4):357–66.
Article PubMed Google Scholar
Kupryś-Lipińska I, Kuna P. Loss of asthma control after cessation of omalizumab treatment: real life data. Postep Derm Alergol. 2014;31:1–5.
Google Scholar
Bush A. Diagnosis of asthma in children under five. Prim Care Respir J. 2007;16(1):7–15.
Article PubMed Google Scholar
Tarlo SM, Liss GM, Yeung KS. Changes in rates and severity of compensation claims for asthma due to diisocyanates: a possible effect of medical surveillance measures. Occup Environ Med. 2002;59(1):58–62.
Article PubMed PubMed Central CAS Google Scholar
Centers for Disease Control and Prevention. Body mass index: BMI for children and teens. http://www.cdc.gov/nccdphp/dnpa/bmi/bmi-for-age.htm. Accessed 1 Dec 2017.
Jat KR. Spirometry in children. Prim Care Respir J. 2013;22:221–9.
Article PubMed Google Scholar
Liu AH, Zeiger R, Sorkness C, Mahr T, Ostrom N, Burgess S. Development and cross-sectional validation of the Childhood Asthma Control Test. J Allergy Clin Immunol. 2007;119(4):817–25.
Article PubMed Google Scholar
Korb KB, Nicholson AE. Bayesian artificial intelligence. London: CRC Press; 2010.
Google Scholar
Sucar LE. Probabilistic graphical models: principles and applications. Berlin: Springer; 2015.
Book Google Scholar
Friedman N, Geiger D, Goldszmidt M. Bayesian network classifiers. Mach Learn. 1997;29(2–3):131–63.
Article Google Scholar
Keogh EJ, Pazzani MJ. Learning the structure of augmented Bayesian classifiers. Int J Artif Intell Tools. 2002;11(04):587–601.
Article Google Scholar
Mihaljevic B, Bielza C, Larrañaga P. bayesslass: an R package for learning Bayesian network classifiers. In: Proceedings of useR!–the R user conference; 2013. p. 53.
Scutari M. Learning Bayesian networks with the bnlearn R package. J Stat Softw. 2009;35(3):1–22.
Google Scholar
Kumar R, Indrayan A. Receiver operating characteristic (ROC) curve for medical researchers. Indian Pediatrics. 2011;48(4):277–87.
Article PubMed Google Scholar
Liang KY, Zeger SL. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73(1):13–22. http://0-biomet-oxfordjournals-org.brum.beds.ac.uk/content/73/1/13.abstract.
Kasteleyn MJ, Bonten TN, de Mutsert R, Thijs W, Hiemstra PS, le Cessie S. Pulmonary function, exhaled nitric oxide and symptoms in asthma patients with obesity: a cross-sectional study. Respir Res. 2017;18(1):205.
Article PubMed PubMed Central Google Scholar
Spathopoulos D, Paraskakis E, Trypsianis G, Tsalkidis A, Arvanitidou V, Emporiadou M. The effect of obesity on pulmonary lung function of school aged children in Greece. Pediatric Pulmonol. 2009;44(3):273–80.
Article Google Scholar
Covar RA, Szefler SJ, Zeiger RS, Sorkness CA, Moss M, Mauger DT, et al. Factors associated with asthma exacerbations during a long-term clinical trial of controller medications in children. J Allergy Clin Immunol. 2008;122(4):741–7.
Article PubMed PubMed Central CAS Google Scholar
Fleming L. Asthma exacerbation prediction: recent insights. Curr Opin Allergy Clin Immunol. 2018;18(2):117–23.
PubMed Google Scholar
De Vera MJB, Gomez MC, Yao CE. Association of obesity and severity of acute asthma exacerbations in Filipino children. Ann Allergy Asthma Immunol. 2016;117(1):38–42.
Article PubMed Google Scholar
Sundbom F, Malinovschi A, Lindberg E, Alving K, Janson C. Effects of poor asthma control, insomnia, anxiety and depression on quality of life in young asthmatics. J Asthma. 2016;53(4):398–403.
Article PubMed Google Scholar
Skloot GS. Nocturnal asthma: mechanisms and management. Mount Sinai J Med NY. 2002;69(3):140–7.
Google Scholar
Ko FW, Hui DS, Leung TF, Chu HY, Wong GW, Tung AH, et al. Evaluation of the asthma control test: a reliable determinant of disease stability and a predictor of future exacerbations. Respirology. 2012;17(2):370–8.
Article PubMed Google Scholar
Wan KS, Wu WF, Liu YC, Huang CS, Wu CS, Hung CW. Effects of food allergens on asthma exacerbations in schoolchildren with atopic asthma. Food Agric Immunol. 2017;28(2):310–4.
Article CAS Google Scholar

Download references

Authors’ contributions

IIS, GS contributed to the conception and design of the study and performed the statistical analysis. IIS, GS, AGR and ENP participated in the interpretation of the results and helped to draft the manuscript. All authors read and approved the final manuscript.

Acknowledgements

We appreciate the Department of Paediatrics of the University hospital of Alexandroupolis for providing the dataset used in this work.

Competing interests

The authors declare that they have no competing interests.

Availability of data and materials

The data supporting this study are publicly available as additional files.

Consent for publications

Not applicable.

Ethics approval and consent to participate

Ethics approval was granted by the Research Ethics Committee of Democritus University of Thrace. The data used in this work are anonymous.

Funding

No funding was received.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Democritus University of Thrace, 67100, Xanthi, Greece
Ioannis I. Spyroglou & Alexandros G. Rigas
Department of Statistics, Alpen-Adria Universität, 9020, Klagenfurt, Austria
Gunter Spöck
Paediatric Respiratory Unit, Department of Paediatrics, Medical School, Democritus University of Thrace, 68100, Alexandroupolis, Greece
E. N. Paraskakis

Authors

Ioannis I. Spyroglou
View author publications
You can also search for this author in PubMed Google Scholar
Gunter Spöck
View author publications
You can also search for this author in PubMed Google Scholar
Alexandros G. Rigas
View author publications
You can also search for this author in PubMed Google Scholar
E. N. Paraskakis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ioannis I. Spyroglou.

Additional files

Additional file 1.

The complete dataset used for the evaluation of BNCs in asthma exacerbation prediction.

Additional file 2.

ROC curves of the BNCs with the use of validation dataset.

Additional file 3.

The CPTs of the BSEJ Bayesian Classifier.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Spyroglou, I.I., Spöck, G., Rigas, A.G. et al. Evaluation of Bayesian classifiers in asthma exacerbation prediction after medication discontinuation. BMC Res Notes 11, 522 (2018). https://0-doi-org.brum.beds.ac.uk/10.1186/s13104-018-3621-1

Download citation

Received: 03 May 2018
Accepted: 20 July 2018
Published: 31 July 2018
DOI: https://0-doi-org.brum.beds.ac.uk/10.1186/s13104-018-3621-1

Evaluation of Bayesian classifiers in asthma exacerbation prediction after medication discontinuation

Abstract

Objective

Results

Introduction

Main text

Methods

Bayesian network classifiers

Naive Bayes classifier (NB)

Tree—augmented Naive Bayes classifier (TAN)

Semi-Naive Bayes classifiers (SNBC)

Results

Discussion

Conclusion

Limitations

Abbreviations

References

Authors’ contributions

Acknowledgements

Competing interests

Availability of data and materials

Consent for publications

Ethics approval and consent to participate

Funding

Publisher's Note

Author information

Authors and Affiliations

Corresponding author

Additional files

Additional file 1.

Additional file 2.

Additional file 3.

Rights and permissions

About this article

Cite this article

Share this article

Keywords