Risk assessment using machine learning in Duchenne muscular dystrophy

Authors

  • Boonsit Yimwadsana Faculty of Information and Communication Technology Mahidol University

Keywords:

Sentiment analysis, Health, Natural Language Processing

Abstract

Risk assessment is an essential component of prognosis and treatment for genetic diseases. In the past, Mendelian inheritance analysis plays a key role in genetic risk assessment. Recently, machine learning has been widely used in data analysis with notable success. However, most of the machine learning techniques do not allow physicians to understand how data features are related to each other because the data collected by
doctors are often unbalanced or bias towards patient data only. People who do not have a specific disease (healthy people or people with different diseases who may have similar symptoms), were often not followed up by physicians. Due to the data unbalance, the results of the risk assessment analysis performed by machine learning techniques are often not effective in the real-world situation. This work aims to introduce additional pedigree data to improve the accuracy of the genetic risk assessment. We tested our concept with Duchenne muscular dystrophy (DMD) disease and show that our proposed use of pedigree information help improve the accuracy even in the situation of the unbalanced data.

 

References

Bernardini, C., “Duchenne Muscular Dystrophy: Methods and Protocols,” Humana, New York, NY, 2018.

Robert Brooker, Genetics: Analysis & Principles, McGraw Hill; 7th edition (January 9, 2020).

Njage, P., Henri, C., Leekitcharoenphon, P., Mistou, M., Hendriksen, R., Hald, T., “Machine Learning Methods as a Tool for Predicting Risk of Illness Applying Next-Generation Sequencing Data,” in Risk Analysis, vol. 39, issue 6., Wiley, 2019, pp. 1397–1413.

Porras, A., Rosenbaum, K., Tor-Diez, C., Summar, M., Linguraru, M. G., “Development and evaluation of a machine learning-based point-of-care screening tool for genetic syndromes in children: a multinational retrospective study,” Lancet Digit Health. 2021.

Molinaro, A., Simon, R., & Pfeiffer, R., “Prediction error estimation: A comparison of resampling methods,” Bioinformatics, 21(15), 2015, pp. 3301– 3307.

Machado, G., Mendoza, M. R., & Corbellini, L. G., “What variables are important in predicting bovine viral diarrhea virus? A random forest approach,” Veterinary Research, 46(1), 2015, pp. 1– 15.

Ogutu, J. O., Piepho, H.-P., & Schulz-Streeck, T., “A comparison of random forests, boosting and support vector machines for genomic selection,” BMC Proceedings, 5(Suppl. 3), S11, 2011.

Kuhn, M., “Building predictive models in R using the caret package,” Journal of Statistical Software, 28(5), 2008, pp. 1– 26.

Sang Medicine, “An Introduction to Risk Analysis in Inherited X-Linked Recessive Disorders”, Practcal Haemostasis, last access, 1 October 2022, https:// practical-haemostasis.com/Genetics/bayesian_risk_ analysis.html

Theodoridis, S., “Machine Learning: A Bayesian and Optimization Perspective”, Academic Press, 2nd edition, 2020

Hicks, S.A., Strümke, I., Thambawita, V. et al., “On evaluation metrics for medical applications of artificial intelligence,” Scientific Report 12, 5979, 2022.

scikit-learn: machine learning in Python, https:// scikit-learn.org

keras: the Python deep learning API, https://keras.io

Downloads

Published

2023-12-10

How to Cite

Yimwadsana, . B. . (2023). Risk assessment using machine learning in Duchenne muscular dystrophy. Journal of the Thai Medical Informatics Association, 9(2), 87–91. retrieved from https://he03.tci-thaijo.org/index.php/jtmi/article/view/1850