Publication details

Využití metod strojového učení a hmotnostní spektrometrie pro klinické aplikace v nádorové biologii

Investor logo
Title in English Using machine learning and mass spectrometry for clinical applications in cancer biology


Year of publication 2023
Type Appeared in Conference without Proceedings
MU Faculty or unit

Faculty of Medicine

Description With increasing demands for analysis of biological samples in complex matrices, there is also a growing interest in the development and optimization of mass spectrometric (MS) methods. MS analysis of intact cells, plasma samples, and other biological materials is important for monitoring and elucidating biological processes in the organism and provides important information about the phenotype/genotype of the organism. Various techniques that deal with the study of these biological samples are presented in two topics. MALDI MS of intact cells is already used in clinical microbiology and diagnosis and has also been introduced into cell biology, immunology, and tumor studies in recent years. The first topic focuses on the classification of ovarian cancer cells with different percentages of cell populations with suppressed gene expression (TUSC3). The MS method was combined with multidimensional statistical algorithms and machine learning methods (ML), such as PLS-DA, ANN, and RF. All computational models were built using the R programming language. The optimization of MS of intact cells was combined with ML methods to monitor changes in the TUSC3 gene. Data obtained from mass spectra were analyzed using a developed script in the R language. A methodology for data preprocessing was described, which led to a reduction in the technical variability of the dataset. The methodology was described using a dataset of 175 mass spectra. A total of 5 classifiers based on different algorithms were created and compared, which were further optimized. Discriminant analysis of partial least squares (PLS-DA) was determined as the model with the best classification ability with 100% accuracy (95% confidence interval, Cl = 94.7-100%) for validation data. The above-described method was also used for other studies, such as monitoring the differentiation of hESC into ELEP. Here, the differentiation trajectory was visualized based solely on spectral data, and some phenotypic abnormalities related to the number of passages and aneuploid state of hESC were also revealed. The second topic is the development of a method for analyzing human plasma samples using MALDI MS. The aim is to develop a method for distinguishing patients with multiple myeloma (MM) and patients with plasma cell leukemia (PCL) and extramedullary disease (EMD). A two-step protocol for protein extraction was developed for sample analysis. The intensity in the entire used m/z range increased by approximately 50 times (compared to unmodified plasma samples) when using the extraction protocol. Classification using ML algorithms (RF, PLS-DA, and ANN) achieved an accuracy of 80-90% for the training dataset and 79-87% for the testing dataset. These findings can help accelerate the integration of MALDI MS into clinical use and improve the diagnosis of these diseases. Supported by Masaryk University project no.: MUNI/A/1298/2022, MUNI/A/1301/2022, MUNI/11/ACC/3/2022, the Ministry of Health of the Czech Republic project no.: NU21-03-00076, and the Grant Agency of the Czech Republic project no.: GA23-06675S.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info