Integrative Stacking Machine Learning Model for Small Cell Lung Cancer Prediction Using Metabolomics Profiling
Simple Summary This study investigates lung cancer detection by combining metabolomics and advanced machine learning to identify small cell lung cancer (SCLC) with high accuracy. We analyzed 461 serum samples from publicly available data to create a stacking-based ensemble model that can distinguish between SCLC, non-small cell lung cancer (NSCLC), and healthy controls. The model has 85.03% accuracy in multi-class classification and 88.19% accuracy in binary classification (SCLC vs. NSCLC). This innovation relies on sophisticated feature selection techniques to identify significant metabolites, particularly positive ions. SHAP analysis identifies key predictors such as benzoic acid, DL-lactate, and L-arginine, shedding new light on cancer metabolism. This non-invasive approach presents a promising alternative to traditional diagnostic methods, with the potential to transform early lung cancer detection. By combining metabolomics and machine learning, the study paves the way for faster, more accurate, and patient-friendly cancer diagnostics, potentially improving treatment outcomes and survival rates.