Bayesian-Optimized BiLSTM with Tensor-Reshaped Structured Data for Robust Breast Cancer Risk Prediction
Deep learning has become increasingly significant in clinical medicine, including breast cancer detection, offering significant potential to improve patient outcomes. However, recurrent architectures like LSTM (Long Short-Term Memory) and BiLSTM (Bidirectional Long Short-Term Memory) remain underutilized for breast cancer prediction using structured tabular data, primarily due to the absence of explicit temporal dependencies, which are unsuitable for sequence-based modeling. This work presents a novel approach that redefines how LSTM architecture can be applied to the publicly available non-sequential Wisconsin Diagnostic Breast Cancer (WDBC), consisting of 569 samples and 30 features. The flat tabular input is reshaped into a fixed-length 3D tensor using a sliding window approach to adapt the data for sequence modeling. This transformation enables the model to leverage LSTM's sequential processing capabilities in a fundamentally new way, capturing implicit feature interactions across structured attributes without temporal context. Furthermore, Bayesian hyperparameter optimization techniques are applied to enhance the model's performance. The proposed model is evaluated against standard LSTM and state-of-the-art tabular Transformer architectures (FT-Transformer and SAINT). Results show that BiLSTM achieves the best overall performance (AUC 0.9985, accuracy 0.9824, RMSE 0.0964), while the LSTM baseline also surpasses both Transformerbased tabular models (AUC 0.9958, accuracy 0.9719). Performance gains are consistent across seven evaluation metrics, with statistical significance confirmed via paired t-tests $({p}<0.05)$. These findings demonstrate that, when appropriately adapted, recurrent architectures can outperform even advanced self-attention models in structured clinical prediction tasks.