The Information Dilution Paradox in Low-Dimensional Thermodynamic Manifolds

Zulman Arif

doi:10.31224/7499

##article.authors##

Zulman Arif Independent Researcher

DOI:

https://doi.org/10.31224/7499

Abstract

Predicting net hourly electrical energy output (PE) in Combined Cycle Power Plants (CCPP) is fundamental to grid dispatch optimization, typically addressed through tree-based ensembles. Recently, hybrid architectures incorporating Transformer-based self-attention have been proposed to enhance tabular regression by capturing latent feature interactions. This study empirically evaluates a Residual Hybrid Transformer-GBDT architecture against a standalone LightGBM baseline using the UCI CCPP dataset. Contrary to the prevailing hypothesis that latent embedding expansion improves predictive accuracy, our results reveal a performance degradation, with Root Mean Squared Error (RMSE) increasing from 3.2777 (Baseline) to 3.2853 (Hybrid). Detailed error segmentation identifies a significant performance collapse during low-load operational regimes (< 430 MW), where the Mean Absolute Error (MAE) increased by 36.5% compared to the baseline. We characterize this phenomenon as the 'Information Dilution Paradox', wherein mapping a low-dimensional physical manifold (d=4) into a high-dimensional latent space (d=16) introduces stochastic noise and feature redundancy rather than discriminative signal. These findings provide a critical counter-narrative to the adoption of high-capacity deep learning for low-cardinality tabular data, suggesting that raw physical feature representations remain superior for thermodynamic manifolds governed by strong intrinsic correlations.

Downloads

Download data is not yet available.

The Information Dilution Paradox in Low-Dimensional Thermodynamic Manifolds

##article.authors##

DOI:

Abstract

Downloads

Downloads

Posted

License

Latest preprints