Loading...
A hybrid feature selection framework: Balancing information preservation and multicollinearity in biological datasets
Citations
Altmetric:
Date
2025-11-30
Type
Conference Contribution - published
Collections
Fields of Research
Abstract
This study introduces a hybrid feature selection framework that combines mutual information (MI) analysis with iterative variance inflation factor (VIF) filtering to address the competing demands of predictive performance and multicollinearity management in a biological dataset. Testing on Solanum callus induction data (1,081 observations, 16 features), the hybrid approach achieved superior performance with R²=0.86, outperforming VIF filtering alone by 16% and mutual information methods alone by 8%. The framework's key advantage is preserving biologically relevant features with strong predictive value (like relative humidity with MI=0.66) that VIF filtering would eliminate due to multicollinearity, while still maintaining statistical independence. This may address an important matter in biological machine learning, where both predictive accuracy and interpretability are essential.
Permalink
Source DOI
Rights
© The Authors
Creative Commons Rights
Attribution-NoDerivatives