Loading...
Thumbnail Image
Publication

Computational biology in plant tissue culture: Systematic evaluation of machine learning approaches for callus induction in Solanum L.: A thesis submitted in partial fulfilment of the requirements for the Degree of Doctor of Philosophy at Lincoln University

Citations
Altmetric:
Date
2026
Type
Thesis
Abstract
Optimization of plant tissue culture systems remains a major biotechnological challenge; yet successful callus induction, the formation of unorganized, proliferating cell masses from differentiated tissues, enables transformative applications in agriculture, medicine, and fundamental research. Callus serves as a foundation for genetic transformation and genome editing, mass propagation of elite varieties, production of pharmaceutically valuable secondary metabolites, and breeding programs. However, achieving consistent callus induction is hindered by the multifactorial nature of the process, governed by intricate dependencies among environmental conditions, genetic background, hormonal composition, and explant source. Conventional trial-and-error methodologies are further constrained by the difficulty of systematically capturing and optimizing these complex interactions, resulting in protocols with poor transferability and requiring extensive empirical refinement for each new species or genotype. This study established comprehensive machine learning (ML) frameworks to model and predict callus induction rates in the Solanum genus, one of the most diverse angiosperm genera containing approximately 1,500–2,000 species. This genus includes economically significant crop species such as potato, tomato, and eggplant that sustain global food security, as well as wild relatives possessing valuable traits for disease resistance and abiotic stress tolerance. The genus also includes species of pharmaceutical interest, owing to their diverse alkaloids and secondary metabolites, which exhibit antimicrobial, antioxidant, and anticancer activities. Despite their agricultural, nutritional, and medicinal importance, many Solanum species remain recalcitrant in tissue culture, as standard protocols developed for model cultivars often perform poorly when applied to wild species and new genotypes. The ML methods developed here systematically evaluate modeling strategies ranging from global to local approaches, with the goal of addressing the protocol optimization bottleneck that constrains biotechnological applications across this economically and scientifically important genus. Five machine learning algorithms were evaluated: Random Forest and XGBoost as tree-based ensemble models, Support Vector Regression, and Gaussian Process Regression as kernel-based methods, and an artificial neural network (multilayer perceptron) with genetic algorithm optimization for hyperparameter tuning and protocol development. Model interpretability was achieved through SHAP analysis and permutation importance assessment. Genetic algorithm optimization of neural networks identified optimal protocols achieving remarkable callus induction rates, representing substantial improvements over conventional approaches. Feature optimization revealed specific strategies ranging from minimal hormone supplementation to comprehensive growth regulator networks. Sensitivity analysis confirmed that environmental precision provides primary control mechanisms, with hormonal supplementation serving as fine tuning parameters rather than primary drivers. The thesis advances both methodology and biological understanding. It establishes a rigorous methodological framework integrating systematic preprocessing, cross-validation with independent holdout data, and comprehensive interpretability analysis. The approach delivers predictive models with quantified uncertainty and clear performance boundaries. Biologically, the work demonstrates that machine learning can successfully capture complex tissue culture interactions. These findings represent a paradigm shift from empirical protocol development to data-driven optimization, with applications across plant biotechnology, crop improvement, pharmaceutical compound production, and conservation. The work may also motivate future steps that include transfer learning across species, imaging-based phenotyping with deep learning, classification of callus quality tiers, and time-resolved modeling of induction dynamics.
Files
Source DOI
Rights
https://researcharchive.lincoln.ac.nz/pages/rights
Creative Commons Rights
Access Rights