Microarray gene expression: towards integration and between-platform association of affymetrix and cDNA arrays
Abstract
Microarrays technology reveals an unprecedented view into the biology of DNA. Information science is moulding this revolution in gene expression profiling with its distinctive skilfulness to transform it into a technologically-advanced and perpetually rejuvenating branch of science while simultaneously contributing to further streamlining the processes involved.
With the advancement of the technology along with the increase of popularity, microarrays afford the luxury that gene expressions can be measured in any of its multiple platforms, which include arrays from commercial vendors like Affymetrix (Santa Clara, CA, USA), Agilent (Palo Alto, CA, USA), and other proprietorial arrays of various laboratories. The technology is expanding rapidly providing an extensive as well as promising source of data for better addressing complex questions involving biological processes. The ever increasing number and publicly available gene expression studies of human and other organisms provide strong motivation to carry out cross-study analyses. Integration of multiple studies that are based on the same technological platform, or, combining data from different array platforms carries the potential towards higher accuracy, consistency and robust information mining. The integrated result often allows constructing a more complete and broader picture.
Various comparison studies have been published over the years, and the overall observation on accuracy, reliability and reproducibility of microarray investigations can be summarized as cautious optimism. In the midst of all the relentless chase in finding suitable remedies for the issues of microarray data integration, this project is an attempt of cross-platform data integration belonging to chilhood leukaemia patients tested on microarray platforms, Affymetrix and cDNA. Keeping in mind the nature of the resultant microarray data from the two platforms, a new ratio-transformation method has been proposed, and is applied to the cancer data. The approach, subsequently, highlights that its usage can address the issue of incomparability of the expression measures of Affymetrix and cDNA platforms. The method is, later, tested against two established approaches, and is found to produce comparative results.
The encouraging cross-platform outcome leads to focus attention on examining further in the direction of defining the association between the two platforms. With this motivation, a wide range of statistical as well as machine learning approaches is applied to the microarray data. Specifically, the modelling of the data is elaborately explored using – regression models (linear, cubic-polynomial, loess, bootstrap aggregating) and artificial neural networks (self-organizing maps and feedforward networks). In the end, the existing relationship between the data from the two platforms is found to be nonlinear, which can be well-delineated by feedforward network with relatively more precision than the rest of the methods tested.... [Show full abstract]