close
Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Nov 7:2:28.
doi: 10.1186/1745-6150-2-28.

Revisiting adverse effects of cross-hybridization in Affymetrix gene expression data: do they matter for correlation analysis?

Affiliations

Revisiting adverse effects of cross-hybridization in Affymetrix gene expression data: do they matter for correlation analysis?

Lev Klebanov et al. Biol Direct. .

Abstract

Background: This work was undertaken in response to a recently published paper by Okoniewski and Miller (BMC Bioinformatics 2006, 7: Article 276). The authors of that paper came to the conclusion that the process of multiple targeting in short oligonucleotide microarrays induces spurious correlations and this effect may deteriorate the inference on correlation coefficients. The design of their study and supporting simulations cast serious doubt upon the validity of this conclusion. The work by Okoniewski and Miller drove us to revisit the issue by means of experimentation with biological data and probabilistic modeling of cross-hybridization effects.

Results: We have identified two serious flaws in the study by Okoniewski and Miller: (1) The data used in their paper are not amenable to correlation analysis; (2) The proposed simulation model is inadequate for studying the effects of cross-hybridization. Using two other data sets, we have shown that removing multiply targeted probe sets does not lead to a shift in the histogram of sample correlation coefficients towards smaller values. A more realistic approach to mathematical modeling of cross-hybridization demonstrates that this process is by far more complex than the simplistic model considered by the authors. A diversity of correlation effects (such as the induction of positive or negative correlations) caused by cross-hybridization can be expected in theory but there are natural limitations on the ability to provide quantitative insights into such effects due to the fact that they are not directly observable.

Conclusion: The proposed stochastic model is instrumental in studying general regularities in hybridization interaction between probe sets in microarray data. As the problem stands now, there is no compelling reason to believe that multiple targeting causes a large-scale effect on the correlation structure of Affymetrix gene expression data. Our analysis suggests that the observed long-range correlations in microarray data are of a biological nature rather than a technological flaw.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Histogram of correlation coefficients for pairs of good (A) and bad (B) probe sets. Data Set 1.
Figure 2
Figure 2
Histogram of correlation coefficients for pairs of good (A) and bad (B) probe sets. Data Set 2.
Figure 3
Figure 3
The behavior of Corr(Z1, Z2)as a function of p for different values of the parameter k. This figure was provided by Dr. Gaile in his review.
Figure 4
Figure 4
Variation coefficients for gene expression levels in the TELL data set.
Figure 5
Figure 5
Variation coefficients for expression levels of miRNAs in SKBr3 breast cancer cells.

References

    1. Okoniewski MJ, Miller CJ. Hybridization interactions between probesets in short oligo microarrays lead to spurious correlations. BMC Bioinformatics. 2006;7 Article 276. - PMC - PubMed
    1. Klebanov L, Yakovlev A. How high is the level of technical noise in microarray data? Biol Direct. 2007;2:9. doi: 10.1186/1745-6150-2-9. Article 9. - DOI - PMC - PubMed
    1. Shi L, Reid LH, Jones WD, Shippy R, Warrington JA, Baker SC, Collins PJ, de Longueville F, Kawasaki ES, Lee KY, Lou Y, Sun YA, Willey JC, Setterquist RA, Fischer GM, Tong W, Dragan YP, Dix DJ, Frueh FW, Goodsaid FM, Herman D, Jensen RV, Johnson CD, Lobenhofer EK, Puri RK, Sherf U, Thierry-Mieg J, Wang C, Wilson M, Wolber PK. The microarray quality control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol. 2006;24:1151–1161. doi: 10.1038/nbt1239. - DOI - PMC - PubMed
    1. Gaile DP, Hutson A, Java JJ, McQuaid D, Conroy JR, Nowak NJ. Errors in centering of array data can induce biases in correlation estimates. Journal of Statistical Planning and Inference S N Roy Centennial Volume. 2006;137:3208–3212.
    1. Qiu X, Brooks AI, Klebanov L, Yakovlev A. The effects of normalization on the correlation structure of microarray data. BMC Bioinformatics. 2005;6 Article 120. - PMC - PubMed

Publication types

LinkOut - more resources