ORIGINAL RESEARCH The statistical distribution of the intensity of pixels within spots of DNA microarrays: what is the appropriate single-value representative? Javier Nuñez-Garcia, 1 Vassilios Mersinias, 2 Kwang-Hyun Cho, 3 Colin P Smith, 2 Olaf Wolkenhauer 4 1 Veterinary Laboratories Agency, Addlestone, Surrey, UK; 2School of Biomedical and Life Sciences, University of Surrey, Guildford, Surrey, UK; 3School of Electrical Engineering, University of Ulsan, Ulsan, Korea; 4 Department of Computer Science, University of Rostock, Rostock, Germany Abstract: This paper opens a discussion about an important issue in the analysis of data from spotted DNA microarrays: how to summarise into a single value the distribution for the intensity values of the pixels within a spot. Although the most popular statistic used is the median, there is no clear study demonstrating why it is more appropriate than other measures of central tendency such as the mean or the mode. Here, we argue that the median intensity is not the most appropriate measure for many common cases and discuss a frequently encountered case of a ‘doughnut’-shaped spot for which the mode is closest to the ‘expected’ spot intensity. For an ‘ideal’ spot with a clear boundary and uniformly hybridised, the intensity of its pixels should approximately be normally distributed. In practical situations, these two requirements are often not met due to the physical properties of pins and the particularities of the printing and hybridisation processes. As a consequence, the distribution of the intensity of the pixels is usually negatively skewed. This asymmetry results in a larger displacement for the mean and median than for the mode from the ideal situation mentioned above. Keywords: microarrays, spot finding, intensity distribution, mean, median, mode Introduction An ideal spot on a DNA microarray would have the same amount of hybridised genetic material in each of the pixels within a well defined spot boundary. Under these conditions the intensity of any pixels inside the spot would be highly similar. If a normally distributed measurement error is assumed (eg by the scanner), the distribution of the intensities of the pixels within a spot would approximately follow a normal distribution. In this ideal case, the mean, median and mode would have almost identical values and they would provide a close approximation of the ‘real’ intensity of the spot. Due to technical impediments, the above conditions are far from being a reality in many DNA microarrays. In addition to the error introduced by the scanner, there are other significant factors that lower the intensity of the pixels within a spot from the ‘real’ or ‘expected’ spot intensity. For example, the ‘doughnut’ effect, detailed in Tran et al (2002), or when in the printing process, genetic material is nonuniformly spread within a spot. This leads to different levels of hybridisation and, thus, regions with different intensities within a spot. These factors result in a negatively skewed distribution of the intensities of pixels within the spot. In Figure 1, the Applied Bioinformatics 2003:2(4) 229–239 © 2003 Open Mind Journals Limited. All rights reserved. histograms of the differences between the median (me), mean (ma) and mode (mo) in both channels (signal, s; reference, r) are shown (see Figures 1a, 1c and 1e and Figures 1b, 1d and 1f, respectively) for a microarray example (see Appendix 2). Note that the mode is larger than the median and this is larger than the mean for most of the spots. This shows the negative asymmetry of the distributions of the intensities of pixels within the spots. It is well known that for asymmetric probability distributions there is not a clear single-value candidate to summarise the central tendency in such a distribution. The choice of such a measure depends on the context of the problem. For example, in Hyndman (1995) and Polonik (1995) the authors use nonsymmetric intervals around the mode to summarise density functions. We found that the asymmetry in the distribution of the intensities of the pixels within a spot is significant enough to consider the choice of this representative value as a crucial step in the process to interpret array data. Since the ‘real’ value of the intensity of a spot is unknown, it is difficult to prove which approach is best. In principle, Correspondence: Olaf Wolkenhauer, Department of Computer Science, University of Rostock, Albert Einstein Str 21, 18059 Rostock, Germany; tel +49 381 498 3335; fax +49 381 498 3336; email wolkenhauer@informatik.uni-rostock.de 229 Nuñez-Garcia et al a b c d e f Figure 1 Distribution of the difference between the median (me), mean (ma) and mode (mo) in the signal (s) and reference (r) channels; histograms (a), (c) and (e) and histograms (b), (d) and (f), respectively. all values in the scale of intensities are candidates to represent the spot intensity.1 However, the set of adequate candidates reduces to a few statistics. In this set are included the mean value, the median and the mode, which are most commonly provided by spot-finding software programs such as ImaGene™ (BioDiscovery, http://www.biodiscovery.com/ imagene.asp) or GenePix ® (Axon Instruments, http:// www.axon.com/gn_GenePixSoftware.html).2 The mean value is sensitive to outliers, especially when the sample size is small and the sample values are high. Outliers are produced, for example, in the spotting or scanning stages. Within spots, pixel outliers frequently occur. For very bright spots, outliers lower the mean value considerably. The 230 occurrence of outliers provides a good reason to discard the use of the mean value. Although the median has become the standard statistic to represent the intensity of a set of pixels that form a spot, to the best of our knowledge there is no detailed study showing that it is the most appropriate choice. Some supervised steps in the process of scanning and quantifying an image are based on ‘eye’ examination; for example, adjusting the gain of the scanner, fitting a grid to the spots or flagging anomalous spots. In what follows, we show a simple example where the mode is favoured by ‘eye’ examination. In Figure 2, we show the original image of a subgrid (Figure 2a) and the reconstructed images using the mean (Figure 2b), the median (Figure 2c) and the mode Applied Bioinformatics 2003:2(4) Statistical properties of spots in DNA microarrays a b c d Figure 2 Original image of our example microarray (a) and the reconstructed images using the mean (b), the median (c) and the mode (d). (Figure 2d). The image reconstructed from the mode is the most similar to the original one. Note that the human eye cannot distinguish small variations of intensities (ie about ±5000 in the scale of 0–65 535). This means that Figure 2 reveals significant differences between the choice of statistics and the evaluation of image properties by eye examination. Since there does not exist any methodology to prove which statistic is the most adequate as representative of the possible values that a variable can take according to a probability distribution, we considered this example a reasonable motivation to investigate the distribution of the intensity within spots. In the following section we investigate the variability of the distribution of the intensities of the pixels within a spot depending on different artifacts, such as the Applied Bioinformatics 2003:2(4) doughnut effect or the choice of the boundaries of the spots. This is followed by a discussion of whether the asymmetries of the distributions of the intensities of the pixels within the spots for both channels influence the corresponding gene expression profile, expressed as the log2 of the ratio of both channels’ intensities. Throughout this paper, we use examples of spots from a microarray that is described in Appendix 2. Distribution of the intensities of pixels within a spot As mentioned above, the ideal spot would have the same amount of hybridised genetic material for each of the pixels within a well defined spot boundary. This is difficult to 231 Nuñez-Garcia et al a b Figure 3 Two different discerned spots (circles) from two different initial positions of the grid (squares). distributions for the intensity of its pixels and for the intensity of pixels in the background. Figure 3 and Figure 4 illustrate this point. The regions considered as a spot are the pixels inside the circle. In Figure 3, we show how ImaGene detected in two different runs with different positions of the initial grid (squares), two different boundaries (circles) for the same spot. In Table 1, the values for some statistics corresponding to both images are provided. In Figures 4a and 4b, the distribution of the intensities within the spot is almost normal.3 The less steep slope in the left tail of Figure 4b (thick curve) is due to the low-intensity pixels in the centre of the spot, caused by the doughnut effect. We can also see the background intensity distribution (Figure 4b, thin curve). In Figures 4c and 4d, some pixels have been added to the spot and the asymmetry becomes more obvious due to the lower intensity of the pixels included in the circle. Consequently, the distribution of the background is becoming ‘more’ normal. In Figures 4e and 4f, this is even more obvious; two local modes appear in the density function (the 2 highest peaks in the thick curve in Figure 4f). The mode achieve with the technology used in most laboratories. In Tran et al (2002), the authors detail this point and also the doughnut effect. They point out that the cause of this effect is the result of some type of crystallisation complex that prevents penetration of the labelled target into the centre of the spot, leading to an unhybridised area. The intensity of the pixels of an ideal spot would follow a normal distribution with a standard deviation depending on the accuracy of the measurement apparatus. In this case, the mean, median and mode will lead to the same value. However, this is only a theoretical situation. Depending on where the spot-finding software places the boundaries of the spot, we find different Table 1 Different statistics of the distribution of the spots in Figure 3 Mean Median Mode 10 746 11 109 10 816 11 197 12 557 12 557 Table 2 Parameters for the distributions of the spots in Figure 4 Spot diameter (pixels) 10 12 14 232 Area of signal (pixels) Signal mean Signal median Signal mode Area of background (pixels) Background mean Background median Background mode 81 113 149 18 716.3 16 764.4 13 846.3 19 969 17 269 14 686 20 970.5 20 047.4 19 953.8 360 328 292 2246.07 1311.69 895.582 844.5 757 701 891.694 882.046 819.154 Applied Bioinformatics 2003:2(4) Statistical properties of spots in DNA microarrays a b c d e f Figure 4 Different distributions ((b), (d) and (f)) of a spot intensity (thick curve) and the background intensity (thin curve), depending on the diameter of the corresponding discerned spot (the circle in (a), (c) and (e)). In the histograms, the vertical lines correspond to the mean (dashed line), median (solid line) and mode (dotted line) for each curve. Applied Bioinformatics 2003:2(4) 233 Nuñez-Garcia et al a b c d e f Figure 5 For a low-intensity spot, the different distributions ((b), (d) and (f)) of spot intensity (thick curve) and the background intensity (thin curve), depending on the diameter of the corresponding discerned spot (the circle in (a), (c) and (e)). In the histograms, the vertical lines correspond to the mean (dashed line), median (solid line) and mode (dotted line) for each curve. 234 Applied Bioinformatics 2003:2(4) Statistical properties of spots in DNA microarrays a b Figure 6 Plot of the mode (mo) against median (me) intensities for all the spots; signal channel (a) and reference channel (b). Table 3 Parameters for the distributions of the spots in Figure 5 Spot diameter (pixels) 10 12 14 Area of signal (pixels) Signal mean Signal median Signal mode Area of background (pixels) Background mean Background median Background mode 81 113 149 4211.49 4024.66 3544.72 4221 4167 3654 4252.45 4245.58 4181.92 360 328 292 1163.06 930.015 793.387 761 698 657.5 641.003 615.323 602.351 Table 4 Background subtraction for the parameters given in Table 2 Spot diameter (pixels) 10 12 14 Signal mean – Signal median– Signal mode– background mean background median background mode 16 470.3 15 452.7 12 950.7 19 124.5 16 512 13 985 20 078.8 19 165.4 19 134.6 Table 5 Background subtraction for the parameters given in Table 3 Spot diameter (pixels) 10 12 14 Signal mean – Signal median– Signal mode– background mean background median background mode 3048.44 3094.65 2751.33 Applied Bioinformatics 2003:2(4) 3460 3469 2996.5 3611.44 3630.26 3579.57 on the right represents the lower intensity pixels due to the doughnut effect plus some pixels with the same intensity at the boundary of the spot. The mode on the left represents a part of the background pixels that are also included in the circle of the spot diameter. In Table 2, the values of the mean, median and mode for the different spot diameters are shown. This behaviour is enhanced when the intensities of the pixels of a spot are high, since the pixels placed on the hole (if it exists) and at the fuzzy boundary of the spot differ even further from the rest of the more intense pixels of the spot. Thus, the mean value, the median and the mode tend to separate from each other. When the intensities of a spot are low, the asymmetry tends to disappear due to the lower difference in intensity between the spot and the background. Consequently, the distance between the three statistics decreases. Figure 5 shows the equivalent to Figure 4 for a spot with low intensity. In Figure 6, the mode is plotted against the median intensities for both channels for the spots 235 Nuñez-Garcia et al Effect of the asymmetry on gene expression In this section we show that the distance between the mode and the median can result in a statistically significant difference when calculating the expression level of genes. The standard gene expression level assumed here is the log2 transformation of the ratio of both channels’ intensities. For a gene, i ∈ {1… n}, where n is the number of spots on the array, this is: ( s Ei = log 2 i ri (1) where si and ri are the intensities of spot i for the signal and reference channel respectively. When the representative value of the intensity distribution of spot i is given by the median of the distribution, we write: ( ( s me Eime = log 2 rime i (2) as the expression of gene i. When the mode is used we write: ( ( s mo Eimo = log 2 imo ri (3) Two genes are equally expressed if the proportion of the signal measurement with respect to the reference value is the same for both genes. Thus, our goal is to investigate if the following difference is significant: ( ( ( ( s mo s me Eimo − Eime = log 2 imo − log 2 ime ri ri (4) The properties of the log transformation allow us to rewrite equation (4) as: ( ( ( ( s mo r mo Eimo − Eime = log 2 ime − log 2 i me si ri (5) For a distribution of the intensity of the pixels within spot i, the following element (where k = r, s) defines a measure of proximity between the mode and median: ( ( 236 absolute error (not relative), since the final gene expression is usually provided in fold changes. ( of the example microarray. Note that the higher the values of the mode and/or median are, the larger the deviation is from the line y=x. By looking at the corresponding Tables 2 and 3, it is apparent that the mode is more robust to variations in the diameter of the spot, even when the background is subtracted (see Tables 4 and 5). Following this reasoning, suppose that we want to correct the two effects treated here (the doughnut effect and the fuzziness at the boundary of the spots). Kim et al (2001) and Brown et al (2001) suggest discarding an upper and lower percentage of pixels before the histogram is calculated; 15% in both sides and a 2σ length interval about the mean, respectively. For the case shown in this paper, discarding the same percentage at both sides of the sample of pixels does not correct the asymmetry in the histogram. However, the lower percentage threshold can be applied. The recalculated histogram will keep the same mode but the mean and median will increase, getting closer to the mode. With this solution, the mean and median become dependent on the value of the threshold. This is a drawback since there is not an infallible method to decide its value. What would be the intensity of the pixels that form the hole of the ‘doughnut’ if the hybridisation was perfectly performed? The intensity would ‘probably’ be around the mode, where there are the most repeated intensities, ie the most likely. From this point of view, the mode seems a good representative since it is the most independent of both effects. Because of the small sample size of pixels within a spot in comparison with the range of possible intensities, it is necessary to estimate the probability density function of the distribution of the intensities before the mode can be calculated. This seems a disadvantage of the mode with respect to the mean and median, since the mode depends on the methodology used to estimate the density function or histogram. In this work, kernel density estimation with Gaussian functions was used. The only parameter to be adjusted, on which the mode depends, is the bandwidth of the Gaussians. We tried a range from 400 to 1500 for some random spots, and the maximum variability found for the mode was about 400 intensity units. This was verified for a sample of spots covering the whole range of intensities. This value is intensity-dependent. For spots with low intensity, the intensities of the pixels within the spots are closer to each other than for spots with higher intensity where the intensity of the pixels are spread over a larger region. Thus, in the first case the bandwidth parameter is not as crucial as for the second case, in terms of the variability mentioned above. However, high-intensity signals can support a higher k mo Sik = log 2 ime ki (6) Note that by taking the ratio we obtain a value that not only depends on the Euclidean distance between the median and the mode but also on their value. For example, consider two distributions (i, j) with modes equal to 60 000 and 30 000, Applied Bioinformatics 2003:2(4) Statistical properties of spots in DNA microarrays simo ri mo = sime ri me (7) and medians equal to 40 000 and 20 000, respectively. Then, we have that Si is equal to Sj even though the Euclidean distances between the modes and the medians, ie 60 000 – 40 000 and 30 000 – 20 000, are not equal. The difference of gene expression levels given by the s r median and the mode is equal to the difference S – S between both channels. In Figure 7, the distribution of this difference (see equation 4) is shown for the case study. The use of the mode or the median to calculate the gene expression level would not cause any significant difference if Eimo − Eime is equal to zero for all i, or likewise if Sis − Sir is equal to zero for all i. This only happens in the following two cases: 1. When simo = sime and ri mo = ri me , which implies that the distributions for both channels are symmetric and unimodal. This is the ideal case with no fuzzy boundaries or doughnut effects. 2. When Sis = Sir but simo ≠ sime and/or ri mo ≠ ri me. This is the case when: This only occurs if a real number t exists such that simo = t × ri mo and sime = t × ri me. This case, although feasible, is unlikely. In Figure 6, the mode is plotted against the median for the signal and reference channels for the example array. We observe that most of the spots have an asymmetry in the left tail of the distribution with respect to the mode (the mode is usually larger than the median and mean). This is due to the factors mentioned in the introduction. Figure 8 shows the histograms of Ss and Sr. Note that in the histogram for Ss the mass is less concentrated near zero than in the histogram for Sr. The mean values of both distributions are 0.08 and 1.28, respectively. We also observe that Ss and Sr are positive for most of the spots. This is consistent with Figure 6. Examples of this asymmetry are shown in Figures 4 and 5. A different method of representing gene expression was introduced by Brody et al (2001): the log transformation of the median of the ratios of both channels, pixel-by-pixel within a spot. In Figure 9, the asymmetric intensity histograms of both channels are shown for a spot of our array, as well as the histogram of the ratios of the signal and reference channels, pixel-by-pixel, as explained in Brody et al (2001). We also observe that for this approach the asymmetry found in the distributions of the intensities within spots introduces significant differences when either the median or the mode are used. For this particular spot, the expression levels are 0.46, 0.52 and 0.55 using the mean, median and mode, respectively. Since in most practical experiments ‘ideal’ spots are not consistently obtained, we conclude that there will be a statistically significant difference when calculating the a b Figure 7 Distribution of the difference of the gene expression (see equation 4) using the median and mode of the pixels for every spot in the example microarray. Figure 8 Histograms of indicators Ss (a) and Sr (b). Applied Bioinformatics 2003:2(4) 237 Nuñez-Garcia et al a b Figure 9 (a) Distribution of the intensities of a spot in both channels (r, reference channel; s, signal channel). (b) The gene expression given by the log transformation of the median of the pixels’ ratios’ intensities. expression level of genes using different statistics. A study is needed to decide on the best value that summarises the whole spot. Conclusions We have opened a discussion about the shortcomings of a crucial part of generating gene expression data from spotted microarrays. The use of the median as the representative intensity of a spot is considered within the microarray community as standard, although there is no convincing basis for this. We have shown that it can make a significant difference whether the median or the mode is used to calculate gene expression levels as a log ratio of the intensities in both channels. Thus, it is important to study which is the appropriate representative statistic for the kind of distributions that we obtain from the spots of microarrays. The mean or the median are not always the best choice, as this paper shows. Several examples were provided of typical spots where the mode is more appropriate due to the asymmetry of the distribution of the intensity of the pixels within the spot. We have investigated the robustness of the median, mean and mode in relation to the diameter of the spot. In our case study, the mode was the best choice. An important conclusion from this study is that for any subsequent analysis of spot intensities (as representations of expression levels), the original image rather than the output of the spot-finding software should be considered as the ‘raw data’. While image parameters continue to be inspected by eye and the choice of spot statistics is unclear, it will be important to have access to the raw image through databases. Acknowledgements The authors would like to thank the Welcome Trust funded 238 Bacterial Microarray Group at St George’s Hospital Medical School in London and the TB group of the Department for Environment, Food, and Rural Affairs at the Veterinary Laboratories Agency, Weybridge. Olaf Wolkenhauer’s work has been supported by a post-genomics grant of the UK Department for the Environment, Food, and Rural Affairs (DEFRA), conducted in collaboration with the Veterinary Laboratories Agency (VLA), Weybridge. Notes 1 Most commonly, each pixel of the image is stored with a 16-byte resolution, which provides a set of 65 535 possible intensity values. 2 Definitions of mean, median and mode are provided in Appendix 1. 3 The density functions are calculated by Gaussian kernel estimation. The mode is the intensity value where the density function achieves its maximum. NOTE: All websites accessed 22 December 2003. Appendix 1 Mean, median and mode definitions For any set of pixels, a density function, say f(·), that explains the distribution of the intensity of the pixels can be calculated. Histograms are the most popular density functions and the easiest to construct. Some other techniques, such as kernel density estimation, provide better featured density functions. For example, they could provide infinitely derivable density functions. A density function is fully informative of the distribution of the intensities of a group of pixels, although always more difficult to treat than a single number. Hence, it is very convenient to summarise the density function into a single number. The following question arises: which is the best intensity value that represents the intensities of a set of pixels or a spot? There does not exist a unique and infallible answer to this question, as we argue in the introduction. Applied Bioinformatics 2003:2(4) Statistical properties of spots in DNA microarrays There are some parameters or statistics that characterise the distribution of a random variable. Three of the most used are the: • Mode, which is defined as the intensity for which the density function achieved the maximum value. Note that the maximum could be achieved at more than one intensity. In this case the distribution is called bimodal if there are two modes or multimodal in general. • Median, which is defined as the intensity that divides the density function in two halves, each with probability of 0.5. It is also called the 50% percentile. • Mean, denoted by µ, which is the ‘balance point’ of the distribution, or as defined in physics, the centre of gravity. It is calculated by the following formula: µ = ∫R xf ( x )dx where f:R→R is the density function with R the set of real numbers and x ∈ R. These statistics can also be calculated directly from the set of pixels without the need of a density function. Thus, for a given set of intensities, say x1 ... xn, the mean (also called average in this case) is calculated by the formula: 1 n ∑ xi n i =1 If we order the intensities from the smallest to the largest, say x(1) ... x(n), the median intensity is equal to x( n / 2 ) or ( x( n / 2 ) + x( n / 2 +1) ) / 2 for an odd or even sample size n, respectively. The mode, on the other hand, is the intensity most repeated among the set of pixels. Note that a spot has relatively few pixels compared with the size of the discrete set of possible intensities. For example, for a tiff image with 16 bits per pixel, the possible intensities are 0, 1, 2 ... 65 535. As a consequence, the most common case would be that there is not any repeated intensity among the set of pixels. Thus, the mode will need to be calculated using a density function. The only case for which these three statistics take the same value is when the intensity of a group of pixels has a symmetric and unimodal density function, such as a Gaussian function which corresponds to a normal distributed random variable. µ= Appendix 2 Microarray example: materials and methods DNA microarrays, target labelling and hybridisation While the design and production of Streptomyces DNA microarrays will be reported elsewhere (Hotchkiss et unpub), information can be found in the Streptomyces coelicolor Applied Bioinformatics 2003:2(4) Microarray Resource at the University of Surrey, Guildford, UK (http://www.surrey.ac.uk/SBMS/Fgenomics/Micro arrays/index.html). In brief, 150–500-bp PCR (polymerase chain reaction) products representing ~7300 of the predicted ORFs (open reading frames) of the fully sequenced S. coelicolor A3(2) genome (Bentley et al 2002) were robotically synthesised, purified and spotted on Corning CMT-GAPS II glass slides (Corning, Acton, MA, USA). Post-processed slides were used for hybridisation of the probes with labelled cDNA or genomic DNA. Total RNA was isolated from ‘mid-logarithmic’ plate cultures of S. coelicolor MT1110 (SCP1-, SCP2-) grown on Oxoid nutrient agar at 30 °C. For genomic DNA isolation, S. coelicolor M145 was grown to stationary phase in shaken flasks of yeast extract–malt extract liquid medium at 30 °C. DNA was extracted and purified by the Kirby mix method. Cy3-labelled cDNA and Cy5-labelled genomic DNA were synthesised and hybridised on the array. Protocols on RNA isolation, target labelling and array hybridisation can been found at http://www.surrey.ac.uk/SBMS/Fgenomics/ Microarrays/index.html. Scanning and image acquisition of the hybridised array was performed with an Affymetrix® 428 scanner at 10-µm resolution. The generated tiff images were quantified with ImaGene v5.1 software (BioDiscovery, http://www. biodiscovery.com/imagene.asp). Mathematica® v4 (Wolfram Research, http://www.wolfram.com) was used for any further analysis. A typical image of a segment of the Streptomyces microarray is shown in Figure 2a. References Bentley S, Chater K, Cerdeno-Tárraga A, Challis G, Thomson N, James K, Harris D, Quail M, Kieser H, Harper D et al. 2002. Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2). Nature, 417:141–7. Brody J, Williams B, Wold B, Quake S. 2001. Significance and statistical errors in the analysis of DNA microarray data. Proc Natl Acad Sci USA, 99:12975–8. Brown C, Goodwin P, Sorger P. 2001. Image metrics in the statistical analysis of DNA microarray data. Proc Natl Acad Sci USA, 98: 8944–9. Hotchkiss G, Mersinias V, Bucca G, Hinds J, Butcher P, Smith C. Manuscript in preparation. Hyndman R. 1995. Highest-density forecast regions for nonlinear nonnormal time series models. J Forecasting, 14:431–41. Kim J, Kim H, Lee Y. 2001. A novel method using edge detection for signal extraction from cDNA microarray image analysis. Exp Mol Med, 33:83–8. Polonik W. 1995. Measuring mass concentrations and estimating density contour clusters – an excess mass approach. Ann Stat, 23:855–81. Tran P, Peiffer D, Shin Y, Meek L, Brody J, Cho K. 2002. Microarray optimizations: increasing spot accuracy and automated identification of true microarray signals. Nucleic Acids Res, 30:E54. 239
© Copyright 2024