Microarray Home
Introduction
Services and Projects
GeneChip Expression
GeneChip Genotyping
Custom Microarrays
Data Analysis
People
Microarray Home
The Microarray Resource provides microarray analysis service and technical expertise to all researchers at Boston University and to interested groups from outside the university. We are located on the 6th floor of the Evans (E) building at the BU Medical Campus.
Affymetrix GeneChip Custom Oligonucleotide Microarray
We offer two different microarray platforms at the Microarray Resource, Affymetrix GeneChips and custom Oligonucleotide microarrays. For more information go to the Services [link] section of the webpage. For both platforms users of the facility provide us with high quality samples and the Microarray Resource does the rest.
Introduction
Services
GeneChip Expression Analysis [Prices]
GeneChip Genotyping Analysis [Prices]
Custom Oligonucleotide Microarrays [Prices]
Data Analysis
Microarray Resource People
Other information
Starting Material for Microarray Analysis
Protocols
Grant Materials
Links
All interested researchers are encouraged to contact the Microarray Resource to discuss a potential project or to ask any question about the use of microarrays.
Microarray Resource 617-414-1377
We look forward to talking with you!
Introduction
Biological arrays are an ordered set of compounds that are affixed to a solid surface. By applying a sample solution to the array it is possible to assay the interaction of the sample with each of the compounds on the array. Microarrays are a powerful research tool because they enable massively-parallel assays of biological samples.
The most common application of microarrays is gene expression analysis. In this case the interaction between labeled mRNA in a biological sample and complementary DNA probes, which are affixed to the array, allows rapid quantification of the entire transcriptome. Many other types of arrays are possible. Nucleic acid arrays can also be used for genotyping and antibody arrays can be used for proteomics.
We offer two different microarray platforms at the Microarray Resource. The Affymetrix GeneChip system is a commercial platform that enables rapid, reproducible, and accurate microarray analysis on a genome-wide scale. The Affymetrix GeneChip platform can be used for both gene expression and genotyping analysis.
We also offer made-to-order microarrays that are manufactured by synthesizing oligonucleotide probes and spotting them onto glass microarray slides here at the Microarray Resource. These “custom” microarrays are able to detect hundreds to thousands of genes of specific interest to individual investigators. The custom microarrays are an extremely flexible platform. They can be used for gene expression analysis of almost any organism in addition to many other types of projects. Additionally, by making our own microarrays, we are able to significantly reduce the cost of performing microarray experiments.
At the Boston University Microarray Resource we strive to allow you to easily incorporate microarrays into your research and help you get the most out of your data. You provide us with high quality samples and the Microarray Resource does the rest.
Services Offered by the Boston University Microarray Resource
Affymetrix GeneChips
The Affymetrix GeneChip™ system is a commercial microarray platform that allows whole genome gene expression analysis for common experimental organisms and high-throughput genotyping for human samples.
Custom Oligonucleotide Microarrays
The Microarray Resource will make custom arrays in-house by spotting oligonucleotide probes onto glass slides. This is an extremely flexible platform allowing focused microarray analysis for any organism.
What types of projects can I use Microarrays for?
Affymetrix GeneChip Expression [link]
- Genome wide expression analysis for common experimental organisms
o Expression analysis to investigate cellular biology
o Expression profiling to categorize biological samples
Custom Oligonucleotide Microarray [link]
- Expression analysis for all organisms
o Expression analysis to investigate cellular biology
o Expression profiling to categorize biological samples
- Chip on Chip
- Gene Copy number
- Spotting user provided cDNA, protein, anti-body, or small-molecule libraries
Affymetrix GeneChip Genotyping [link]
- High Throughput Human Genotyping
Affymetrix GeneChip for Expression Analysis
The Affymetrix GeneChip system is a commercial microarray platform that allows whole genome gene expression analysis for common experimental organisms. This system has three major advantages over other array systems. It is easy to get rapid results, it has the capability to monitor the expression of every gene in the genome, and it is the most widely used commercial microarray platform.
However, the Affymetrix system also has a few disadvantages when compared with the Microarray Resource’s custom array system. The GeneChip platform is significantly more expensive than custom microarrays, and Affymetrix only makes GeneChip arrays for common experimental organisms.
- Set-up an appointment with members of the Microarray Resource to discuss your experiment. This is optional, but highly recommended.
- Give us 10 µg total RNA [more information about starting RNA link]
- Wait 1 week for us to process your samples
- Work with microarray core to analyze data
Available GeneChips for expression profiling and BUSM prices [Link] ] Contact Us [Link]
Affymetrix GeneChip for Genotyping Analysis
Genotyping with Affymetrix GeneChips
The Affymetrix GeneChip™ system is a commercial microarray platform that allows high-throughput genotyping for human samples. The GeneChip® Mapping 10K Array offers the ability to generate over 10,000 SNP genotypes from a single genomic DNA sample. The 100K array, which will be released next year, will probe over 100,000 SNPs.
- Set-up an appointment with members of the Microarray Resource to discuss your experiment. This is optional, but highly recommended.
- Give us 250 ng genomic DNA [more information about starting RNA link]
- Wait 1 week for us to process your samples
- Work with microarray core to analyze data
Please contact the microarray resource for more information about genotyping using the Affymetrix GeneChip platform
GeneChips for SNP genotyping and BUSM prices [Link] Contact Us [Link]
Custom Oligonucleotide Microarrays
Our primary goal is to make microarray analysis more accessible to all researchers at BU. We hope that microarray analysis of gene expression will become a method that researchers consider part of their regular repertoire of experimental approaches the way a Northern blot is now. One of the most important ways that we can do this is by making the technology inexpensive. Manufacturing custom Oligonucleotide microarrays in house will enabling an enormous cost savings. The custom microarray system will allow researchers to choose a collection of genes of interest for their own research specific microarray. Another advantage of custom microarrays is that while Affymetrix arrays are targeted for expression analysis and genotyping any sequence can be spotted on a custom array. This opens up a wide range of additional application such as analysis of sequences enriched in chromatin immunoprecipitations, detecting region specific differences in copy number, pathogen detection, and many more. Even in the area of gene expression analysis, custom microarrays have the advantage that they can be designed for any organism.

There are a number of ways to select genes for a custom microarray. Known genes of interest can be the primary source but this approach can be supplemented with literature and database-mining or with preliminary experiments using Affymetrix whole-genome microarrays.
The Microarray Resource makes custom arrays in-house by spotting and covalently cross-linking oligonucleotide probes onto glass slides. Probes for genes of interest are designed using sophisticated software that determines the best 50-70 nucleotide probe sequence for each gene. These probes are then synthesized on an ABI 3900 high throughput DNA synthesizer. The probes will be spotted onto derivatized glass slides using a Genetix QArray-Mini™ custom array spotter. Once the arrays are made RNA samples from investigators are labeled and hybridized to these arrays and scanned with a Packard ScanArray Express™ multi-channel microarray scanner. We will also spot user-provided libraries.
In order to make sure that you get the most out of your microarray experiment, the Microarray Resource will help you analyze your data. This includes guidance on experimental design and statistical analysis, as well as access to software that will allow sophisticated data mining and visualization.
- Set-up an appointment with members of the Microarray Resource to discuss your experiment. This is optional, but highly recommended.
- Determine a list of genes for your custom microarray
- Wait a few weeks while we design and manufacture your microarrays.
- Give us 5 µg total RNA [more information about starting RNA link]
- Wait 1 week for us to process your samples
- Work with microarray core to analyze data
Custom Oligonucleotide Microarray Prices [Link] Contact Us [Link]
Affymetrix GeneChip Prices for Expression Analysis
Affymetrix currently makes GeneChip expression arrays for 9 organisms. The cost of the arrays varies from $300-$350 and is detailed below.
Available Organisms
Yeast (cerevisiae)
|
Drosophila
|
P. aeruginosa
|
Arabidopsis
|
E. coli
|
C. elegans
|
B. subtilis
|
Barley
|
For human, mouse, and rat GeneChips, the entire transcriptome is split between two Affymetrix GeneChips. In each case, the A chips contain the best annotated genes from the organism, while B chips contain mostly ESTs, splice variants, and poorly annotated transcripts. The cost of running both A and B chips for human, mouse and rat samples is less than double the cost of running just the A chip because the processed RNA from a single sample can be hybridized to multiple arrays.
Organism
|
Genes
|
Annotated Genes
|
Chip
|
Reagents
|
Labor
|
Total
|
Human U133A
|
~22,500
|
19,993 w/ Gene Symbol
|
$350
|
$200
|
$300
|
$850
|
Human U133B
|
~22,500
|
10,043 w/ Gene Symbol
|
$350
|
$25
|
$150
|
$525
|
Mouse MOE430A
|
~22,500
|
?
|
$350
|
$200
|
$300
|
$850
|
Mouse MOE430B
|
~22,500
|
?
|
$350
|
$25
|
$150
|
$525
|
Rat ROE430A
|
~16,000
|
?
|
$350
|
$200
|
$300
|
$850
|
Rat ROE430AB
|
~16,000
|
?
|
$350
|
$25
|
$150
|
$525
|
Arabidopsis
|
~16,000
|
?
|
$300
|
$200
|
$300
|
$800
|
C. elegans
|
~22,500
|
?
|
$300
|
$200
|
$300
|
$800
|
Drosophila
|
~13,500
|
?
|
$300
|
$200
|
$300
|
$800
|
Yeast SG-98
|
~7,000
|
4,181 w/ Gene Symbol
|
$300
|
$200
|
$300
|
$800
|
E. coli
|
~5,500
|
?
|
$300
|
$200
|
$300
|
$800
|
P. aeruginosa
|
~6,000
|
?
|
$300
|
$200
|
$300
|
$800
|
Chip
Affymetrix GeneChips are available to us at the Boston Academic Consortium prices. Due to the nature of our agreements with Affymetrix, we can only offer chips at these prices to academic investigators. Other groups are encouraged to purchase their GeneChips from Affymetrix, and bring them to the microarray resource for hybridization.
Reagents and Labor
For each sample to be prepared for GeneChip analysis there is a cost of $500 for reagents and labor. One sample is hybridized to each GeneChip microarray. If the sample is to be hybridized to a set of A and B chips then the reagent and labor cost is $675. Our charge for labor is very competitive with other facilities. Remember, we provide comprehensive assistance with data analysis, a service unique to the Boston University Microarray Resource.
Experimental Design and Data Analysis
In order to make sure that you get the most out of your microarray experiment, the Microarray Resource will help you analyze your data. This includes guidance on experimental design and statistical analysis, as well as access to software that will allow sophisticated data mining and visualization.
Affymetrix GeneChip Prices for Genotyping
Affymetrix currently makes one GeneChip for SNP genotyping.
Affymetrix anticipates releasing a similar 100K Mapping Array in 2004.
|
SNPs
|
Chip
|
Reagents & Labor
|
Total
|
Human 10K
|
>10,000
|
$400
|
?
|
?
|
Chip
Affymetrix GeneChips are available to us at the Boston Academic Consortium prices. Due to the nature of our agreements with Affymetrix, we can only offer chips at these prices to academic investigators. Other groups are encouraged to purchase their GeneChips from Affymetrix, and bring them to the microarray resource for hybridization.
Reagents and Labor
More information to come
Experimental Design and Data Analysis
In order to make sure that you get the most out of your microarray experiment, the Microarray Resource will help you analyze your data. This includes guidance on experimental design and statistical analysis, as well as access to software that will allow sophisticated data mining and visualization.
Custom Microarray Prices
The cost of design and fabrication of custom microarrays is $150. This cost includes probe selection, synthesis, and spotting.
A minimum order is required on custom microarray projects, though you don't need to use -- or even necessarily print -- all of the arrays at one time. The minimum order for a new custom array varies depending on the number of oligos in the array and the kind of array you want to make.
Minimum Order
Number of Oligos
|
Human, Mouse, Rat, and Yeast Expression Arrays
|
All Other Arrays
|
1-100
|
10 arrays
|
10 arrays
|
101-500
|
20 arrays
|
30 arrays
|
501-1000
|
40 arrays
|
60 arrays
|
Custom microarrays containing more than a thousand unique oligos are certainly possible. Investigators seeking to make arrays with more than a thousand unique oligos should contact us to discuss their project. Discounts would be considered for projects larger than 100 arrays. Investigators interested in projects of this size should contact us to discuss their project
|
Samples
|
Genes
|
Chip
|
Reagents
|
Labor
|
Total
|
Custom Microarray
|
2 per array
|
1-1,000 or more
|
$150
|
$100
|
$200
|
$450
|
Reagents and Labor
For each custom microarray there is a cost of $300 for reagents and labor. Two samples are hybridized to each custom microarray.
Experimental Design and Data Analysis
In order to make sure that you get the most out of your microarray experiment, the Microarray Resource will help you analyze your data. This includes guidance on experimental design and statistical analysis, as well as access to software that will allow sophisticated data mining and visualization.
Compared with the Affymetrix system, custom-array experiments can be done at greatly lower cost. The custom array itself costs only $150 per array, which is less than half of the cost of a GeneChip array. Additionally, the cost of reagents and labor for hybridizing a custom array is $300 versus $500 for GeneChip arrays. Another important cost savings with custom arrays is that two samples, such as control and experimental, are hybridized to a single array while just a one sample can be hybridized to a GeneChip array. Consequently, the simplest custom array experiment costs $450 vs. $1600 for the simplest Affymetrix experiment.
Starting RNA for Affymetrix GeneChip Expression Analysis
- 10 µg high quality total RNA preferred.
- Small amplification protocols are available that facilitate GeneChip expression analysis from samples as small as 100 ng.
- In less than 10 µl of water (we will dry down dilute samples)
- DNAse treatment is not necessary. Small amounts of genomic DNA contamination will not affect the results of microarray analysis.
- Poly-A selected RNA can be used for microarray analysis, but this is not recommended unless previous studies were conducted using poly-A selected RNA.
- RNA extraction protocols [link]
- The most common problem with RNA that we encounter in the microarray core facility is carryover organic contamination from the extraction. This organic contamination will cause sample preparation reactions to fail.
Starting DNA for Affymetrix GeneChip SNP Genotyping
- 250 ng high quality genomic DNA
- DNA extraction protocols [link]
Starting RNA for Custom Microarray Analysis
- 5 µg high quality total RNA preferred.
- In water
- DNAse treatment is not necessary. Small amounts of genomic DNA contamination will not affect the results of microarray analysis.
- Poly-A selected RNA can be used for microarray analysis, but this is not recommended unless previous studies were conducted using poly-A selected RNA.
- RNA extraction protocols [link]
- The most common problem with RNA that we encounter in the microarray core facility is carryover organic contamination from the extraction. This organic contamination will cause sample preparation reactions to fail.
Microarray Protocols
GeneChip Expression Analysis Protocol [Link]
Sample Preparation
Hybridization, Staining, and Scanning
GeneChip Genotyping Protocol [Link]
Sample Preparation
Hybridization, Staining, and Scanning
Custom Oligonucleotide Microarray Protocol [Link]
Array Production Protocols
Sample Preparation and Hybridization Protocols
RNA extraction protocols
DNA extraction protocols
Data Analysis Methods
Data Analysis
Following Microarray hybridization and scanning there are a few things that need to be done to create a data set that is ready for analysis. These include image quantification, normalization, and annotation.
Image Quantification
For Affymetrix GeneChips image quantification is performed using GeneChip Operating System 1.0 software (CGOS 1.0). Starting with a scanned image GCOS determines the intensity of each 25mer probe on the GeneChip. Then a gene specific intensity is calculated using the intensities of the set of probes for each gene. This procedure is described in more detail in the Affymetrix Statistical Algorithms Reference Guide [https://www.affymetrix.com/support/technical/technotes/statistical_reference_guide.pdf]
Following within-chip image quantification, it is necessary to normalize the data across chips in order to make measurements as comparable as possible across chips. There are a number of different normalization methods, but in general more complex methods will do a better job of normalization at the risk of overfitting. Furthermore, as more samples are added to a microarray data-set, chip to chip differences become less important. This makes complex normalization less important. In general the Microarray Resource uses the simplest normalization method, linear scaling.
In linear scaling, the intensity of each gene on a chip is multiplied by a constant such that the average intensity of all the genes on that chip is scaled to a predetermined target.
In quantile normalization, the intensity of each gene is ranked within each chip. The average intensity across all chips of each rank is then calculated. Finally, on each chip, the intensity of each gene is replaced by the average intensity of the gene of that rank across all chips.
Loess Normalization
In loess normalization, the intensity of the genes on a chip are normalized based on the local mean of signal intensities.
Gene to Gene Normalization
In addition to these normalization methods, which make chips comparable, there are other normalization techniques that make genes comparable on the same scale. These methods are generally used prior to clustering or principle components analysis.
In Log Ratio Normalization, the expression of each gene on each chip is calculated as;
log ( intensity of gene on this chip / mean intensity of gene across all chips)
Annotation
Introduction
In order to effectively analyze microarray data, it is critical for investigators to have access to complete and up-to-date annotation of the genes on the array. At the Microarray Resource we get our annotation information from two primary sources, though there are a few others that are worth mentioning.
Affymetrix maintains the NetAffx [Link] database containing information about the genes that are contained on their GeneChip microarrays. This is the best first source of information about Affmyetrix probe sets because each probe set has a unique page in the NetAffx database containing a broad range of information including gene and probe sequences, links to other databases, and functional descriptions of the genes.
The Incyte Proteome BioKnowledge Library [Link] is now available for access by all current Boston University and Boston University Medical Center faculty, staff, and students. This is an excellent database for finding information about genes from microarray experiments. It is well curated and provides Pubmed links for all references. This database is indexed by gene symbol.
The are a number of other database that can provide valuable information about genes from microarray experiments
- Genbank
- SGI (yeast)
- Gene Ontology
With microarray data, biology researchers want to identify genes differentially expressed under different growth conditions or different treatments, to cluster genes according to their expression pattern, and to differentiate samples in pharmaceutical or clinical studies.
The most straightforward method of identifying differentially regulated genes in a microarray experiment is by fold change. Fold change is the multiple by which the expression of a gene changed between two experimental groups.
Fold change can be reported using various scales that each convey the same information
Ratio: ¼, 4
Linear: -4, 4
Log base 2: -2, 2
Log base 10: -?, ?
Fold Change is usually calculated using the mean of a set of measurements within an experimental group, but I can also be calculated using the geometric mean, particularly if the original measurements were not converted to logarithmic scale.
While Fold Change is an important descriptor of the behavior of a genes expression between two experimental groups, it does not tell the whole story. For example take the expression of one gene measured 4 times in each of two experimental groups.
Group A: 100, 200, 200, 300 Mean = 200
Group B: 100, 100, 200, 2800 Mean = 800
Fold Change = 4
According to Fold Change this is a differentially regulated gene while we can see that Group B is not reproducibly upregulated 4 fold. Consequently, Fold Change should not be used as a first pass method for identifying differentially expressed genes.
A better method for identifying differentially regulated genes is provided by statistics. Analysis of Variance (ANOVA) is a technique that assesses whether a set of measurements from two or more experimental groups indicates, given observed variance, that the groups are different. For microarrays the measurements are the expression levels of one gene and the groups correspond to the experimental sample groups. ANOVA is used to identify genes that are differentially expressed in a manner that is reproducible across multiple measurements within each experimental group.
An ANOVA score is calculated by comparing the variance observed between the sample group means to the variance observed within the groups. If the between group variance is high relative to the within group variance this indicates differential expression. The result of an ANOVA is a probability, p, that an observed difference between groups could have been produced by chance if the groups were in fact the same.
Following the use of ANOVA to calculate a p-value for each gene it is useful to choose a p-value cut-off, below which genes will be considered differentially expressed, and above which genes will not be considered differentially expressed. This cutoff will be arbitrary, but its’ choice should be made with an understanding of the trade-offs between sensitivity and selectivity that are inherent to choosing a significance cut-off. In general, choosing a lower significance cut-off will result in fewer genes being identified as differentially expressed, but a smaller portion of those that are selected will be false-positives. Choosing a higher significance cut-off will result in more genes being identified as differentially expressed, but a greater portion of those will be false-positives. At any significance cut-off it is possible to estimate the associated false-positive and false-negative rates. This allows an informed choice of the significance cut-off
ANOVA can take a few different forms depending on the experimental design. The most basic type of ANOVA is a one-way ANOVA. In a one-way ANOVA, the sample groups are stratified along a single experimental variable. The simplest one-way ANOVA, with two sample groups, is equivalent to a T-Test. The result of an ANOVA comparing more than two groups is the probability that any one of the groups is significantly different from the rest. At the Microarray Resource we perform one-way as well as multiple-factor ANOVA. Multiple-factor ANOVA differs from one-way ANOVA in that it generates p-value scores for each of the primary experimental axis as well as scores for each interaction between factors.
Correction of significance results for multiple hypothesis testing is an important concern in microarray data analysis. It is common to use a p-value cut-off of 0.05. In a microarray experiment in which 20,000 genes are measured, even if no genes are truly differentially expressed, 1,000 genes can be expected to meet the p < 0.05 significance cut-off by chance alone. Furthermore, in the same 20,000 gene experiment with no changed genes, one unchanged gene would be expected to have a p-value as low as 0.00005.
A statistic test, like ANOVA, applied to microarray data tells you the probability that the observations made about a single gene could have been made if the null hypothesis, that the gene is not significantly changed, were true. When applied to normally distributed random data, p-values will be evenly distributed between 0 and 1. Thus, when looking at a single gene, a very low p-value is a significant finding, but as you increase the number of genes observed, the chance of finding a single very low p-value increases.
Take a fictitious microarray data set with 20,000 genes, none of which are differentially expressed between the experimental groups. We will use a p-value cut-off of 0.05 to identify differentially regulated genes. If we look at any one gene from our fictitious data set, which we know is not differentially expressed, there is a 1 in 20 chance of it having a p-value less than 0.05. Our gene-wise false-positive rate, at this level of sensitivity, is 5%. So, if we to use a microarray to observe the expression of a single gene, we can use p-value cut-off of 0.05 and control false positives at a rate of 5%.
If we use a statistical test and a p-value cut-off of p < 0.05 to identify differentially expressed genes from our fictitious microarray experiment, our gene-wise false positive rate is still 5%. Five percent of 20,000 genes is 1,000 genes, that were not actually differentially expressed, but would be identified as significant at this level of sensitivity. Testing as many hypotheses as there are genes on a microarray gives plenty of chances to make a mistake.
There are a few different methods for dealing with multiple hypothesis testing in significance analysis of microarray data. The Bonferroni correction multiplies the significance observed for each hypothesis by the number of hypotheses being tested. The Bonferroni correction is usually overly stringent for microarray data analysis. If we use a Bonferoni corrected p-value cutoff of 0.05 on a real microarray data set, no matter how many genes meet the significance cut-off, there will be a 5% chance that a single false-positive will be among them. If we identify 100 genes that are differentially expressed in an experiment, we would likely be willing to accept a few false-positives among the 100. The Bonferonni criteria that there is only a 5% chance that a single false-positive is among the 100 is more control of false-positives than is usually necessary. Increasing selectivity using the Bonferonni correction reduces sensitivity, so fewer differentially regulated genes will be identified.
Another method for treating the multiple hypothesis problem makes more sense for microarray experiments. The False Discovery Rate (FDR) correction of Benjamini and Hochberg estimates the gene-wise false-positive rate among the genes at a significance cut-off. The FDR is the quotient of the number of unchanged genes expected at a given significance cutoff over the number of genes detected at that significance cutoff.
The assumption that unchanged genes would have p-values evenly distributed between 0 and 1 can be used to estimate the number of false-positives expected at a given significance cut-off. The number of false-positives expected at a given significance cut-off will be equal to the number of unchanged genes (or the number of genes on the microarray) times the p-value of the significance cut-off
Based on two assumptions, it is possible to estimate the number of changed and unchanged genes in a microarray data set. The first assumption is that unchanged genes will have p-values evenly distributed between 0 and 1. The second assumption is that changed genes will not have p-values greater than a certain p-value threshold.
If there are no changed genes with p greater than the threshold then all of the genes with p greater than the threshold are unchanged. If the unchanged genes have evenly distributed p-values, then the density of unchanged genes above the threshold will be the same as the density of unchanged genes below the threshold. So, we calculate the density of unchanged genes above the threshold, and integrate this constant density from p equals 0 to 1.
Technique
Principle Components Analysis is a mathematical transformation that can be applied to microarray data sets allowing data compression and dimensionality reduction. The primary objective is to transform the data into a new space where data analysis is easier. Princpal components analysis transforms a number of (possibly) correlated variables into a (smaller) number of uncorrelated variables called principal components. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible.
The mathematical technique used in PCA requires solving for the eigenvalues and eigenvectors of a microarray data-set in matrix form. The eigenvector associated with the largest eigenvalue has the same direction as the first principal component. The eigenvector associated with the second largest eigenvalue determines the direction of the second principal component, etc.. The maximum number of eigenvectors equals the number of columns (samples) of the microarray data-set.
At the Microarray Resource we use principal components to view distributation of variability within the various samples that make up an experiment.
Figure Here
Looking at samples
Looking at genes
Gene Clustering example (data before clustering / data after clustering)
Sample Clustering example (data before clustering / data after clustering)
At the Microarray Resource we use hierarchical clustering to visualize the expression profiles of a group of genes that have been selected using other statistical methods.
We perform hierarchical clustering using Spotfire software. If you would like us to perform hierarchical clustering on your data-set, just give us a list of genes to cluster and we’ll do the rest.
K-Means clustering is a technique that is used to divide genes into discrete groups
Visualizations are often associated with the presentation of microarray data.
The most common of these visualizations is the heat map.
In a Volcano Plot, the fold change and significance for each gene are displayed as a scatter plot. Both fold change and significance are generally plotted in log scale. The spots take a characteristic volcano form because absolute fold change is correlated with significance.
Volcano plots can be used to demonstrate fold change and significance cut-offs.
Picture here
Volcano plots are also an excellent way to visualize the changes that occur in a group of genes.
Picture here
Talk about making volcano plots comparing more than two groups?
GenMapp
Oligo Design & Synthesis
The Microarray Resource will design oligonucleotide probes for detecting expression of specific genes of interest. This is not trivial as one must consider melting temperature, secondary structure, and sequence specificity, in addition to potential splice variants for each gene. We have automated many steps of this process.
The Microarray Resource will synthesize 50-70mers using the ABI 3900 DNA synthesizer at a rate of 100 oligos per day or more. In contrast to cDNAs, which are commonly used as microarray probes, oligos provide flexibility to analyze the abundance of all mRNAs produced from a given gene. One lesson from the large genome projects is that complexity may be generated, in part, by the surprisingly large number of mRNA splice variants derived from a single gene. cDNA would not allow one to easily distinguish among different splice variants.
Some pre-designed
gene sets (e.g. 100 tumor suppressor genes) will soon be listed on the website to provide a startin
g point for figuring out what genes an investigator might want to analyze with their custom arrays.
Data Analysis
Data Warehousing & Data Analysis Consulting
Effectively managing and analyzing the large volume of data generated in each microarray experiment is a key factor to the successful use of the experimental approach. One of our principal objectives as a microarray core facility is to provide software that will allow users of the Microarray Resource to get the most out of their data.
C++
Netaffx
Excel
Spotfire
For Microarray Resource customers, ANOVA is implemented within Microsoft Excel.
Assumptions/Limitations of ANOVA
With microarray data, biology researchers want to identify genes differentially expressed under different growth conditions or different treatments, to cluster genes according to their expression pattern, and to differentiate samples in pharmaceutical or clinical studies.
GeneChip probe design