- Start lcDNA (Figure 1).
Figure 1. Load Data Tab. This is the tab where users are able to load data for further analysis. Correctly loading data is essential to reducing user frustration
We have set up a drag and drop environment for loading data files. To select a file: left-click to highlight and right-click and hold to pick up then move the the filename to the drop destination and release the mouse button. Drag and drop targets include: The text "Drag Gene ID File Here" | "Gene ID File: XXXX". And the box under the header "Loaded Intensity Files".
Entries in the "Loaded/Imported Files" box can be deleted by left-clicking to highlight file and then right-clicking to remove the entry.
- Click on "File" in the menubar and select "Load w/o Gene ID File"
- Load your intensity data.
- Navigate to the intensity files (Table 1). Drag each one to the box under the label "Loaded Intensity Files".
Table 1. Tutorial data files.
| Filename    |
Explanation |
| R1S1.txt |
Real Experiment 1, Slide 1 |
| R1S2.txt |
Real Experiment 1, Slide 2 |
| R2S1.txt |
Real Experiment 2, Slide 1 |
| R2S2.txt |
Real Experiment 2, Slide 2 |
C1S1.txt |
Calibration 1, Slide 1 |
| C1S2.txt |
Calibration 1, Slide 2 |
| C2S1.txt |
Calibration 2, Slide 1 |
| C2S2.txt |
Calibration 2, Slide 2 |
- Specify Analysis Options (Figure 2).
Figure 2.Analysis Options menu. Once you have loaded/imported your file you may want to select how the data is to be analyzed.
Click on "Analysis Options" in the menubar (If you click on the perforation that appears you can tearoff the menu). To turn on the function click the "On" radio button; if the box is filled that means the function is selected. If the function has additional options they can be accessed by clicking on them. For a description of what the functions do see the instruction manual.
- Dye Switching: If you used the dye-swapping technique in your experiment then you need to specify which files had the dyes-swapped. If you already corrected for this in your input files then you can skip this option.
Figure 3. For each real experiment we switched dyes on the second slide.
- Eliminate Extremes: Currently, we only eliminate spots based on the intensity values.
Figure 4. We have decided to eliminate spots with intensity values outside of the range 200:63000.
- Quality Filtering: This can be run only if you have multiple spots for each gene.
Figure 5. We will compare each spot to 50 others with similar mean intensities and reject those whose coefficient of variance is above the 90th percentile.
- Normalization: For the calibration slides we select calibration, which performs LOWESS normalization. For the comparative slides we normalize using the rank-invariant method; for our E. coli experiments (~4300 genes) we usually use Iteration=T, % Threshold = 0.2, and Ext Threshold = 25. If the slides are low quality, or have few genes, then the % Threshold whould be increased, or Iteration=F should be selected. Note: If a normalization technique does not use some of the options in the row then lcDNA disregards the unused options.
Figure 6. The calibration slides (C) are normalized as calibration and the experimental slides (R) are normalized as comparative.
- Assess Expression: This function runs an MCMC simulation to determine which of our data points we should have confidence in. If the score is >0.975 then we are relatively confident that channel_2 is downregulated relative to channel 1; if the score is <0.025 then we are pretty confident that channel_2 is upregulated relative to channel_1. After running this simulation, you should select the data (score >0.975 and score<0.025) that you are confident in and use it for further analysis. Note: Calibration hybridizations will, in most cases, increase the number of genes that flagged as differentially regulated by the lcDNA. Only two (more may help) biologically independent calibration hybridizations are required. As long as the cell type is the same, the calibration hybridization data can, in most cases, be used with different experiments.
Figure 7. All of the slides are from the same time point (Data Set). The first 4 are calibration slides (they can be used for multiple data sets as long as you use the same RNA and microarray protocol). The first two (C1S1 and C1S2) are replicates within one experiment and the second two (C2S1 and C2S2) are replicates within another experiment. The second 4 are comparative experiment slides. The first two (R1S1 and R1S2) are replicates within one experiment and the second two (R2S1 and R2S2) are replicates within another experiment. We have set the number of genes to a value (5000) greater than the number of genes (~4300) on our slides. Since we have multiple slides for each experiment, we select the Technical Replicates option.
- Click on the Analyze Loaded Files button. Analysis can take from a few seconds to 10 or more minutes, depending on your CPU speed, the analysis options selected (Assess Expression is the most intensive option), and how many files you are analyzing. While the program is analyzing your data files it may appear to have locked up or crashed; this is usually not the case, it just takes time for some of the procedures, however, if you have a reasonably fast computer and are not analyzing that many files and it takes more than an hour then it probably has crashed.
Figure 8. After the analysis has completed, a window will pop up; click on Create New Directory and enter a name for a directory where your data will be saved when you are prompted for a file name. A directory will be created and subdirectories will hold the analyzed data. Currently, a subdirectory will be created for each analysis option that you selected. The assess expression data files will be named with the rootname of the first data file for a data set.
- Data Visualization
To open the plot options dialog, double left click (with the mouse) anywhere on the plot. The plots also have a zooming function; to zoom: left-click and hold, expand the box over the area of interest, and release. To unzoom: right-click (currently we only support a complete unzoom). The zooming functions may seem useless at the moment but we are adding a feature that will allow you to click on a spot to get info; this feature will be in the frame on the left hand side that currently holds the filenames.

Figure 9. The Plot Options pop-up window is activated by a double left click on the plot. The file must have been analyzed with the function corresponding to the data type if you wish to plot that type of data. If you load your files then you can plot "Data Type" "Raw Data". Warning: You should run your data through the eliminate extremes function before using MA_Plots; this will remove negative values.


Figure 10. MA-plots are useful in visualizing the quality of a slide (see Tseng et al.). The top plot is raw calibration data (C1S1). The bottom plot is raw comparative data (R1S1).


Figure 11. Intensity plots. It is better to use MA-plots to for data visualization, nonetheless, this type of plot is also included. The top plot is C1S1. The bottom plot is R1S1.

Figure 12. Normalization / Slide_To_Slide plots. This function allows you to plot the normalized log ratios for one slide versus another slide. This function allows the user to graphically assess the slide-to-slide variablity (of repeats within or across experiments). If the slides have multiple spots for each gene then this function averages the values for each slide and then plots the data. The above figure is a plot of R1S1 vs R1S2 (within experiment). If your experiment is relatively clean then you expect to see the data fall on the 45 degree line.
Recommended Reading:
- Tseng, G.C., Oh M.-K., Rohlin L., Liao, J.C. and Wong, W.H. (2001) Issues In cDNA Microarray Analysis: Quality Filtering, Channel Normalization, Models of Variations and Assessment of Gene Effects. Nucleic Acid Res, 29, 2549-2557.
- Hyduke DR, Rohlin L, Kao KC, Liao JC; (2003) A Software Package for cDNA Microarray Data Normalization and Assessing Confidence Intervals. Submitted
Webmaster