Steve's Electric Dreams

A BizTalk and .NET Blog

<December 2008>
SuMoTuWeThFrSa
30123456
78910111213
14151617181920
21222324252627
28293031123
45678910


Navigation

Professional

Subscriptions

Post Categories



Monday, May 03, 2004 - Posts

AT Richness Viewer

I am starting a new side project.  My wife is a biology graduate student at UNO.  She is studying the genetics of plants.  The software they use is either (1) WAY too expensive or (2) really lousy.  What she needs is a way to look at nucleotide sequences visually.  These sequences are strings of the characters A, T, G, C and U.  G and C are one type of nucleotide and A/T/U are the other.


She needs to be able to view the sequence in a way that will allow her to see the density of ATU in a region. Apparently, A/T/U rich regions are usually exons (genes that matter) and G/C regions are introns (genes that don't).

I am writting a C# Windows application that will allow her to take a set of sequences and look at them side-by-side. It will assign a color to the nucleotide based on the density of ATU in the region. 

The first interesting part is how to assign colors.  I am using an algorithm that assigns decending weights to nucleotides as they get farther from the target.  I then scale this to 0-255 since I am displaying it in 8-bit color.

For example,

     AATATCGGCTATAGCATTCGATCAG
                            Target
                         Weight=1.0

The nucleotides immediately to the left and right are given a weight of 0.9 and so on until they reach zero.
I am assigning a 1 to an A/T/U and a 0 to G/C.  So in this example, the total density for the target is:

  Target-9 A = 0.1
  Target-8 T = 0.2
  Target-7 A = 0.3
  Target-6 T = 0.4
  Target-5 C = 0
  Target-4 G = 0
  Target-3 G = 0
  Target-2 C = 0
  Target-1 T = 0.9
  Target A = 1.0
  Target+1 T = 0.9
  Target+2 A = 0.8
  Target+3 G = 0
  Target+4 C = 0
  Target+5 A = 0.5
  Target+6 T = 0.4
  Target+7 T = 0.3
  Target+8 C = 0
  Target+9 G = 0
  ---------------------
  TOTAL  5.8 out of a possible 10.0
 
Scaling this to 0-255 gives us:

  Density = 255 * (5.8 / 10.0) = 135.15 -> 135

Therefore, if I am rendering in a grayscale, this nucleotide's RBG color is (135, 135, 135).
 
Now that I have the density colors for each nucleotide, I am going to render a ribbon graph with these colors. This will allow someone to look at it and say "here is an ATU-rich region".

I am going to have a lot of fun figuring out how to let them zoom and pan around this graph for analysis.  I have a new book on GDI+ development that should help.  If anyone has any helpful ideas, please let me know.

 

posted Monday, May 03, 2004 10:36 AM by swright with 0 Comments




Powered by Dot Net Junkies, by Telligent Systems