perl programme

Forum Moderators: coopster & phranque

Message Too Old, No Replies

perl programme

ojefua

2:58 pm on Nov 13, 2008 (gmt 0)

Hello,

I need to get subsequences from a long stretch of DNA sequence. The idea is to print out all possible subsequences beginning with (TTA¦CTA¦CTG¦TTG¦CTC) and ending with (TAG¦TGA¦TAA).
E.g; in a Dna like below:

$dna = "GGGCTACCCCGCCTCAAAGGGGGGTTACCCGGCCCGTTGAAACCCGGTCCGGGCTTAAAAGGGTAA"

only these subsequences can be obtained:

CTACCCCGCCTCAAAGGGGGGTTACCCGGCCCGTTGAAACCCGGTCCGGGCTTAAAAGGGTAA

CTCAAAGGGGGGTTACCCGGCCCGTTGAAACCCGGTCCGGGCTTAAAAGGGTAA

TTACCCGGCCCGTTGAAACCCGGTCCGGGCTTAAAAGGGTAA

TTAAAAGGGTAA

So, the point the subsequences are chopped of their positions in the DNA where there is any start codon and must end at the next stop codon(in this case there's only 1 stop codon "TAA")

phranque

4:09 am on Nov 17, 2008 (gmt 0)

welcome to WebmasterWorld [webmasterworld.com], ojefua!

please post your best effort for that code snippet so we can discuss the specific problem are you having with the code.

ojefua

8:12 am on Nov 19, 2008 (gmt 0)

Hi,

please check this for me.

my $seq = "AAAAATGAAAATAAGGGAAATGAAAAAAAAAAGGGGGGGACGGG"

my $gene = "AAATGAAAAAAA"

if I match gene from my sequence like so:

if($seq =~ /$gene/g){

#pos($seq) will give me 1st position after the match
#$` will hold the upstream sequence in this case:AAAAATGAAAATAAGGG

I am trying to find the position of the last stop codon in $` and assume that any of 3 possible stop codons are in the seq.

Notice the first 2 Adenines in the seq. It should be that the seq must be read in the correct frame as that of the match( In this case the frame should be 3rd frame; but we assume we dont know for some other sequence because this is just an eaxmple)

The correct position should return the position of TAA in the above

perl programme

ojefua

phranque

ojefua

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week