Welcome to WebmasterWorld Guest from 54.157.222.62

Forum Moderators: coopster & jatar k & phranque

Message Too Old, No Replies

subsequences!

   
11:45 am on Dec 4, 2008 (gmt 0)

5+ Year Member



I have a long string of letters, in this, case DNA. My intention is to find particular start triplets to begin and stp triplets to end the strings in the subsequence.the substring within these starts and stops triplets(with start and stop riplets inclusive) are then kept in an array in array.

For example my $string = "ATGAAAGTGAAAGGGAAAGGGGTGAGTGGGGGCGGGTTGGGTATTGGTTGGAAATAA"
should produce the substrings below and stored in an array

@whatever =("ATGAAAGTGAAAGGGAAAGGGGTGAGTGGGGGCGGGTTGGGTATTGGTTGGAAATAA",
"GTGAAAGGGAAAGGGGTGAGTGGGGGCGGGTTGGGTGGTTGGAAATAA",
"ATTGGTTGGAAATAA");

I have this as part of my entire code as my best effort:

while ($seq =~ m/ATG¦TTG¦CTG¦ATT¦CTA¦GTG¦ATT/gi){
my $matchPosition = pos($seq) - 3;
if (($matchPosition % 3) == 0) {
push (@startsRF1, $matchPosition);
}


while ($seq =~ m/TAG¦TAA¦TGA/gi){
my $matchPosition = pos($seq);
if (($matchPosition % 3) == 0) {
push (@stopsRF1, $matchPosition);
}

my $codonRange = "";
my $startPosition = 0;
my $stopPosition = 0;

@startsRF1 = reverse(@startsRF1);
@stopsRF1 = reverse(@stopsRF1);
while (scalar(@startsRF1) > 0) {
$codonRange = "";
$startPosition = pop(@startsRF1);
if ($startPosition < $stopPosition) {
next;
}

my $ORFseq = "";

while (scalar(@stopsRF1) > 0) {
$stopPosition = pop(@stopsRF1);
if ($stopPosition > $startPosition) {

my $difF = $stopPosition - $startPosition;
$ORFseq = substr($seq, $startPosition,(length($seq)-(length($seq)-$difF)));
push (@arrayOfORFs, $ORFseq);

}

5:02 pm on Dec 4, 2008 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



my question is what does this have to do with web analytics or tracking/logging?
2:33 pm on Dec 11, 2008 (gmt 0)



So I moved thios thread... but if it isn't perl, I apologize! I figured the perl guys in here would read this like I read the morning paper.
7:53 pm on Dec 11, 2008 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



if it looks hard to read, it's usually perl ;)

I'm not sure if I got your idea completely, because my try reaches different results then you give in your post.

use strict;
my $string = "ATGAAAGTGAAAGGGAAAGGGGTGAGTGGGGGCGGGTTGGGTATTGGTTGGAAATAA";
my @array = ();
my @starts = qw(ATG TTG CTG ATT CTA GTG);
my @stops = qw(TAG TAA TGA);
for my $start (@starts)
{
for my $stop (@stops)
{
while($string =~ m/$start(.*)$stop/g)
{
push @array, $start . $1 . $stop;
}

}
}

print join("\n", @array);

results in

ATGAAAGTGAAAGGGAAAGGGGTGAGTGGGGGCGGGTTGGGTATTGGTTGGAAATAA
ATGAAAGTGAAAGGGAAAGGGGTGA
TTGGGTATTGGTTGGAAATAA
ATTGGTTGGAAATAA
GTGAAAGGGAAAGGGGTGAGTGGGGGCGGGTTGGGTATTGGTTGGAAATAA
GTGAAAGGGAAAGGGGTGA

but maybe I got something wrong, I've never been into the Bio-Stuff.

If that's not what you needed, please elaborate for a guy who knows he should have DNA somewhere in his body but not much more than that.

Also, did you check the modules available at cpan? I hear there are quite a few for dealing with DNA. Maybe one of these can do the job much cleaner: [search.cpan.org...]

9:15 am on Dec 18, 2008 (gmt 0)

5+ Year Member



ok Thanks. Can you offer any hints to this line of code:

I have a sting of letters and would like to use regex to check the availabilty of these letters in a text.

bbbb either cg or gc or cc or gg then followed by a t. So the regex should match any of 4 possiblities like either:

bbbbcgt or bbbbgct or bbbbcct or bbbbggt.

Regards,
Emmanuel

11:42 am on Dec 18, 2008 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



you can use parentheses in regexps for two reasons: to catch parts of the match and work with it and to group things.

in your case, /bbbb(cg¦gc¦cc¦gg)t/ would work and, if the string is bbbbcgt, $1 would contain cg. the ¦ in the parentheses tells the regexp-machine that any one of those strings can match at this position.
if you don't need to know which of the four possibilites matched, you could also say /bbbb(?:cg¦gc¦cc¦gg)t/ to indicate that you just want to group them, not save them.

5:30 am on Dec 19, 2008 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



welcome to WebmasterWorld [webmasterworld.com], Emmanuel !

you should learn the basics of regular expressions:
[perldoc.perl.org...]

the knowlege is essential to perl and can be transferred to many other disciplines.