problem with alignments and sequence locations

View: New views
2 Messages — Rating Filter:   Alert me  

problem with alignments and sequence locations

by Steffen Heyne :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I'm using Bioperl for my research and it is very useful! Thank you!

Currently I have a problem with locations tags of sequences. I read in
seed alignments of Rfam (in stockholm format, but I think it is similar
to other formats).

If the location is like:

AB194432.1/908-846

the start/end values are changed to

$seq->start = 846
$seq->end = 908

and therefore the new location (e.g.$seq->get_nse) is:

AB194432.1/846-908

The $seq->strand tag is correctly set to -1 in this case, but if the
alignment is written out again (clustal, stockholm,...) this strand info
is lost and the sequences have this "wrong" location. But this
information is important in respect to the sequence accession number.

Is there a way to set the location back to the original one or is this
behavior desired? Any manually setting with $seq->start($val) failed due
to automatic checking.

I'm using bioperl 1.6.1

Thanks!

steffen


--
---
Steffen Heyne, Dipl.-Bioinf.
Lehrstuhl für Bioinformatik
Institut für Informatik
Albert-Ludwigs-Universität Freiburg
Georges-Köhler-Allee 106
79110 Freiburg, Germany

Tel: (+49) 761 203 8239
Fax: (+49) 761 203 7462
Mail: heyne@...
_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: problem with alignments and sequence locations

by Chris Fields-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Nov 10, 2009, at 6:55 AM, Steffen Heyne wrote:

> Hi,
>
> I'm using Bioperl for my research and it is very useful! Thank you!
>
> Currently I have a problem with locations tags of sequences. I read  
> in seed alignments of Rfam (in stockholm format, but I think it is  
> similar to other formats).
>
> If the location is like:
>
> AB194432.1/908-846
>
> the start/end values are changed to
>
> $seq->start = 846
> $seq->end = 908
>
> and therefore the new location (e.g.$seq->get_nse) is:
>
> AB194432.1/846-908
>
> The $seq->strand tag is correctly set to -1 in this case, but if the  
> alignment is written out again (clustal, stockholm,...) this strand  
> info is lost and the sequences have this "wrong" location. But this  
> information is important in respect to the sequence accession number.
>
> Is there a way to set the location back to the original one or is  
> this behavior desired? Any manually setting with $seq->start($val)  
> failed due to automatic checking.
>
> I'm using bioperl 1.6.1
>
> Thanks!
>
> steffen

This is a definite bug. We recently discussed amending the NSE format  
due to this (the subject came up over the last few months or so); it's  
fallen through the cracks.  Fortunaely it is very easy to fix (the  
relevant method is in LocatableSeq).

Does anyone have a problem with me adding this in?  It will change  
output for only those instances where the strand is -1, so

AB194432.1/908-846

would be start = 846, end = 908, strand = -1

AB194432.1/846-908

would be start = 846, end = 908, strand = 1

chris
_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l