How to parse DBLINK Project linetype in Genbank file

View: New views
5 Messages — Rating Filter:   Alert me  

How to parse DBLINK Project linetype in Genbank file

by Chris Stubben :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I have some Genbank files (from genome sequences) with a DBLINK line type listing the Entrez genome project id.  Is there a way to parse this line?   I can't seem to find it among the Annotation objects using bioperl 1.6.0 (printing all annotations or just dblinks below)

my @annotations = $so->annotation->get_Annotations('dblink');  # nothing

---
LOCUS       NC_001664             159322 bp    DNA     linear   VRL 16-OCT-2009
DEFINITION  Human herpesvirus 6A, complete genome.
ACCESSION   NC_001664
VERSION     NC_001664.2  GI:224020395
DBLINK      Project:14462
KEYWORDS    .
SOURCE      Human herpesvirus 6 (HHV-6A)


Thanks,

Chris Stubben

Re: How to parse DBLINK Project linetype in Genbank file

by Mark A. Jensen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Chris--
 This might be a bug; the HOWTO
(at
http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Getting_the_Annotations)
states that the GenBank text is 'DBSOURCE'. Maybe we're not parsing the 'DBLINK'
GenBank tag?
A guru will surely chime in here.
MAJ
----- Original Message -----
From: "Chris Stubben" <stubben@...>
To: <Bioperl-l@...>
Sent: Wednesday, October 21, 2009 11:29 AM
Subject: [Bioperl-l] How to parse DBLINK Project linetype in Genbank file


>
> I have some Genbank files (from genome sequences) with a DBLINK line type
> listing the Entrez genome project id.  Is there a way to parse this line?
> I can't seem to find it among the Annotation objects using bioperl 1.6.0
> (printing all annotations or just dblinks below)
>
> my @annotations = $so->annotation->get_Annotations('dblink');  # nothing
>
> ---
> LOCUS       NC_001664             159322 bp    DNA     linear   VRL
> 16-OCT-2009
> DEFINITION  Human herpesvirus 6A, complete genome.
> ACCESSION   NC_001664
> VERSION     NC_001664.2  GI:224020395
> DBLINK      Project:14462
> KEYWORDS    .
> SOURCE      Human herpesvirus 6 (HHV-6A)
>
>
> Thanks,
>
> Chris Stubben
> --
> View this message in context:
> http://www.nabble.com/How-to-parse-DBLINK-Project-linetype-in-Genbank-file-tp25994776p25994776.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@...
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: How to parse DBLINK Project linetype in Genbank file

by Scott Markel :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Chris,

Looking at the code for Bio::SeqIO::genbank dblink annotations are created
for DBSOURCE lines, not for DBLINK lines.  If you use your script on
NP_042883, for example, you'll see a dblink annotation.  A case-sensitive
grep on DBLINK shows no matches in genbank.pm, so those lines don't
currently lead to any annotations.

Scott

Scott Markel, Ph.D.
Principal Bioinformatics Architect  email:  smarkel@...
Accelrys (SciTegic R&D)             mobile: +1 858 205 3653
10188 Telesis Court, Suite 100      voice:  +1 858 799 5603
San Diego, CA 92121                 fax:    +1 858 799 5222
USA                                 web:    http://www.accelrys.com

http://www.linkedin.com/in/smarkel
Vice President, Board of Directors:
    International Society for Computational Biology
Chair: ISCB Publications Committee
Associate Editor: PLoS Computational Biology
Editorial Board: Briefings in Bioinformatics


-----Original Message-----
From: bioperl-l-bounces@... [mailto:bioperl-l-bounces@...] On Behalf Of Chris Stubben
Sent: Wednesday, 21 October 2009 8:29 AM
To: Bioperl-l@...
Subject: [Bioperl-l] How to parse DBLINK Project linetype in Genbank file


I have some Genbank files (from genome sequences) with a DBLINK line type
listing the Entrez genome project id.  Is there a way to parse this line?  
I can't seem to find it among the Annotation objects using bioperl 1.6.0
(printing all annotations or just dblinks below)

my @annotations = $so->annotation->get_Annotations('dblink');  # nothing

---
LOCUS       NC_001664             159322 bp    DNA     linear   VRL
16-OCT-2009
DEFINITION  Human herpesvirus 6A, complete genome.
ACCESSION   NC_001664
VERSION     NC_001664.2  GI:224020395
DBLINK      Project:14462
KEYWORDS    .
SOURCE      Human herpesvirus 6 (HHV-6A)


Thanks,

Chris Stubben
--
View this message in context: http://www.nabble.com/How-to-parse-DBLINK-Project-linetype-in-Genbank-file-tp25994776p25994776.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.

_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: How to parse DBLINK Project linetype in Genbank file

by Chris Fields-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

This should be parsed and available as Bio::Annotation::DBlink.  I'll  
give it a check JIC.

chris

On Oct 21, 2009, at 11:06 AM, Mark A. Jensen wrote:

> Chris-- This might be a bug; the HOWTO
> (at http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Getting_the_Annotations 
> )
> states that the GenBank text is 'DBSOURCE'. Maybe we're not parsing  
> the 'DBLINK' GenBank tag?
> A guru will surely chime in here.
> MAJ
> ----- Original Message ----- From: "Chris Stubben" <stubben@...>
> To: <Bioperl-l@...>
> Sent: Wednesday, October 21, 2009 11:29 AM
> Subject: [Bioperl-l] How to parse DBLINK Project linetype in Genbank  
> file
>
>
>>
>> I have some Genbank files (from genome sequences) with a DBLINK  
>> line type
>> listing the Entrez genome project id.  Is there a way to parse this  
>> line?
>> I can't seem to find it among the Annotation objects using bioperl  
>> 1.6.0
>> (printing all annotations or just dblinks below)
>>
>> my @annotations = $so->annotation->get_Annotations('dblink');  #  
>> nothing
>>
>> ---
>> LOCUS       NC_001664             159322 bp    DNA     linear   VRL
>> 16-OCT-2009
>> DEFINITION  Human herpesvirus 6A, complete genome.
>> ACCESSION   NC_001664
>> VERSION     NC_001664.2  GI:224020395
>> DBLINK      Project:14462
>> KEYWORDS    .
>> SOURCE      Human herpesvirus 6 (HHV-6A)
>>
>>
>> Thanks,
>>
>> Chris Stubben
>> --
>> View this message in context: http://www.nabble.com/How-to-parse-DBLINK-Project-linetype-in-Genbank-file-tp25994776p25994776.html
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@...
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@...
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: How to parse DBLINK Project linetype in Genbank file

by Chris Fields-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Chris, Mark,

Looks like it isn't parsed.  I'll add a bug report to track, should be  
easy enough to add.

Notably, gbdriver catches it but chokes b/c the type mapper is  
expecting a Bio::Annotation::DBlink (any non-specific information is  
by default pushed into a Bio::Annotation::SimpleValue, hence the  
painful death).

chris

On Oct 21, 2009, at 11:38 AM, Chris Fields wrote:

> This should be parsed and available as Bio::Annotation::DBlink.  
> I'll give it a check JIC.
>
> chris
>
> On Oct 21, 2009, at 11:06 AM, Mark A. Jensen wrote:
>
>> Chris-- This might be a bug; the HOWTO
>> (at http://www.bioperl.org/wiki/HOWTO:Feature-Annotation#Getting_the_Annotations 
>> )
>> states that the GenBank text is 'DBSOURCE'. Maybe we're not parsing  
>> the 'DBLINK' GenBank tag?
>> A guru will surely chime in here.
>> MAJ
>> ----- Original Message ----- From: "Chris Stubben" <stubben@...>
>> To: <Bioperl-l@...>
>> Sent: Wednesday, October 21, 2009 11:29 AM
>> Subject: [Bioperl-l] How to parse DBLINK Project linetype in  
>> Genbank file
>>
>>
>>>
>>> I have some Genbank files (from genome sequences) with a DBLINK  
>>> line type
>>> listing the Entrez genome project id.  Is there a way to parse  
>>> this line?
>>> I can't seem to find it among the Annotation objects using bioperl  
>>> 1.6.0
>>> (printing all annotations or just dblinks below)
>>>
>>> my @annotations = $so->annotation->get_Annotations('dblink');  #  
>>> nothing
>>>
>>> ---
>>> LOCUS       NC_001664             159322 bp    DNA     linear   VRL
>>> 16-OCT-2009
>>> DEFINITION  Human herpesvirus 6A, complete genome.
>>> ACCESSION   NC_001664
>>> VERSION     NC_001664.2  GI:224020395
>>> DBLINK      Project:14462
>>> KEYWORDS    .
>>> SOURCE      Human herpesvirus 6 (HHV-6A)
>>>
>>>
>>> Thanks,
>>>
>>> Chris Stubben
>>> --
>>> View this message in context: http://www.nabble.com/How-to-parse-DBLINK-Project-linetype-in-Genbank-file-tp25994776p25994776.html
>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l@...
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@...
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@...
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l