Bio::DB::GenBank batch mode usage

View: New views
3 Messages — Rating Filter:   Alert me  

Bio::DB::GenBank batch mode usage

by John Tyree-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I'm trying to use Bio::DB::GenBank to download a large number of files
by accession number. The docs say not to do this in normal mode to
reduce server load. There is some kind of helper function associated
with this.

    %params = Bio::DB::GenBank->get_params('batch');

But I don't understand how to use it. If you pass the hash using:

     Bio::DB::GenBank->new(%params);

it raises the following and dies:

--------------------- WARNING ---------------------
MSG: invalid retrieval type tool must be one of (pipeline,io_string,tempfile
---------------------------------------------------

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: seq_start() must be integer value if set
STACK: Error::throw
STACK: Bio::Root::Root::throw
/usr/lib64/perl5/site_perl/5.10.0/Bio/Root/Root.pm:357
STACK: Bio::DB::NCBIHelper::seq_start
/usr/lib64/perl5/site_perl/5.10.0/Bio/DB/NCBIHelper.pm:416
STACK: Bio::DB::NCBIHelper::new
/usr/lib64/perl5/site_perl/5.10.0/Bio/DB/NCBIHelper.pm:117
STACK: Find_Patient_By_AccNo.pl:93

There is a deprecated method called get_Stream_by_batch() but how does
one achieve batch mode using the proper get_Stream_by_id() ?

Thanks,
John
_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Bio::DB::GenBank batch mode usage

by John Tyree :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I'm trying to use Bio::DB::GenBank to download a large number of files
by accession number. The docs say not to do this in normal mode to
reduce server load. There is some kind of helper function associated
with this.

   %params = Bio::DB::GenBank->get_params('batch');

But I don't understand how to use it. If you pass the hash using:

    Bio::DB::GenBank->new(%params);

it raises the following and dies:

--------------------- WARNING ---------------------
MSG: invalid retrieval type tool must be one of (pipeline,io_string,tempfile
---------------------------------------------------

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: seq_start() must be integer value if set
STACK: Error::throw
STACK: Bio::Root::Root::throw
/usr/lib64/perl5/site_perl/5.10.0/Bio/Root/Root.pm:357
STACK: Bio::DB::NCBIHelper::seq_start
/usr/lib64/perl5/site_perl/5.10.0/Bio/DB/NCBIHelper.pm:416
STACK: Bio::DB::NCBIHelper::new
/usr/lib64/perl5/site_perl/5.10.0/Bio/DB/NCBIHelper.pm:117
STACK: Find_Patient_By_AccNo.pl:93

There is a deprecated method called get_Stream_by_batch() but how does
one achieve batch mode using the proper get_Stream_by_id() ?

Thanks,
John
_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: Bio::DB::GenBank batch mode usage

by Chris Fields-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

If you are just downloading the records to a file it might be better  
to retrieve the raw records using EUtilities, providing you have  
either the accession number or the GI.  If downloading files via  
Bio::DB::GenBank, it requires a preparse and write to file via  
Bio::SeqIO.

---------------------------

use Bio::DB::EUtilities;
use Bio::SeqIO;

my @ids = (); # your GI/acc here

my $factory = Bio::DB::EUtilities->new(
    -eutil => 'efetch',
    -db    => 'nucleotide',
    -rettype => 'genbank',
    -id => \@ids);

$factory->get_Response(-file => "records.gb");

---------------------------

If you have a long lost of IDs you can use epost first, then efetch  
using the search history.  This page has a few recipe scripts:

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook

chris

On Jul 2, 2009, at 1:50 PM, John Tyree wrote:

> I'm trying to use Bio::DB::GenBank to download a large number of files
> by accession number. The docs say not to do this in normal mode to
> reduce server load. There is some kind of helper function associated
> with this.
>
>   %params = Bio::DB::GenBank->get_params('batch');
>
> But I don't understand how to use it. If you pass the hash using:
>
>    Bio::DB::GenBank->new(%params);
>
> it raises the following and dies:
>
> --------------------- WARNING ---------------------
> MSG: invalid retrieval type tool must be one of  
> (pipeline,io_string,tempfile
> ---------------------------------------------------
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: seq_start() must be integer value if set
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /usr/lib64/perl5/site_perl/5.10.0/Bio/Root/Root.pm:357
> STACK: Bio::DB::NCBIHelper::seq_start
> /usr/lib64/perl5/site_perl/5.10.0/Bio/DB/NCBIHelper.pm:416
> STACK: Bio::DB::NCBIHelper::new
> /usr/lib64/perl5/site_perl/5.10.0/Bio/DB/NCBIHelper.pm:117
> STACK: Find_Patient_By_AccNo.pl:93
>
> There is a deprecated method called get_Stream_by_batch() but how does
> one achieve batch mode using the proper get_Stream_by_id() ?
>
> Thanks,
> John
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@...
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l