|
View:
New views
5 Messages
—
Rating Filter:
Alert me
|
|
|
Bio::Index::GenBank - by organism?Many thanks to Ewan Birney et. al. for Bio::Index::*
I can throw away my awful grep based index-by-accession stuff. :) Any chance someone has also written an organism based index mechanism? Something like... while (my $seq = $inx−>get_Seq_by_organism('*Xanthomonas*')) { print $seq->display_id . "\n"; } Thanks, j _______________________________________________ Bioperl-l mailing list Bioperl-l@... http://lists.open-bio.org/mailman/listinfo/bioperl-l |
|
|
Re: Bio::Index::GenBank - by organism?On Nov 9, 2009, at 6:05 PM, Jay Hannah wrote:
> Many thanks to Ewan Birney et. al. for Bio::Index::* > > I can throw away my awful grep based index-by-accession stuff. :) > > Any chance someone has also written an organism based index > mechanism? Something like... > > while (my $seq = $inx−>get_Seq_by_organism('*Xanthomonas*')) { > print $seq->display_id . "\n"; > } > > Thanks, > > j It should work via id_parser(); from Bio::Index::GenBank: $inx->id_parser(\&get_id); # make the index $inx->make_index($file_name); # here is where the retrieval key is specified sub get_id { my $line = shift; $line =~ /clone="(\S+)"/; $1; } Change the code ref deal with the line you want and parse the name out. Caveat: this may not be absolutely perfect (it only passes in a line at a time, and some species lines will wrap). Also not sure how this would work in cases where multiple sequences from the same species are present. The other option is to preparse everything and tie a hash to store a species->UID map, then use that along with your Bio::Index index to grab what you need. chris _______________________________________________ Bioperl-l mailing list Bioperl-l@... http://lists.open-bio.org/mailman/listinfo/bioperl-l |
|
|
Re: Bio::Index::GenBank - by organism?You might also look at what mygenbank does:
http://homepage.mac.com/iankorf/mygenbank.html On Nov 9, 2009, at 7:55 PM, Chris Fields wrote: > On Nov 9, 2009, at 6:05 PM, Jay Hannah wrote: > >> Many thanks to Ewan Birney et. al. for Bio::Index::* >> >> I can throw away my awful grep based index-by-accession stuff. :) >> >> Any chance someone has also written an organism based index >> mechanism? Something like... >> >> while (my $seq = $inx−>get_Seq_by_organism('*Xanthomonas*')) { >> print $seq->display_id . "\n"; >> } >> >> Thanks, >> >> j > > It should work via id_parser(); from Bio::Index::GenBank: > > $inx->id_parser(\&get_id); > # make the index > $inx->make_index($file_name); > > # here is where the retrieval key is specified > sub get_id { > my $line = shift; > $line =~ /clone="(\S+)"/; > $1; > } > > Change the code ref deal with the line you want and parse the name > out. Caveat: this may not be absolutely perfect (it only passes in > a line at a time, and some species lines will wrap). Also not sure > how this would work in cases where multiple sequences from the same > species are present. > > The other option is to preparse everything and tie a hash to store a > species->UID map, then use that along with your Bio::Index index to > grab what you need. > > chris > _______________________________________________ > Bioperl-l mailing list > Bioperl-l@... > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason.stajich@... jason@... _______________________________________________ Bioperl-l mailing list Bioperl-l@... http://lists.open-bio.org/mailman/listinfo/bioperl-l |
|
|
Re: Bio::Index::GenBank - by organism?On Nov 9, 2009, at 9:55 PM, Chris Fields wrote:
> It should work via id_parser(); from Bio::Index::GenBank: > > $inx->id_parser(\&get_id); > # make the index > $inx->make_index($file_name); > > # here is where the retrieval key is specified > sub get_id { > my $line = shift; > $line =~ /clone="(\S+)"/; > $1; > } This worked great for me today (tackling a different problem than the original). Thanks!! j _______________________________________________ Bioperl-l mailing list Bioperl-l@... http://lists.open-bio.org/mailman/listinfo/bioperl-l |
|
|
Re: Bio::Index::GenBank - by organism?On Nov 10, 2009, at 12:50 PM, Jason Stajich wrote:
> You might also look at what mygenbank does: > http://homepage.mac.com/iankorf/mygenbank.html It appears, perhaps, that BioSQL can provide *foo* searching like so: http://www.biosql.org/wiki/Schema_Overview#TAXON.2C_TAXON_NAME SELECT DISTINCT include.ncbi_taxon_id FROM taxon INNER JOIN taxon AS include ON (include.left_value BETWEEN taxon.left_value AND taxon.right_value) WHERE taxon.taxon_id IN (SELECT taxon_id FROM taxon_name WHERE name LIKE '%fungi%') So I think we're going to chase that for a while. I didn't see a *foo* search in MyGenBank? Thanks, j http://clab.ist.unomaha.edu/CLAB/index.php/User:Jhannah _______________________________________________ Bioperl-l mailing list Bioperl-l@... http://lists.open-bio.org/mailman/listinfo/bioperl-l |
| Free embeddable forum powered by Nabble | Forum Help |