|
View:
New views
5 Messages
—
Rating Filter:
Alert me
|
|
|
hacking BlastTable.pm to support blast -m 8Hello,
As far as I understand, Bio::Index::BlastTable only supports the -m 9 blast format. Another popular and more compact format is -m 8, the main difference being that the blast program, the query and database, and the different field names are not reported between each search, i.e. you get a much cleaner table (which looks much easier to parse). By looking at BlastTable.pm, it looks like the main hack would be in the sub _index_file. Right now it is: sub _index_file { my( $self, $file, # File name $i, # Index-number of file being indexed ) = @_; my( $begin, # Offset from start of file of the start # of the last found record. ); open(my $BLAST, '<', $file) or $self->throw("cannot open file $file\n"); my $indexpoint = 0; my $lastline = 0; while( <$BLAST> ) { if(m{^#\s+T?BLAST[PNX]} ) { my $len = length $_; $indexpoint = tell($BLAST)-$len; } if(m{^#\s+Query:\s+([^\n]+)}) { foreach my $id ($self->id_parser()->($1)) { $self->debug("id is $id, begin is $indexpoint\n"); $self->add_record($id, $i, $indexpoint); } } } } Using the -m 8 format, is it me or this could be done by getting the query name from the first row of the blast table, find when the hits for this query starts and stop, and give this to add_record()? I'm kind of not sure to get all the details regarding the $i and $indexpoint... so well, if an expert eye could give me some advice or hack the code that would be nice ;) --Tristan _______________________________________________ Bioperl-l mailing list Bioperl-l@... http://lists.open-bio.org/mailman/listinfo/bioperl-l |
|
|
Re: hacking BlastTable.pm to support blast -m 8On Oct 27, 2009, at 2:31 PM, Tristan Lefebure wrote: > Hello, > As far as I understand, Bio::Index::BlastTable only supports > the -m 9 blast format. Another popular and more compact > format is -m 8, the main difference being that the blast > program, the query and database, and the different field > names are not reported between each search, i.e. you get > a much cleaner table (which looks much easier to parse). > > By looking at BlastTable.pm, it looks like the main > hack would be in the sub _index_file. Right now it is: > > sub _index_file { > my( $self, > $file, # File name > $i, # Index-number of file being indexed > ) = @_; > > my( $begin, # Offset from start of file of the start > # of the last found record. > ); > > open(my $BLAST, '<', $file) or $self->throw("cannot open file > $file\n"); > my $indexpoint = 0; > my $lastline = 0; > while( <$BLAST> ) { > if(m{^#\s+T?BLAST[PNX]} ) { > my $len = length $_; > $indexpoint = tell($BLAST)-$len; > } > if(m{^#\s+Query:\s+([^\n]+)}) { > foreach my $id ($self->id_parser()->($1)) { > $self->debug("id is $id, begin is > $indexpoint\n"); > $self->add_record($id, $i, > $indexpoint); > } > } > } > } > > Using the -m 8 format, is it me or this could be > done by getting the query name from the first row > of the blast table, find when the hits for this query > starts and stop, and give this to add_record()? > > I'm kind of not sure to get all the details > regarding the $i and $indexpoint... so well, if an > expert eye could give me some advice or hack the code > that would be nice ;) > > --Tristan That should be feasible, yes, and you are correct. The main thing to make sure of is to retain the '#' for -m9, so the parser catches the BLAST executable and other info. I'll go ahead and do this based on your suggestion, unless you have a patch ready. Also, it looks like the module is missing tests, so I can work on adding those for both -m8 and -m9 output. chris _______________________________________________ Bioperl-l mailing list Bioperl-l@... http://lists.open-bio.org/mailman/listinfo/bioperl-l |
|
|
Re: hacking BlastTable.pm to support blast -m 8On Tuesday 27 October 2009 15:50:24 Chris Fields wrote:
> I'll go ahead and do this based on your suggestion, > unless you have a patch ready. Also, it looks like > the module is missing tests, so I can work on adding > those for both -m8 and -m9 output. > Great! I did nothing, you can go ahead. I bet you can do this 100x faster than me... _______________________________________________ Bioperl-l mailing list Bioperl-l@... http://lists.open-bio.org/mailman/listinfo/bioperl-l |
|
|
Re: hacking BlastTable.pm to support blast -m 8On Oct 27, 2009, at 2:59 PM, Tristan Lefebure wrote:
> On Tuesday 27 October 2009 15:50:24 Chris Fields wrote: >> I'll go ahead and do this based on your suggestion, >> unless you have a patch ready. Also, it looks like >> the module is missing tests, so I can work on adding >> those for both -m8 and -m9 output. >> > > Great! I did nothing, you can go ahead. I bet you can do > this 100x faster than me... Committed to svn in r16301, along with some tests. Let me know if this doesn't work. chris _______________________________________________ Bioperl-l mailing list Bioperl-l@... http://lists.open-bio.org/mailman/listinfo/bioperl-l |
|
|
Re: hacking BlastTable.pm to support blast -m 8Works perfectly, thanks Chris.
On Tuesday 27 October 2009 19:39:26 Chris Fields wrote: > On Oct 27, 2009, at 2:59 PM, Tristan Lefebure wrote: > > On Tuesday 27 October 2009 15:50:24 Chris Fields wrote: > >> I'll go ahead and do this based on your suggestion, > >> unless you have a patch ready. Also, it looks like > >> the module is missing tests, so I can work on adding > >> those for both -m8 and -m9 output. > > > > Great! I did nothing, you can go ahead. I bet you can > > do this 100x faster than me... > > Committed to svn in r16301, along with some tests. Let > me know if this doesn't work. > > chris > Bioperl-l mailing list Bioperl-l@... http://lists.open-bio.org/mailman/listinfo/bioperl-l |
| Free embeddable forum powered by Nabble | Forum Help |