automated stand alone blast with repeat masker

View: New views
5 Messages — Rating Filter:   Alert me  

automated stand alone blast with repeat masker

by nisa_dar :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I'm running a stand alone blast against my local databases by using the following code

use Bio::Seq;
use Bio::Tools::Run::StandAloneBlast;
 

@params = (program  => 'blastn', database => 'db.fa');
 
$blast_obj = Bio::Tools::Run::StandAloneBlast->new(@params);
 
$seq_obj = Bio::Seq->new(-id  =>"test query", -seq =>"TTTAAATATATTTTGAAGTATAGATTATATGTT");
 
$report_obj = $blast_obj->blastall($seq_obj);
 
$result_obj = $report_obj->next_result;
 
print $result_obj->num_hits;

How can I include the code for repeat masker in it?

Thanks
Nisa

Re: automated stand alone blast with repeat masker

by Dave Messina-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I haven't done this myself, but from a quick search on the BioPerl website,
it looks like you'll want to use the
Bio::Tools::Run::RepeatMasker<http://doc.bioperl.org/releases/bioperl-current/bioperl-run/Bio/Tools/Run/RepeatMasker.html>module
to create a repeat-masked fasta file.

If you RepeatMask your query sequence(s), then you need to specify that
sequence when you create your Bio::Seq object.

If you instead RepeatMask your database, you'll need to create a blast
database from the repeat-masked sequences and specify that db in your
@params. I don't think there's a module for running formatdb, but you can do
it through a system call.



Dave
_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: automated stand alone blast with repeat masker

by nisa_dar :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Do we have to install it separately because seems like its not there on my
system although I have bioperl installed on my system.



Quoting Dave Messina <David.Messina@...>:

> I haven't done this myself, but from a quick search on the BioPerl website,
> it looks like you'll want to use the
>
Bio::Tools::Run::RepeatMasker<http://doc.bioperl.org/releases/bioperl-current/bioperl-run/Bio/Tools/Run/RepeatMasker.html>module

> to create a repeat-masked fasta file.
>
> If you RepeatMask your query sequence(s), then you need to specify that
> sequence when you create your Bio::Seq object.
>
> If you instead RepeatMask your database, you'll need to create a blast
> database from the repeat-masked sequences and specify that db in your
> @params. I don't think there's a module for running formatdb, but you can do
> it through a system call.
>
>
>
> Dave
>


_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: automated stand alone blast with repeat masker

by nisa_dar :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Following is the path to repeatmasker.pm on my system

/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/RepeatMasker.pm

but when I run my program, the error message comes

RepeatMasker program not found as  or not executable

Here is my piece of code which gives this error,
#!/usr/bin/perl

use strict;
use warnings;

use Bio::Seq;
use Bio::Tools::Run::StandAloneBlast;
use Bio::Search::Hit::HitI;
use Bio::Search::Hit::BlastHit;
use Bio::Search::HSP::BlastHSP;
use Bio::Search::HSP::HSPI;
use Bio::SearchIO;
use Bio::Tools::Run::RepeatMasker;

BEGIN {

        $ENV{REPEATMASKERDIR} = '/opt/rocks/lib/perl5/site_perl/5.8.8/Bio/Tools/';

 }


my @params = ("mam" => 1,"noint"=>1);
my $factory = Bio::Tools::Run::RepeatMasker->new(@params);
my $in  = Bio::SeqIO->new(-file => "boechera.fasta", -format => 'fasta');

I tried finding RepeatMasker directory by typing

which RepeatMasker

but the error message was

/usr/bin/which: no RepeatMasker in
(/opt/openmpi/1.1.4/bin:/opt/lsfhpc/ego/1.2/linux2.6-glibc2.3-x86_64/etc:/opt/lsfhpc/ego/1.2/linux2.6-glibc2.3-x86_64/bin:/opt/lsfhpc/7.0/linux2.6-glibc2.3-x86_64/etc:/opt/lsfhpc/7.0/linux2.6-glibc2.3-x86_64/bin:/usr/kerberos/bin:/usr/java/jdk1.5.0_07/bin:/share/iNquiry/biotools/bin:/share/iNquiry/bin/lx24-x86:/share/iNquiry/bin/lx24-amd64:/opt/Bio/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/opt/modules/current/bin/:/opt/modules/bin/:/opt/Bio/glimmer/scripts:/opt/Bio/gromacs/bin:/opt/eclipse:/opt/ganglia/bin:/opt/maven/bin:/opt/rocks/bin:/opt/rocks/sbin:/home/vdar/bin)





Quoting Dave Messina <David.Messina@...>:

> I haven't done this myself, but from a quick search on the BioPerl website,
> it looks like you'll want to use the
>
Bio::Tools::Run::RepeatMasker<http://doc.bioperl.org/releases/bioperl-current/bioperl-run/Bio/Tools/Run/RepeatMasker.html>module

> to create a repeat-masked fasta file.
>
> If you RepeatMask your query sequence(s), then you need to specify that
> sequence when you create your Bio::Seq object.
>
> If you instead RepeatMask your database, you'll need to create a blast
> database from the repeat-masked sequences and specify that db in your
> @params. I don't think there's a module for running formatdb, but you can do
> it through a system call.
>
>
>
> Dave
>


_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: automated stand alone blast with repeat masker

by Dave Messina-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Do we have to install it separately because seems like its not there on my
> system although I have bioperl installed on my system.
>

Yes.

BioPerl doesn't include all of the many programs it potentially interacts
with. It'd be great if you could get everything in one shebang, but
practically this isn't possible because there are so many bioinformatics
programs, because they are written and maintained by their authors not by
the BioPerl group, and because they are constantly being updated and the
version included in BioPerl would quickly become out of sync.


>From the Bio::Tools::Run::RepeatMasker documentation:

*To use this module, the RepeatMasker program (and probably database) must
be
installed. RepeatMasker is a program that screens DNA sequences for
interspersed
repeats known to exist in mammalian genomes as well as for low
complexity DNA sequences. For more information, on the program and its
usage, please refer to http://www.repeatmasker.org/. *



Dave
_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l