Question about sequence and blast interface

View: New views
5 Messages — Rating Filter:   Alert me  

Question about sequence and blast interface

by Shane Brubaker :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.

Hi, I have a couple of general questions about setting up a GBrowse site for the first time.  I am setting up a site for a genome which contains several hundred scaffolds.  One thing we would like to do is be able to BLAST a gene and then have a link to the appropriate location on the correct scaffold, and then go into GBrowse from there.

 

1.        Is there a package out there that makes a BLAST web interface that then links into GBrowse?  I have seen quite a few places that have set this up themselves, but I’m wondering if there is something out there pre-packaged that would help me do this?  Keyword searching on gene annotations is another method of getting in that is desired.

2.       Can you load the nucleotide sequence of the entire scaffold into GBrowse (I am using Chado as the back-end by the way) such that people could then select/highlight a region on the scaffold and then copy the sequence to their clipboard?

3.       In general what are good ways to handle implementations where you have a lot of scaffolds?



Thanks very much for your help,

 

Shane Brubaker

 



This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.

------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
_______________________________________________
Gmod-gbrowse mailing list
Gmod-gbrowse@...
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse

Re: Question about sequence and blast interface

by Scott Cain-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Shane,

See my notes below.

Scott


On Oct 2, 2009, at 7:32 PM, Shane Brubaker wrote:

> Hi, I have a couple of general questions about setting up a GBrowse  
> site for the first time.  I am setting up a site for a genome which  
> contains several hundred scaffolds.  One thing we would like to do  
> is be able to BLAST a gene and then have a link to the appropriate  
> location on the correct scaffold, and then go into GBrowse from there.
>
> 1.        Is there a package out there that makes a BLAST web  
> interface that then links into GBrowse?  I have seen quite a few  
> places that have set this up themselves, but I’m wondering if there  
> is something out there pre-packaged that would help me do this?  
> Keyword searching on gene annotations is another method of getting  
> in that is desired.

No there isn't, but using BioPerl it shouldn't be too hard to parse  
the output of the blast and insert links to the result.  You could  
even write the link such that the blast hit gets temporarily placed in  
the GBrowse display (it is just fun with http GET urls: it doesn't go  
into the database).


> 2.       Can you load the nucleotide sequence of the entire scaffold  
> into GBrowse (I am using Chado as the back-end by the way) such that  
> people could then select/highlight a region on the scaffold and then  
> copy the sequence to their clipboard?

Yes, using the FastaDumper that comes with GBrowse (looking in conf/
plugins).  It will dump a selected region as fasta.  You can also make  
it go directly to blast.  Take a look at the configuration file for  
the yeast data set that comes with GBrowse.

> 3.       In general what are good ways to handle implementations  
> where you have a lot of scaffolds?

Nothing in particular that I can think of.

>
>
> Thanks very much for your help,
>
> Shane Brubaker
>
>
> This email and any attachments thereto may contain private,  
> confidential, and privileged material for the sole use of the  
> intended recipient. Any review, copying, or distribution of this  
> email (or any attachments thereto) by others is strictly prohibited.  
> If you are not the intended recipient, please contact the sender  
> immediately and permanently delete the original and any copies of  
> this email and any attachments thereto.
> ------------------------------------------------------------------------------
> Come build with us! The BlackBerry® Developer Conference in SF, CA
> is the only developer event you need to attend this year. Jumpstart  
> your
> developing skills, take BlackBerry mobile applications to market and  
> stay
> ahead of the curve. Join us from November 9-12, 2009. Register  
> now!
> http://p.sf.net/sfu/devconf_______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse@...
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse

-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research





------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
_______________________________________________
Gmod-gbrowse mailing list
Gmod-gbrowse@...
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse

Parent Message unknown Re: Question about sequence and blast interface

by Don Gilbert-6 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Shane,

I use a perl script to run NCBI Web blast then turn its output to hyperlinks to GBrowse
for the insects blast server here,
http://insects.eugenes.org/species/blast/
The perl script sppblast.cgi  is linked at bottom of this page, but heed the warning, it
is a messy script that only works here for my data sets, but may give you ideas.

There is no real problem with 100s of scaffolds you say you have other than user interface
to find them, it is when you have 100,000s (as some do) that computers/databases start
to choke.

- Don Gilbert
-- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405
-- gilbertd@...--http://marmot.bio.indiana.edu/

------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
_______________________________________________
Gmod-gbrowse mailing list
Gmod-gbrowse@...
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse

Re: Question about sequence and blast interface

by Jason Stajich-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

You can also try my fgblast script but it requires a configuration  
file.  It uses BioPerl's HTMLWriter script to re-write the Hit table  
with URLs and draws a summary graphic at the top. May be useful as an  
example.

I uploaded the basic code and you can browse it here or check it out  
with git or mercurial:
http://bitbucket.org/hyphaltip/genome-web-tools/src/tip/cgi-bin/fgblast.cgi

-jason
On Oct 5, 2009, at 12:07 PM, Scott Cain wrote:

> Hi Shane,
>
> See my notes below.
>
> Scott
>
>
> On Oct 2, 2009, at 7:32 PM, Shane Brubaker wrote:
>
>> Hi, I have a couple of general questions about setting up a GBrowse
>> site for the first time.  I am setting up a site for a genome which
>> contains several hundred scaffolds.  One thing we would like to do
>> is be able to BLAST a gene and then have a link to the appropriate
>> location on the correct scaffold, and then go into GBrowse from  
>> there.
>>
>> 1.        Is there a package out there that makes a BLAST web
>> interface that then links into GBrowse?  I have seen quite a few
>> places that have set this up themselves, but I’m wondering if there
>> is something out there pre-packaged that would help me do this?
>> Keyword searching on gene annotations is another method of getting
>> in that is desired.
>
> No there isn't, but using BioPerl it shouldn't be too hard to parse
> the output of the blast and insert links to the result.  You could
> even write the link such that the blast hit gets temporarily placed in
> the GBrowse display (it is just fun with http GET urls: it doesn't go
> into the database).
>
>
>> 2.       Can you load the nucleotide sequence of the entire scaffold
>> into GBrowse (I am using Chado as the back-end by the way) such that
>> people could then select/highlight a region on the scaffold and then
>> copy the sequence to their clipboard?
>
> Yes, using the FastaDumper that comes with GBrowse (looking in conf/
> plugins).  It will dump a selected region as fasta.  You can also make
> it go directly to blast.  Take a look at the configuration file for
> the yeast data set that comes with GBrowse.
>
>> 3.       In general what are good ways to handle implementations
>> where you have a lot of scaffolds?
>
> Nothing in particular that I can think of.
>
>>
>>
>> Thanks very much for your help,
>>
>> Shane Brubaker
>>
>>
>> This email and any attachments thereto may contain private,
>> confidential, and privileged material for the sole use of the
>> intended recipient. Any review, copying, or distribution of this
>> email (or any attachments thereto) by others is strictly prohibited.
>> If you are not the intended recipient, please contact the sender
>> immediately and permanently delete the original and any copies of
>> this email and any attachments thereto.
>> ------------------------------------------------------------------------------
>> Come build with us! The BlackBerry® Developer Conference in SF,  
>> CA
>> is the only developer event you need to attend this year. Jumpstart
>> your
>> developing skills, take BlackBerry mobile applications to market and
>> stay
>> ahead of the curve. Join us from November 9-12, 2009. Register
>> now!
>> http://p.sf.net/sfu/devconf_______________________________________________
>> Gmod-gbrowse mailing list
>> Gmod-gbrowse@...
>> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
>
> -----------------------------------------------------------------------
> Scott Cain, Ph. D. scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/) 216-392-3087
> Ontario Institute for Cancer Research
>
>
>
>
>
> ------------------------------------------------------------------------------
> Come build with us! The BlackBerry® Developer Conference in SF, CA
> is the only developer event you need to attend this year. Jumpstart  
> your
> developing skills, take BlackBerry mobile applications to market and  
> stay
> ahead of the curve. Join us from November 9-12, 2009. Register  
> now!
> http://p.sf.net/sfu/devconf
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse@...
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse

--
Jason Stajich
jason.stajich@...
jason@...


------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
_______________________________________________
Gmod-gbrowse mailing list
Gmod-gbrowse@...
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse

Re: Question about sequence and blast interface

by Smithies, Russell :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Great script Don, I may borrow bits of it :-)

Here's a simple one I wrote.
It uses our internal blast to blast a query against the fasta used in the GBrowse (I have blast db's setup for each GBrowse) and outputs a page with hyperlinks to the GBrowse showing the locations of the HSPs.
It works OK but could do with some refinement (eg. if you get too many hits the URL can become too long) but it gives you a place to start.


Russell Smithies

Bioinformatics Applications Developer
T +64 3 489 9085
E  russell.smithies@...

Invermay  Research Centre
Puddle Alley,
Mosgiel,
New Zealand
T  +64 3 489 3809  
F  +64 3 489 9174  
www.agresearch.co.nz

=======================

#!/usr/bin/perl

$| = 1;

use CGI::Pretty qw(:standard);
use CGI::Carp qw(fatalsToBrowser);

use IO::String;
use List::Util qw(min max);
use Switch;

use Bio::SearchIO;



my $example_seq = <<EXAMPLE;
>example_1
CGCGCCCCAGGCCGTCCCACCTCCCTCCTCCCGCCGCGGATCCGCCAGAC
AGCGAGGCCCCCGGCCGGGGGCAGGGGGGACGCCCCCTCCGGGGCACCCC
CCGGGCTCGGAGCCGCCCGCGGGGCCGAGGAGCACCCCGAGGCCCCACAG
TCTGAGACGAGCCGCCGCCGCCGCCCCCGCCGCCGCCGCCGCCACTGCGG
GGAGGAGGGGGAGGAGGAGCGGGAGGAGGGACGAGCTGGTTGGGACAACA
GGAAAAAAAGTTTTGAGACTTTTCCGCTGCCTCTGGGAGCCGAAAGCGCG
GGGACCGCAACGCGCAGCGCTGCTCCGCGAGGCAGGAACTTGAGGACCCC
AGACAGCAGCAGCCCCCGCCACCGCCTCGGACGCTTGCTCCCTCCCTGCC
GCCTACACGGCATTCCCCAGGCGCCCCCATTCCGGACCAGCCATCAGGAA
CCGCAAACCCGATTCCCGCGAAGACTTGACCCCAGACTTCGGGCGCACCC
GCCCT
>example_2
TAAAAGTGGAGCAGCACGTGGAGCTGTACCAGAAATATAGCAACAATTCC
TGGCGCTACCTCAGCAACCGGCTGCTCGCCCCCAGCGACTCACCGGAGTG
GCTGTCCTTTGACGTCACTGGAGTTGTGCGGCAGTGGCTGACCCACAGAG
AGGAAATAGAGGGCTTTCGCCTCAGTGCCCACTGTTCCTGTGACAGTAAG
GATAACACGCTTCAAGTGGACATCAACGGGTTCAGTTCCGGCCGCCGGGG
TGACCTCGCCACCATTCACGGCATGAACCGGCCCTTCCTGCTCCTCATGG
CAGAAAAGAACTGCTGTGT
TCGTCAGCTCTACATTGACTTCCGGAAGGACCTGGGCTGGAAGTGGATTC
ACGAACCCAAGGGGTACCACGCCAATTTCTGCCTGGGGCCCTGCCCTTAC
ATCTGGAGCCTGGACACGCAGTACAGCAAGGTCCTGGCCCTGTACAACCA
GCACAACCCGGGCGCATCGGCGGCGCCGTGCTGCGTGCCTCAGGCGCTGG
AACCCCTGCCCATCGTGTACTACGTGGGCCGCAAGCCCAAGGTGGAGCAG
TTGTCCAACATGATCGTGCGCTCCTGCAAGTGCAGCTGAGGCCCCGCCCC
ACCCCAACAGCCCCCGCCCCATAGGC
EXAMPLE




print header;
print
        start_html(-title=> 'GBlast',-BGCOLOR=>'lightgoldenrodyellow'),
        h1('GBlast'),
        start_form(-name=>'gb_form', -action=>"#output"),
        "Sequence:",br,textarea(-name=>'seq', -rows=>10, -columns=>55, -value=>$example_seq),
        p,
        "Which Gbrowse? ",
        popup_menu(-name=>'gb', -values=>[qw/cow bta3 bta4 oar/], -default=> 'bta4',),
        p,
        "Additional BLAST parameters? ",
        textfield(-name=>'blast_params' ,-default=>'-e 1e-10', -size => 40),
        p,
        submit,
        defaults("Reset"),
        end_form,
        p,
        div({-align=>'center'},'Email ',a({href=>"mailto:russell.smithies\@agresearch.co.nz"},"Russell Smithies"),' if you have any questions.'),
        a({-name=>"output"}),
        hr;


if (param()) {

        my %gbrowse = (
                cow => {
                        db => '/data/databases/flatfile/illuminati_blastdata/btChr',
                        gb => 'http://gbrowse.agresearch.co.nz/cgi-bin/gbrowse/cow/?',
                },
                bta3 => {
                        db => '/data/databases/flatfile/illuminati_blastdata/btChr_Btau3',
                        gb => 'http://gbrowse.agresearch.co.nz/cgi-bin/gbrowse/bta3/?',
                },
                bta4 =>{
                        db => '/data/databases/flatfile/illuminati_blastdata/btau4Chr_annotated',
                        gb => 'http://gbrowse.agresearch.co.nz/cgi-bin/gbrowse/bta4/?',
                },
                oar =>{
                        db => '/data/databases/flatfile/illuminati_blastdata/OAR_chromosomes_ver.1.0',
                        gb => 'http://gbrowse.agresearch.co.nz/cgi-bin/gbrowse/oar/?',
                }
        );


        my $temp_file = sprintf("/tmp/%0.6f%d.tmp", time, rand(1000));
        open(TEMP,">$temp_file") or print $!;
        print TEMP param('seq');
        close TEMP;

        my $db = $gbrowse{param('gb')}{db};
        $blast_params = param('blast_params');
        my $result = join "", `/usr/local/blast/blastall -p blastn -i $temp_file -d $db $blast_params 2>&1`;


        #cleanup
        unlink $temp_file;

        my $io = IO::String->new($result);

        my $in = new Bio::SearchIO(-format => "blast",-fh => $io)or die $!;


        my %coords;
                           
        while( my $result = $in->next_result ) {
                %coords = ();
  while( my $hit = $result->next_hit ) {
                        my $hit_name = $hit->name;

                        # need to do some name translations for bta3 and cow
                        # as the chromosome names in the blast databases
                        # differ from those used in Gbrowse
    switch (param('gb')) {
                                case "cow" { $hit_name =~ s/CHR/BTA/ }
                                case "bta3"{ $hit_name =~ /CM000(\d+)/; $hit_name = "BTA" . ($1 - 176); }
    }

                        $coords{$hit_name}{query} = $result->query_name;

    while( my $hsp = $hit->next_hsp ) {
                               
  $coords{$hit_name}{list} .=  join("..",$hsp->range('hit')) . ",";

    if(defined $coords{$hit_name}{min}){
                                        $coords{$hit_name}{min} = min($coords{$hit_name}{min},$hsp->range('hit'));
                                }else{
                                        $coords{$hit_name}{min} = min($hsp->range('hit'));
                                }

    if(defined $coords{$hit_name}{max}){
                                        $coords{$hit_name}{max} = max($coords{$hit_name}{max},$hsp->range('hit'));
                                }else{
                                        $coords{$hit_name}{max} = max($hsp->range('hit'));
                                }
    }
  }

                # print the links
                my $format = ";style=%22GBlast%20Hit%22+glyph=segments+stranded=1+bgcolor=red;abs=1;";

                foreach $k(keys %coords){
                        $name = "q=$k". ":" .$coords{$k}{min}."..".$coords{$k}{max}.";";
                        $href =  $gbrowse{param('gb')}{gb} . $name . "add=".$k."+%22GBlast%20Hit%22+%22GBlastQuery_".$coords{$k}{query} . " at ".$k. ":" .$coords{$k}{min}."..".$coords{$k}{max}."%22+" . $coords{$k}{list} . $format;
                        #print "$href\n";
                        print a({href=>$href, target=>"_blank"},$coords{$k}{query} . " at " . $k . ":" .$coords{$k}{min}."..".$coords{$k}{max}),p;
                }

        }

        print hr, pre($result),hr;

}
print end_html;
==============================================






> -----Original Message-----
> From: Don Gilbert [mailto:gilbertd@...]
> Sent: Tuesday, 6 October 2009 9:13 a.m.
> To: SBrubaker@...; scott@...
> Cc: gmod-gbrowse@...
> Subject: Re: [Gmod-gbrowse] Question about sequence and blast interface
>
> Shane,
>
> I use a perl script to run NCBI Web blast then turn its output to hyperlinks
> to GBrowse
> for the insects blast server here,
> http://insects.eugenes.org/species/blast/
> The perl script sppblast.cgi  is linked at bottom of this page, but heed the
> warning, it
> is a messy script that only works here for my data sets, but may give you
> ideas.
>
> There is no real problem with 100s of scaffolds you say you have other than
> user interface
> to find them, it is when you have 100,000s (as some do) that
> computers/databases start
> to choke.
>
> - Don Gilbert
> -- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405
> -- gilbertd@...--http://marmot.bio.indiana.edu/
>
> ------------------------------------------------------------------------------
> Come build with us! The BlackBerry® Developer Conference in SF, CA
> is the only developer event you need to attend this year. Jumpstart your
> developing skills, take BlackBerry mobile applications to market and stay
> ahead of the curve. Join us from November 9-12, 2009. Register now!
> http://p.sf.net/sfu/devconf
> _______________________________________________
> Gmod-gbrowse mailing list
> Gmod-gbrowse@...
> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Gmod-gbrowse mailing list
Gmod-gbrowse@...
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse