[bioperl newbie] Retrieving link to protein from PubChem

View: New views
5 Messages — Rating Filter:   Alert me  

[bioperl newbie] Retrieving link to protein from PubChem

by saikari keitele :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I'm using Bioperl to retrieve records from PubChem.
I'm trying to find a way-but have been unsuccessful- to retrieve from a
compound record, the reference to the protein(s) that can synthesize the
compound.
Thanks very much.

saikari
_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: [bioperl newbie] Retrieving link to protein from PubChem

by Chris Fields-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Nov 9, 2009, at 10:05 AM, saikari keitele wrote:

> Hi,
>
> I'm using Bioperl to retrieve records from PubChem.
> I'm trying to find a way-but have been unsuccessful- to retrieve  
> from a
> compound record, the reference to the protein(s) that can synthesize  
> the
> compound.
> Thanks very much.
>
> saikari

The below bioperl script returns the GI for proteins that correspond  
to the substance passed on the command line; invoke using 'perl  
pc_substance.pl substance_requested'.  It probably needs more fiddling  
to catch everything but it should get you started.

For other bits and pieces (such as how to retrieve the raw sequence  
files), please see the EUtilities HOWTO:

http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook

chris

----------------------------------------

#!/usr/bin/perl -w

use 5.010;
use strict;
use warnings;
use Bio::DB::EUtilities;

my $substance = shift;

my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
                                      -db => 'pcsubstance',
                                      -term => $substance,
                                      -usehistory => 'y');

my $hist = $eutil->next_History || die;

$eutil->reset_parameters(-eutil => 'elink',
                        -history => $hist,
                        -db      => 'protein',
                        -dbfrom  => 'pcsubstance',
                        -retmax  => 1000);

say join(',',$eutil->get_ids);
_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: [bioperl newbie] Retrieving link to protein from PubChem

by saikari keitele :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Fabulous!. Huge help.
saikari

On Mon, Nov 9, 2009 at 4:27 PM, Chris Fields <cjfields@...> wrote:

>  On Nov 9, 2009, at 10:05 AM, saikari keitele wrote:
>
> Hi,
>>
>> I'm using Bioperl to retrieve records from PubChem.
>> I'm trying to find a way-but have been unsuccessful- to retrieve from a
>> compound record, the reference to the protein(s) that can synthesize the
>> compound.
>> Thanks very much.
>>
>> saikari
>>
>
> The below bioperl script returns the GI for proteins that correspond to the
> substance passed on the command line; invoke using 'perl pc_substance.plsubstance_requested'.  It probably needs more fiddling to catch everything
> but it should get you started.
>
> For other bits and pieces (such as how to retrieve the raw sequence files),
> please see the EUtilities HOWTO:
>
> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>
> chris
>
> ----------------------------------------
>
> #!/usr/bin/perl -w
>
> use 5.010;
> use strict;
> use warnings;
> use Bio::DB::EUtilities;
>
> my $substance = shift;
>
> my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>                                     -db => 'pcsubstance',
>                                     -term => $substance,
>                                     -usehistory => 'y');
>
> my $hist = $eutil->next_History || die;
>
> $eutil->reset_parameters(-eutil => 'elink',
>                       -history => $hist,
>                       -db      => 'protein',
>                       -dbfrom  => 'pcsubstance',
>                       -retmax  => 1000);
>
> say join(',',$eutil->get_ids);
>
_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: [bioperl newbie] Retrieving link to protein from PubChem

by saikari keitele :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thanks again very much for your help and the script.
i've been trying it, however I fail to find any protein record linked to a
record in the pcsubstance database.
Do you think that its is because  no links have been defined between the 2
databases, or that I am just unlucky and that no link exists for the
particular records I'm testing?
Thanks again

saikari

On Mon, Nov 9, 2009 at 4:41 PM, saikari keitele <saikari78@...> wrote:

> Fabulous!. Huge help.
> saikari
>
>   On Mon, Nov 9, 2009 at 4:27 PM, Chris Fields <cjfields@...>wrote:
>
>>  On Nov 9, 2009, at 10:05 AM, saikari keitele wrote:
>>
>> Hi,
>>>
>>> I'm using Bioperl to retrieve records from PubChem.
>>> I'm trying to find a way-but have been unsuccessful- to retrieve from a
>>> compound record, the reference to the protein(s) that can synthesize the
>>> compound.
>>> Thanks very much.
>>>
>>> saikari
>>>
>>
>> The below bioperl script returns the GI for proteins that correspond to
>> the substance passed on the command line; invoke using 'perl
>> pc_substance.pl substance_requested'.  It probably needs more fiddling to
>> catch everything but it should get you started.
>>
>> For other bits and pieces (such as how to retrieve the raw sequence
>> files), please see the EUtilities HOWTO:
>>
>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook
>>
>> chris
>>
>> ----------------------------------------
>>
>> #!/usr/bin/perl -w
>>
>> use 5.010;
>> use strict;
>> use warnings;
>> use Bio::DB::EUtilities;
>>
>> my $substance = shift;
>>
>> my $eutil = Bio::DB::EUtilities->new(-eutil => 'esearch',
>>                                     -db => 'pcsubstance',
>>                                     -term => $substance,
>>                                     -usehistory => 'y');
>>
>> my $hist = $eutil->next_History || die;
>>
>> $eutil->reset_parameters(-eutil => 'elink',
>>                       -history => $hist,
>>                       -db      => 'protein',
>>                       -dbfrom  => 'pcsubstance',
>>                       -retmax  => 1000);
>>
>> say join(',',$eutil->get_ids);
>>
>
>
_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: [bioperl newbie] Retrieving link to protein from PubChem

by Chris Fields-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Nov 10, 2009, at 5:41 AM, saikari keitele wrote:

> Thanks again very much for your help and the script.
> i've been trying it, however I fail to find any protein record  
> linked to a
> record in the pcsubstance database.
> Do you think that its is because  no links have been defined between  
> the 2
> databases, or that I am just unlucky and that no link exists for the
> particular records I'm testing?
> Thanks again
>
> saikari

It's probably that no links have been defined.  I have found similar  
problems in the past with pubchem, in that not all substances have  
proteins associated with them.  Most proteins linked to are those with  
a deposited structure.

There are a few other databases to check out; KEGG, the BioCyc dbs  
(like EcoCyc), come to mind.  I don't think we have a generic remote  
query engine set up for any of those unfortunately (unless there is  
one I'm unaware of), but I know BioCyc comes with it's own set of  
tools (including perl- and java-based query tools) and can be set up  
locally, which is likely much faster and more in lines with what you  
need.

chris

...
_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l