extracting CDS location from Genbank

View: New views
10 Messages — Rating Filter:   Alert me  

extracting CDS location from Genbank

by Captainrave :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Help.  I'm very new to perl and bioperl.  Basically I need to extract the location of each CDS in a genbank entry e.g.103...120 and export them to an output file as a list.  How would I do this?

Your help would be much appreciated!

Re: extracting CDS location from Genbank

by michael watson (IAH-C) :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

>From the SeqIO howto:

#!/bin/perl

use strict;
use Bio::SeqIO;

my $file = shift; # get the file name, somehow
my $seqio_object = Bio::SeqIO->new(-file => $file);
my $seq_object = $seqio_object->next_seq;

>From the Feature HOWTO:

for my $feat_object ($seq_object->get_SeqFeatures) {          
   print "primary tag: ", $feat_object->primary_tag, "\n";          
   for my $tag ($feat_object->get_all_tags) {            
      print "  tag: ", $tag, "\n";            
      for my $value ($feat_object->get_tag_values($tag)) {

         print "    value: ", $value, "\n";            
      }          
   }      
}

Surely you could have fouind that yourself? ;0

-----Original Message-----
From: bioperl-l-bounces@...
[mailto:bioperl-l-bounces@...] On Behalf Of Captainrave
Sent: 04 December 2007 11:05
To: Bioperl-l@...
Subject: [Bioperl-l] extracting CDS location from Genbank


Help.  I'm very new to perl and bioperl.  Basically I need to extract
the
location of each CDS in a genbank entry e.g.103...120 and export them to
an
output file as a list.  How would I do this?

Your help would be much appreciated!
--
View this message in context:
http://www.nabble.com/extracting-CDS-location-from-Genbank-tf4942483.htm
l#a14148723
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.

_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: extracting CDS location from Genbank

by Captainrave :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Yes but actually implementing it is another story.

I get an error:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: file argument provided, but with an undefined value
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:359
STACK: Bio::SeqIO::new C:/Perl/site/lib/Bio/SeqIO.pm:359
STACK: test3.pl:7
-----------------------------------------------------------

Basically because I dont understand the code well enough.  For example, how do I tell it which input file to read? I know this might sound stupid, but I dont understand the Biowiki very well!

Re: extracting CDS location from Genbank

by michael watson (IAH-C) :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Post the script that produces that error, and your file's location

-----Original Message-----
From: bioperl-l-bounces@...
[mailto:bioperl-l-bounces@...] On Behalf Of Captainrave
Sent: 04 December 2007 15:07
To: Bioperl-l@...
Subject: Re: [Bioperl-l] extracting CDS location from Genbank


Yes but actually implementing it is another story.

I get an error:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: file argument provided, but with an undefined value
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:359
STACK: Bio::SeqIO::new C:/Perl/site/lib/Bio/SeqIO.pm:359
STACK: test3.pl:7
-----------------------------------------------------------

Basically because I dont understand the code well enough.  For example,
how
do I tell it which input file to read? I know this might sound stupid,
but I
dont understand the Biowiki very well!

--
View this message in context:
http://www.nabble.com/extracting-CDS-location-from-Genbank-tf4942483.htm
l#a14152264
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.

_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: extracting CDS location from Genbank

by Sendu Bala-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Captainrave wrote:

> Yes but actually implementing it is another story.
>
> I get an error:
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: file argument provided, but with an undefined value
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:359
> STACK: Bio::SeqIO::new C:/Perl/site/lib/Bio/SeqIO.pm:359
> STACK: test3.pl:7
> -----------------------------------------------------------

The best way to get help is to give us your script and the error
message, and the command you used to run your script. The less you know,
the more you should give us (ie. don't edit anything out).
_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: extracting CDS location from Genbank

by Captainrave :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

#!/bin/perl

use strict;
use Bio::SeqIO;
my $file = shift; # get the file name, somehow
my $seqio_object = Bio::SeqIO->new(-file => $file);
my $seq_object = $seqio_object->next_seq;

for my $feat_object ($seq_object->get_SeqFeatures) {          
   print "primary tag: ", $feat_object->primary_tag, "\n";          
   for my $tag ($feat_object->get_all_tags) {            
      print "  tag: ", $tag, "\n";            
      for my $value ($feat_object->get_tag_values($tag)) {

         print "    value: ", $value, "\n";            
      }          
   }      
}

exit;

The file is on the same folder.  But how do I tell it to use this file?


michael watson (IAH-C) wrote:
Post the script that produces that error, and your file's location

-----Original Message-----
From: bioperl-l-bounces@lists.open-bio.org
[mailto:bioperl-l-bounces@lists.open-bio.org] On Behalf Of Captainrave
Sent: 04 December 2007 15:07
To: Bioperl-l@lists.open-bio.org
Subject: Re: [Bioperl-l] extracting CDS location from Genbank


Yes but actually implementing it is another story.

I get an error:

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: file argument provided, but with an undefined value
STACK: Error::throw
STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:359
STACK: Bio::SeqIO::new C:/Perl/site/lib/Bio/SeqIO.pm:359
STACK: test3.pl:7
-----------------------------------------------------------

Basically because I dont understand the code well enough.  For example,
how
do I tell it which input file to read? I know this might sound stupid,
but I
dont understand the Biowiki very well!

--
View this message in context:
http://www.nabble.com/extracting-CDS-location-from-Genbank-tf4942483.htm
l#a14152264
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.

_______________________________________________
Bioperl-l mailing list
Bioperl-l@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l

_______________________________________________
Bioperl-l mailing list
Bioperl-l@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: extracting CDS location from Genbank

by michael watson (IAH-C) :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Same script as below, but try:

my $file = 'C:\path\to\my\filename.gbk';

-----Original Message-----
From: bioperl-l-bounces@...
[mailto:bioperl-l-bounces@...] On Behalf Of Captainrave
Sent: 04 December 2007 15:42
To: Bioperl-l@...
Subject: Re: [Bioperl-l] extracting CDS location from Genbank


#!/bin/perl

use strict;
use Bio::SeqIO;
my $file = shift; # get the file name, somehow
my $seqio_object = Bio::SeqIO->new(-file => $file);
my $seq_object = $seqio_object->next_seq;

for my $feat_object ($seq_object->get_SeqFeatures) {          
   print "primary tag: ", $feat_object->primary_tag, "\n";          
   for my $tag ($feat_object->get_all_tags) {            
      print "  tag: ", $tag, "\n";            
      for my $value ($feat_object->get_tag_values($tag)) {

         print "    value: ", $value, "\n";            
      }          
   }      
}

exit;

The file is on the same folder.  But how do I tell it to use this file?



michael watson (IAH-C) wrote:

>
> Post the script that produces that error, and your file's location
>
> -----Original Message-----
> From: bioperl-l-bounces@...
> [mailto:bioperl-l-bounces@...] On Behalf Of Captainrave
> Sent: 04 December 2007 15:07
> To: Bioperl-l@...
> Subject: Re: [Bioperl-l] extracting CDS location from Genbank
>
>
> Yes but actually implementing it is another story.
>
> I get an error:
>
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: file argument provided, but with an undefined value
> STACK: Error::throw
> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:359
> STACK: Bio::SeqIO::new C:/Perl/site/lib/Bio/SeqIO.pm:359
> STACK: test3.pl:7
> -----------------------------------------------------------
>
> Basically because I dont understand the code well enough.  For
example,
> how
> do I tell it which input file to read? I know this might sound stupid,
> but I
> dont understand the Biowiki very well!
>
> --
> View this message in context:
>
http://www.nabble.com/extracting-CDS-location-from-Genbank-tf4942483.htm

> l#a14152264
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@...
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@...
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

--
View this message in context:
http://www.nabble.com/extracting-CDS-location-from-Genbank-tf4942483.htm
l#a14152907
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.

_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: extracting CDS location from Genbank

by Chris Fields :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

The 'my $file = shift;' is a perl idiom.  The built-in 'shift' used  
implicitly in this way uses @ARGV (from command line); the file would  
the be passed as the first arg when running the script:

get_features.pl myfile.gb

This should work for any OS.  Personally, I use something like the  
following to indicate how the script is used in case a file is never  
entered:

my $USAGE = <<END_USE;
USAGE: get_features.pl <file>
Perl script to grab features from a GenBank file and print to a table
END_USE

my $file = shift || die $USAGE;

chris

On Dec 4, 2007, at 9:41 AM, Captainrave wrote:

>
> #!/bin/perl
>
> use strict;
> use Bio::SeqIO;
> my $file = shift; # get the file name, somehow
> my $seqio_object = Bio::SeqIO->new(-file => $file);
> my $seq_object = $seqio_object->next_seq;
>
> for my $feat_object ($seq_object->get_SeqFeatures) {
>   print "primary tag: ", $feat_object->primary_tag, "\n";
>   for my $tag ($feat_object->get_all_tags) {
>      print "  tag: ", $tag, "\n";
>      for my $value ($feat_object->get_tag_values($tag)) {
>
>         print "    value: ", $value, "\n";
>      }
>   }
> }
>
> exit;
>
> The file is on the same folder.  But how do I tell it to use this  
> file?
>
>
>
> michael watson (IAH-C) wrote:
>>
>> Post the script that produces that error, and your file's location
>>
>> -----Original Message-----
>> From: bioperl-l-bounces@...
>> [mailto:bioperl-l-bounces@...] On Behalf Of  
>> Captainrave
>> Sent: 04 December 2007 15:07
>> To: Bioperl-l@...
>> Subject: Re: [Bioperl-l] extracting CDS location from Genbank
>>
>>
>> Yes but actually implementing it is another story.
>>
>> I get an error:
>>
>> ------------- EXCEPTION: Bio::Root::Exception -------------
>> MSG: file argument provided, but with an undefined value
>> STACK: Error::throw
>> STACK: Bio::Root::Root::throw C:/Perl/site/lib/Bio/Root/Root.pm:359
>> STACK: Bio::SeqIO::new C:/Perl/site/lib/Bio/SeqIO.pm:359
>> STACK: test3.pl:7
>> -----------------------------------------------------------
>>
>> Basically because I dont understand the code well enough.  For  
>> example,
>> how
>> do I tell it which input file to read? I know this might sound  
>> stupid,
>> but I
>> dont understand the Biowiki very well!
>>
>> --
>> View this message in context:
>> http://www.nabble.com/extracting-CDS-location-from-Genbank-tf4942483.htm
>> l#a14152264
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@...
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l@...
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> --
> View this message in context: http://www.nabble.com/extracting-CDS-location-from-Genbank-tf4942483.html#a14152907
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@...
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign



_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: extracting CDS location from Genbank

by Sendu Bala-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Captainrave wrote:
> #!/bin/perl
> my $file = shift; # get the file name, somehow
>
> The file is on the same folder.  But how do I tell it to use this file?

http://stein.cshl.org/genome_informatics/perl_intro/command_line.html

Basically, when you run your script add the name of the file to your
command line.

me% perl myscript.pl myfile

By saying 'my $file = shift' inside myscript.pl, the variable $file now
contains the filename 'myfile'.

You could also have hardcoded the filename:
my $file = 'myfile';


Anyway, you're going to run into lots of these issues, and they're
beyond the scope of this mailing list. For basic perl problems seek help
via www.perl.org. When you have a BioPerl-specific question, don't
hesitate to post here.
_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: extracting CDS location from Genbank

by Captainrave :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thanks, it works great now.

Do any of you know if there is a tag to pull out CDS location. i.e. the values such as 132...145 etc?  Those are all I need.  Also, is there anyway to stop it reporting tag and value and literally JUST output the value?

Thanks!!!