blast parsing question

View: New views
2 Messages — Rating Filter:   Alert me  

blast parsing question

by Bernd Jagla-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

 

I am new to BioJava. I want to test what is going on here in order to
potentially integrate it with KNIME.

My first project is parsing BLAST output for large files. The example in the
codebook is very good and I had no problems integrating everything in
Eclipse and geting it to work.

 

Now here is my problem:

I am interested in parsing the summary table in the beginning of the
blast-output, and I haven't found a way to get at this information.

 

I am blasting short sequences (20nt - 300nt) against genomic databases
(mouse/human/refseq/miRBase). I want to know if a given sequence (out of a
set of sequences) aligns to a specific genome with high identity. I want to
then separate the input source fasta file into a set that aligns to the
genome and one that doesn't (potentially another list of dubious sequences
where there is no clear answer). For this I only need the length of the
query sequence and score and the first few characters of the header line.

At least that's the way I am currently doing it. I have set the blast
parameters to only give me the first alignment, but the first 50 or so in
the summary.

 

Any help, comments are appreciated.

 

Thanks,

 

Bernd

 

 

 





Bernd Jagla
Bioinformatician

Institut Pasteur
Plate-forme puces a ADN
Genopole / Institut Pasteur
28 rue du Docteur Roux
75724 Paris Cedex 15
France


 <mailto:bernd.jagla@...> bernd.jagla@...


tel:

 
<http://www.plaxo.com/click_to_call?lang=en&src=jj_signature&To=%2B33+%280%2
9+140+61+35+13&Email=berndjagla@...> +33 (0) 140 61 35 13

 

 

_______________________________________________
Biojava-l mailing list  -  Biojava-l@...
http://lists.open-bio.org/mailman/listinfo/biojava-l

Re: blast parsing question

by Andreas Prlic-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Bernd,

not sure if you got a reply to your mail off-list. Did you manage to solve
your problem in the meanwhile?

Andreas

On Wed, Jul 29, 2009 at 6:16 AM, Bernd Jagla <bernd.jagla@...> wrote:

> Hi,
>
>
>
> I am new to BioJava. I want to test what is going on here in order to
> potentially integrate it with KNIME.
>
> My first project is parsing BLAST output for large files. The example in
> the
> codebook is very good and I had no problems integrating everything in
> Eclipse and geting it to work.
>
>
>
> Now here is my problem:
>
> I am interested in parsing the summary table in the beginning of the
> blast-output, and I haven't found a way to get at this information.
>
>
>
> I am blasting short sequences (20nt - 300nt) against genomic databases
> (mouse/human/refseq/miRBase). I want to know if a given sequence (out of a
> set of sequences) aligns to a specific genome with high identity. I want to
> then separate the input source fasta file into a set that aligns to the
> genome and one that doesn't (potentially another list of dubious sequences
> where there is no clear answer). For this I only need the length of the
> query sequence and score and the first few characters of the header line.
>
> At least that's the way I am currently doing it. I have set the blast
> parameters to only give me the first alignment, but the first 50 or so in
> the summary.
>
>
>
> Any help, comments are appreciated.
>
>
>
> Thanks,
>
>
>
> Bernd
>
>
>
>
>
>
>
>
>
>
>
> Bernd Jagla
> Bioinformatician
>
> Institut Pasteur
> Plate-forme puces a ADN
> Genopole / Institut Pasteur
> 28 rue du Docteur Roux
> 75724 Paris Cedex 15
> France
>
>
>  <mailto:bernd.jagla@...> bernd.jagla@...
>
>
> tel:
>
>
> <
> http://www.plaxo.com/click_to_call?lang=en&src=jj_signature&To=%2B33+%280%2
> 9+140+61+35+13&Email=berndjagla@...<http://www.plaxo.com/click_to_call?lang=en&src=jj_signature&To=%2B33+%280%2%0A9+140+61+35+13&Email=berndjagla@...>>
> +33 (0) 140 61 35 13
>
>
>
>
>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@...
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>
_______________________________________________
Biojava-l mailing list  -  Biojava-l@...
http://lists.open-bio.org/mailman/listinfo/biojava-l