|
View:
New views
2 Messages
—
Rating Filter:
Alert me
|
|
|
blast parsing questionHi,
I am new to BioJava. I want to test what is going on here in order to potentially integrate it with KNIME. My first project is parsing BLAST output for large files. The example in the codebook is very good and I had no problems integrating everything in Eclipse and geting it to work. Now here is my problem: I am interested in parsing the summary table in the beginning of the blast-output, and I haven't found a way to get at this information. I am blasting short sequences (20nt - 300nt) against genomic databases (mouse/human/refseq/miRBase). I want to know if a given sequence (out of a set of sequences) aligns to a specific genome with high identity. I want to then separate the input source fasta file into a set that aligns to the genome and one that doesn't (potentially another list of dubious sequences where there is no clear answer). For this I only need the length of the query sequence and score and the first few characters of the header line. At least that's the way I am currently doing it. I have set the blast parameters to only give me the first alignment, but the first 50 or so in the summary. Any help, comments are appreciated. Thanks, Bernd Bernd Jagla Bioinformatician Institut Pasteur Plate-forme puces a ADN Genopole / Institut Pasteur 28 rue du Docteur Roux 75724 Paris Cedex 15 France <mailto:bernd.jagla@...> bernd.jagla@... tel: <http://www.plaxo.com/click_to_call?lang=en&src=jj_signature&To=%2B33+%280%2 9+140+61+35+13&Email=berndjagla@...> +33 (0) 140 61 35 13 _______________________________________________ Biojava-l mailing list - Biojava-l@... http://lists.open-bio.org/mailman/listinfo/biojava-l |
|
|
Re: blast parsing questionHi Bernd,
not sure if you got a reply to your mail off-list. Did you manage to solve your problem in the meanwhile? Andreas On Wed, Jul 29, 2009 at 6:16 AM, Bernd Jagla <bernd.jagla@...> wrote: > Hi, > > > > I am new to BioJava. I want to test what is going on here in order to > potentially integrate it with KNIME. > > My first project is parsing BLAST output for large files. The example in > the > codebook is very good and I had no problems integrating everything in > Eclipse and geting it to work. > > > > Now here is my problem: > > I am interested in parsing the summary table in the beginning of the > blast-output, and I haven't found a way to get at this information. > > > > I am blasting short sequences (20nt - 300nt) against genomic databases > (mouse/human/refseq/miRBase). I want to know if a given sequence (out of a > set of sequences) aligns to a specific genome with high identity. I want to > then separate the input source fasta file into a set that aligns to the > genome and one that doesn't (potentially another list of dubious sequences > where there is no clear answer). For this I only need the length of the > query sequence and score and the first few characters of the header line. > > At least that's the way I am currently doing it. I have set the blast > parameters to only give me the first alignment, but the first 50 or so in > the summary. > > > > Any help, comments are appreciated. > > > > Thanks, > > > > Bernd > > > > > > > > > > > > Bernd Jagla > Bioinformatician > > Institut Pasteur > Plate-forme puces a ADN > Genopole / Institut Pasteur > 28 rue du Docteur Roux > 75724 Paris Cedex 15 > France > > > <mailto:bernd.jagla@...> bernd.jagla@... > > > tel: > > > < > http://www.plaxo.com/click_to_call?lang=en&src=jj_signature&To=%2B33+%280%2 > 9+140+61+35+13&Email=berndjagla@...<http://www.plaxo.com/click_to_call?lang=en&src=jj_signature&To=%2B33+%280%2%0A9+140+61+35+13&Email=berndjagla@...>> > +33 (0) 140 61 35 13 > > > > > > _______________________________________________ > Biojava-l mailing list - Biojava-l@... > http://lists.open-bio.org/mailman/listinfo/biojava-l > Biojava-l mailing list - Biojava-l@... http://lists.open-bio.org/mailman/listinfo/biojava-l |
| Free embeddable forum powered by Nabble | Forum Help |