|
View:
New views
4 Messages
—
Rating Filter:
Alert me
|
|
|
issue with reference nameHi,
I have a dataset that I am trying to setup GBrowse for but the problem is that the contig name have periods and braces, eg. "SC020.(contig_18.1)". I think its the braces that is causing trouble because they are not allowed in col 1 according to GFF3 specs (http://gmod.org/wiki/GFF3): Column 1: "seqid" The ID of the landmark used to establish the coordinate system for the current feature. IDs may contain any characters, but must escape any characters not in the set [a-zA-Z0-9.:^*$@!+_?-|]. In particular, IDs may not contain unescaped whitespace and must not begin with an unescaped ">". I escaped the braces with URL encoding, eg. "SC020.%28contig_18.1%29". Do the braces in the ID and the Name attributes also need to be URL encoded? And how about the sequence header in the FASTA section? Thanks, Prachi Here's some sample GFF lines: SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 contig 1 1824958 . . . ID=SC020.(contig_18.1);Name=SC020.(contig_18.1) SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 mRNA 978712 981210 . - . ID=7000000516956357;Parent=7000000516956351 SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 gene 978712 981210 . - . ID=7000000516956351;Name=Unknown SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 CDS 978712 981210 . - 0 ID=AO090020000391;Parent=7000000516956357 SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 exon 978712 981210 . - . ID=7000000516956361;Parent=7000000516956357 ##FASTA >SC020.(contig_18.1) aattttttaatttattaaattagatattttaaatatatttttataatatttaaatattat aaactattataatctattattattataataataatattatttttaatatagtatttttat atttgaattatttttttaattataaataattttcttttatattaaataattttcttttat ............ ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Gmod-gbrowse mailing list Gmod-gbrowse@... https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse |
|
|
Re: issue with reference nameHave you considered renaming the contigs with a simpler nomenclature?
A perl script could do that fairly quickly. Maureen On Wed, Nov 4, 2009 at 2:06 PM, Prachi Shah <prachi@...> wrote: > Hi, > > I have a dataset that I am trying to setup GBrowse for but the problem > is that the contig name have periods and braces, eg. > "SC020.(contig_18.1)". > > I think its the braces that is causing trouble because they are not > allowed in col 1 according to GFF3 specs (http://gmod.org/wiki/GFF3): > Column 1: "seqid" > The ID of the landmark used to establish the coordinate system for > the current feature. IDs may contain any characters, but must escape > any characters not in the set [a-zA-Z0-9.:^*$@!+_?-|]. In particular, > IDs may not contain unescaped whitespace and must not begin with an > unescaped ">". > > I escaped the braces with URL encoding, eg. "SC020.%28contig_18.1%29". > Do the braces in the ID and the Name attributes also need to be URL > encoded? And how about the sequence header in the FASTA section? > > Thanks, > Prachi > > > Here's some sample GFF lines: > > SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 > contig 1 1824958 . . . > ID=SC020.(contig_18.1);Name=SC020.(contig_18.1) > SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 > mRNA 978712 981210 . - . > ID=7000000516956357;Parent=7000000516956351 > SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 > gene 978712 981210 . - . > ID=7000000516956351;Name=Unknown > SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 > CDS 978712 981210 . - 0 > ID=AO090020000391;Parent=7000000516956357 > SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 > exon 978712 981210 . - . > ID=7000000516956361;Parent=7000000516956357 > ##FASTA >>SC020.(contig_18.1) > aattttttaatttattaaattagatattttaaatatatttttataatatttaaatattat > aaactattataatctattattattataataataatattatttttaatatagtatttttat > atttgaattatttttttaattataaataattttcttttatattaaataattttcttttat > ............ > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Gmod-gbrowse mailing list > Gmod-gbrowse@... > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse > -- Maureen J. Donlin, Ph.D. Research Associate Professor Dept. of Biochemistry & Molecular Biology Dept. of Molecular Microbiology & Immunology Saint Louis University School of Medicine 507 Doisy Research Center 314-977-8858 ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Gmod-gbrowse mailing list Gmod-gbrowse@... https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse |
|
|
Re: issue with reference nameHi Maureen,
Yes, I have considered that. But, I am at the receiving end for this data and would like to not tweak it as far as I can. Thanks, Prachi On Wed, Nov 4, 2009 at 12:17 PM, Maureen Donlin <donlinmj@...> wrote: > Have you considered renaming the contigs with a simpler nomenclature? > A perl script could do that fairly quickly. > > Maureen > > On Wed, Nov 4, 2009 at 2:06 PM, Prachi Shah <prachi@...> wrote: >> Hi, >> >> I have a dataset that I am trying to setup GBrowse for but the problem >> is that the contig name have periods and braces, eg. >> "SC020.(contig_18.1)". >> >> I think its the braces that is causing trouble because they are not >> allowed in col 1 according to GFF3 specs (http://gmod.org/wiki/GFF3): >> Column 1: "seqid" >> The ID of the landmark used to establish the coordinate system for >> the current feature. IDs may contain any characters, but must escape >> any characters not in the set [a-zA-Z0-9.:^*$@!+_?-|]. In particular, >> IDs may not contain unescaped whitespace and must not begin with an >> unescaped ">". >> >> I escaped the braces with URL encoding, eg. "SC020.%28contig_18.1%29". >> Do the braces in the ID and the Name attributes also need to be URL >> encoded? And how about the sequence header in the FASTA section? >> >> Thanks, >> Prachi >> >> >> Here's some sample GFF lines: >> >> SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 >> contig 1 1824958 . . . >> ID=SC020.(contig_18.1);Name=SC020.(contig_18.1) >> SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 >> mRNA 978712 981210 . - . >> ID=7000000516956357;Parent=7000000516956351 >> SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 >> gene 978712 981210 . - . >> ID=7000000516956351;Name=Unknown >> SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 >> CDS 978712 981210 . - 0 >> ID=AO090020000391;Parent=7000000516956357 >> SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 >> exon 978712 981210 . - . >> ID=7000000516956361;Parent=7000000516956357 >> ##FASTA >>>SC020.(contig_18.1) >> aattttttaatttattaaattagatattttaaatatatttttataatatttaaatattat >> aaactattataatctattattattataataataatattatttttaatatagtatttttat >> atttgaattatttttttaattataaataattttcttttatattaaataattttcttttat >> ............ >> >> ------------------------------------------------------------------------------ >> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day >> trial. Simplify your report design, integration and deployment - and focus on >> what you do best, core application coding. Discover what's new with >> Crystal Reports now. http://p.sf.net/sfu/bobj-july >> _______________________________________________ >> Gmod-gbrowse mailing list >> Gmod-gbrowse@... >> https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse >> > > > > -- > Maureen J. Donlin, Ph.D. > > Research Associate Professor > Dept. of Biochemistry & Molecular Biology > Dept. of Molecular Microbiology & Immunology > Saint Louis University School of Medicine > 507 Doisy Research Center > 314-977-8858 > ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Gmod-gbrowse mailing list Gmod-gbrowse@... https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse |
|
|
Re: issue with reference nameHi Prachi,
I can't say for sure at the moment, but I bet the parenthesis are going to cause a problem. First try URI escaping in column 9, but failing that, you'll probably need to remove the parens. You might also want to ask the organization creating the data not to use special characters. Scott On Nov 4, 2009, at 3:06 PM, Prachi Shah wrote: > Hi, > > I have a dataset that I am trying to setup GBrowse for but the problem > is that the contig name have periods and braces, eg. > "SC020.(contig_18.1)". > > I think its the braces that is causing trouble because they are not > allowed in col 1 according to GFF3 specs (http://gmod.org/wiki/GFF3): > Column 1: "seqid" > The ID of the landmark used to establish the coordinate system for > the current feature. IDs may contain any characters, but must escape > any characters not in the set [a-zA-Z0-9.:^*$@!+_?-|]. In particular, > IDs may not contain unescaped whitespace and must not begin with an > unescaped ">". > > I escaped the braces with URL encoding, eg. "SC020.%28contig_18.1%29". > Do the braces in the ID and the Name attributes also need to be URL > encoded? And how about the sequence header in the FASTA section? > > Thanks, > Prachi > > > Here's some sample GFF lines: > > SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 > contig 1 1824958 . . . > ID=SC020.(contig_18.1);Name=SC020.(contig_18.1) > SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 > mRNA 978712 981210 . - . > ID=7000000516956357;Parent=7000000516956351 > SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 > gene 978712 981210 . - . > ID=7000000516956351;Name=Unknown > SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 > CDS 978712 981210 . - 0 > ID=AO090020000391;Parent=7000000516956357 > SC020.%28contig_18.1%29 A_oryzae_RIB40_INSERTASSEMBLYFROMGENBANK_1 > exon 978712 981210 . - . > ID=7000000516956361;Parent=7000000516956357 > ##FASTA >> SC020.(contig_18.1) > aattttttaatttattaaattagatattttaaatatatttttataatatttaaatattat > aaactattataatctattattattataataataatattatttttaatatagtatttttat > atttgaattatttttttaattataaataattttcttttatattaaataattttcttttat > ............ > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 > 30-Day > trial. Simplify your report design, integration and deployment - and > focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Gmod-gbrowse mailing list > Gmod-gbrowse@... > https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse ----------------------------------------------------------------------- Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Gmod-gbrowse mailing list Gmod-gbrowse@... https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse |
| Free embeddable forum powered by Nabble | Forum Help |