Upper case symbols

View: New views
3 Messages — Rating Filter:   Alert me  

Upper case symbols

by Andreas Dräger-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dear all,

Thank you for this very quick answer regarding Sequence logos! The code
works fine. However, would it also be possible to tell the DNATools
class that upper case letters should be used instead of lower case "a",
"c", "g", and "t"? I think it is more common to apply upper case symbols
in sequence logos.

Cheers
Andreas

--
Dipl.-Bioinform. Andreas Dräger
Eberhard Karls University Tübingen
Center for Bioinformatics (ZBIT)
Sand 1
72076 Tübingen
Germany

Phone: +49-7071-29-70436
Fax:   +49-7071-29-5091
_______________________________________________
Biojava-l mailing list  -  Biojava-l@...
http://lists.open-bio.org/mailman/listinfo/biojava-l

Re: Upper case symbols

by Eric Haugen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Andreas Draeger wrote:

> Dear all,
>
> Thank you for this very quick answer regarding Sequence logos! The code
> works fine. However, would it also be possible to tell the DNATools
> class that upper case letters should be used instead of lower case "a",
> "c", "g", and "t"? I think it is more common to apply upper case symbols
> in sequence logos.
>
> Cheers
> Andreas
>
Hi Andreas,

The way I did it was to uppercase the letters in the AlphabetManager.xml file,
replacing the AlphabetManager.xml inside my biojava.jar.  See the attached version.
Maybe there's an easier way I don't know about...

Best regards,

Eric

--
Eric Haugen
Software Engineer
Stam lab, UW Genome Sciences
University of Washington
2211 Elliott Avenue, 6th Floor
Seattle, WA 98121

<?xml version="1.0"?>

<!-- Core alphabet definitions for BioJava
   -
   - @author Matthew Pocock
   - @author Thomas Down
  -->

<!DOCTYPE alphabetManager [
  <!ELEMENT alphabetManager (symbol|alphabet)*>
  <!ELEMENT symbol (description?)>
  <!ATTLIST symbol
    name CDATA #REQUIRED>

  <!ELEMENT description (#PCDATA)>

  <!ELEMENT alphabet (description?, (symbol|symbolref)*, characterTokenization*)>

  <!ATTLIST alphabet
    name ID #IMPLIED
    parent IDREF #IMPLIED>

  <!ELEMENT symbolref EMPTY>
  <!ATTLIST symbolref
    name CDATA #REQUIRED>
   
  <!ELEMENT characterTokenization ((atomicMapping|ambiguityMapping|gapSymbolMapping)*)>
  <!ATTLIST characterTokenization
   name CDATA #REQUIRED
   caseSensitive CDATA #REQUIRED>
   
  <!ELEMENT atomicMapping (symbolref)>
  <!ATTLIST atomicMapping
   token CDATA #REQUIRED>
  <!ELEMENT ambiguityMapping (symbolref*)>
  <!ATTLIST ambiguityMapping
   token CDATA #REQUIRED>
  <!ELEMENT gapSymbolMapping EMPTY>
  <!ATTLIST gapSymbolMapping
   token CDATA #REQUIRED>

]>

<alphabetManager>
  <symbol name="urn:lsid:biojava.org:symbol:adenine">
    <description>pairs with T in DNA, and U in RNA</description>
  </symbol>

  <symbol name="urn:lsid:biojava.org:symbol:guanine">
    <description>pairs with C</description>
  </symbol>

  <symbol name="urn:lsid:biojava.org:symbol:cytosine">
    <description>pairs with G</description>
  </symbol>

  <symbol name="urn:lsid:biojava.org:symbol:thymine">
    <description>pairs with A</description>
  </symbol>

  <symbol name="urn:lsid:biojava.org:symbol:uracil">
    <description>pairs with A, replacing T in RNA</description>
  </symbol>

  <alphabet name="DNA">
    <description>Deoxy-Ribose Nucleic Acid. The stuff of life.</description>
    <symbolref name="urn:lsid:biojava.org:symbol:adenine"/>
    <symbolref name="urn:lsid:biojava.org:symbol:guanine"/>
    <symbolref name="urn:lsid:biojava.org:symbol:cytosine"/>
    <symbolref name="urn:lsid:biojava.org:symbol:thymine"/>
   
    <characterTokenization name="token" caseSensitive="false">
      <atomicMapping token="A">
        <symbolref name="urn:lsid:biojava.org:symbol:adenine"/>
      </atomicMapping>

      <atomicMapping token="G">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine"/>
      </atomicMapping>

      <atomicMapping token="C">
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine"/>
      </atomicMapping>

      <atomicMapping token="T">
        <symbolref name="urn:lsid:biojava.org:symbol:thymine"/>
      </atomicMapping>

      <ambiguityMapping token="R">
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
      </ambiguityMapping>

      <ambiguityMapping token="Y">
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:thymine" />
      </ambiguityMapping>

      <ambiguityMapping token="M">
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
      </ambiguityMapping>

      <ambiguityMapping token="K">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:thymine" />
      </ambiguityMapping>

      <ambiguityMapping token="S">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
      </ambiguityMapping>

      <ambiguityMapping token="W">
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:thymine" />
      </ambiguityMapping>

      <ambiguityMapping token="H">
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:thymine" />
      </ambiguityMapping>

      <ambiguityMapping token="B">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:thymine" />
      </ambiguityMapping>

      <ambiguityMapping token="V">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
      </ambiguityMapping>

      <ambiguityMapping token="D">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:thymine" />
      </ambiguityMapping>
   
      <ambiguityMapping token="N">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:thymine" />
      </ambiguityMapping>

      <ambiguityMapping token="x">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:thymine" />
      </ambiguityMapping>

      <ambiguityMapping token="-">
        <!-- Gap is the empty set -->
      </ambiguityMapping>

      <!-- A gap by any other name would still smell so... gappy? -->
     
      <ambiguityMapping token=".">
      </ambiguityMapping>
     
      <ambiguityMapping token=" ">
      </ambiguityMapping>

      <gapSymbolMapping token="~" />>
    </characterTokenization>
  </alphabet>

  <alphabet name="RNA">
    <description>Ribose Nucleic Acid. Intermediate between genomic DNA and protein.</description>
    <symbolref name="urn:lsid:biojava.org:symbol:adenine"/>
    <symbolref name="urn:lsid:biojava.org:symbol:guanine"/>
    <symbolref name="urn:lsid:biojava.org:symbol:cytosine"/>
    <symbolref name="urn:lsid:biojava.org:symbol:uracil"/>

    <characterTokenization name="token" caseSensitive="false">
      <atomicMapping token="A">
        <symbolref name="urn:lsid:biojava.org:symbol:adenine"/>
      </atomicMapping>

      <atomicMapping token="G">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine"/>
      </atomicMapping>

      <atomicMapping token="C">
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine"/>
      </atomicMapping>

      <atomicMapping token="U">
        <symbolref name="urn:lsid:biojava.org:symbol:uracil"/>
      </atomicMapping>

     <ambiguityMapping token="R">
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
      </ambiguityMapping>

      <ambiguityMapping token="Y">
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:uracil" />
      </ambiguityMapping>

      <ambiguityMapping token="M">
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
      </ambiguityMapping>

      <ambiguityMapping token="K">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:uracil" />
      </ambiguityMapping>

      <ambiguityMapping token="S">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
      </ambiguityMapping>

      <ambiguityMapping token="W">
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:uracil" />
      </ambiguityMapping>

      <ambiguityMapping token="H">
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:uracil" />
      </ambiguityMapping>

      <ambiguityMapping token="B">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:uracil" />
      </ambiguityMapping>

      <ambiguityMapping token="V">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
      </ambiguityMapping>

      <ambiguityMapping token="D">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:uracil" />
      </ambiguityMapping>

      <ambiguityMapping token="N">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:uracil" />
      </ambiguityMapping>
     
      <ambiguityMapping token="X">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:uracil" />
      </ambiguityMapping>
     
      <ambiguityMapping token="-">
        <!-- Gap is the empty set -->
      </ambiguityMapping>

      <ambiguityMapping token=".">
      </ambiguityMapping>

      <gapSymbolMapping token="~" />>
    </characterTokenization>
  </alphabet>

  <alphabet name="NUCLEOTIDE">
    <description>All nucleic acids.</description>
    <symbolref name="urn:lsid:biojava.org:symbol:adenine"/>
    <symbolref name="urn:lsid:biojava.org:symbol:guanine"/>
    <symbolref name="urn:lsid:biojava.org:symbol:cytosine"/>
    <symbolref name="urn:lsid:biojava.org:symbol:thymine"/>
    <symbolref name="urn:lsid:biojava.org:symbol:uracil"/>
   
    <characterTokenization name="token" caseSensitive="false">
      <atomicMapping token="A">
        <symbolref name="urn:lsid:biojava.org:symbol:adenine"/>
      </atomicMapping>

      <atomicMapping token="G">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine"/>
      </atomicMapping>

      <atomicMapping token="C">
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine"/>
      </atomicMapping>

      <atomicMapping token="T">
        <symbolref name="urn:lsid:biojava.org:symbol:thymine"/>
      </atomicMapping>

      <atomicMapping token="U">
        <symbolref name="urn:lsid:biojava.org:symbol:uracil"/>
      </atomicMapping>
         
      <ambiguityMapping token="R">
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
      </ambiguityMapping>

      <ambiguityMapping token="Y">
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:thymine" />
      </ambiguityMapping>

      <ambiguityMapping token="M">
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
      </ambiguityMapping>

      <ambiguityMapping token="K">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:thymine" />
      </ambiguityMapping>

      <ambiguityMapping token="S">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
      </ambiguityMapping>

      <ambiguityMapping token="W">
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:thymine" />
      </ambiguityMapping>

      <ambiguityMapping token="H">
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:thymine" />
      </ambiguityMapping>

      <ambiguityMapping token="B">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:thymine" />
      </ambiguityMapping>

      <ambiguityMapping token="V">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
      </ambiguityMapping>

      <ambiguityMapping token="D">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:thymine" />
      </ambiguityMapping>
   
      <ambiguityMapping token="N">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:thymine" />
        <symbolref name="urn:lsid:biojava.org:symbol:uracil" />
      </ambiguityMapping>

      <ambiguityMapping token="X">
        <symbolref name="urn:lsid:biojava.org:symbol:guanine" />
        <symbolref name="urn:lsid:biojava.org:symbol:adenine" />
        <symbolref name="urn:lsid:biojava.org:symbol:cytosine" />
        <symbolref name="urn:lsid:biojava.org:symbol:thymine" />
        <symbolref name="urn:lsid:biojava.org:symbol:uracil" />
      </ambiguityMapping>

      <ambiguityMapping token="-">
        <!-- Gap is the empty set -->
      </ambiguityMapping>

      <ambiguityMapping token=".">
      </ambiguityMapping>

      <gapSymbolMapping token="~" />>
    </characterTokenization>
  </alphabet>

  <alphabet name="PROTEIN">
    <description>Unmodified Amino-acids commonly found in proteins</description>
    <symbol name="urn:lsid:biojava.org:symbol:ALA">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:ARG">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:ASN">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:ASP">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:CYS">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:GLN">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:GLU">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:GLY">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:HIS">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:ILE">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:LEU">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:LYS">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:MET">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:PHE">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:PRO">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:SER">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:THR">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:SEC">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:TRP">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:TYR">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:VAL">
    </symbol>

    <characterTokenization name="token" caseSensitive="false">
      <atomicMapping token="A">
        <symbolref name="urn:lsid:biojava.org:symbol:ALA" />
      </atomicMapping>

      <atomicMapping token="R">
        <symbolref name="urn:lsid:biojava.org:symbol:ARG" />
      </atomicMapping>

      <atomicMapping token="N">
        <symbolref name="urn:lsid:biojava.org:symbol:ASN" />
      </atomicMapping>

      <atomicMapping token="D">
        <symbolref name="urn:lsid:biojava.org:symbol:ASP" />
      </atomicMapping>

      <atomicMapping token="C">
        <symbolref name="urn:lsid:biojava.org:symbol:CYS" />
      </atomicMapping>

      <atomicMapping token="Q">
        <symbolref name="urn:lsid:biojava.org:symbol:GLN" />
      </atomicMapping>

      <atomicMapping token="E">
        <symbolref name="urn:lsid:biojava.org:symbol:GLU" />
      </atomicMapping>

      <atomicMapping token="G">
        <symbolref name="urn:lsid:biojava.org:symbol:GLY" />
      </atomicMapping>
     
      <atomicMapping token="H">
        <symbolref name="urn:lsid:biojava.org:symbol:HIS" />
      </atomicMapping>

      <atomicMapping token="I">
        <symbolref name="urn:lsid:biojava.org:symbol:ILE" />
      </atomicMapping>

      <atomicMapping token="L">
        <symbolref name="urn:lsid:biojava.org:symbol:LEU" />
      </atomicMapping>

      <atomicMapping token="K">
        <symbolref name="urn:lsid:biojava.org:symbol:LYS" />
      </atomicMapping>

      <atomicMapping token="M">
        <symbolref name="urn:lsid:biojava.org:symbol:MET" />
      </atomicMapping>

      <atomicMapping token="F">
        <symbolref name="urn:lsid:biojava.org:symbol:PHE" />
      </atomicMapping>

      <atomicMapping token="P">
        <symbolref name="urn:lsid:biojava.org:symbol:PRO" />
      </atomicMapping>

      <atomicMapping token="S">
        <symbolref name="urn:lsid:biojava.org:symbol:SER" />
      </atomicMapping>

      <atomicMapping token="T">
        <symbolref name="urn:lsid:biojava.org:symbol:THR" />
      </atomicMapping>

      <atomicMapping token="U">
        <symbolref name="urn:lsid:biojava.org:symbol:SEC" />
      </atomicMapping>

      <atomicMapping token="W">
        <symbolref name="urn:lsid:biojava.org:symbol:TRP" />
      </atomicMapping>

      <atomicMapping token="Y">
        <symbolref name="urn:lsid:biojava.org:symbol:TYR" />
      </atomicMapping>

      <atomicMapping token="V">
        <symbolref name="urn:lsid:biojava.org:symbol:VAL" />
      </atomicMapping>

      <ambiguityMapping token="X">
        <symbolref name="urn:lsid:biojava.org:symbol:ALA"/>
        <symbolref name="urn:lsid:biojava.org:symbol:ARG"/>
        <symbolref name="urn:lsid:biojava.org:symbol:ASN"/>
        <symbolref name="urn:lsid:biojava.org:symbol:ASP"/>
        <symbolref name="urn:lsid:biojava.org:symbol:CYS"/>
        <symbolref name="urn:lsid:biojava.org:symbol:GLN"/>
        <symbolref name="urn:lsid:biojava.org:symbol:GLU"/>
        <symbolref name="urn:lsid:biojava.org:symbol:GLY"/>
        <symbolref name="urn:lsid:biojava.org:symbol:HIS"/>
        <symbolref name="urn:lsid:biojava.org:symbol:ILE"/>
        <symbolref name="urn:lsid:biojava.org:symbol:LEU"/>
        <symbolref name="urn:lsid:biojava.org:symbol:LYS"/>
        <symbolref name="urn:lsid:biojava.org:symbol:MET"/>
        <symbolref name="urn:lsid:biojava.org:symbol:PHE"/>
        <symbolref name="urn:lsid:biojava.org:symbol:PRO"/>
        <symbolref name="urn:lsid:biojava.org:symbol:SER"/>
        <symbolref name="urn:lsid:biojava.org:symbol:THR"/>
        <symbolref name="urn:lsid:biojava.org:symbol:SEC"/>
        <symbolref name="urn:lsid:biojava.org:symbol:TRP"/>
        <symbolref name="urn:lsid:biojava.org:symbol:TYR"/>
        <symbolref name="urn:lsid:biojava.org:symbol:VAL"/>
      </ambiguityMapping>

      <ambiguityMapping token="B">
        <symbolref name="urn:lsid:biojava.org:symbol:ASP" />
        <symbolref name="urn:lsid:biojava.org:symbol:ASN" />
      </ambiguityMapping>

      <ambiguityMapping token="Z">
        <symbolref name="urn:lsid:biojava.org:symbol:GLU" />
        <symbolref name="urn:lsid:biojava.org:symbol:GLN" />
      </ambiguityMapping>

      <ambiguityMapping token="-">
      </ambiguityMapping>

      <ambiguityMapping token=".">
      </ambiguityMapping>

      <gapSymbolMapping token="~" />>
    </characterTokenization>
  </alphabet>

  <alphabet name="PROTEIN-TERM" parent="PROTEIN">
    <symbol name="urn:lsid:biojava.org:symbol:TER">
      <description>Termination signal during translation</description>
    </symbol>

    <!-- THOMASD we surely shouldn't need to replicated this -->

    <characterTokenization name="token" caseSensitive="false">
      <atomicMapping token="*">
        <symbolref name="urn:lsid:biojava.org:symbol:TER" />
      </atomicMapping>

      <atomicMapping token="A">
        <symbolref name="urn:lsid:biojava.org:symbol:ALA" />
      </atomicMapping>

      <atomicMapping token="R">
        <symbolref name="urn:lsid:biojava.org:symbol:ARG" />
      </atomicMapping>

      <atomicMapping token="N">
        <symbolref name="urn:lsid:biojava.org:symbol:ASN" />
      </atomicMapping>

      <atomicMapping token="D">
        <symbolref name="urn:lsid:biojava.org:symbol:ASP" />
      </atomicMapping>

      <atomicMapping token="C">
        <symbolref name="urn:lsid:biojava.org:symbol:CYS" />
      </atomicMapping>

      <atomicMapping token="Q">
        <symbolref name="urn:lsid:biojava.org:symbol:GLN" />
      </atomicMapping>

      <atomicMapping token="E">
        <symbolref name="urn:lsid:biojava.org:symbol:GLU" />
      </atomicMapping>

      <atomicMapping token="G">
        <symbolref name="urn:lsid:biojava.org:symbol:GLY" />
      </atomicMapping>
     
      <atomicMapping token="H">
        <symbolref name="urn:lsid:biojava.org:symbol:HIS" />
      </atomicMapping>

      <atomicMapping token="I">
        <symbolref name="urn:lsid:biojava.org:symbol:ILE" />
      </atomicMapping>

      <atomicMapping token="L">
        <symbolref name="urn:lsid:biojava.org:symbol:LEU" />
      </atomicMapping>

      <atomicMapping token="K">
        <symbolref name="urn:lsid:biojava.org:symbol:LYS" />
      </atomicMapping>

      <atomicMapping token="M">
        <symbolref name="urn:lsid:biojava.org:symbol:MET" />
      </atomicMapping>

      <atomicMapping token="F">
        <symbolref name="urn:lsid:biojava.org:symbol:PHE" />
      </atomicMapping>

      <atomicMapping token="P">
        <symbolref name="urn:lsid:biojava.org:symbol:PRO" />
      </atomicMapping>

      <atomicMapping token="S">
        <symbolref name="urn:lsid:biojava.org:symbol:SER" />
      </atomicMapping>

      <atomicMapping token="T">
        <symbolref name="urn:lsid:biojava.org:symbol:THR" />
      </atomicMapping>

      <atomicMapping token="U">
        <symbolref name="urn:lsid:biojava.org:symbol:SEC" />
      </atomicMapping>

      <atomicMapping token="W">
        <symbolref name="urn:lsid:biojava.org:symbol:TRP" />
      </atomicMapping>

      <atomicMapping token="Y">
        <symbolref name="urn:lsid:biojava.org:symbol:TYR" />
      </atomicMapping>

      <atomicMapping token="V">
        <symbolref name="urn:lsid:biojava.org:symbol:VAL" />
      </atomicMapping>

      <ambiguityMapping token="X">
        <symbolref name="urn:lsid:biojava.org:symbol:ALA"/>
        <symbolref name="urn:lsid:biojava.org:symbol:ARG"/>
        <symbolref name="urn:lsid:biojava.org:symbol:ASN"/>
        <symbolref name="urn:lsid:biojava.org:symbol:ASP"/>
        <symbolref name="urn:lsid:biojava.org:symbol:CYS"/>
        <symbolref name="urn:lsid:biojava.org:symbol:GLN"/>
        <symbolref name="urn:lsid:biojava.org:symbol:GLU"/>
        <symbolref name="urn:lsid:biojava.org:symbol:GLY"/>
        <symbolref name="urn:lsid:biojava.org:symbol:HIS"/>
        <symbolref name="urn:lsid:biojava.org:symbol:ILE"/>
        <symbolref name="urn:lsid:biojava.org:symbol:LEU"/>
        <symbolref name="urn:lsid:biojava.org:symbol:LYS"/>
        <symbolref name="urn:lsid:biojava.org:symbol:MET"/>
        <symbolref name="urn:lsid:biojava.org:symbol:PHE"/>
        <symbolref name="urn:lsid:biojava.org:symbol:PRO"/>
        <symbolref name="urn:lsid:biojava.org:symbol:SER"/>
        <symbolref name="urn:lsid:biojava.org:symbol:THR"/>
        <symbolref name="urn:lsid:biojava.org:symbol:SEC"/>
        <symbolref name="urn:lsid:biojava.org:symbol:TRP"/>
        <symbolref name="urn:lsid:biojava.org:symbol:TYR"/>
        <symbolref name="urn:lsid:biojava.org:symbol:VAL"/>
      </ambiguityMapping>

      <ambiguityMapping token="X">
        <symbolref name="urn:lsid:biojava.org:symbol:ALA"/>
        <symbolref name="urn:lsid:biojava.org:symbol:ARG"/>
        <symbolref name="urn:lsid:biojava.org:symbol:ASN"/>
        <symbolref name="urn:lsid:biojava.org:symbol:ASP"/>
        <symbolref name="urn:lsid:biojava.org:symbol:CYS"/>
        <symbolref name="urn:lsid:biojava.org:symbol:GLN"/>
        <symbolref name="urn:lsid:biojava.org:symbol:GLU"/>
        <symbolref name="urn:lsid:biojava.org:symbol:GLY"/>
        <symbolref name="urn:lsid:biojava.org:symbol:HIS"/>
        <symbolref name="urn:lsid:biojava.org:symbol:ILE"/>
        <symbolref name="urn:lsid:biojava.org:symbol:LEU"/>
        <symbolref name="urn:lsid:biojava.org:symbol:LYS"/>
        <symbolref name="urn:lsid:biojava.org:symbol:MET"/>
        <symbolref name="urn:lsid:biojava.org:symbol:PHE"/>
        <symbolref name="urn:lsid:biojava.org:symbol:PRO"/>
        <symbolref name="urn:lsid:biojava.org:symbol:SER"/>
        <symbolref name="urn:lsid:biojava.org:symbol:THR"/>
        <symbolref name="urn:lsid:biojava.org:symbol:SEC"/>
        <symbolref name="urn:lsid:biojava.org:symbol:TRP"/>
        <symbolref name="urn:lsid:biojava.org:symbol:TYR"/>
        <symbolref name="urn:lsid:biojava.org:symbol:VAL"/>
        <symbolref name="urn:lsid:biojava.org:symbol:TER" />
      </ambiguityMapping>

      <ambiguityMapping token="B">
        <symbolref name="urn:lsid:biojava.org:symbol:ASP" />
        <symbolref name="urn:lsid:biojava.org:symbol:ASN" />
      </ambiguityMapping>

      <ambiguityMapping token="Z">
        <symbolref name="urn:lsid:biojava.org:symbol:GLU" />
        <symbolref name="urn:lsid:biojava.org:symbol:GLN" />
      </ambiguityMapping>

      <ambiguityMapping token="-">
      </ambiguityMapping>

      <ambiguityMapping token=".">
      </ambiguityMapping>

      <gapSymbolMapping token="~" />>
    </characterTokenization>

  </alphabet>

  <alphabet name="STRUCTURE">
    <symbol name="urn:lsid:biojava.org:symbol:nothing">
      <description>The structure here is undetermined</description>
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:coil">
      <description>Random coil</description>
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:alpha-helix">
      <description>part of an alpha-helix</description>
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:3(10)-helix">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:Pi-helix">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:sheet">
      <description>part of an extended hydrogen bonded beta-strand (extended strand, logs of Bs)</description>
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:beta-bridge">
      <description>like a lone E</description>
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:turn">
    </symbol>

    <symbol name="urn:lsid:biojava.org:symbol:bend">
      <description>five-residue bend centred at residue i</description>
    </symbol>

    <characterTokenization name="token" caseSensitive="false">
      <atomicMapping token=" ">
        <symbolref name="urn:lsid:biojava.org:symbol:nothing" />
      </atomicMapping>

      <atomicMapping token="c">
        <symbolref name="urn:lsid:biojava.org:symbol:coil" />
      </atomicMapping>

      <atomicMapping token="h">
        <symbolref name="urn:lsid:biojava.org:symbol:alpha-helix" />
      </atomicMapping>

      <atomicMapping token="g">
        <symbolref name="urn:lsid:biojava.org:symbol:3(10)-helix" />
      </atomicMapping>

      <atomicMapping token="i">
        <symbolref name="urn:lsid:biojava.org:symbol:Pi-helix" />
      </atomicMapping>

      <atomicMapping token="e">
        <symbolref name="urn:lsid:biojava.org:symbol:sheet" />
      </atomicMapping>

      <atomicMapping token="b">
        <symbolref name="urn:lsid:biojava.org:symbol:beta-bridge" />
      </atomicMapping>

      <atomicMapping token="t">
        <symbolref name="urn:lsid:biojava.org:symbol:turn" />
      </atomicMapping>

      <atomicMapping token="s">
        <symbolref name="urn:lsid:biojava.org:symbol:bend" />
      </atomicMapping>

      <ambiguityMapping token="-">
      </ambiguityMapping>

      <gapSymbolMapping token="~" />>
    </characterTokenization>
  </alphabet>

</alphabetManager>


_______________________________________________
Biojava-l mailing list  -  Biojava-l@...
http://lists.open-bio.org/mailman/listinfo/biojava-l

Re: Upper case symbols

by Andreas Dräger-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thank you, Eric!

This XML file works greatly!

Cheers
Andreas
_______________________________________________
Biojava-l mailing list  -  Biojava-l@...
http://lists.open-bio.org/mailman/listinfo/biojava-l