Bio::Search::Tiling::MapTiling and BlastX report.

View: New views
3 Messages — Rating Filter:   Alert me  

Bio::Search::Tiling::MapTiling and BlastX report.

by Frederic.SAPET :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello

I'm trying to use the new  Bio::Search::Tiling::MapTiling (I have tried
both BioPerl 1.6.1 and the last nightly build)  module on a WuBlastX
report after reading the documentation
(http://www.bioperl.org/wiki/HOWTO:Tiling).
Please find in attachment the sample file I used for my test.




Here is the code I have tried :

use Bio::SearchIO;
use Bio::Search::Tiling::MapTiling;

my $blio = Bio::SearchIO->new( -format => 'blast',
                                     -file => 'WuBlastX.txt'
);

my $result = $blio->next_result;

while (my $hit = $result->next_hit) {
    my $tiling = Bio::Search::Tiling::MapTiling->new($hit);
    my @contextsQ = $tiling->contexts('query');
        for my $contextQ ( @contextsQ ) {
        my ($min, $max) = $tiling->range('query', $contextQ);
        my $tLengthQ = $tiling->length('query', 'exact', $contextQ);
                print "QUERY range in *$contextQ* => $min, $max
(length=$tLengthQ)\n";
        }
        my @contextsS = $tiling->contexts('subject');
        for my $contextS ( @contextsS ) {
        my ($min2, $max2) = $tiling->range('subject', $contextS);
                print "SUBJECT range in *$contextS* => $min2, $max2\n";
        }
}

exit;

Results printed on my terminal are :

QUERY range in *m1* => 4065, 10571 (length=7038)
QUERY range in *m0* => 7, 11037 (length=2577)
QUERY range in *m2* => 2462, 14599 (length=327)
SUBJECT range in *all* => 435, 270

I think that the right query range in m1 context should be 1170 to 10571
(and length 6828 ?)
and subject range has to be 231 to 3563 no ?

The m1 context seems to be the best one, with 2271 amino acid aligned  (in
4 HSP )between MySampleSeq and the Protein B9GCX0.

Could you please help me to point what is wrong ?

Thank you.

Fred
BLASTX 2.0MP-WashU [04-May-2006] [linux26-x64-I32LPF64 2006-05-10T17:22:28]

Copyright (C) 1996-2006 Washington University, Saint Louis, Missouri USA.
All Rights Reserved.

Reference:  Gish, W. (1996-2006) http://blast.wustl.edu
Gish, Warren and David J. States (1993).  Identification of protein coding
regions by database similarity search.  Nat. Genet. 3:266-72.

Query=  MySampleSeq
        (14,601 letters)

  Translating both strands of query sequence in all 6 reading frames

Database:  UNIPROT_B9GCX0
           1 sequences; 3829 total letters.
Searching done

                                                                     Smallest
                                                                       Sum
                                                              High  Probability
Sequences producing High-scoring Segment Pairs:              Score  P(N)      N

UNIPROT:B9GCX0 |B9GCX0|B9GCX0|SubName: Full=Putative unch...  7596  0.       11


>UNIPROT:B9GCX0 |B9GCX0|B9GCX0|SubName: Full=Putative uncharacterized protein;
            |Oryza sativa subsp japonica (Rice) |AA|3829
        Length = 3829

  Minus Strand HSPs:

 Score = 7596 (2679.0 bits), Expect = 0., Sum P(11) = 0., Group = 1
 Identities = 1533/2058 (74%), Positives = 1675/2058 (81%), Frame = -2

Query: 10205 PGQTSDKGKSSNLCVIHIPDMHLQKEDDLSILKQCVDKFNVPPEHRFALLTRIRYARAFN 10026
             P Q+SDK K SNLCVIHIPD+HLQKEDDLSILKQCVDKFNVP EHRF+L TRIRYA AFN
Sbjct:   435 PDQSSDKAKPSNLCVIHIPDLHLQKEDDLSILKQCVDKFNVPSEHRFSLFTRIRYAHAFN 494

Query: 10025 SARTCRIYSRISLLSFIVLVQSSDAHDELTYFFTNEPEYINELIRLVRSEDSVPGSIRXX 9846
             S RTCR+YSRISLL+FIVLVQSSDAHDELT FFTNEPEYINELIRLVRSE+ VPG IR  
Sbjct:   495 SPRTCRLYSRISLLAFIVLVQSSDAHDELTSFFTNEPEYINELIRLVRSEEFVPGPIRAL 554

Query:  9845 XXXXXXXXXXXXXSSHERARXXXXXXXXXXXXNRMVLLSVLQKAISSLNSLNDTSSPLIV 9666
                          SSHERAR            NRMVLLSVLQKAISSL+S NDTSSPLIV
Sbjct:   555 AMLALGAQLAAYASSHERARILSGSSIISAGGNRMVLLSVLQKAISSLSSPNDTSSPLIV 614

Query:  9665 DAXXXXXXXXXXXXXXXGTTVRGSGMVXXXXXXXRDNDPSHMHLVCLAVKTLQKLMEYSS 9486
             DA               GTTVRGSGMV       +DNDPSHMHLVCLAVKTLQKLMEYSS
Sbjct:   615 DALLQFFLLHVLSSSSSGTTVRGSGMVPPLLPLLQDNDPSHMHLVCLAVKTLQKLMEYSS 674

Query:  9485 PAVSLFKDLGGVELLSQRLHVEVQRVIGTADGHNSMVT-DAVKSDDNHMYSQKRLIKALL 9309
             PAVSLFKDLGGVELLSQRLHVEVQRVIG  D HNSMVT DA+KS+++H+YSQKRLIKALL
Sbjct:   675 PAVSLFKDLGGVELLSQRLHVEVQRVIGV-DSHNSMVTSDALKSEEDHLYSQKRLIKALL 733

Query:  9308 KALGSATYSPGNPARSQSSQDNSLPVSLSLIFQNVDKFGGDIYFSAVTVMSEIIHKDPTC 9129
             KALGSATYSP NPARSQSS DNSLP+SLSLIFQNVDKFGGDIYFSAVTVMSEIIHKDPTC
Sbjct:   734 KALGSATYSPANPARSQSSNDNSLPISLSLIFQNVDKFGGDIYFSAVTVMSEIIHKDPTC 793

Query:  9128 FITLKELGVPDAFISSVTAGVIPSCKALICVPNGLGAICLNNQGLEAVRETSALRFLVDT 8949
             F +LKELG+PDAF+SSV+AGVIPSCKALICVPNGLGAICLNNQGLEAVRETSALRFLVDT
Sbjct:   794 FPSLKELGLPDAFLSSVSAGVIPSCKALICVPNGLGAICLNNQGLEAVRETSALRFLVDT 853

Query:  8948 FTSRKYLIPMNEGXXXXXXXXXXXXRHVQSLRSIGVDIIIEIINKLSSSQEYKNNE-TAT 8772
             FTSRKYLIPMNEG            RHVQSLRS GVDIIIEIINKLSS +E K+NE  A+
Sbjct:   854 FTSRKYLIPMNEGVVLLANAVEELLRHVQSLRSTGVDIIIEIINKLSSPREDKSNEPAAS 913

Query:  8771 LQEKTDMETDVEGRDLVSAMDSSVDGSNDEQFSHLSIFHVMVLVHRTMENSETCRLFVEK 8592
               E+T+METD EGRDLVSAMDSS DG+NDEQFSHLSIFHVMVLVHRTMENSETCRLFVEK
Sbjct:   914 SDERTEMETDAEGRDLVSAMDSSEDGTNDEQFSHLSIFHVMVLVHRTMENSETCRLFVEK 973

Query:  8591 GGXXXXXXXXXRPSITQSSGGMPIALHSTMVFKGFTQHHSTPLARAFCSSLKEHLKSALK 8412
             GG         RPSITQSSGGMPIALHSTMVFKGFTQHHSTPLARAFCSSLKEHLK+AL+
Sbjct:   974 GGLQALLTLLLRPSITQSSGGMPIALHSTMVFKGFTQHHSTPLARAFCSSLKEHLKNALQ 1033

Query:  8411 ELDKVSNSFDMTKIEKGAIPSXXXXXXXXXXAASKDNRWMNALLSEFGDASREVLEDVGQ 8232
             ELD V++S ++ K+EKGAIPS          AASKDNRWMNALLSEFGD+SR+VLED+G+
Sbjct:  1034 ELDTVASSGEVAKLEKGAIPSLFVVEFLLFLAASKDNRWMNALLSEFGDSSRDVLEDIGR 1093

Query:  8231 VHREVLWKISLFEKNKIVXXXXXXXXXXXXXXPDMSASDIGDSRYTSFRQYLDPILRRRG 8052
             VHREVLW+ISLFE+ K+                D +  D+ DSRYTSFRQYLDP+LRRRG
Sbjct:  1094 VHREVLWQISLFEEKKV--EPETSSPLANDSQQDAAVGDVDDSRYTSFRQYLDPLLRRRG 1151

Query:  8051 SGWNIESQVSDLINMYRDIGRAASDSQRVGSDRYSSLGLPXXXXXXXX-XXXXXXXXTRX 7875
             SGWNIESQVSDLIN+YRDIGRAA DSQR     Y S GLP                 T+
Sbjct:  1152 SGWNIESQVSDLINIYRDIGRAAGDSQR-----YPSAGLPSSSSQDQPPSSSDASASTKS 1206

Query:  7874 XXXXXXXXXXXCFDMMRSLSYHINHLFLELGKAMLFASRRENSPVNLSPAVISVANNIAS 7695
                        C DMMRSLSYHINHLF+ELGKAML  SRRENSPVNLS +++SVA+NIAS
Sbjct:  1207 EEDKKRSEHSSCCDMMRSLSYHINHLFMELGKAMLLTSRRENSPVNLSASIVSVASNIAS 1266

Query:  7694 IVLEHLNFEGHSVSFERDMTVTTKCRYLGKVVEFVDGMLLDRPESCNSIMVNSFYCRGVI 7515
             IVLEHLNFEGH++S ER+ TV+TKCRYLGKVVEF+DG+LLDRPESCN IM+NSFYCRGVI
Sbjct:  1267 IVLEHLNFEGHTISSERETTVSTKCRYLGKVVEFIDGILLDRPESCNPIMLNSFYCRGVI 1326

Query:  7514 QAILTTFQATSELLFTMSRPPSSPMETDSKTGKDGKEMDSSWIYGPLTSYGAIMDHLVTS 7335
             QAILTTF+ATSELLF+M+R PSSPMETDSK+ K+ +E DSSWIYGPL+SYGAI+DHLVTS
Sbjct:  1327 QAILTTFEATSELLFSMNRLPSSPMETDSKSVKEDRETDSSWIYGPLSSYGAILDHLVTS 1386

Query:  7334 SFILSSSTRQLLEQPIFNGSVRFPQDAETFMKLLQSKVLKTVLPIWAHPQFPECNIELIS 7155
             SFILSSSTRQLLEQPIF+G++RFPQDAE FMKLLQS+VLKTVLPIW HPQFPECN+ELIS
Sbjct:  1387 SFILSSSTRQLLEQPIFSGNIRFPQDAEKFMKLLQSRVLKTVLPIWTHPQFPECNVELIS 1446

Query:  7154 SVMSIMRHVCSGVEVKDTVGNGGARLAGPPPDESAISLIVEMGFSRARAEEALRQVGTNS 6975
             SV SIMRHV SGVEVK+T  N GARLAGPPPDE+AISLIVEMGFSRARAEEALRQVGTNS
Sbjct:  1447 SVTSIMRHVYSGVEVKNTAINTGARLAGPPPDENAISLIVEMGFSRARAEEALRQVGTNS 1506

Query:  6974 VEIATDWLFAHPEEPQEEDDELARALAMSLGNSVTPAQEGDSRSNDLELEEATVQPPPID 6795
             VEIATDWLF+HPEEPQE DDELARALAMSLGNS T AQE D +SNDLELEE TVQ PPID
Sbjct:  1507 VEIATDWLFSHPEEPQE-DDELARALAMSLGNSDTSAQEEDGKSNDLELEEETVQLPPID 1565

Query:  6794 EMLRSCLQLLQRKEALAFSVRDMLVTISSQNDGQNRVKVLTYLIDNLKQCVVASEPSNDT 6615
             E+L SCL+LLQ KE+LAF VRDML+T+SSQNDGQNRVKVLTYLID+LK C+++S+P   T
Sbjct:  1566 EVLSSCLRLLQTKESLAFPVRDMLLTMSSQNDGQNRVKVLTYLIDHLKNCLMSSDPLKST 1625

Query:  6614 XXXXXXXXXXXXXHGDTAAREVASKAGLVKVALDLLCSWEVQIRESSMIEVPNWVISCFL 6435
                          HGDTAAREVASKAGLVKVAL+LLCSWE++ R+  + +VPNWV SCFL
Sbjct:  1626 ALSALFHVLALILHGDTAAREVASKAGLVKVALNLLCSWELEPRQGEISDVPNWVPSCFL 1685

Query:  6434 SVDQMLQLEPKLPDVTELHVLKRDNSNIKTSLVIDDSKRKDSESLPNVGLLDMEDQFQLL 6255
             S+D+MLQL+PKLPDVTEL VLK+DNSN +TS+VIDDSK+KDSE+  + GLLD+EDQ QLL
Sbjct:  1686 SIDRMLQLDPKLPDVTELDVLKKDNSNTQTSVVIDDSKKKDSEASSSTGLLDLEDQKQLL 1745

Query:  6254 KICCKCIGKQLPSASMHAILQLSATLTKVHAAAICFLESGGLNALLSLPTSSLFSGFNNM 6075
             KICCKCI KQLPSA+MHAILQL ATLTK+HAAAICFLESGGL+ALLSLPTSSLFSGFN++
Sbjct:  1746 KICCKCIQKQLPSATMHAILQLCATLTKLHAAAICFLESGGLHALLSLPTSSLFSGFNSV 1805

Query:  6074 ASTIIRHILEDPHTLQQAMELEIRHSLVTAANRHANPRVTPRNFIQNLAFVVYRDPVIFM 5895
             ASTIIRHILEDPHTLQQAMELEIRHSLVTAANRHANPRVTPRNF+QNLAFVVYRDPVIFM
Sbjct:  1806 ASTIIRHILEDPHTLQQAMELEIRHSLVTAANRHANPRVTPRNFVQNLAFVVYRDPVIFM 1865

Query:  5894 KAAQSVCQIEMVGDRPYVVLLXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXSGDTAAGSP 5715
             KAAQ+VCQIEMVGDRPYVVLL                              SGD A GSP
Sbjct:  1866 KAAQAVCQIEMVGDRPYVVLLKDREKEKNKEKEKDKPADKDKTSGAATKMTSGDMALGSP 1925

Query:  5714 ANSHGKQSDLNSRNVKSHRKPPQSFVTVIEHLLDLLMSFVPPPRPEDQVD-VSGTALSSD 5538
              +S GKQ+DLN+RNVKS+RKPPQSFVTVIE+LLDL+MSF+PPPR ED+ D  S TA S+D
Sbjct:  1926 ESSQGKQTDLNTRNVKSNRKPPQSFVTVIEYLLDLVMSFIPPPRAEDRPDGESSTASSTD 1985

Query:  5537 MDIDCSSAKGKGKAVSVPPEESKHAIQESTASLAKTAFFLKLLTDVLLTYASSIHVVLRH 5358
             MDID SSAKGKGKAV+V PEESKHAIQE+TASLAK+AF LKLLTDVLLTYASSI VVLRH
Sbjct:  1986 MDID-SSAKGKGKAVAVTPEESKHAIQEATASLAKSAFVLKLLTDVLLTYASSIQVVLRH 2044

Query:  5357 DAELSNMHGPNRTSARLTSGGIFNHILQHFLPHATRQKKERKNDGDWMYKLATRANQFLV 5178
             DA+LSN  GPNR    ++SGG+F+HILQHFLPH+T+QKKERK DGDW YKLATRANQFLV
Sbjct:  2045 DADLSNARGPNRIG--ISSGGVFSHILQHFLPHSTKQKKERKADGDWRYKLATRANQFLV 2102

Query:  5177 ASSIRSAEARKRIFSEICSIFLDFTDSSAGYNAPVPRMNVYVDLLNDILSARSPTGSSLS 4998
             ASSIRSAE RKRIFSEICSIF+DFTDS AG   P+ RMN YVDLLNDILSARSPTGSSLS
Sbjct:  2103 ASSIRSAEGRKRIFSEICSIFVDFTDSPAGCKPPILRMNAYVDLLNDILSARSPTGSSLS 2162

Query:  4997 AESAVIFVEAGLVHSLSTMLQVLDLDHPDSAKIVTAVVKALELVSKEHIHSAD-NAKGVN 4821
             AESAV FVE GLV  LS  LQV+DLDHPDSAKIVTA+VKALE+V+KEH+HSAD NAKG N
Sbjct:  2163 AESAVTFVEVGLVQYLSKTLQVIDLDHPDSAKIVTAIVKALEVVTKEHVHSADLNAKGEN 2222

Query:  4820 SSKIAXXXXXXXXXXXRFQALDMTSQPTEMVTDHRETFNAVRTSQISDSVADEMDHDRDM 4641
             SSK+            RFQALD T+QPTEMVTDHRE FNAV+TSQ SDSVADEMDHDRD+
Sbjct:  2223 SSKVVSDQSNLDPSSNRFQALD-TTQPTEMVTDHREAFNAVQTSQSSDSVADEMDHDRDL 2281

Query:  4640 DGGFARDGEDDFMHEMAEDGTGDGSTMEIRIEIPRNREDDMAPAADDTXXXXXXXXXXXX 4461
             DGGFARDGEDDFMHE+AEDGT + STMEIR EIPRNREDDMA   +D+            
Sbjct:  2282 DGGFARDGEDDFMHEIAEDGTPNESTMEIRFEIPRNREDDMADDDEDSDEDMSADDGEEV 2341

Query:  4460 XXXXXXXXXXXXXX-----AHRMSHPXXXXXXXXXXXXXXXXXXXXXXXXXXXXX-GVIL 4299
                                AH+MSHP                              GVIL
Sbjct:  2342 DEDEDEDEDEENNNLEEDDAHQMSHPDTDQEDREMDEEEFDEDLLEEDDDEDEDEEGVIL 2401

Query:  4298 RLEEGINGINVLDHVEVFGGSNNLSGDTLRVMPLDIFGTRRQGRSTSIYNLLGRASDHGV 4119
             RLEEGINGINV DH+EVFGGSNNLSGDTLRVMPLDIFGTRRQGRSTSIYNLLGRA DHGV
Sbjct:  2402 RLEEGINGINVFDHIEVFGGSNNLSGDTLRVMPLDIFGTRRQGRSTSIYNLLGRAGDHGV 2461

Query:  4118 LDHPLLEEPSSTTNFSDQ 4065
              DHPLLEEPSS  +   Q
Sbjct:  2462 FDHPLLEEPSSVLHLPQQ 2479

 Score = 1665 (591.2 bits), Expect = 0., Sum P(11) = 0., Group = 1
 Identities = 326/371 (87%), Positives = 342/371 (92%), Frame = -1

Query:  1119 LSDSAYLLVGEVLKKIVALAPFFCCHFINELARSMQNLTLRAMKELHLYENSEKALLSSS 940
             LSD+AYLLV EVLKKIVALAPFFCCHFINELA SMQNLTL AMKELHLYE+SEKALLS+S
Sbjct:  3194 LSDNAYLLVAEVLKKIVALAPFFCCHFINELAHSMQNLTLCAMKELHLYEDSEKALLSTS 3253

Query:   939 SANGTAVLRVVQALSSLVNTLQERKDPEQPAEKDHSDAVSQISEINTALDSLWLELSNCI 760
             SANGTA+LRVVQALSSLV TLQE+KDP+ PAEKDHSDA+SQISEINTALD+LWLELSNCI
Sbjct:  3254 SANGTAILRVVQALSSLVTTLQEKKDPDHPAEKDHSDALSQISEINTALDALWLELSNCI 3313

Query:   759 SKIESSSEYXXXXXXXXXXXXMLTTGVAPPLPAGTQNLLPYIESFFVTCEKLRPGQPDAV 580
             SKIESSSEY             LTTGVAPPLPAGTQN+LPYIESFFVTCEKLRPGQPDA+
Sbjct:  3314 SKIESSSEYASNLSPASANAATLTTGVAPPLPAGTQNILPYIESFFVTCEKLRPGQPDAI 3373

Query:   579 QDASTSDMEDASTSSGGQRSSACQASLDEKQNAFVKFSEKHRRLLNAFIRQNSGLLEKSF 400
             Q+ASTSDMEDASTSSGGQ+SS   A+LDEK NAFVKFSEKHRRLLNAFIRQN GLLEKSF
Sbjct:  3374 QEASTSDMEDASTSSGGQKSSGSHANLDEKHNAFVKFSEKHRRLLNAFIRQNPGLLEKSF 3433

Query:   399 SLMLKIPRLIDFDNKRAYFRSKIKHQYDHHHHSPVRISVRRPYILEDSYNQLRMRSPQDL 220
             SLMLKIPRLI+FDNKRAYFRSKIKHQ+DHHH SPVRISVRR YILEDSYNQLRMRSPQDL
Sbjct:  3434 SLMLKIPRLIEFDNKRAYFRSKIKHQHDHHH-SPVRISVRRAYILEDSYNQLRMRSPQDL 3492

Query:   219 KGRLTVQFQGEEGIDAGGLTREWYQSISRVIVDKSALLFTTVGNDLTFQPNPNSVYQTEH 40
             KGRLTV FQGEEGIDAGGLTREWYQ +SRVI DK ALLFTTVGNDLTFQPNPNSVYQTEH
Sbjct:  3493 KGRLTVHFQGEEGIDAGGLTREWYQLLSRVIFDKGALLFTTVGNDLTFQPNPNSVYQTEH 3552

Query:    39 LSYFKFVGRVV 7
             LSYFKFVGRVV
Sbjct:  3553 LSYFKFVGRVV 3563

 Score = 1356 (482.4 bits), Expect = 0., Sum P(11) = 0., Group = 1
 Identities = 284/370 (76%), Positives = 307/370 (82%), Frame = -1

Query:  3882 ENLVEMAFSDRNHESSSSRLDAIFRSLRSGRNGHRFNMWLDDGPQRNGSAAPAVPEGIEE 3703
             ENLVEMAFSDRNH++SSSRLDAIFRSLRSGR+GHRFNMWLDD PQR GSAAPAVPEGIEE
Sbjct:  2483 ENLVEMAFSDRNHDNSSSRLDAIFRSLRSGRSGHRFNMWLDDSPQRTGSAAPAVPEGIEE 2542

Query:  3702 LLISHLRRPTP-QPDGQRTPVGGAQENDQPN----HGSDAEAREVAPAQQNENSESTLNP 3538
             LL+S LRRPTP QPD Q TP GGA+ENDQ N    H S+ EA   AP +QNEN+++ + P
Sbjct:  2543 LLVSQLRRPTPEQPDEQSTPAGGAEENDQSNQQHLHQSETEAGGDAPTEQNENNDNAVTP 2602

Query:  3537 -----LDLSECAGPAPPDSDALQRDVSNASELATEMQYERSDAITRDVEAVSQASSGSGA 3373
                  LD SE A PAPP S+ALQR+VS ASE ATEMQYERSDA+ RDVEAVSQASSGSGA
Sbjct:  2603 AARSELDGSESADPAPP-SNALQREVSGASEHATEMQYERSDAVVRDVEAVSQASSGSGA 2661

Query:  3372 TLGESLRSLEVEIGSVEGHDDGDRHGTSGTSERLPLGDIQAAARSRRPSGNAVPVSSRDM 3193
             TLGESLRSLEVEIGSVEGHDDGDRHG S   +RLPLGD+QAA+RSRRP G+ V  SSRD+
Sbjct:  2662 TLGESLRSLEVEIGSVEGHDDGDRHGAS---DRLPLGDLQAASRSRRPPGSVVLGSSRDI 2718

Query:  3192 SLESVSEVPQNPDQEPDQNASEGNQEPTRAAGADSIDPTFLEALPEDLRAEVLSSRQNQV 3013
             SLESVSEVPQN +QE DQNA EG+QEP RAA  DSIDPTFLEALPEDLRAEVLSSRQNQV
Sbjct:  2719 SLESVSEVPQNQNQESDQNADEGDQEPNRAADTDSIDPTFLEALPEDLRAEVLSSRQNQV 2778

Query:  3012 TQTSNDQPQDDGDIDPEFLAALPPDIREEVLAXXXXXXXXXXXXXXXXXPVEMDAVSIIA 2833
             TQTSN+QPQ+DGDIDPEFLAALPPDIREEVLA                 PVEMDAVSIIA
Sbjct:  2779 TQTSNEQPQNDGDIDPEFLAALPPDIREEVLAQQRAQRLQQSQELEGQ-PVEMDAVSIIA 2837

Query:  2832 TFPSEIREEV 2803
             TFPSEIREEV
Sbjct:  2838 TFPSEIREEV 2847

 Score = 323 (118.8 bits), Expect = 0., Sum P(11) = 0., Group = 1
 Identities = 61/72 (84%), Positives = 67/72 (93%), Frame = -2

Query:  2351 LQPLYKGQLQKLLVNLCTHRGSRQALVQILVDMLMLDLQGFSKKSIDAPEPPFRLYGCHA 2172
             +QPLYKGQLQ+LL+NLC HR SR++LVQILVDMLMLDLQG SKKSIDA EPPFRLYGCHA
Sbjct:  2945 VQPLYKGQLQRLLLNLCAHRESRKSLVQILVDMLMLDLQGSSKKSIDATEPPFRLYGCHA 3004

Query:  2171 NIAYSRPQSSDG 2136
             NI YSRPQS+DG
Sbjct:  3005 NITYSRPQSTDG 3016

 Score = 314 (115.6 bits), Expect = 0., Sum P(11) = 0., Group = 1
 Identities = 55/70 (78%), Positives = 64/70 (91%), Frame = -2

Query: 10865 WKQGNFHHWRPLFIHFDTYFKTYISSRKDLLLSDDMTEADPMPKNAILKILRVMQIILEN 10686
             + +GNFHHW+PLF+HFDTYFKT ISSRKDLLLSDDM E DP+PKN IL+ILRVMQI+LEN
Sbjct:   303 FNKGNFHHWKPLFMHFDTYFKTQISSRKDLLLSDDMAEGDPLPKNTILQILRVMQIVLEN 362

Query: 10685 CQNRSSFTGL 10656
             CQN++SF GL
Sbjct:   363 CQNKTSFAGL 372

 Score = 310 (114.2 bits), Expect = 0., Sum P(11) = 0., Group = 1
 Identities = 64/93 (68%), Positives = 74/93 (79%), Frame = -1

Query:  1698 QLLNLLDVVMHNAENEIKQAKLEASSEKPSAPDNAVQDGKNNSDISVSYGSELNPEDGSK 1519
             QLLNLL+VVM NAENEI QAKLEA+SEKPS P+NA QD +  ++ + S GS+ N ED SK
Sbjct:  3101 QLLNLLEVVMLNAENEITQAKLEAASEKPSGPENATQDAQEGANAAGSSGSKSNAEDSSK 3160

Query:  1518 APAVDNRSNLQAVLRSLPQPELRLLCSLLAHDG 1420
              P VD  S+LQ VL+SLPQ ELRLLCSLLAHDG
Sbjct:  3161 LPPVDGESSLQKVLQSLPQAELRLLCSLLAHDG 3193

 Score = 274 (101.5 bits), Expect = 0., Sum P(11) = 0., Group = 1
 Identities = 55/87 (63%), Positives = 61/87 (70%), Frame = -2

Query:  2030 GLPPLVSRRVLETLTNLARSHPNVAKLLLFLEFPCPSRCFPEAHDHRHGKAVLLDDGEEQ 1851
             G+PPLVSRRVLETLT LAR+HPNVAKLLLFLEFPCP  C  E  D R GKAVL++   EQ
Sbjct:  3016 GVPPLVSRRVLETLTYLARNHPNVAKLLLFLEFPCPPTCHAETSDQRRGKAVLMEGDSEQ 3075

Query:  1850 KTFAXXXXXXXXXXXXYMRSVAHLEQV 1770
               +A            YMRSVAHLEQ+
Sbjct:  3076 NAYALVLLLTLLNQPLYMRSVAHLEQL 3102

 Score = 231 (86.4 bits), Expect = 0., Sum P(11) = 0., Group = 1
 Identities = 46/71 (64%), Positives = 50/71 (70%), Frame = -2

Query: 10571 CVNSLPFL-CQHLKLLLASSDPEIXXXXXXXXXXXXKINPSKLHMNGKLISCGPINTHLL 10395
             C N   F   +H +LLLASSDPEI            KINPSKLHMNGKLI+CG IN+HLL
Sbjct:   363 CQNKTSFAGLEHFRLLLASSDPEIVVAALETLAALVKINPSKLHMNGKLINCGAINSHLL 422

Query: 10394 SLAQGWGSKEE 10362
             SLAQGWGSKEE
Sbjct:   423 SLAQGWGSKEE 433

 Score = 217 (81.4 bits), Expect = 0., Sum P(11) = 0., Group = 1
 Identities = 46/70 (65%), Positives = 50/70 (71%), Frame = -3

Query:  2665 NMLRERFAHRYHSSSLFGMXXXXXXXXXXXX-DIMAAGLDRNTGDPSRS-TSKPIETEGA 2492
             NMLRERFAHRYHS SLFGM             DI+ +GLDRN GD SR  TSKPIETEG+
Sbjct:  2868 NMLRERFAHRYHSGSLFGMNSRGRRGESSRRGDIIGSGLDRNAGDSSRQPTSKPIETEGS 2927

Query:  2491 PLVDEDGLKA 2462
             PLVD+D LKA
Sbjct:  2928 PLVDKDALKA 2937

 Score = 159 (61.0 bits), Expect = 0., Sum P(11) = 0., Group = 1
 Identities = 29/35 (82%), Positives = 32/35 (91%), Frame = -1

Query: 11037 PANIKAFIDRVVNIPLHDIAIPLSGFCWEFNKVNF 10933
             PA +KAFIDRV++IPLHDIAIPLSGF WEFNK NF
Sbjct:   274 PAKVKAFIDRVISIPLHDIAIPLSGFRWEFNKGNF 308

 Score = 125 (49.1 bits), Expect = 0., Sum P(11) = 0., Group = 1
 Identities = 27/41 (65%), Positives = 30/41 (73%), Frame = -3

Query: 14599 GFRSEXXXXXXXXXXHRASFPLRLQQILAGSRAVSPAIKIE 14477
             G RSE          HRASFPLRLQQIL+GSRAVSP+IK+E
Sbjct:   231 GSRSEMAAAAAMAA-HRASFPLRLQQILSGSRAVSPSIKVE 270


Parameters:
  B=1000000
  V=1000000
  W=5
  S2=65
  X=10
  cpus=8
  filter=seg
  hspsepqmax=10000
  hspmax=0
  topcomboN=1

  ctxfactor=5.77
  E=10

  Query                        -----  As Used  -----    -----  Computed  ----
  Frame  MatID Matrix name     Lambda    K       H      Lambda    K       H
   +3      0   BLOSUM62        0.318   0.134   0.401    0.350   0.152   0.535  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   +2      0   BLOSUM62        0.318   0.134   0.401    0.346   0.150   0.530  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   +1      0   BLOSUM62        0.318   0.134   0.401    0.345   0.148   0.496  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -1      0   BLOSUM62        0.318   0.134   0.401    0.351   0.155   0.580  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -2      0   BLOSUM62        0.318   0.134   0.401    0.337   0.145   0.465  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a
   -3      0   BLOSUM62        0.318   0.134   0.401    0.358   0.157   0.604  
               Q=9,R=2         0.244   0.0300  0.180     n/a     n/a     n/a

  Query
  Frame  MatID  Length  Eff.Length     E    S W   T  X   E2     S2
   +3      0     4866      4532       8.3  48 5 n/a 10  0.0044  48
                                                    44  0.21    40
   +2      0     4866      4399       8.0  48 5 n/a 10  0.0044  48
                                                    44  0.21    40
   +1      0     4867      4373       8.0  48 5 n/a 10  0.0044  48
                                                    44  0.21    40
   -1      0     4867      4371       8.0  48 5 n/a 10  0.0044  48
                                                    44  0.21    40
   -2      0     4866      4254       9.9  47 5 n/a 10  0.0061  47
                                                    44  0.21    40
   -3      0     4866      4240       9.9  47 5 n/a 10  0.0061  47
                                                    44  0.21    40


Statistics:

  Database:  UNIPROT_B9GCX0
   Title:  UNIPROT_B9GCX0
   Posted:  3:30:56 PM CET Oct 29, 2009
   Created:  3:30:56 PM CET Oct 29, 2009
   Format:  XDF-1
   # of letters in database:  3829
   # of sequences in database:  1
   # of database sequences satisfying E:  1
  No. of states in DFA:  32,067 (6765 KB)
  Total size of DFA:  7905 KB (8709 KB)
  Time to generate neighborhood:  0.03u 0.01s 0.04t   Elapsed:  00:00:00
  No. of threads or processors used:  1
  Search cpu time:  0.00u 0.01s 0.01t   Elapsed:  00:00:00
  Total cpu time:  0.03u 0.03s 0.06t   Elapsed:  00:00:00
  Start:  Fri Oct 30 09:12:00 2009   End:  Fri Oct 30 09:12:00 2009
_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: Bio::Search::Tiling::MapTiling and BlastX report.

by Mark A. Jensen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Fred-
I've been looking hard at this; my guess is it's a bug in relating
strand/frame contexts to the 'all' AA-only context.
Would be kind enough to submit a bug report to
http://bugzilla.bioperl.org
and attach your script and the blast report?
I will work on this ASAP.
Thanks
Mark
----- Original Message -----
From: <Frederic.SAPET@...>
To: <bioperl-l@...>
Sent: Friday, October 30, 2009 6:10 AM
Subject: [Bioperl-l] Bio::Search::Tiling::MapTiling and BlastX report.


> Hello
>
> I'm trying to use the new  Bio::Search::Tiling::MapTiling (I have tried
> both BioPerl 1.6.1 and the last nightly build)  module on a WuBlastX
> report after reading the documentation
> (http://www.bioperl.org/wiki/HOWTO:Tiling).
> Please find in attachment the sample file I used for my test.
>
>
>
>
> Here is the code I have tried :
>
> use Bio::SearchIO;
> use Bio::Search::Tiling::MapTiling;
>
> my $blio = Bio::SearchIO->new( -format => 'blast',
>                                     -file => 'WuBlastX.txt'
> );
>
> my $result = $blio->next_result;
>
> while (my $hit = $result->next_hit) {
>    my $tiling = Bio::Search::Tiling::MapTiling->new($hit);
>    my @contextsQ = $tiling->contexts('query');
>        for my $contextQ ( @contextsQ ) {
>        my ($min, $max) = $tiling->range('query', $contextQ);
>        my $tLengthQ = $tiling->length('query', 'exact', $contextQ);
>                print "QUERY range in *$contextQ* => $min, $max
> (length=$tLengthQ)\n";
>        }
>        my @contextsS = $tiling->contexts('subject');
>        for my $contextS ( @contextsS ) {
>        my ($min2, $max2) = $tiling->range('subject', $contextS);
>                print "SUBJECT range in *$contextS* => $min2, $max2\n";
>        }
> }
>
> exit;
>
> Results printed on my terminal are :
>
> QUERY range in *m1* => 4065, 10571 (length=7038)
> QUERY range in *m0* => 7, 11037 (length=2577)
> QUERY range in *m2* => 2462, 14599 (length=327)
> SUBJECT range in *all* => 435, 270
>
> I think that the right query range in m1 context should be 1170 to 10571
> (and length 6828 ?)
> and subject range has to be 231 to 3563 no ?
>
> The m1 context seems to be the best one, with 2271 amino acid aligned  (in
> 4 HSP )between MySampleSeq and the Protein B9GCX0.
>
> Could you please help me to point what is wrong ?
>
> Thank you.
>
> Fred


--------------------------------------------------------------------------------


> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@...
> http://lists.open-bio.org/mailman/listinfo/bioperl-l 

_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l

Re: Bio::Search::Tiling::MapTiling and BlastX report.

by Mark A. Jensen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Patch committed (r16305); details under Bug#2942. Will add new test case.

Users be advised that this was a range problem only; lengths have been correctly
calculated.
thanks Fred--
MAJ
----- Original Message -----
From: <Frederic.SAPET@...>
To: <bioperl-l@...>
Sent: Friday, October 30, 2009 6:10 AM
Subject: [Bioperl-l] Bio::Search::Tiling::MapTiling and BlastX report.


> Hello
>
> I'm trying to use the new  Bio::Search::Tiling::MapTiling (I have tried
> both BioPerl 1.6.1 and the last nightly build)  module on a WuBlastX
> report after reading the documentation
> (http://www.bioperl.org/wiki/HOWTO:Tiling).
> Please find in attachment the sample file I used for my test.
>
>
>
>
> Here is the code I have tried :
>
> use Bio::SearchIO;
> use Bio::Search::Tiling::MapTiling;
>
> my $blio = Bio::SearchIO->new( -format => 'blast',
>                                     -file => 'WuBlastX.txt'
> );
>
> my $result = $blio->next_result;
>
> while (my $hit = $result->next_hit) {
>    my $tiling = Bio::Search::Tiling::MapTiling->new($hit);
>    my @contextsQ = $tiling->contexts('query');
>        for my $contextQ ( @contextsQ ) {
>        my ($min, $max) = $tiling->range('query', $contextQ);
>        my $tLengthQ = $tiling->length('query', 'exact', $contextQ);
>                print "QUERY range in *$contextQ* => $min, $max
> (length=$tLengthQ)\n";
>        }
>        my @contextsS = $tiling->contexts('subject');
>        for my $contextS ( @contextsS ) {
>        my ($min2, $max2) = $tiling->range('subject', $contextS);
>                print "SUBJECT range in *$contextS* => $min2, $max2\n";
>        }
> }
>
> exit;
>
> Results printed on my terminal are :
>
> QUERY range in *m1* => 4065, 10571 (length=7038)
> QUERY range in *m0* => 7, 11037 (length=2577)
> QUERY range in *m2* => 2462, 14599 (length=327)
> SUBJECT range in *all* => 435, 270
>
> I think that the right query range in m1 context should be 1170 to 10571
> (and length 6828 ?)
> and subject range has to be 231 to 3563 no ?
>
> The m1 context seems to be the best one, with 2271 amino acid aligned  (in
> 4 HSP )between MySampleSeq and the Protein B9GCX0.
>
> Could you please help me to point what is wrong ?
>
> Thank you.
>
> Fred


--------------------------------------------------------------------------------


> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@...
> http://lists.open-bio.org/mailman/listinfo/bioperl-l 

_______________________________________________
Bioperl-l mailing list
Bioperl-l@...
http://lists.open-bio.org/mailman/listinfo/bioperl-l