ExPASy logo ExPASy Home page Site Map Search ExPASy Contact us Swiss-Prot
Notice: This page will be replaced with www.uniprot.org. Please send us your feedback!
Search for

UniProtKB/Swiss-Prot entry P33478


[Entry info] [Name and origin] [References] [Comments] [Cross-references] [Keywords] [Features] [Sequence] [Tools]

Note: most headings are clickable, even if they don't appear as links. They link to the user manual or other documents.
Entry information
Entry name POLG_DEN1S
Primary accession number P33478
Secondary accession numbers None
Integrated into Swiss-Prot on February 1, 1994
Sequence was last modified on February 1, 1994 (Sequence version 1)
Annotations were last modified on    September 2, 2008 (Entry version 82)
Name and origin of the protein
Protein name Genome polyprotein
Synonyms None
Contains Protein C
     (Core protein)
     (Capsid protein)
prM
Peptide pr
Small envelope protein M
     (Matrix protein)
Envelope protein E
Non-structural protein 1
     (NS1)
Non-structural protein 2A
     (NS2A)
Non-structural protein 2A-alpha
     (NS2A-alpha)
Serine protease subunit NS2B
     (Non-structural protein 2B)
Serine protease subunit NS3
     (EC 3.4.21.91)
     (Non-structural protein 3)
Non-structural protein 4A
     (NS4A)
Peptide 2k
Non-structural protein 4B
     (NS4B)
RNA-directed RNA polymerase NS5
     (EC 2.7.7.48)
     (EC 2.1.1.56)
     (Non-structural protein 5)
Gene name None
From
Dengue virus type 1 (strain Singapore/S275/1990) (DENV-1) [TaxID: 33741] 
Taxonomy Viruses; ssRNA positive-strand viruses, no DNA stage; Flaviviridae; Flavivirus; Dengue virus group.
Virus hosts Aedes aegypti (Yellowfever mosquito) [TaxID: 7159]
Homo sapiens (Human) [TaxID: 9606]
Protein existence 3: Inferred from homology;
References
[1]
NUCLEOTIDE SEQUENCE [GENOMIC RNA].
DOI=10.1016/0042-6822(92)90560-C; PubMed=1585663 [NCBI, ExPASy, EBI, Israel, Japan]
Fu J., Tan B.H., Yap E.H., Chan Y.C., Tan Y.H.;
"Full-length cDNA sequence of dengue type 1 virus (Singapore strain S275/90).";
Virology 188:953-958(1992).
Comments
  • FUNCTION: Protein C packages viral RNA to form a viral nucleocapsid, and promotes virion budding (By similarity).
  • FUNCTION: prM acts as a chaperone for envelope protein E during intracellular virion assembly by masking and inactivating envelope protein E fusion peptide. prM is matured in the last step of virion assembly, presumably to avoid catastrophic activation of the viral fusion peptide induced by the acidic pH of the trans-Golgi network. After cleavage by host furin, the pr peptide is released in the extracellular medium and small envelope protein M and envelope protein E homodimers are dissociated (By similarity).
  • FUNCTION: Envelope protein E binds cell surface receptor and is involved in membrane fusion between virion and target cell. Synthesized as an homodimer with prM which acts as a chaperone for envelope protein E. After cleavage of prM, envelope protein E dissociate from small envelope protein M and homodimerizes (By similarity).
  • FUNCTION: Non-structural protein 1 is slowly secreted from mammalian cells, but not from mosquito cells. Secreted form elicits protective immune response and plays an essential role in RNA replication. Soluble and membrane-associated NS1 may activate human complement and induce host vascular leakage. This effect might explain the clinical manifestations of dengue hemorrhagic fever and dengue shock syndrome (By similarity).
  • FUNCTION: Non-structural protein 2B is a required cofactor for the serine protease function of NS3 (By similarity).
  • FUNCTION: Serine protease NS3 displays three enzymatic activities: serine protease, NTPase and RNA helicase. NS3 serine protease, in association with NS2B, cleaves the polyprotein at dibasic sites in the cytoplasm: C-prM, NS2A-NS2B, NS2B-NS3, NS3-NS4A, NS4A-2K and NS4B-NS5. NS3 RNA helicase binds RNA and unwinds dsRNA in the 3' to 5' direction (By similarity).
  • FUNCTION: Non-structural protein 4A plays a role in RNA replication. Enhances inhibition of cell antiviral response by non-structural protein 4B (By similarity).
  • FUNCTION: Non-structural protein 4B prevent the establishment of cellular antiviral state by blocking the interferon-alpha/beta (IFN-alpha/beta) and IFN-gamma signaling pathways (By similarity).
  • FUNCTION: RNA-directed RNA polymerase NS5 replicates the viral (+) and (-) genome, and assure the capping of genomes in the cytoplasm. May be involved in methylation of 5'RNA cap structure (By similarity).
  • CATALYTIC ACTIVITY: Selective hydrolysis of -Xaa-Xaa-|-Yaa- bonds in which each of the Xaa can be either Arg or Lys and Yaa can be either Ser or Ala.
  • CATALYTIC ACTIVITY: Nucleoside triphosphate + RNA(n) = diphosphate + RNA(n+1).
  • CATALYTIC ACTIVITY: S-adenosyl-L-methionine + G(5')pppR-RNA = S-adenosyl-L-homocysteine + m7G(5')pppR-RNA.
  • SUBUNIT: prM and envelope protein E form heterodimers in the endoplasmic reticulum and Golgi. Envelope protein E forms homodimers. NS1 forms homodimers as well as homohexamers when secreted. NS1 may interact with NS4A. NS3 and NS2B form an heterodimer. NS3 interacts with unphosphorylated NS5 (By similarity).
  • SUBCELLULAR LOCATION: Note=The virion is assembled in the endoplasmic reticulum lumen, transported by vesicles to the Golgi, then transported again to the cell membrane where it is released outside the cell.
  • SUBCELLULAR LOCATION: Protein C: Virion (By similarity).
  • SUBCELLULAR LOCATION: Peptide pr: Secreted (By similarity).
  • SUBCELLULAR LOCATION: Small envelope protein M: Virion membrane; Single-pass type I membrane protein (By similarity).
  • SUBCELLULAR LOCATION: Envelope protein E: Virion membrane; Single-pass type I membrane protein (By similarity).
  • SUBCELLULAR LOCATION: Non-structural protein 1: Secreted. Endoplasmic reticulum membrane; Peripheral membrane protein; Lumenal side (By similarity).
  • SUBCELLULAR LOCATION: Non-structural protein 2A-alpha: Endoplasmic reticulum membrane (By similarity).
  • SUBCELLULAR LOCATION: Non-structural protein 2A: Endoplasmic reticulum membrane (By similarity).
  • SUBCELLULAR LOCATION: Serine protease subunit NS2B: Endoplasmic reticulum membrane; Peripheral membrane protein; Cytoplasmic side (By similarity).
  • SUBCELLULAR LOCATION: Serine protease subunit NS3: Endoplasmic reticulum membrane; Peripheral membrane protein; Cytoplasmic side (By similarity).
  • SUBCELLULAR LOCATION: Non-structural protein 4A: Endoplasmic reticulum membrane; Peripheral membrane protein; Cytoplasmic side (By similarity).
  • SUBCELLULAR LOCATION: Non-structural protein 4B: Endoplasmic reticulum membrane; Multi-pass membrane protein (By similarity). Note=The C-terminal transmembrane domain of non-structural protein 4B is presumably reoriented after cleavage on the lumenal side (By similarity).
  • SUBCELLULAR LOCATION: RNA-directed RNA polymerase NS5: Endoplasmic reticulum membrane; Peripheral membrane protein; Cytoplasmic side. Nucleus (By similarity).
  • DOMAIN: Transmembrane domains of the small envelope protein M and envelope protein E contains an endoplasmic reticulum retention signals (By similarity).
  • PTM: Specific enzymatic cleavages in vivo yield mature proteins. The nascent protein C contains a C-terminal hydrophobic domain that act as a signal sequence for translocation of prM into the lumen of the ER. Mature protein C is cleaved at a site upstream of this hydrophobic domain by NS3. prM is cleaved in post-Golgi vesicles by a host furin, releasing the mature small envelope protein M, and peptide pr. Non-structural protein 2A-alpha, a C-terminally truncated form of non-structural protein 2A, results from partial cleavage by NS3 (By similarity).
  • PTM: RNA-directed RNA polymerase NS5 is phosphorylated on serines residues. This phosphorylation may trigger NS5 nuclear localization (By similarity).
  • PTM: Envelope protein E and non-structural protein 1 are N-glycosylated (By similarity).
  • SIMILARITY: Contains 1 helicase ATP-binding domain.
  • SIMILARITY: Contains 1 helicase C-terminal domain.
  • SIMILARITY: Contains 1 peptidase S7 domain [view classification].
  • SIMILARITY: Contains 1 RdRp catalytic domain.
Copyright
Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms. Distributed under the Creative Commons Attribution-NoDerivs License.
Cross-references
Sequence databases
EMBL
M87512; -; NOT_ANNOTATED_CDS; Genomic_RNA.[EMBL / GenBank / DDBJ]
PIR A42551; A42551.
3D structure databases
HSSP Q88653; 1L9K. [HSSP ENTRY / PDB]
SMR P33478; 21-100, 281-673, 1651-2093, 2499-2759.
ModBase P33478.
Protein family/group databases
MEROPS S07.001; -.
Ontologies
GO
GO:0005789; Cellular component: endoplasmic reticulum membrane (inferred from electronic annotation from UniProtKB-SubCell).
QuickGo view.
Family and domain databases
InterPro IPR014001; DEAD-like_N.
IPR011492; DEAD_Flavivir.
IPR001650; DNA/RNA_helicase_C.
IPR002464; DNA/RNA_helicase_DEAH_CS.
IPR011999; Flav_glyE_cen_dm.
IPR013754; Flav_glyE_dim.
IPR001122; Flavi_capsidC.
IPR000069; Flavi_M.
IPR001157; Flavi_NS1.
IPR000752; Flavi_NS2A.
IPR000487; Flavi_NS2B.
IPR000404; Flavi_NS4A.
IPR001528; Flavi_NS4B.
IPR002535; Flavi_propep.
IPR000336; Flv_glyE_Ig-like.
IPR014412; Gen_Poly_FLV.
IPR014021; Helicase_SF1/SF2_ATP-bd.
IPR001850; Peptidase_S7.
IPR000208; RNA_pol_flaviviral.
IPR007094; RNA_pol_PSvir.
IPR002877; RrmJFtsJ_MeTrfase.
Graphical view of domain structure.
Gene3D G3DSA:2.60.98.10; Flav_glyE_dim; 1.
G3DSA:2.60.40.350; Flv_glyE_Ig-like; 1.
Pfam PF01003; Flavi_capsid; 1.
PF07652; Flavi_DEAD; 1.
PF02832; Flavi_glycop_C; 1.
PF00869; Flavi_glycoprot; 1.
PF01004; Flavi_M; 1.
PF00948; Flavi_NS1; 1.
PF01005; Flavi_NS2A; 1.
PF01002; Flavi_NS2B; 1.
PF01350; Flavi_NS4A; 1.
PF01349; Flavi_NS4B; 1.
PF00972; Flavi_NS5; 1.
PF01570; Flavi_propep; 1.
PF01728; FtsJ; 1.
PF00271; Helicase_C; 1.
PF00949; Peptidase_S7; 1.
Pfam graphical view of domain structure.
PIRSF PIRSF003817; Gen_Poly_FLV; 1.
ProDom PD001496; Flavi_NS1; 1.
[Domain structure / List of seq. sharing at least 1 domain]
SMART SM00487; DEXDc; 1.
SM00490; HELICc; 1.
SMART graphical view of domain structure.
PROSITE PS00690; DEAH_ATP_HELICASE; FALSE_NEG.
PS51192; HELICASE_ATP_BIND_1; 1.
PS51194; HELICASE_CTER; 1.
PS50507; RDRP_SSRNA_POS; 1.
PROSITE graphical view of domain structure (profiles).
BLOCKS P33478.
Other
ProtoNet P33478.
UniRef View cluster of proteins with at least 50% / 90% / 100% identity.
Keywords
ATP-binding; Capsid protein; Cleavage on pair of basic residues; Complete proteome; Endoplasmic reticulum; Envelope protein; Glycoprotein; Helicase; Hydrolase; Membrane; Metal-binding; Multifunctional enzyme; Nucleotide-binding; Nucleotidyltransferase; Nucleus; Phosphoprotein; Protease; Ribonucleoprotein; RNA replication; RNA-binding; RNA-directed RNA polymerase; Secreted; Serine protease; Transcription; Transcription regulation; Transferase; Transmembrane; Viral nucleoprotein; Virion.
Features
SEVIEWER logo Feature table viewer FT aligner logo Feature aligner
KeyFrom    To Length Description FTId
CHAIN   1    100  100     Protein C. PRO_0000037894
PROPEP   101    114  14     ER anchor for the protein C, removed in mature form by serine protease NS3. PRO_0000037895
CHAIN   115    280  166     prM. PRO_0000264654
CHAIN   115    205  91     Peptide pr. PRO_0000264655
CHAIN   206    280  75     Small envelope protein M. PRO_0000037896
CHAIN   281    774  494     Envelope protein E. PRO_0000037897
CHAIN   775   1126  352     Non-structural protein 1. PRO_0000037898
CHAIN   1127   1344  218     Non-structural protein 2A. PRO_0000037899
CHAIN   1127   1315  189     Non-structural protein 2A-alpha. PRO_0000264656
CHAIN   1345   1474  130     Serine protease subunit NS2B. PRO_0000037900
CHAIN   1475   2093  619     Serine protease subunit NS3. PRO_0000037901
CHAIN   2094   2220  127     Non-structural protein 4A. PRO_0000037902
PEPTIDE   2221   2243  23     Peptide 2k. PRO_0000264657
CHAIN   2244   2492  249     Non-structural protein 4B. PRO_0000037903
CHAIN   2493   3396  904     RNA-directed RNA polymerase NS5. PRO_0000037904
TOPO_DOM   1    101  101     Cytoplasmic (Potential). 
TRANSMEM   102    122  21     Potential. 
TOPO_DOM   123    238  116     Extracellular (Potential). 
TRANSMEM   239    259  21     Potential. 
TOPO_DOM   260    265  6     Cytoplasmic (Potential). 
TRANSMEM   266    286  21     Potential. 
TOPO_DOM   287    724  438     Extracellular (Potential). 
TRANSMEM   725    745  21     Potential. 
TOPO_DOM   746    751  6     Cytoplasmic (Potential). 
TRANSMEM   752    772  21     Potential. 
TOPO_DOM   773   1155  383     Extracellular (Potential). 
TRANSMEM   1156   1176  21     Potential. 
TOPO_DOM   1177   1446  270     Cytoplasmic (Potential). 
TRANSMEM   1447   1467  21     Potential. 
TOPO_DOM   1468   2192  725     Lumenal (Potential). 
TRANSMEM   2193   2213  21     Potential. 
TOPO_DOM   2214   2220  7     Cytoplasmic (Potential). 
TRANSMEM   2221   2240  20     Potential. 
TOPO_DOM   2241   2348  108     Lumenal (Potential). 
TRANSMEM   2349   2369  21     Potential. 
TOPO_DOM   2370   2414  45     Cytoplasmic (Potential). 
TRANSMEM   2415   2435  21     Potential. 
TOPO_DOM   2436   2460  25     Lumenal (Potential). 
TRANSMEM   2461   2481  21     Potential. 
TOPO_DOM   2482   3391  910     Cytoplasmic (Potential). 
DOMAIN   1655   1811  157     Helicase ATP-binding. 
DOMAIN   1821   1988  168     Helicase C-terminal. 
DOMAIN   3019   3168  150     RdRp catalytic. 
NP_BIND   1668   1675  8     ATP (Potential). 
MOTIF   1759   1762  4     DEAH box (By similarity). 
ACT_SITE   1525   1525        Charge relay system; for serine protease NS3 activity (By similarity). 
ACT_SITE   1549   1549        Charge relay system; for serine protease NS3 activity (By similarity). 
ACT_SITE   1609   1609        Charge relay system; for serine protease NS3 activity (By similarity). 
SITE   100    101  2     Cleavage; by serine protease NS3 (By similarity). 
SITE   114    115  2     Cleavage; by host signal peptidase (By similarity). 
SITE   205    206  2     Cleavage; by host furin (By similarity). 
SITE   280    281  2     Cleavage; by host signal peptidase (By similarity). 
SITE   774    775  2     Cleavage; by host signal peptidase (By similarity). 
SITE   1126   1127  2     Cleavage; by host (By similarity). 
SITE   1314   1315  2     Cleavage; by serine protease NS3 (By similarity). 
SITE   1474   1475  2     Cleavage; by serine protease NS3 (By similarity). 
SITE   2220   2221  2     Cleavage; by host signal peptidase (By similarity). 
SITE   2243   2244  2     Cleavage; by serine protease NS3 (By similarity). 
SITE   2492   2493  2     Cleavage; by serine protease NS3 (By similarity). 
CARBOHYD   183    183        N-linked (GlcNAc...) (Potential). 
CARBOHYD   347    347        N-linked (GlcNAc...) (Potential). 
CARBOHYD   433    433        N-linked (GlcNAc...) (Potential). 
CARBOHYD   981    981        N-linked (GlcNAc...) (Potential). 
CARBOHYD   2302   2302        N-linked (GlcNAc...) (Potential). 
CARBOHYD   2306   2306        N-linked (GlcNAc...) (Potential). 
CARBOHYD   2458   2458        N-linked (GlcNAc...) (Potential). 
DISULFID   283    310        By similarity. 
DISULFID   340    401        By similarity. 
DISULFID   354    385        By similarity. 
DISULFID   372    396        By similarity. 
DISULFID   465    565        By similarity. 
DISULFID   582    613        By similarity. 
Sequence information
Length: 3396 AA [This is the length of the unprocessed precursor] Molecular weight: 379564 Da [This is the MW of the unprocessed precursor] CRC64: C53E75F3E424367D [This is a checksum on the sequence]
        10         20         30         40         50         60 
MNNQRKKTAR PSFNMLKRAR NRVSTGSQLA KRFSKGLLSG QGPMKLVMAF IAFLRFLAIP 

        70         80         90        100        110        120 
PTAGILARWG SFKKNGAIKV LRGFKKEISN MLNIMNRRKR SVTMLLMLLP TALAFHLTTR 

       130        140        150        160        170        180 
GGEPHMIVSK QEREKSLLFK TSVGVNMCTL IAMDLGELCE DTMTYKCPRI TEAEPDDVDC 

       190        200        210        220        230        240 
WCNATDTWVT YGTCSQTGEH RRDKRSVALA PHVGLGLETR TETWMSSEGA WKQIQRVETW 

       250        260        270        280        290        300 
ALRHPGFTVI ALFLAHAIGT SITQKGIIFI LLMLVTPSMA MRCVGIGSRD FVEGLSGATW 

       310        320        330        340        350        360 
VDVVLEHGSC VTTMAKDKPT LDIELLKTEV TNPAVLRKLC IEAKISNTTT DSRCPTQGEA 

       370        380        390        400        410        420 
TLVEEQDANF VCRRTFVDRG WGNGCGLFGK GSLLTCAKFK CVTKLEGKIV QYENLKYSVI 

       430        440        450        460        470        480 
VTVHTGDQHQ VGNETTEHGT IATITPQAPT SEIQLTDYGA LTLDCSPRTG LDFNEMVLLT 

       490        500        510        520        530        540 
MKEKSWLVHK QWFLDLPLPW TSGASTSQET WNRQDLLVTF KTAHAKKQEV VVLGSQEGAM 

       550        560        570        580        590        600 
HTALTGATEI QTSGTTTIFA GHLKCRLKMD KLTLKGMSYV MCTGSFKLEK EVAETQHGTV 

       610        620        630        640        650        660 
LVQVKYEGTD APCKIPFSTQ DEKGVTQNRL ITANPIVTDK EKPVNIETEP PFGESYIVVG 

       670        680        690        700        710        720 
AGEKALKQCW FKKGSSIGKM FEATARGARR MAILGDTAWD FGSIGGVFTS VGKLVHQVFG 

       730        740        750        760        770        780 
TAYGVLFSGV SWTMKIGIGI LLTWLGLNSR STSLSMTCIA VGMVTLYLGV MVQADSGCVI 

       790        800        810        820        830        840 
NWKGRELKCG SGIFVTNEVH TWTEQYKFQA DSPKRLSAAI GKAWEEGVCG IRSATRLENI 

       850        860        870        880        890        900 
MWKQISNELN HILLENDMKF TVVVGDVVGI LAQGKKMIRP QPMEHKYSWK SWGKAKIIGA 

       910        920        930        940        950        960 
DIQNTTFIID GPDTPECPDD QRAWNIWEVE DYGFGIFTTN IWLKLRDSYT QMCDHRLMSA 

       970        980        990       1000       1010       1020 
AIKDSKAVHA DMGYWIESEK NETWKLARAS FIEVKTCVWP KSHTLWSNGV LESEMIIPKI 

      1030       1040       1050       1060       1070       1080 
YGGPISQHNY RPGYFTQTAG PWHLGKLELD FDLCEGTTVV VDEHCGNRGP SLRTTTVTGK 

      1090       1100       1110       1120       1130       1140 
IIHEWCCRSC TLPPLRFKGE DGCWYGMEIR PVKEKEENLV KSMVSAGSGE VDSFSLGLLC 

      1150       1160       1170       1180       1190       1200 
ISIMIEEVMR SRWSRKMLMT GTLAVFLLLI MGQLTWNDLI RLCIMVGANA SDRMGMGTTY 

      1210       1220       1230       1240       1250       1260 
LALMATFKMR PMFAVGLLFR RLTSREVLLL TIGLSLVASV ELPNSLEELG DGLAMGIMIL 

      1270       1280       1290       1300       1310       1320 
KLLTDFQSHQ LWATLLSLTF VKTTFSLHYA WKTMAMVLSI VSLFPLCLST TSQKTTWLPV 

      1330       1340       1350       1360       1370       1380 
LLGSLGCKPL TMFLIAENKI WGRKSWPLNE GIMAVGIVSI LLSSLLKNDV PLAGPLIAGG 

      1390       1400       1410       1420       1430       1440 
MLIACYVISG SSADLSLEKA AEVSWEEEAE HSGASHNILV EVQDDGTMKI KDEERDDTLT 

      1450       1460       1470       1480       1490       1500 
ILLKATLLAV SGVYPLSIPA TLFVWYFWQK KKQRSGVLWD TPSPPEVERA VLDDGIYRIM 

      1510       1520       1530       1540       1550       1560 
QRGLLGRSQV GVGVFQDGVF HTMWHVTRGA VLMYQGKRLE PSWASVKKDL ISYGGGWRFQ 

      1570       1580       1590       1600       1610       1620 
GSWNTGEEVQ VIAVEPGKNP KNVQTAPGTF KTPEGEVGAI ALDFKPGTSG SPIVNREGKI 

      1630       1640       1650       1660       1670       1680 
VGLYGNGVVT TSGTYVSAIA QAKASQEGPL PEIEDEVFRK RNLTIMDLHP GSGKTRRYLP 

      1690       1700       1710       1720       1730       1740 
AIVREAIRRN VRTLILAPTR VVASEMAEAL KGMPIRYQTT AVKSEHTGKE IVDLMCHATF 

      1750       1760       1770       1780       1790       1800 
TMRLLSPVRV PNYNMIIMDE AHFTDPASIA RRGYISTRVG MGEAAAIFMT ATPPGSVEAF 

      1810       1820       1830       1840       1850       1860 
PQSNAVIQDE ERDIPERSWN SGYEWITDFP GKTVWFVPSI KSGNDIANCL RKNGKRVIQL 

      1870       1880       1890       1900       1910       1920 
SRKTFDTEYQ KTKNNDWDYV VTTDISEMGA NFRADRVIDP RRCLKPVILK DGPERVILAG 

      1930       1940       1950       1960       1970       1980 
PMPVTVASAA QRRGRIGRNQ NKEGDQYVYM GQPLNNDEDH AHWTEAKMLL DNINTPEGII 

      1990       2000       2010       2020       2030       2040 
PALFEPEREK SAAIDGEYRL RGEARKTFVE LMRRGDLPVW LSYKVASEGF QYSDRRWCFD 

      2050       2060       2070       2080       2090       2100 
GERNNQVLEE NMDVEMWTKE GERKKLRPRW LDARTYSDPL ALREFKEFAA GRRSVSGDLI 

      2110       2120       2130       2140       2150       2160 
LEIGKLPQHL TQRAQNALDN LVMLHNSEQG GRAYRHAMEE LPDTIETLML LALIAVLTGG 

      2170       2180       2190       2200       2210       2220 
VTLFFLSGKG LGKTSIGLLC VMASSVLLWM ASVEPHWIAA SIILEFFLMV LLIPEPDRQR 

      2230       2240       2250       2260       2270       2280 
TPQDNQLAYV VIGLLFMILT VAANEMGLLE TTKKDLGIGH VAAENHHHAT MLDVDLRPAS 

      2290       2300       2310       2320       2330       2340 
AWTLYAVATT VITPMMRHTI ENTTANISLT AIANQAAILM GLDKGWPISK MDIGVPLLAL 

      2350       2360       2370       2380       2390       2400 
GCYSQVNPLT LTAAVLMLVA HYAIIGPGLQ AKATREAQKR TAAGIMKNPT VDGIVAIDLD 

      2410       2420       2430       2440       2450       2460 
PVVYDAKFEK QLGQIMLLIL CTSQILLMRT TWALCESITL ATGPLTTLWE GSPGKFWNTT 

      2470       2480       2490       2500       2510       2520 
IAVSMANIFR GSYLAGAGLA FSLMKSLGGG RRGTGAKGKH WERNGKDRLN QLSKSEFNTY 

      2530       2540       2550       2560       2570       2580 
KRSGIMEVDR SEAKEGLKRG ETTKHAVSRG TAKLRWFVER NLVKPEGKVI DLGCGRGGWS 

      2590       2600       2610       2620       2630       2640 
YYCAGLKKVT EVKGYTKGGP GHEEPIPMAT YGWNLVKLYS GKDVFFTPPE KCDTLLCDIG 

      2650       2660       2670       2680       2690       2700 
ESSPNPTIEE GRTLRVLKMV EPWLRGNQFC IKILNPYMPS VVETLEQMQR KHGGMLVRNP 

      2710       2720       2730       2740       2750       2760 
LSRNSTHEMY WVSCGTGNIV SAVNMTSRML LNRFTMAHRK PTYERDVDLG AGTRHVAVEP 

      2770       2780       2790       2800       2810       2820 
EVANLDIIGQ RIENIKHEHK STWHYDEDNP YKTWAYHGSY EVKPSGSASS MVNGVVKLLT 

      2830       2840       2850       2860       2870       2880 
KPWDAIPMVT QIAMTDTTPF GQQRVFKEKV DTRTPKAKRG TAQIMEVTAR WLWGFLSRNK 

      2890       2900       2910       2920       2930       2940 
KPRICTREEF TRKVRSNAAI GAVFVDENQW NSAKEAVEDE RFWDLVHRER ELHKQGKCAT 

      2950       2960       2970       2980       2990       3000 
CVYNMMGKRE KKLGEFGKAK GSRAIWYMWL GARFLEFEAL GFMNEDHWFS RENSLSGVEG 

      3010       3020       3030       3040       3050       3060 
EGLHKLGYIL RDISKIPGGN MYADDTAGWD TRITEDDLQN EAKITDIMEP EHALLATSIF 

      307