UniProtKB/Swiss-Prot protein knowledgebase release 56.1 statistics
1. INTRODUCTION
Release 56.1 of 02-Sep-08 of UniProtKB/Swiss-Prot contains 397539 sequence entries,
comprising 143289088 amino acids abstracted from 172934 references.
4887 sequences have been added since release 56.0, the sequence data of
137 existing entries has been updated and the annotations of
103040 entries have been revised.
Number of fragments: 8196
Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 26604
Protein existence (PE): entries %
1: Evidence at protein level 6042 15.2%
2: Evidence at transcript level 6347 16%
3: Inferred from homology 2587 65.1%
4: Predicted 1351 3.4%
5: Uncertain 1338 0.3%
The growth of the database is summarized below.
2. AMINO ACID COMPOSITION
2.1 Composition in percent for the complete database
Ala (A) 8.14 Gln (Q) 3.96 Leu (L) 9.67 Ser (S) 6.66
Arg (R) 5.50 Glu (E) 6.73 Lys (K) 5.88 Thr (T) 5.36
Asn (N) 4.06 Gly (G) 7.04 Met (M) 2.40 Trp (W) 1.09
Asp (D) 5.41 His (H) 2.28 Phe (F) 3.87 Tyr (Y) 2.93
Cys (C) 1.42 Ile (I) 5.92 Pro (P) 4.77 Val (V) 6.81
Asx (B) 0.000 Glx (Z) 0.000 Xaa (X) 0.00
Legend: gray = aliphatic, red = acidic, green = small hydroxy,
blue = basic, black = aromatic, white = amide, yellow = sulfur
2.2 Classification of the amino acids by their frequency
Leu, Ala, Gly, Val, Glu, Ser, Ile, Lys, Arg, Asp, Thr, Pro, Asn, Gln,
Phe, Tyr, Met, His, Cys, Trp
3. TAXONOMIC ORIGIN
Total number of species represented in this release of UniProtKB/Swiss-Prot: 11521
The first twenty species represent 99490 sequences: 25 % of the total
number of entries.
3.1 Table of the frequency of occurrence of species
Species represented 1x: 5235
2x: 1705
3x: 839
4x: 543
5x: 421
6x: 315
7x: 241
8x: 197
9x: 171
10x: 102
11- 20x: 542
21- 50x: 358
51-100x: 139
>100x: 713
3.2 Table of the most represented species
------ --------- --------------------------------------------
Number Frequency Species
------ --------- --------------------------------------------
1 20325 Homo sapiens (Human)
2 15915 Mus musculus (Mouse)
3 7170 Rattus norvegicus (Rat)
4 6970 Arabidopsis thaliana (Mouse-ear cress)
5 6553 Saccharomyces cerevisiae (Baker's yeast)
6 5421 Bos taurus (Bovine)
7 4438 Schizosaccharomyces pombe (Fission yeast)
8 4342 Escherichia coli (strain K12)
9 3201 Caenorhabditis elegans
10 2889 Bacillus subtilis
11 2864 Dictyostelium discoideum (Slime mold)
12 2839 Xenopus laevis (African clawed frog)
13 2836 Drosophila melanogaster (Fruit fly)
14 2226 Danio rerio (Zebrafish) (Brachydanio rerio)
15 2140 Pongo abelii (Sumatran orangutan)
16 2065 Gallus gallus (Chicken)
17 1955 Escherichia coli O157:H7
18 1786 Oryza sativa subsp. japonica (Rice)
19 1782 Methanocaldococcus jannaschii (Methanococcus jannaschii)
20 1773 Haemophilus influenzae
21 1708 Salmonella typhimurium
22 1634 Escherichia coli O6
23 1631 Shigella flexneri
24 1446 Mycobacterium tuberculosis
25 1325 Sus scrofa (Pig)
26 1299 Salmonella typhi
27 1245 Pseudomonas aeruginosa
28 1207 Xenopus tropicalis (Western clawed frog) (Silurana tropicalis)
29 1184 Mycobacterium bovis
30 1126 Macaca fascicularis (Crab eating macaque) (Cynomolgus monkey)
31 990 Synechocystis sp. (strain PCC 6803)
32 983 Archaeoglobus fulgidus
33 959 Yersinia pestis
34 916 Vibrio cholerae
35 909 Acanthamoeba polyphaga mimivirus (APMV)
36 893 Rhizobium meliloti (Sinorhizobium meliloti)
37 876 Staphylococcus aureus (strain Mu50 / ATCC 700699)
38 875 Salmonella paratyphi A
39 875 Staphylococcus aureus (strain N315)
40 875 Oryctolagus cuniculus (Rabbit)
41 847 Staphylococcus aureus (strain COL)
42 846 Staphylococcus aureus (strain MW2)
43 842 Staphylococcus aureus (strain MSSA476)
44 840 Staphylococcus aureus (strain MRSA252)
45 823 Salmonella choleraesuis
46 821 Shigella sonnei (strain Ss046)
47 818 Escherichia coli O6:K15:H31 (strain 536 / UPEC)
48 815 Yersinia pseudotuberculosis
49 776 Shigella boydii serotype 4 (strain Sb227)
50 769 Vibrio parahaemolyticus
51 765 Ashbya gossypii (Yeast) (Eremothecium gossypii)
52 760 Aquifex aeolicus
53 759 Shigella dysenteriae serotype 1 (strain Sd197)
54 759 Escherichia coli O9:H4 (strain HS)
55 758 Pasteurella multocida
56 756 Escherichia coli (strain UTI89 / UPEC)
57 754 Escherichia coli O139:H28 (strain E24377A / ETEC)
58 749 Canis familiaris (Dog)
59 738 Kluyveromyces lactis (Yeast) (Candida sphaerica)
60 731 Erwinia carotovora subsp. atroseptica (Pectobacterium atrosepticum)
61 728 Candida albicans (Yeast)
62 723 Escherichia coli (strain ATCC 8739 / DSM 1576 / Crooks)
63 719 Neurospora crassa
64 712 Staphylococcus epidermidis (strain ATCC 35984 / RP62A)
65 711 Streptomyces coelicolor
66 711 Staphylococcus epidermidis (strain ATCC 12228)
67 709 Vibrio vulnificus
68 699 Photorhabdus luminescens subsp. laumondii
69 696 Candida glabrata (Yeast) (Torulopsis glabrata)
70 696 Bacillus halodurans
71 693 Shigella flexneri serotype 5b (strain 8401)
72 692 Vibrio vulnificus (strain YJ016)
73 687 Mycoplasma pneumoniae
74 671 Pan troglodytes (Chimpanzee)
75 671 Bacillus anthracis
76 663 Yersinia pestis bv. Antiqua (strain Nepal516)
77 661 Yersinia enterocolitica serotype O:8 / biotype 1B (strain 8081)
78 657 Yersinia pestis bv. Antiqua (strain Antiqua)
79 655 Anabaena sp. (strain PCC 7120)
80 648 Mycobacterium leprae
81 645 Yersinia pseudotuberculosis serotype O:1b (strain IP 31758)
82 644 Pseudomonas syringae pv. tomato
83 642 Pseudomonas putida (strain KT2440)
84 641 Escherichia coli O1:K1 / APEC
85 641 Staphylococcus aureus (strain NCTC 8325)
86 631 Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7)
87 627 Enterobacter sp. (strain 638)
88 621 Escherichia coli
89 621 Bradyrhizobium japonicum
90 614 Treponema pallidum
91 613 Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578)
92 609 Zea mays (Maize)
93 605 Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696)
94 605 Yersinia pestis (strain Pestoides F)
95 598 Bacillus cereus (strain ATCC 14579 / DSM 31)
96 596 Agrobacterium tumefaciens (strain C58 / ATCC 33970)
97 595 Methanobacterium thermoautotrophicum
98 594 Staphylococcus aureus (strain USA300)
99 589 Ralstonia solanacearum (Pseudomonas solanacearum)
100 586 Shewanella oneidensis
101 585 Serratia proteamaculans (strain 568)
102 583 Rhizobium loti (Mesorhizobium loti)
103 582 Rickettsia prowazekii
104 579 Staphylococcus aureus (strain bovine RF122 / ET3-1)
105 579 Helicobacter pylori (Campylobacter pylori)
106 574 Listeria monocytogenes
107 572 Buchnera aphidicola subsp. Acyrthosiphon pisum
108 569 Lactococcus lactis subsp. lactis (Streptococcus lactis)
109 567 Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980)
110 566 Photobacterium profundum (Photobacterium sp. (strain SS9))
111 566 Listeria innocua
112 565 Neisseria meningitidis serogroup B
113 562 Buchnera aphidicola subsp. Schizaphis graminum
114 560 Helicobacter pylori J99 (Campylobacter pylori J99)
115 558 Xanthomonas campestris pv. campestris
116 556 Staphylococcus haemolyticus (strain JCSC1435)
117 548 Staphylococcus saprophyticus subsp. saprophyticus
118 545 Brucella melitensis
119 542 Neisseria meningitidis serogroup A
120 542 Brucella suis
121 541 Bacillus cereus (strain ATCC 10987)
122 541 Enterobacter sakazakii (strain ATCC BAA-894)
123 537 Emericella nidulans (Aspergillus nidulans)
124 535 Yarrowia lipolytica (Candida lipolytica)
125 532 Clostridium acetobutylicum
126 531 Caulobacter crescentus (Caulobacter vibrioides)
127 523 Debaryomyces hansenii (Yeast) (Torulaspora hansenii)
128 523 Xanthomonas axonopodis pv. citri
129 521 Oceanobacillus iheyensis
130 520 Bacillus thuringiensis subsp. konkukian
131 515 Pseudomonas syringae pv. syringae (strain B728a)
132 510 Vibrio fischeri (strain ATCC 700601 / ES114)
133 508 Streptococcus pneumoniae
134 508 Bacillus cereus (strain ZK / E33L)
135 508 Pseudomonas fluorescens (strain PfO-1)
136 507 Buchnera aphidicola subsp. Baizongia pistaciae
137 507 Pseudomonas aeruginosa (strain UCBPP-PA14)
138 507 Listeria monocytogenes serotype 4b (strain F2365)
139 503 Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477)
140 501 Bacillus licheniformis (strain DSM 13 / ATCC 14580)
141 501 Xylella fastidiosa
142 499 Thermotoga maritima
143 499 Bordetella bronchiseptica (Alcaligenes bronchisepticus)
144 493 Pseudomonas syringae pv. phaseolicola (strain 1448A / Race 6)
145 492 Rickettsia conorii
146 492 Xylella fastidiosa (strain Temecula1 / ATCC 700964)
147 486 Chromobacterium violaceum
148 485 Bordetella parapertussis
149 484 Haemophilus ducreyi
150 484 Bordetella pertussis
151 483 Mycoplasma genitalium
152 481 Deinococcus radiodurans
153 480 Sodalis glossinidius (strain morsitans)
154 475 Clostridium perfringens
155 474 Vibrio cholerae serotype O1 (strain ATCC 39541 / Ogawa 395 / O395)
156 473 Corynebacterium glutamicum (Brevibacterium flavum)
157 468 Brucella abortus
158 465 Methanosarcina acetivorans
159 463 Haemophilus influenzae (strain 86-028NP)
160 461 Pseudomonas entomophila (strain L48)
161 460 Mannheimia succiniciproducens (strain MBEL55E)
162 457 Pyrococcus horikoshii
163 457 Pseudomonas aeruginosa (strain PA7)
164 456 Burkholderia pseudomallei (Pseudomonas pseudomallei)
165 456 Streptomyces avermitilis
166 454 Bacillus clausii (strain KSM-K16)
167 453 Pyrococcus abyssi
168 453 Xanthomonas campestris pv. campestris (strain 8004)
169 451 Enterococcus faecalis (Streptococcus faecalis)
170 449 Geobacillus kaustophilus
171 449 Shewanella sp. (strain MR-7)
172 448 Halobacterium salinarium (Halobacterium halobium)
173 447 Rickettsia felis (Rickettsia azadi)
174 446 Shewanella sp. (strain MR-4)
175 446 Methanosarcina mazei (Methanosarcina frisia)
176 445 Streptococcus pneumoniae (strain ATCC BAA-255 / R6)
177 445 Vibrio harveyi (strain ATCC BAA-1116 / BB120)
178 443 Synechococcus elongatus (Thermosynechococcus elongatus)
179 442 Lactobacillus plantarum
180 439 Synechococcus elongatus (strain PCC 7942) (Anacystis nidulans R2)
181 438 Brucella abortus (strain 2308)
182 438 Oryza sativa subsp. indica (Rice)
183 437 Streptococcus mutans
184 436 Thermoanaerobacter tengcongensis
185 436 Chlamydia trachomatis
186 434 Rickettsia bellii (strain RML369-C)
187 434 Pyrococcus furiosus
188 433 Burkholderia mallei (Pseudomonas mallei)
189 433 Ovis aries (Sheep)
190 432 Burkholderia sp. (strain 383) (Burkholderia cepacia
191 432 Acinetobacter sp. (strain ADP1)
192 431 Rhodopseudomonas palustris
193 430 Streptococcus pyogenes serotype M6
194 429 Staphylococcus aureus (strain Newman)
195 428 Pseudomonas putida (strain F1 / ATCC 700007)
196 428 Nicotiana tabacum (Common tobacco)
197 428 Anabaena variabilis (strain ATCC 29413 / PCC 7937)
198 427 Borrelia burgdorferi (Lyme disease spirochete)
199 425 Xanthomonas campestris pv. vesicatoria (strain 85-10)
200 424 Ralstonia eutropha (strain JMP134) (Alcaligenes eutrophus)
201 423 Campylobacter jejuni
202 420 Shewanella frigidimarina (strain NCIMB 400)
203 420 Pseudomonas putida (strain GB-1)
204 420 Shewanella sp. (strain ANA-3)
205 419 Chlamydia pneumoniae (Chlamydophila pneumoniae)
206 416 Aspergillus fumigatus (Sartorya fumigata)
207 415 Xanthomonas oryzae pv. oryzae (strain MAFF 311018)
208 413 Methylococcus capsulatus
209 413 Ralstonia eutropha (Cupriavidus necator
210 412 Bacillus amyloliquefaciens (strain FZB42)
211 411 Rhodobacter sphaeroides (strain ATCC 17023 / 2.4.1 / NCIB 8253 / DSM 158)
212 411 Staphylococcus aureus (strain Mu3 / ATCC 700698)
213 410 Streptococcus pyogenes serotype M1
214 409 Chlamydia muridarum
215 408 Rhizobium sp. (strain NGR234)
216 408 Shewanella baltica (strain OS185)
217 408 Sulfolobus solfataricus
218 406 Streptococcus pyogenes serotype M18
219 404 Nitrosomonas europaea
220 404 Streptococcus pyogenes serotype M3
221 403 Rickettsia typhi
222 402 Hahella chejuensis (strain KCTC 2396)
223 401 Aeromonas hydrophila subsp. hydrophila (strain ATCC 7966 / NCIB 9240)
224 401 Pseudoalteromonas haloplanktis (strain TAC 125)
225 400 Gloeobacter violaceus
226 400 Solanum lycopersicum (Tomato) (Lycopersicon esculentum)
227 399 Dechloromonas aromatica (strain RCB)
228 396 Colwellia psychrerythraea (strain 34H / ATCC BAA-681) (Vibrio psychroerythus)
229 396 Burkholderia xenovorans (strain LB400)
230 395 Shewanella sp. (strain W3-18-1)
231 395 Pseudomonas mendocina (strain ymp)
232 394 Corynebacterium efficiens
233 394 Shewanella putrefaciens (strain CN-32 / ATCC BAA-453)
234 392 Neisseria gonorrhoeae (strain ATCC 700825 / FA 1090)
235 392 Shewanella baltica (strain OS195)
236 391 Chlorobium tepidum
237 389 Idiomarina loihiensis
238 389 Shewanella denitrificans (strain OS217 / ATCC BAA-1090 / DSM 15013)
239 389 Synechococcus sp. (strain ATCC 27144 / PCC 6301 / SAUG 1402/1)
240 387 Burkholderia thailandensis (strain E264 / ATCC 700388 / DSM 13276 / CIP 106301)
241 387 Aeromonas salmonicida (strain A449)
242 387 Shewanella baltica (strain OS155 / ATCC BAA-1091)
243 386 Mycobacterium paratuberculosis
244 386 Haemophilus influenzae (strain PittEE)
245 384 Actinobacillus pleuropneumoniae serotype 5b (strain L20)
246 384 Synechococcus sp. (strain WH8102)
247 383 Staphylococcus aureus (strain JH1)
248 383 Pyrococcus kodakaraensis (Thermococcus kodakaraensis)
249 381 Shewanella amazonensis (strain ATCC BAA-1098 / SB2B)
250 381 Burkholderia cenocepacia (strain AU 1054)
3.3 Taxonomic distribution of the sequences
Kingdom sequences (% of the database)
Archaea 14805 ( 4%)
Bacteria 227377 ( 57%)
Eukaryota 142969 ( 36%)
Viruses 12388 ( 3%)
Within Eukaryota:
Category sequences (% of Eukaryota) (% of the complete database)
Human 20326 ( 14%) ( 5%)
Other Mammalia 43208 ( 30%) ( 11%)
Other Vertebrata 14138 ( 10%) ( 4%)
Viridiplantae 23668 ( 17%) ( 6%)
Fungi 22034 ( 15%) ( 6%)
Insecta 5563 ( 4%) ( 1%)
Nematoda 3783 ( 3%) ( 1%)
Other 10249 ( 7%) ( 3%)
3.4 Annotation of high-priority organisms
4. SEQUENCE SIZE
Repartition of the sequences by size (excluding fragments)
From To Number From To Number
1- 50 6709 1001-1100 2942
51- 100 29686 1101-1200 2011
101- 150 42279 1201-1300 1593
151- 200 41413 1301-1400 1435
201- 250 41520 1401-1500 1138
251- 300 36361 1501-1600 564
301- 350 35752 1601-1700 439
351- 400 31730 1701-1800 382
401- 450 25630 1801-1900 353
451- 500 21346 1901-2000 285
501- 550 15023 2001-2100 180
551- 600 10955 2101-2200 246
601- 650 9408 2201-2300 244
651- 700 6621 2301-2400 164
701- 750 5357 2401-2500 113
751- 800 3944 >2500 885
801- 850 3446
851- 900 4043
901- 950 3019
951-1000 2127
The average sequence length in UniProtKB/Swiss-Prot is 360 amino acids.
The shortest sequence is GWA_SEPOF (P83570): 2 amino acids.
The longest sequence is TITIN_MOUSE (A2ASS6): 35213 amino acids.
5. JOURNAL CITATIONS
Note: the following citation statistics reflect the number of distinct
journal citations.
Total number of journals cited in this release of UniProtKB/Swiss-Prot: 1937
5.1 Table of the frequency of journal citations
Journals cited 1x: 632
2x: 267
3x: 135
4x: 100
5x: 73
6x: 56
7x: 44
8x: 36
9x: 33
10x: 22
11- 20x: 154
21- 50x: 151
51-100x: 91
>100x: 143
5.2 List of the most cited journals in UniProtKB/Swiss-Prot
Nb Citations Journal name
-- --------- -------------------------------------------------------------
1 16438 Journal of Biological Chemistry
2 7706 Proceedings of the National Academy of Sciences of the U.S.A.
3 4713 Journal of Bacteriology
4 4413 Gene
5 4213 Biochemical and Biophysical Research Communications
6 4186 Nucleic Acids Research
7 3755 FEBS Letters
8 3513 Biochemistry
9 3494 The EMBO Journal
10 3138 Molecular and Cellular Biology
11 3016 European Journal of Biochemistry
12 2986 Nature
13 2836 Biochimica et Biophysica Acta
14 2734 Journal of Molecular Biology
15 2440 Genomics
16 2431 Cell
17 2029 Biochemical Journal
18 1902 Science
19 1630 Journal of Virology
20 1595 Molecular Microbiology
21 1441 Journal of Cell Biology
22 1429 Plant Molecular Biology
23 1291 Molecular and General Genetics
24 1232 Virology
25 1222 Nature Genetics
26 1211 Genes and Development
27 1200 Human Molecular Genetics
28 1124 Journal of Biochemistry
29 1123 Plant Physiology
30 1112 Oncogene
31 1107 The American Journal of Human Genetics
32 994 Development
33 925 Journal of Immunology
34 914 Human Mutation
35 876 Genetics
36 868 Molecular Biology of the Cell
37 818 Infection and Immunity
38 808 Structure
39 772 Journal of General Virology
40 759 Archives of Biochemistry and Biophysics
41 734 The Plant Cell
42 725 Yeast
43 704 Blood
44 676 Microbiology
45 655 Molecular Cell
46 628 Developmental Biology
47 620 Journal of Cell Science
48 606 The Plant Journal
49 602 FEMS Microbiology Letters
50 601 Cancer Research
51 567 Human Genetics
52 566 Nature Structural Biology
53 539 Current Biology
54 539 Mechanisms of Development
55 511 Current Genetics
56 481 Journal of Neuroscience
57 478 Applied and Environmental Microbiology
58 470 Acta Crystallographica, Section D
59 469 Journal of Clinical Investigation
60 466 Protein Science
61 464 Neuron
62 460 Mammalian Genome
63 424 Toxicon
64 423 Immunogenetics
65 421 The Journal of Experimental Medicine
66 415 Molecular Endocrinology
67 410 Molecular and Biochemical Parasitology
68 409 American Journal of Physiology
69 380 Journal of Neurochemistry
70 366 Endocrinology
71 362 Journal of Molecular Evolution
72 356 DNA and Cell Biology
73 354 The Journal of Clinical Endocrinology and Metabolism
74 348 DNA Sequence
75 333 Molecular Biology and Evolution
76 315 Bioscience, Biotechnology, and Biochemistry
77 310 Journal of Medical Genetics
78 306 Brain Research. Molecular Brain Research
79 286 Biological Chemistry Hoppe-Seyler
80 285 Proteins
81 272 Cytogenetics and Cell Genetics
82 261 Comparative Biochemistry and Physiology
83 261 Antimicrobial Agents and Chemotherapy
84 260 Journal of Investigative Dermatology
85 260 Peptides
86 245 Journal of General Microbiology
87 245 Molecular Pharmacology
88 244 Biology of Reproduction
89 240 Plant and Cell Physiology
90 240 Nature Cell Biology
91 236 Experimental Cell Research
92 226 Genome Research
93 215 Hoppe-Seyler's Zeitschrift fur Physiologische Chemie
94 213 Virus Research
95 211 Neurology
96 198 Developmental Dynamics
97 195 Molecular Plant-Microbe Interactions
98 195 RNA
99 192 DNA Research
100 190 European Journal of Immunology
101 186 Biochimie
102 181 Tissue Antigens
103 176 Annals of Neurology
104 174 Planta
105 174 European Journal of Human Genetics
106 168 Journal of Human Genetics
107 166 Genes to Cells
108 164 Molecular and Cellular Endocrinology
109 163 Immunity
110 163 Developmental Cell
111 159 DNA
112 156 Molecular Phylogenetics and Evolution
113 155 American Journal of Medical Genetics
114 152 The New England Journal of Medicine
115 152 Hemoglobin
116 151 Archives of Microbiology
117 151 Eukaryotic cell
118 148 Insect Biochemistry and Molecular Biology
119 147 Bioorganicheskaia Khimiia
120 141 Molecular Reproduction and Development
121 139 Investigative Ophthalmology and Visual Science
122 137 Diabetes
123 135 Glycobiology
124 134 Animal Genetics
125 133 Molecular Immunology
126 129 General and Comparative Endocrinology
127 128 Molecular and Cellular Neuroscience
128 126 International Journal of Cancer
129 121 Archives of Virology
130 119 Agricultural and Biological Chemistry
131 117 The FASEB Journal
132 115 EMBO Reports
133 114 British Journal of Haematology
134 114 Clinical Genetics
135 113 Molecular Genetics and Metabolism
136 110 Journal of Protein Chemistry
137 108 Journal of Cellular Biochemistry
138 108 Biological Chemistry
139 107 Molecular Genetics and Genomics
140 105 Journal of Neuroscience Research
141 104 Neuroscience Letters
142 103 Journal of Molecular Endocrinology
143 103 Journal of Lipid Research
144 100 American Journal of Medical Genetics. Part A
145 100 Biochemistry and Molecular Biology International
6. STATISTICS FOR SOME LINE TYPES
The following table summarizes the total number of some UniProtKB/Swiss-Prot lines,
as well as the number of entries with at least one such line, and the
frequency of the lines.
Total Number of Average
Line type / subtype number entries per entry
------------------------------------ -------- --------- ---------
References (RL) 724052 1.82
1 Journal 590503 313225 1.49
2 Submitted to EMBL/GenBank/DDBJ 126465 116210 0.32
3 Submitted to other databases 5123 4728 0.01
4 Book citation 594 584 <0.01
5 Plant Gene Register 543 531 <0.01
6 Thesis 389 387 <0.01
7 Unpublished observations 288 284 <0.01
8 Patent 141 139 <0.01
9 Worm Breeder's Gazette 6 6 <0.01
Total number of distinct authors cited in UniProtKB/Swiss-Prot: 264791
Total Number of Average
Line type / subtype number entries per entry
------------------------------------ -------- --------- ---------
Comments (CC) 1645144 4.14
1 SIMILARITY 460524 373596 1.16
2 FUNCTION 285518 274998 0.72
3 SUBCELLULAR LOCATION 229385 225198 0.58
4 CATALYTIC ACTIVITY 158233 144565 0.40
5 SUBUNIT 156485 156485 0.39
6 PATHWAY 92259 80477 0.23
7 COFACTOR 67168 61674 0.17
8 TISSUE SPECIFICITY 29741 29741 0.07
9 PTM 29304 23989 0.07
10 MISCELLANEOUS 27161 24798 0.07
11 DOMAIN 24393 21499 0.06
12 ALTERNATIVE PRODUCTS 17103 17103 0.04
13 SEQUENCE CAUTION 10560 10560 0.03
14 INTERACTION 9546 9546 0.02
15 INDUCTION 9295 9295 0.02
16 DEVELOPMENTAL STAGE 7646 7646 0.02
17 WEB RESOURCE 6327 5147 0.02
18 ENZYME REGULATION 6291 6291 0.02
19 CAUTION 5536 5426 0.01
20 DISEASE 4411 3040 0.01
21 MASS SPECTROMETRY 3581 2722 0.01
22 BIOPHYSICOCHEMICAL PROPERTIES 2268 2268 0.01
23 POLYMORPHISM 719 690 <0.01
24 RNA EDITING 544 544 <0.01
25 ALLERGEN 449 449 <0.01
26 TOXIC DOSE 380 372 <0.01
27 BIOTECHNOLOGY 238 236 <0.01
28 PHARMACEUTICAL 79 79 <0.01
Features (FT) 2508288 6.31
1 CHAIN 403765 393562 1.02
2 TRANSMEM 272822 55726 0.69
3 METAL 184776 46208 0.46
4 BINDING 132745 42257 0.33
5 DOMAIN 120590 69089 0.30
6 CONFLICT 108968 37856 0.27
7 STRAND 106813 10123 0.27
8 MOD_RES 106406 37783 0.27
9 TOPO_DOM 104535 21329 0.26
10 HELIX 103676 10649 0.26
11 ACT_SITE 96275 56871 0.24
12 CARBOHYD 87437 22611 0.22
13 DISULFID 86354 22380 0.22
14 NP_BIND 73402 50379 0.18
15 REPEAT 73357 11187 0.18
16 REGION 63186 35011 0.16
17 VARIANT 61445 12897 0.15
18 COMPBIAS 38940 22033 0.10
19 VAR_SEQ 36001 15265 0.09
20 SIGNAL 29659 29649 0.07
21 SITE 26454 15145 0.07
22 MOTIF 26436 17084 0.07
23 TURN 25694 8573 0.06
24 ZN_FING 25165 10201 0.06
25 MUTAGEN 24563 5950 0.06
26 COILED 15346 10180 0.04
27 INIT_MET 12386 12386 0.03
28 NON_TER 11119 8457 0.03
29 LIPID 9502 6092 0.02
30 PROPEP 9461 7886 0.02
31 DNA_BIND 8868 8200 0.02
32 PEPTIDE 7607 4672 0.02
33 TRANSIT 5648 5565 0.01
34 CA_BIND 3354 1392 0.01
35 CROSSLNK 3036 2099 0.01
36 NON_CONS 1477 605 <0.01
37 UNSURE 680 231 <0.01
38 NON_STD 340 266 <0.01
Total Number of Average
Line type / subtype number entries per entry
------------------------------------ -------- --------- ---------
Cross-references (DR) 7016186 17.65
1 InterPro 964037 369710 2.43
2 EMBL 686161 388611 1.73
3 GO 627204 238304 1.58
4 Pfam 511053 357306 1.29
5 RefSeq 360180 329837 0.91
6 PROSITE 358740 224846 0.90
7 GeneID 346163 329616 0.87
8 KEGG 330646 310450 0.83
9 GenomeReviews 259741 242109 0.65
10 HAMAP 208255 208155 0.52
11 HOGENOM 200089 200088 0.50
12 TIGRFAMs 188268 175947 0.47
13 Gene3D 181316 150101 0.46
14 BioCyc 153387 145365 0.39
15 PANTHER 144037 133061 0.36
16 PRINTS 124757 102095 0.31
17 NMPDR 118307 118303 0.30
18 PIR 111178 101472 0.28
19 ProDom 109668 106777 0.28
20 SMART 105908 80543 0.27
21 HSSP 84084 84084 0.21
22 UniGene 78995 73315 0.20
23 HOVERGEN 75465 75465 0.19
24 Ensembl 67729 66286 0.17
25 PIRSF 59382 59382 0.15
26 ArrayExpress 53323 53323 0.13
27 PDBsum 52234 13156 0.13
28 PDB 52234 13156 0.13
29 SMR 49928 49928 0.13
30 GermOnline 41970 41360 0.11
31 TIGR 31920 31212 0.08
32 CleanEx 30267 29615 0.08
33 HGNC 19031 18877 0.05
34 LinkHub 18149 18149 0.05
35 IntAct 16545 16545 0.04
36 PhosphoSite 16009 16009 0.04
37 PharmGKB 15842 15832 0.04
38 MGI 15782 15730 0.04
39 MIM 15245 12117 0.04
40 H-InvDB 11260 9566 0.03
41 DIP 9002 8952 0.02
42 MEROPS 7802 7540 0.02
43 TAIR 7055 6940 0.02
44 RGD 7050 7045 0.02
45 SGD 6640 6538 0.02
46 CYGD 6628 6523 0.02
47 HPA 5783 4698 0.01
48 DrugBank 5316 1625 0.01
49 PeptideAtlas 5170 5170 0.01
50 GeneDB_Spombe 4477 4436 0.01
51 EcoGene 4331 4328 0.01
52 EchoBASE 4159 4124 0.01
53 WormPep 3902 3193 0.01
54 Gramene 3737 3737 0.01
55 FlyBase 3715 3587 0.01
56 WormBase 3596 3512 0.01
57 Reactome 3417 2070 0.01
58 dictyBase 2957 2863 0.01
59 SubtiList 2830 2829 0.01
60 Orphanet 2631 1672 0.01
61 GeneFarm 2260 2239 0.01
62 ZFIN 2132 2116 0.01
63 StyGene 1661 1657 <0.01
64 TubercuList 1474 1438 <0.01
65 PseudoCAP 1184 1175 <0.01
66 SWISS-2DPAGE 1182 1182 <0.01
67 ListiList 1141 1133 <0.01
68 REPRODUCTION-2DPAGE 1029 941 <0.01
69 AGD 771 765 <0.01
70 LegioList 709 707 <0.01
71 PhotoList 699 699 <0.01
72 Leproma 651 648 <0.01
73 PeroxiBase 511 498 <0.01
74 World-2DPAGE 497 497 <0.01
75 CGD 473 473 <0.01
76 MaizeGDB 468 463 <0.01
77 ProMEX 425 425 <0.01
78 DisProt 397 394 <0.01
79 OGP 378 378 <0.01
80 SagaList 375 374 <0.01
81 REBASE 351 343 <0.01
82 ECO2DBASE 351 299 <0.01
83 GlycoSuiteDB 280 280 <0.01
84 BuruList 268 268 <0.01
85 VectorBase 244 237 <0.01
86 PHCI-2DPAGE 244 244 <0.01
87 BindingDB 210 210 <0.01
88 MypuList 198 198 <0.01
89 DOSAC-COBS-2DPAGE 150 150 <0.01
90 Aarhus/Ghent-2DPAGE 126 96 <0.01
91 Siena-2DPAGE 102 102 <0.01
92 HSC-2DPAGE 85 85 <0.01
93 2DBase-Ecoli 84 84 <0.01
94 PhosSite 74 74 <0.01
95 Cornea-2DPAGE 67 67 <0.01
96 COMPLUYEAST-2DPAGE 59 59 <0.01
97 euHCVdb 55 44 <0.01
98 PMMA-2DPAGE 52 52 <0.01
99 PptaseDB 31 31 <0.01
100 Rat-heart-2DPAGE 28 28 <0.01
101 ANU-2DPAGE 23 23 <0.01
7. MISCELLANEOUS STATISTICS
4381 entries are encoded on a mitochondrion, and 3444 are encoded on a plasmid.
9864 entries are encoded on a plastid,
of which 16 are encoded on apicoplasts,
9454 on chloroplasts,
4 on chromatophores,
145 on cyanelles,
128 on non-photosynthetic plastids and
121 on unspecified types of plastid.