Welcome to SmProt v2.0!

Small proteins are the general term for proteins with length shorter than 100 amino acids. SmProt database contains records of Small Proteins encoded by genes, especially for ones from UTRs and non-coding RNAs. The selected small proteins were collected from the literature, mass spectrometry (MS) and ribosome profiling data carried out in eight species, including human, mouse, rat, zebrafish, yeast, fruitfly, E.coli, and C.elegans. Moreover, SmProt database contains features for the collected small proteins on their sequences, data sources, genomic locations, tissues localization, source cell lines, start codon, multiple scores reflecting coding potential, function, interaction, and related diseases that have been verified or predicted, etc.
UPDATE:Extra Attention on High Confidence, Relationship Between Small Proteins and Diseases, Function, Vast Increase of Tissues/Cell lines/Datasets, Translation Initiation, PhyloCSF Score, More ORF Types and Gene Types, and other Detailed Information. Over 4000 conserved Small Protein Families identified from Human Microbiomes by this paper were collected to display.



search hints: Gene Symbol or Gene IDs from related databases (eg. NONCODE, RefSeq, ENSEMBL), cell line or tissue, PubMed ID (PMID), ORF type and gene type.

What's new!

1.High confidence generated through more accurate algorithm, specially designed pipeline, scores evaluation and multiple omics evidence filtration. All small proteins derived from ribosome profiling datasets were completely new!
2.Special attention on functional features of small proteins, surrounding function, diseases and interaction.
3.419 Ribo-seq datasets derived from 79 cell lines/tissues were employed in the update.
4.Small proteins with non-AUG translation initiation were added to the database.
5.Small proteins translated from circRNAs were added to the database.
6.The number of small proteins increased to 564063 ID(after merging) and 1808051 records(before merging).
7.Over 4000 conserved small protein families identified from human microbiomes were collected to display in humanMicroBio.


How to cite?     Hao Y., et al. 2017.SmProt: a database of small proteins encoded by annotated coding and non-coding RNA loci.Brief Bioinform bbx005.

Keep assessing, integrating and uploading new data.
June 2019
Keep optimizing the website.
March 2019
Started to collect small proteins from new literature.
March 2019
Started running the analysis pipeline.
September 2018
Started to collected new Ribo-seq/TI-seq datasets.
June 2018
Constructed new pipeline for SmProt v2.0.
May 2018
Conceived and investigated new points for update.
March 2018
SmProt article published in Briefings in Bioinformatics.
January 2017
1st public release.
June 2016
Started to collect small proteins from the literature.
December 2015
Started to collect MS data sets and ribosome profiling data sets.
June 2015

