Welcome to SmProt v2.0!

Small proteins are the general term for proteins with length shorter than 100 amino acids. SmProt database contains records of Small Proteins encoded by genes, especially for ones from UTRs and non-coding RNAs. The selected small proteins were collected from the literature, mass spectrometry (MS) and ribosome profiling data carried out in eight species, including human, mouse, rat, zebrafish, yeast, fruitfly, E.coli, and C.elegans. Moreover, SmProt database contains features for the collected small proteins on their sequences, data sources, genomic locations, tissues localization, source cell lines, start codon, multiple scores reflecting coding potential, function, interaction, and related diseases that have been verified or predicted, etc.
UPDATE:Extra Attention on High Confidence, Relationship Between Small Proteins and Diseases, Function, Vast Increase of Tissues/Cell lines/Datasets, Translation Initiation, and More Detailed Information.



search hints: Gene Symbol or Gene IDs from related databases (eg. NONCODE, RefSeq, ENSEMBL), cell line or tissue, PubMed ID (PMID), ORF type and gene type.

What's new!

1.High confidence generated through more accurate algorithm, specially designed pipeline, scores evaluation and multiple omics evidence filtration.
2.Special attention on functional level of small proteins, surrounding function, interaction and related diseases.
3.264 Ribo-seq datasets derived from 79 cell lines/tissues were added additionally in the update.
4.Small proteins with non-AUG translation initiation were added to the database.
5.Small proteins translated from circRNAs were added to the database.
6.The number of small proteins increased from 250,000 to 5,400,000.


How to cite?     Hao Y., et al. 2017.SmProt: a database of small proteins encoded by annotated coding and non-coding RNA loci.Brief Bioinform bbx005.


Keep assessing, integrating and uploading new data.
June 2019
Keep optimizing the website.
March 2019
Started to collect small proteins from new literature.
March 2019
Started running the analysis pipeline.
September 2018
Started to collected new Ribo-seq/TI-seq datasets.
June 2018
Constructed new pipeline for SmProt v2.0.
May 2018
Conceived and investigated new points for update.
March 2018
SmProt article published in Briefings in Bioinformatics.
January 2017
1st public release.
June 2016
Started to collect small proteins from the literature.
December 2015
Started to collect MS data sets and ribosome profiling data sets.
June 2015

Visitor Statistics