Search | ID Search: search through SmProt ID, NONCODE ID, ENSEMBL ID. Location Search: search concerned location of chromosome in specific species. Hits of small proteins will be reported if their locations are overlapped with the input location. |
Browse | On Browse webpage, users can choose species (human, mouse, etc.), start codon (ATG, non-ATG), data source (ribosome profiling, mass spectrum, etc.), predicted function (yes/no, means whether have function domain prediction). Click Browse button and the filtered results with brief information will be listed below. Click on one SmProt_ID to jump to the page with detailed information.
|
Variants | On Variants webpage, variants related to small ORFs in 5'UTR called from WGS data of multiple projects are provided, as well as their effects on downstream gene expressions and translated uORF in SmProt. Users can choose data source (WGS project) and variant type (uAUG_gained, uSTOP_lost, etc., means effects of variants). Click on one variant to jump to the page with detailed information.
|
Diseases | On Diseases webpage, disease-specific translation events and variants in small proteins predicted from ribosome profiling data are provided (confidence: predicted specific), as well as disease-related small proteins reported in literature (confidence: reported related). Users choose species, then diseases list will be attached to the chosen species. Users can further choose confidence and start codon of small proteins.
|
Human Microbio | On HumanMicroBio webpage, users can choose body site (skin, gut, etc.) to see small proteins identified from microorganism samples from the body site. The brief results show total number, length and representative sequence of each family. Click on the Family ID to jump to the page with corresponding detailed information.
|
Inner BLAST | On Blast webpage, users can assess sequence similarity of small proteins in multiple species. All small proteins in SmProt v2.0 were added to the blast database. Program blastp means from protein to protein, blastx means from translated nucleotide to protein. Users can enter fasta format sequence directly or load fasta files from disk. The results can be generated with default parameters or specified parameters.
|
Genome Browser | Users can click Genome Button on Navigation Bar, or location link in General Information table in any small protein page, or genome browser link on Dataset table in any small protein page, to jump to Genome browser webpage to check small proteins on a genomic region. Users can manually change tracks to be shown or hiden.
|
Terminology Explaination | PhyloCSF: conservation of genomic region which reflects the coding potential. RiboPvalue: One tailed rank sum test p-value for regular riboseq frame bias inside ORF (frame test). TISPvalue: One tailed negative binomial test p-value for TISCount (TIS test). MS evidence: translation evidence from mass spectrum experiments. TISCount: Number of reads with P-site at TIS site. Kozak sequence: (GCC)GCCA/GCCATGG, emerges as the consensus sequence for initiation of translation in vertebrates. Kozak Strength: the likelyhood of an AUG initiating translation. oORF: overlapping open reading frame (with downstream gene). methyl_bin: For C>T changes at CpG sites, the mutability adjusted proportion of singletons are calculated separately for three distinct bins of methylation. AC: allele count. AF: allele frequency. |
|