Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Annotation guidelines

Last modified June 8, 2018

Standard operating procedure (SOP) for UniProt manual curation

This document describes the manual curation procedure used by the UniProt Consortium members. The UniProt manual curation process comprises manual review of results from a range of sequence analysis programs and literature curation of experimental data as well as attribution of all information to its original source.

UniProtKB/Swiss-Prot document: Standard operating procedure (SOP) for UniProt manual curation
UniProt web page: UniProt manual annotation program

How do we manually annotate a UniProtKB entry?
Why is UniProtKB composed of 2 sections, UniProtKB/Swiss-Prot and UniProtKB/TrEMBL?

Protein naming guidelines

Ambiguities regarding gene/protein names are a major problem in the literature and in the sequence databases which tend to propagate the confusion. As administrators of UniProt we feel that we can play a major role in standardization of protein nomenclature.

The European Bioinformatics Institute (EMBL-EBI), the National Center for Biotechnology Information (NCBI), the Protein Information Resource (PIR) and the Swiss Institute for Bioinformatics (SIB) have worked together to produce a shared set of protein naming guidelines. These guidelines are intended for use by anyone who wants to name a protein and aim to promote consistent nomenclature which is indispensable for communication, literature searching and data retrieval:

UniProtKB/Swiss-Prot document: International protein nomenclature guidelines

User manual: Protein names

Why does the UniProtKB use so many different names for the same protein?

The document 'Protein nomenclature publication list' lists references that are important in defining the nomenclature or terminology relative to proteins in general and in particular on specific family or groups of proteins.

Criteria description for protein existence

Some of protein sequences exhibit strong similarity to known proteins in closely related species. For other proteins there is experimental evidence, such as Edman sequencing, clear identification by mass spectrometry (MSI), X-ray or NMR structure, detection by antibodies, etc. However, for some other proteins, there is no evidence at all. To indicate these different levels of evidence for the existence of a protein, we have introduced a PE (Protein Existence) level. The following document lists the criteria used to assign a protein existence (PE) level to entries.

UniProtKB/Swiss-Prot document: Criteria used to assign the PE level of entries
User manual: Protein existence

Why do we keep dubious sequences in UniProtKB? How to discard them from a protein set?

UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again