Skip Header

You are using a version of browser that may not display all the features of this website. Please consider upgrading your browser.

Chordata protein annotation project

The Chordata protein annotation project focuses on the manual annotation of chordata-specific proteins as well as those that are widely conserved. The aim of this project is twofold: 1. to keep the existing human entries up-to-date and 2. to broaden the manual annotation to other vertebrate species, especially model organisms, including great apes, cow, mouse, rat, chicken, zebrafish, as well as Xenopus laevis and Xenopus tropicalis.

See: How do we manually annotate a UniProtKB entry?

Update of the human proteome

A draft of the human proteome has been available in UniProtKB/Swiss-Prot since 2008 and one of the current priorities of the Chordata protein annotation project is to improve the quality of human sequences provided.

See: What is UniProt's human proteome?

To this aim, we are updating sequences which show discrepancies with those predicted from the genome sequence. Dubious isoforms, sequences based on experimental artefacts and protein products derived from erroneous gene model predictions are also revisited. This work is in part done in collaboration with the Hinxton Sequence Forum (HSF), which allows active exchange between UniProt, HAVANA, Ensembl and HGNC groups, as well as with RefSeq database. UniProt is a member of the Consensus CDS project and we are in the process of reviewing our records to support convergence towards a standard set of protein annotation.

We also continuously update human entries with functional annotation, including novel structural, post-translational modification, interaction and enzymatic activity data. In order to identify candidates for re-annotation, we use, among others, information extraction tools such as the STRING database. In addition, we regularly add new sequence variants and maintain disease information. Indeed, this annotation project includes the Variation Annotation project, the goal of which is to annotate all known human genetic diseases and disease-linked protein variants, as well as neutral polymorphisms.

Annotation of other mammalian and chordata proteins

In addition to the review of the human proteome, other mammalian and non-mammalian chordata proteins are increasingly being manually annotated with special emphasis on species such as Xenopus laevis, Xenopus tropicalis and Danio rerio (Zebrafish) which are important model organisms for studying embryonic development and cell biology. We work in close collaboration with species-specific resources and model organism databases, such as HGNC, MGI, RGD, Zfin and Xenbase, to ensure consistency between UniProt and these resources.

  • All manually reviewed mouse entries can be found here (statistics)
  • All manually reviewed rat entries can be found here (statistics)
  • All manually reviewed Xenopus laevis entries can be found here
  • All manually reviewed Xenopus tropicalis entries can be found here
  • All manually reviewed Danio rerio (Zebrafish) entries can be found here
UniProt is an ELIXIR core data resource
Main funding by: National Institutes of Health

We'd like to inform you that we have updated our Privacy Notice to comply with Europe’s new General Data Protection Regulation (GDPR) that applies since 25 May 2018.

Do not show this banner again