towards a richer set of information to describe our complete genome collection

Draft Policy on GCat identifiers

From Genomic Standards Consortium

Scope

A GCAT identifier is unique to a specific genome report.

An identifier will only be issued once for a given genome report and cannot be re-used.

There exists a one-to-one mapping between a GCAT identifier and an NCBI genome project [1] number.

In the case that an identifier is requested for a genome report for a given NCBI genome project, and the genome report is subsequently deleted; a new GCAT identifier will be required for a new genome report for the same NCBI genome project.

Syntax

A GCAT identifier will take the form of a serial number following by _GCAT, e.g. 000001_GCAT (format specifically selected to make GCat identifiers "look" different from an INSDC accession numbers)

An identifier will have GCAT in uppercase, however the identifier is case-insensitive, i.e. 000001_GCAT is the same as 000001_gcat

The leading zeros in the identifier are not essential, e.g. 1_GCAT, 000001_GCAT, and 001_GCAT are the same.

Issue of GCAT identifiers

It will not be possible to request specific GCAT identifiers; identifiers will be issued in sequence starting with 000001_GCAT.

Identifiers are now being issued.

The GCAT identifier will be recorded in the MIGS genome record (e.g. entered by the user).

If a report is deleted, the GSC will still retain information regarding the report and be associated with the gcat identifier for the sake of 'housekeeping.

Genome Report Filenames

Genome reports will optionally have filenames that include the GCAT identifier, and will consist of the identifier followed by a file type suffix, e.g.

000001_GCAT.xml


Revisions

Following the NCBI policy on accession numbers, a GCAT identifier will remain the same even if the content in the genome report changes.

Additional users, including the "GSC" will be able to edit any report but the original contents of the report as submitted by the original provider will always be retreivable and the source of all information in the genome will be transparent to the reader.

Loading...