Strain sequence data should be described using the following metadata fields in tab-separated format (please see here for an example file: metadata_table.template.tsv):
genome_ID N50 source technology NCBI_ID NCBI_name contact other
If the value of a field is unknown you may leave it empty. Mandatory fields are genome_ID, source, technology and contact. The fields have the following meaning:
'genome_ID' specifies an identifier for a sequence sample from a particular strain, might include multiple sequences.
'N50' is defined here: http://en.wikipedia.org/wiki/N50_statistic.
'source' specifies your sample (e.g. Arabidopsis thaliana root sample) 'technology' specifies the sequencing technology used (e.g. Illumina paired end) 'total_assembly_length' specifies the number of bp of your assembly 'N50' is defined here http://en.wikipedia.org/wiki/N50_statistic 'mean_contig_length' specifies the mean contig length. 'n_contigs' specifies the number of contigs.
'technology' specifies the sequencing technology used (e.g. Illumina paired end).
'NCBI_ID' specifies an identifier for a strain in a reference taxonomy. This can also be an identifier at a higher taxonomic rank, if the strain is not represented in the taxonomy yet.
'NCBI_name' specified the respective name of the taxon in the reference taxonomy.
'contact' specifies the email of the owner of a contributed assembled isolate strain sequence sample.
'other' is a field for comments.