Written by RobE, May, 2006.
The command seed2gff will take a single genome and output a GFF3 file for you. There are several options for this command that allow you to select which parts of the genome you want included in your GFF3 file such as proteins, or limit the sequence to a particular region of the genome.
Use the program nmpdr2gff to create the GFF3 files for uploading to the BRC site. This takes a single argument, the name of the directory to put the files into. The program goes through each organism and looks for the flag file NMPDR in the organism directory. If that is present it creates the GFF3 file and writes it int a subdirectory called the name of the genus. There are a couple of flags that must be set to swtich the GFF3 output from SEED to NMPDR for the BRC. Mainly the database is called NMPDR not SEED.
Once these files are created you can gzip them and transfer them to the BRC site via ftp. You'll need the username and password from Tom Creasey at TIGR, or me.
ftp to the BRC Central site (ftp://ftp.brc-central.org) and download the files that are from the other sites. If you don't want to do that, the easiest way to get the data is to use this command:wget -r ftp://ftp.brc-central.org/
This will recursively download the entire directory structure on the ftp site. I have been doing this in /home/seed/IOWG/.
Once you have the data downloaded, then use the command extract_fasta_idmap.pl to convert those files into the three files that we need for the mapping, fasta, assigned_functions, and org.table. This command just takes the name of the directory with all the subdirectories and goes through them look for gff files and extracting data.
-g Number of the genome to extract (required). -o