3. Registration for location of programs and resources > Manual

본문 바로가기

Manual

3. Registration for location of programs and resources

Page 정보

작성자 TGFam-admin Reply 0건 Read 1,134회 작성일 18-07-18 11:30

본문

 

To run TGFam-Finder, users should prepare RESOURCE.config including full location of genomic resources, and PROGRAM_PATH.config containing absolute location of pre-installed programs and name of output directories. Through the full auto-installation, RESOURCE.config for sample data and PROGRAM_PATH.config are automatically generated. Users who do not want to use our installation script should manually input location of the programs in PROGRAM_PATH.config as well as the resources in RESOURCE.config. If users perform partial-installation except for InterproScan using our script, the location of InterproScan should be inserted in PROGRAM_PATH.config.

 

 

3.1. PROGRAM_PATH.config

## DEFAULT PATH ##

1.     $TGFAM_SCRIPTS_PATH="";  ## Directory path containing annotation scripts

2.     $RUNNING_PATH="";  ## Location of directory running TGFam-Finder

## PROGRAM PATH ##

1.     $CLUSTALW_PATH = "";  ## Directory path containing ClustalW execution file

2.     $HMMER_BIN_PATH = "";  ## Directory path containing Hmmer execution file

3.     $BLAST_BIN_PATH = "";  ## Directory path containing BLAST execution file

4.     $IPRSCAN_PATH = "";  ## Directory path including InterproScan execution file

5.     $BOWTIE_PATH = "";  ## Directory path including Bowtie2 execution file

6.     $TOPHAT_PATH = "";  ## Directory path including Tophat execution file

7.     $EXONERATE_BIN_PATH = "";  ## Directory path containing exonerate execution file

8.     $AUGUSTUS_PATH = "";  ## Directory path of augustus containing ‘bin’ directory

9.     $AUGUSTUS_BIN_PATH = "";  ## Directory path including augustus execution file

10.  $AUGUSTUS_CONFIG_PATH = "";  ## Location of config directory of augustus

11.  $SCIPIO_PATH = "";  ## Directory path containing scipio execution file

12.  $BLAT_PATH = "";  ## Directory path including blat execution file

13.  $CUFFLINKS_PATH = "";  ## Directory path including cufflinks execution file

 

## OUTPUT FOLDER NAME FOR ANALYSIS ##

1.     $AUGUSTUS_ANALYSIS_PATH="";  ## Augustus annotation output directory name that will be created in RUNNING_PATH ex) Augustus

2.     $ISGAP_ANALYSIS_PATH="";  ## ISGAP output directory name

3.     $MERGING_ANALYSIS_PATH="";  ## Output directory name for merging initial gene models

4.     $PROTEIN_MAPPING_ANALYSIS_PATH="";  ## Protein mapping output directory name

5.     $RNASEQ_AUGUSTUS_ANALYSIS_PATH="";  ## Output directory name of augustus annotation for assembled transcripts

6.     $PM_AUGUSTUS_ANALYSIS_PATH="";  ## Output directory name for comparison between protein mapping and augustus

7.     $ISGAP_ AUGUSTUS_ANALYSIS_PATH="";  ## Output directory for comparison of results between ISGAP and augustus annotations

8.     $FINAL_ANALYSIS_PATH="";  ## Final result output directory name



f453849089b40573866ea809f113cdfc_1531880971_0565.png 

 


3.2. RESOURCE.config

## REQUIRED INFORMATION ##

1.     $TARGET_GENOME = "";  ## Full path of assembled genome (ex, path/filename), fasta format

2.     $PROTEINS_FOR_DOMAIN_IDENTIFICATION = "";  ## Full path of peptide sequences of target or allied species, fasta format

3.     $TSV_FOR_DOMAIN_IDENTIFICATION = "";  ## Full path of InterPro result of the peptide sequences, tsv format

4.     $RESOURCE_PROTEIN = "";  ## Full path of target peptide sequences in multiple species that users prepare as resources for each annotation step, fasta format

5.     $BLAST_DB_NAME = "";  ## Full path of BLAST DB that will be automatically created

6.     $OUTPUT_PREFIX = "";  ## Output prefix generated results

7.     $TARGET_DOMAIN_ID = "";  ## Target domain ID saved in fifth column of tsv files such as PF00000, SSF00000, and SM00000, not IPR number

8.     $TARGET_DOMAIN_NAME = "";  ## target domain name(s)

9.     $REPRESENTATIVE_DOMAIN_NAME = "";  ## Target gene-family name

10.  $EXTENSION_LENGTH = "";  ## Extension length for flanking regions of target domain

11.  $MAX_INTRON_LENGTH = "";  ## Max intron length

12.  $THREADS = "";  ## Number of threads

## OPTIONAL INFORMATION ##

1.    $CDS_OF_TARGET_GENOME = "";  ## Full path of coding DNA sequences of the target genome, fasta format

2.    $GFF3_OF_TARGET_GENOME = "";  ## Full path of GFF3 of the target genome, gff3

3.    $RNASEQ_FORWARD_PATH = "";  ## Full path of RNA-seq (forward), fastaq format

4.    $RNASEQ_REVERSE_PATH = "";  ## Full path of RNA-seq (reverse), fastaq format

5.    $EXCLUDED_DOMAIN_ID = "";  ## Excluded Target domain ID

6.    $HMM_MATRIX_NAME = "";  ## Full path of HMM Matrix Name, HMM format

7.     $HMM_CUTOFF =""; ## Hmm CutOFF value (default is 1e-3)


 d6157eb6ffda83d716f597d6da2a7ee2_1556869264_3551.png


Note. If you have problems for preparation of config files, see the section 6.2 and 6.3.

ReplyList

Register된 Reply이 없습니다.

Member Login


그누보드5
Copyright © TGFam-Finder. All rights reserved.