TiCo [help page]

Input and output formats

A description of the input and output formats with examples can be found here.

Configuration parameters

The following parameters are provided to adjust the consideration of alternative start sites for a putative gene. The parameter sigma is used to smooth the positional probability Markov Models.


Search

Range to be searched around putative gene starts for alternative start sites. I. e. by the search range the maximum distance to a predicted TIS as derived from the input file is defined. In this range all potential start sites are considered as candidate TIS. A potential start site is defined as start codon, that shares the same reading frame of the respective gene, with no inframe stop codon between the start codon and the annotated stop.
At first the initially predicted TIS is labeled as strong TIS, the alternative start sites are labeled as weak TIS. During the iterative classification, the label strong is assigned to the candidate start with the highest PWM-Score.

up
Specifies the maximal distance to a given start position for upstream (5') alternative starts.
Default: 250 nucleotides
Minimum: 50 nucleotides
Maximum: 250 nucleotides
down
Specifies the maximal distance to a given start position for downstream (3') alternative starts.
Default: 250 nucleotides
Minimum: 50 nucleotides
Maximum: 250 nucleotides

View an Illustration of the search range.


Table of contents.


Extract

Range to be extracted around each candidate start site. The resulting sequence window is used for the unsupervised learning. It is assumed to contain the characteristics of respective start site, e. g. the ribosom binding site.

up
Specifies the number of nucleotides to be be extracted upstream (5') a given start position.
Default: 30 nucleotides
Minimum: 10 nucleotides
Maximum: 100 nucleotides
down
Specifies th number of nucleotides to be extracted downstream (3') a given start position.
Default: 30 nucleotides
Minimum: 10 nucleotides
Maximum: 100 nucleotides

View an Illustration of the extract range.


Table of contents.


Sigma

The standard deviation parameter sigma of the Gaussian density specifies the smoothing of the positional probabilities of the second order Markov Models. A high value for sigma means the positional probabilities are highly smoothed.
The parameter doesn't imply any assumptions on trinucleotide positions in the sequence, but adapts the estimation to a varying number of genes under consideration. The default value 0.5 works well with approximately 4000 genes. For a set with a smaller number of genes it may be useful to chose a higher value for sigma to prevent vanishing probabilities.

Range: 0.1 - 2.0
Default: 0.5


Table of contents.


Automated Sigma Optimization

Since TiCo release 2.0 the smoothing parameter sigma can be optimized by an automated routine using the ROC (Receiver Operating Characteristics) score. A detailed description will be given in the coming publication.

Value: checked or unchecked
Default: checked


Table of contents.


Minimum gene length

The minimum length of a gene after reannotation of the TIS (denoted in bp). If the distance of a potential candidate TIS falls below the minimum length it is omitted from the list of candidates.

Default: 60 nucleotides


Table of contents.


Starts

The start codons to be considered as alternative start sites within the search window.

Default: ATG GTG TTG
(Cannot be altered in the current version of the webinterface)


Stops

The codons to be assumed as stop codon.

Default: TAA TAG TGA
(Cannot be altered in the current version of the webinterface)


Table of contents.