Next: Kimono Up: No Title Previous: Gibbs sampling

Gibbs sampling continued

The likelihood of sequence ${\bf S}$ , at indentation d_S, conditioned on the rest of the alignment $\{d_{S' \neq S}\}$ , is given by the product of the column likelihoods:

$\begin{displaymath}\mbox{Pr}[{\bf S} \vert {\bf d}] = \prod_c p_{{\bf S}_{c-d_S}}^{(c)} \end{displaymath}$

where ${\bf S}_i$ is the residue at position i of sequence ${\bf S}$ .

Extensions:

Competitive models
More sophisticated architectures (Telegraph)
Other kinds of data...

Traditional procedure is to (i) cluster microarray data; (ii) sample upstream sequences to find promoter motifs (see e.g. Tavazoie et al, Nature Genetics 22, 1999).

However, we now have all the ammo necessary to make joint, competitive models of sequence and array data:

$\begin{displaymath}\mbox{Pr}[{\bf S}; {\bf x} \vert {\bf d}; {\bf C}; {\bf c}] =... ... S} \vert {\bf d}] . \mbox{Pr}[{\bf x} \vert {\bf C}; {\bf c}] \end{displaymath}$

Next: Kimono Up: No Title Previous: Gibbs sampling

2000-04-26