Maximum-likelihood predictions


Overview

Given the residue class-membership probabilities for the protein, a straightforward greedy algorithm for prediction would simply output the highest-probability structure-class, independently at each residue. A failing of this natural greedy approach is that it does not necessarily output a physically valid secondary-structure prediction: runs of consecutive residues that share the same structure class are known to physically have minimum lengths, and a greedy prediction may violate these minimum run lengths.


Dynamic programming

We developed an efficient dynamic programming algorithm that is guaranteed to output a physically valid prediction that satisfies minimum run length constraints, while optimizing the structure class probabilities at residues, as well as other features of the prediction such as the rates of transitions between runs and the lengths of runs.