i-Wsearch (Random Forests)-based AGO-binding sites classifier

Example: TNRC6A, SPT5, TAS3, SPT6

The method is trained on a set of 6,779 W-motifs from 195 eukaryotic AGO-binding proteins and non-AGO-binding motifs from ten series of randomly selected 6000 eukaryotic proteins.

The classifier includes 10 amino-acid residues flanking both side of Trp(W). Each flanking residue in the Trp neighbor proļ¬le is characterized by 3 descriptors: flexibility [Vihinen et al., 1994], hydrophilicity [Hopp & Woods, 1981] and volume [Zamyatnin, 1972]). In addition, the classifier encodes two amino-acid distances to nearest Trp residues at N- and C-termini from motif.

Since random forests use a single variable at a time, they can give an automatic measure of feature importance. The table shows importance on an artifical classification task of each feature across all 20 flanking positions of the W-mori.f

Feature importance