Use of webpage interface for C5.0
-
Upload positive and negative files and choose flags for
training.
Here, positive and negative are files of equal length amino
acid sequences (e.g. sequences like AILYR of length 5).
Sequences appear in the positive and negative files, one per line.
In addition to the 20 single letter amino acid codes, you may
have X (uncertain code). The scripts could be modified (but are
not yet) to allow ? in place of X.
-
When you train on the positive and negative files, you must
NOT select the rules option (-r), though you can boost.
Additionally, you should not select cross validation, since
the decision trees are not saved.
-
On the training webpage, you
can perform cross validation, boosting, and output confusion
matrices to see the algorithm's performance. Read
tutorial
for more examples of C5.0 use.
-
To use C5.0 for prediction, use the
test
webpage. Here you upload a test file, which is in the same
format as the training file; i.e. a single file consisting of
sequences, all of the same length as the sequences in the
training files.