NemaGene Clustering Method

ESTs are inherently redundant and datasets must be clustered by transcript and gene of origin. Because most EST-clustering programs lack the ability to: 1) use raw sequence trace data, with associated probability data, 2) view and hand-edit clusters to eliminate mis-assemblies, 3) represent splice isoforms, we developed NemaGene, which is built upon the existing Phred/Phrap/Consed algorithms. An initial attempt at this method yielded 1,860 clusters from 5,713 M. incognita ESTs.

Refinements in NemaGene v2.0 eliminate chimeric ESTs, accommodate multiple splice-isoforms and maintain cluster names when new ESTs are added. For each cluster with multiple members, the consensus sequence is longer and of higher quality than each stand-alone EST read.



Click on image to view a larger picture


Previousprevious   index   nextNext