ESTs are inherently redundant and datasets must be clustered by transcript and gene of origin. Because most EST-clustering programs lack the ability to: 1) use raw sequence trace data, with associated probability data, 2) view and hand-edit clusters to eliminate mis-assemblies, 3) represent splice isoforms, we developed NemaGene, which is built upon the existing Phred/Phrap/Consed algorithms. An initial attempt at this method yielded 1,860 clusters from 5,713 M. incognita ESTs.
Refinements in NemaGene v2.0 eliminate chimeric ESTs, accommodate multiple splice-isoforms and maintain cluster names when new ESTs are added. For each cluster with multiple members, the consensus sequence is longer and of higher quality than each stand-alone EST read.
|
|
|
|
![]() |
|
![]() |
![]() |
![]() |
||
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Click on image to view a larger picture