In this paper, we describe the context sensitivity problem encountered in partitioning a heterogeneous biological sequence into statistically homogeneous segments. After showing signatures of the problem in the bacterial genomes of Escherichia coli K-12 MG1655 and Pseudomonas syringae DC3000, when these are segmented using two entropic segmentation schemes, we clarify the contextual origins of these signatures through mean-field analyses of the segmentation schemes. Finally, we explain why we believe all sequence segmentation schems are plagued by the context sensitivity problem.
Cite
@article{arxiv.0904.2668,
title = {The Context Sensitivity Problem in Biological Sequence Segmentation},
author = {Siew-Ann Cheong and Paul Stodghill and David J. Schneider and Samuel W. Cartinhour and Christopher R. Myers},
journal= {arXiv preprint arXiv:0904.2668},
year = {2009}
}