Abstract:
The operon structure of the prokaryotic genome is a critical input for the reconstruction of regulatory networks at the whole genome level. As experimental methods for the detection of operons are difficult and time-consuming, efforts are being put into developing computational methods that can use available biological information to predict operons. A genetic algorithm is developed to evolve a starting population of putative operon maps of the genome into progressively better predictions. Fuzzy scoring functions based on multiple criteria are used for assessing the 'fitness' of the newly evolved operon maps and guiding their evolution. The algorithm organizes the whole genome into operons. The fuzzy guided genetic algorithm-based approach makes it possible to use diverse biological information like genome sequence data, functional annotations and conservation across multiple genomes, to guide the organization process. This approach does not require any prior training with experimental operons. The predictions from this algorithm for Escherchia coli K12 and Bacillus subtilis are evaluated against experimentally discovered operons for these organisms. The accuracy of the method is evaluated using an ROC (receiver operating characteristic) analysis. The area under the ROC curve is around 0.9, which indicates excellent accuracy.