A first feature that is revealed by an analysis of Figure
2 is the coherent grouping of phages. This grouping is in agreement with a 6 groups K-means classification based on phage signatures. The number of groups is greater than those described in Figure
2 to take into account the isolated phages. The groups based on signature distance correspond to the different known and identified groups of coliphages. For example, all the phages belonging to the lytic T7 super-group (group III) have a relatively homogenous distance signature. For temperate phages, two groups can be observed. The first group (group I), containing the lambda-like phages, is characterized by a short distance signature, perhaps reflecting a more ancient prophage life style. The second lambdoid group (II) is very homogenous and contains phages characterized by their ability to carry shiga toxin-like encoding genes. Our representation appears to be compatible with the "classification scheme" suggested by Casjens [
39]. The last group (IV) corresponds to the T4 super-group that contains phages with genomes ranging from 164 to 180 kb in length. These genomes have the peculiarity of having a low GC%, necessitating the normalization of genomic signature of the host and phages (see Methods). In spite of the fact that genomes are larger and then likely least host dependent, the overall observed distance is less than that of the T7 super-group. In the
E. coli phage landscape, several phages remain isolated. Phage ΦEcoM-Gj1 has been recently described and its genome reveals a unique pattern of different origins. It is the first phage with a Myoviridae morphotype but with a T7-like RNA polymerase and a large subunit terminase related to that of phage T1 [
40]. Phage EPS7 has been isolated and its genome recently analyzed [
41]. This phage belongs to the T5 family and its close genomic signature distance is not surprising. The addition to this group of the phage rv5 is tempting, although rv5 is a Myoviridae. Moreover, the proximity of the T4 super group and the putative T5 group is coherent. Analysis of the T5 sequence by Wang et al (2005) [
42] revealed that in the "top 10" homologous phages and genes, RB49, RB69 and T4 are first on the list. Like ΦEcoM-Gj1, phage ΦEco32 has been described as a genome with a large degree of mosaicism [
43]. The genomic signature distance of phage N4 seems to allow it to be grouped with ΦEco32, but no genetic relationship can be retrieved from the literature. Finally, phages Mu and P2 show very close distances, whereas Mu is able to integrate as a prophage by a transposition mechanism, while P2 has a site-specific mechanism of genome integration. It is noticeable that significant homology between phage Mu and P2 have been observed for the tail fiber encoding genes [
44]. Phage P4, the satellite phage of P2, is a defective phage that exists as a plasmid, shows a more divergent distance signature. Figure
2 confirms that there is no correlation between morphotypes and groups or subtype of phages, although several groups appear to be more homogenous than others. For example, the temperate phage group represented by phage 933W (II) appears more susceptible to exchange modules encoding tail fibers. There is also no significant correlation between genome length and the distance between the host and phage signature. However, our representation, using a combination of the distance signatures, genome length and phage characteristics (life style and morphotype), allows us, independently of sequences comparison, to obtain a coherent picture of the "relationship landscape" of the bacteriophages of
E. coli.