Introduction

The ends of human chromosomes – telomeres – consist of long repetitive sequences of (TTAGGG)n that are protected and maintained by the telomerase and networks of protein complexes 1, 2. Telomere biology is intimately linked to genome stability, aging, and cancer 3, 4. The dynamic interactions between the telomerase and its associated proteins, core telomere binding factors, and various factors and modification enzymes are key to telomere homeostasis 5, 6, 7. Six core telomeric proteins (RAP1, TRF1, TRF2, TIN2, TPP1, and POT1) have been shown to be essential for maintaining telomere length and protect telomere integrity 8, 9, 10. Of these six proteins, TRF1 and TRF2 were originally cloned as the major telomere double-stranded DNA (dsDNA) binding factors 11, 12, 13, 14, and are crucial to maintaining telomere length and end protection 15, 16, 17. In fact, both proteins can homodimerize for telomere DNA binding through their respective myb domains 11. Mammalian POT1, on the other hand, was cloned based on its sequence homology to yeast cdc13 and shown to specifically bind the 3′ single-stranded telomere overhangs 18, 19. Through direct interactions with these telomere-binding proteins, RAP1, TPP1, and TIN2 are also targeted to the telomeres for telomere maintenance and regulation 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33. Unlike its yeast orthologue, mammalian RAP1 lacks the ability to bind telomere DNA despite the presence of a myb domain 20. Instead, the telomeric recruitment and stability of RAP1 is dependent on TRF2 20. RAP1 is also involved in regulation of telomere length 20, 27, 28. Interestingly, TRF2, but not RAP1, was required for the inhibition of non-homologous end joining at the telomeres, as depletion of TRF2 leads to telomere fusions 15, 34. However, RAP1 is essential for the inhibition of homology-directed repair at telomeres 25, 26.

The primary focus for mammalian telomeric proteins has been on their function in telomere maintenance. Increasing evidence, however, suggests a broader role for these proteins outside of the telomeres. For example, human TIN2, TPP1, and POT1 have been shown to localize and interact in the cytoplasm 35. Furthermore, TRF2 has been implicated in regulating the proliferation and differentiation of neural tumor and stem cells, where TRF2 interacts with the repressor element 1-silencing transcription factor (REST) in PML-nuclear bodies and protects REST from proteosomal degradation 36. Notably, while the myb domain of TRF2 recognizes telomeric repeat sequences, its basic domain can bind DNA junctions in a telomere sequence-independent manner 37. In fact, TRF2 appears to play an important role in homologous recombination repair of double-strand breaks at non-telomeric regions 38.

Studies of RAP1 in different organisms have underlined the role of RAP1 in the regulation of gene expression. Budding yeast Rap1 (scRap1) binds directly to telomeric DNA and controls subtelomeric silencing 39, 40. scRap1 also binds to many gene loci and regulates the transcription of genes that encode proteins from diverse pathways, including ribosomal proteins, glycolytic enzymes, and mating-type factors 41, 42, 43. Trypanosoma brucei Rap1 (tpRap1) is a critical transcription suppressor for silencing variant surface glycoprotein genes located within subtelomeric regions 44. Recently, human RAP1 was found to associate with IκB kinases in the cytoplasm and act as a crucial regulator of NF-κB-modulated gene expression 45. Furthermore, by comparing the data from RAP1-deficient and wild-type mouse embryonic fibroblasts, chromatin immunoprecipitation (ChIP)-seq analysis revealed that extra-telomeric RAP1 binding sites were enriched in subtelomeric regions and in genes deregulated as a result of RAP1 deletion 26. However, little was known about the extra-telomeric binding activity of human RAP1.

Here we report our systematic investigation of the binding sites of telomeric proteins RAP1 and TRF2 along human chromosomes. Using anti-RAP1 and TRF2 antibodies, we performed whole-genome ChIP coupled with high-throughput sequencing in human cells. Our analysis found that both TRF2 and RAP1 occupy a limited number of interstitial regions throughout the human genome and regulate gene expression.

Results

Identification of genome-wide RAP1 and TRF2 binding sites by ChIP-seq

To understand the extra-telomeric function of telomeric proteins, we performed ChIP analysis to identify chromosomal binding sites of telomeric proteins in human cells. First, anti-RAP1 and TRF2 antibodies were tested for their ability to pull down telomere protein-DNA complexes in ChIP experiments using HTC75 cells. As shown in Figure 1A, both antibodies were able to specifically and efficiently immunoprecipitate telomere DNA. We then used these antibodies for whole-genome ChIP-seq experiments, where the immunoprecipitated DNA was recovered and sequenced by Solexa technology (Figure 1B).

Figure 1
figure 1

Identification of RAP1 and TRF2 binding sites by ChIP-seq. (A) HTC75 cells were crosslinked and immunoprecipitated using anti-RAP1 and TRF2 antibodies. The co-precipitated DNA was analyzed by dot-blotting with the indicated telomere probe. Rabbit IgG was used as negative control. (B) Flowchart for whole-genome ChIP-seq experiments. (C) Enrichment of telomere sequences. The relative abundance of TTAGGG repeats of various lengths in RAP1 and TRF2 ChIP-seq data was compared to the IgG control. (D) A ChIP-seq histogram showing RAP1 and TRF2 interstitial binding sites that were found near gene AC013473.1 on chromosome 2.

Our ChIP-seq experiments yielded > 2.0 × 107 uniquely mapped short reads for RAP1, and > 1.4 × 107 reads for TRF2. Short reads from IgG (> 3.1 × 107) were used as controls for RAP1 and TRF2 peak detection (Table 1). Consistent with telomere targeting of RAP1 and TRF2, both RAP1 and TRF2 data sets are highly enriched for telomeric sequences. For example, DNA fragments that contain three or more TTAGGG repeats could be specifically brought down by anti-RAP1 and TRF2 antibodies, and sequences that contain more than six TTAGGG repeats were > 20-fold more abundant compared to IgG controls (Figure 1C). Although the majority of the precipitated DNA could be mapped to telomeric and subtelomeric regions, a small number of interstitial sites were also discovered (Table 2). For example, the gene locus of AC013473.1 appeared to be occupied by both RAP1 and TRF2, but not IgG control (Figure 1D). There were a total of 78 and 77 interstitial sites for RAP1 and TRF2, respectively, indicating that telomeric proteins can indeed bind to chromosomal regions other than the telomeres in human cells.

Table 1 Summary of ChIP-seq short reads
Table 2 Distribution of RAP1 and TRF2 binding sites

The majority of the interstitial sites appear to be located within 5 kb of gene loci (73% for RAP1 and 58% for TRF2) (Figure 2A and 2B). The RAP1 sites are situated in the exon (8%), intron (31%), and UTR (34%) regions of the genes (Figure 2A). In total, there are 63 and 50 genes, respectively, in the vicinity of RAP1 (Table 3) and TRF2 binding sites (Table 4). An example is the CLIC6 gene, where one RAP1 binding site was mapped to its intronic region (∼3 kb downstream of the sixth exon) (Figure 2C). Genes from diverse pathways appear to be targets of RAP1 and TRF2, including membrane proteins such as CLIC6 and PLXNB2, and signaling molecules such as PAK2 and VAV3 (Tables 3 and 4). The enrichment of RAP1 and TRF2 binding sites to these gene loci suggests that RAP1 and TRF2 may be targeted to these genes for regulation.

Figure 2
figure 2

Human RAP1 and TRF2 are able to bind interstitial sites. Gene location and distribution of RAP1 (A) and TRF2 (B) binding sites. Interstitial binding sites were categorized based on their relative locations to gene loci. UTR is defined as sequences within 5 kb upstream of the first exon or 5 kb downstream of the last exon. Not defined, sites > 5 kb away from annotated genes. (C) RAP1 binding site on CLIC6. Left, RAP1 and IgG ChIP-seq reads were mapped to CLIC6 on chromosome 2. Right, RAP1 binding sites in the intron region of CLIC6. (D) Differential binding of RAP1 in human vs. mouse cells. RAP1 ChIP-seq data from human and mouse were compared.

Table 3 List of RAP1 binding sites (a total of 63 genes, including subtelomeric regions)
Table 4 List of TRF2 binding sites (a total of 50 genes including subtelomeric regions)

A recent RAP1 ChIP-seq analysis using mouse cells reported 30 398 RAP1 binding sites and 8 687 RAP1 target genes 26. Surprisingly, we found no overlap in RAP1 sites between the two data sets (Figure 2D). Of the potential mouse RAP1 targets, 7 521 are estimated to have human orthologues. However, we found only 16 genes to be human RAP1 targets as well. This number is too small to be of statistical significance, and further indicates the lack of overlap between human and mouse RAP1 sites. Taken together, these findings highlight the differences in RAP1 interstitial binding sites between human and mouse.

Confirmation of the extra-telomeric binding sites for RAP1 and TRF2

Next, we carried out secondary screens to confirm our ChIP-seq results. To rule out the possibility of antibody cross-reactivity, human HTC75 cells expressing FLAG-tagged RAP1 or TRF2 were generated. The ectopically expressed TRF2 and RAP1 proteins retained their ability to target to the telomeres (Supplementary information, Figure S1). Anti-FLAG ChIP experiments were then carried out using extracts from these cells. The precipitated DNA was then purified for quantitative PCR (qPCR) analysis using primer pairs that were derived from the target sites identified in ChIP-seq experiments (Supplementary information, Table S1). Consistent with our whole-genome ChIP-seq data, FLAG-RAP1 and TRF2 were indeed enriched on the extra-telomeric target sites examined here (Figure 3A and 3B). In comparison, we observed no enrichment at the region that was negative in ChIP-seq analysis (negative chr2, Figure 3A and 3B). Similar experiments were carried out for endogenous RAP1 and the results were consistent as well (data not shown). These findings indicate that RAP1 and TRF2 can associate with specific interstitial sites on human chromosomes.

Figure 3
figure 3

ChIP-qPCR analysis to confirm RAP1 and TRF2 binding sites. Anti-FLAG ChIP was carried out using HTC75 cells expressing (A) RAP1-FLAG or (B) TRF2-FLAG, and the precipitated DNA was analyzed by qPCR using primer pairs derived from the indicated RAP1 or TRF2 binding sites identified by ChIP-seq (Supplementary information, Table S1). Rabbit IgG served as control. Error bars indicate standard errors (n = 3). P values were calculated by Student's t test and * indicates P < 0.05.

RAP1 and TRF2 can occupy sites with or without telomere repeat sequences

RAP1 and TRF2 can bind to each other and form a heterodimer that is anchored on telomeres through the direct binding of TRF2 to telomere dsDNA 13, 20. We therefore analyzed whether interstitial RAP1 and TRF2 sites contain similar sequence motifs. Indeed, one class of the interstitial site motifs contains the TTAGGG repeat (Figure 4A). Here, 12 of the 78 RAP1 sites match the telomere repeats, suggesting direct binding of the RAP1-TRF2 heterodimer to these sites. When the TTAGGG repeat-containing sites were excluded from the analysis, novel motifs for RAP1 and TRF2 binding emerged. For instance, 15 RAP1-binding sites share the motif CC[AC]T[TG][CT]C[AT]T[TC]CC (Figure 4B), while a closely related motif CCATTCC[AT]TTCC is shared by 14 of the 77 TRF2 sites (Figure 4C), suggesting possible co-occupancy of both RAP1 and TRF2 on these sites. Interestingly, a fraction of TRF2 sites do not overlap with RAP1 sites, raising the possibility that RAP1 and TRF2 can be recruited to distinct regions of chromosomes. The appearance of non-TTAGGG-containing motifs for RAP1 and TRF2 further suggests that TRF2 and RAP1 either possess binding capacity for non-telomeric DNA sequences, or can localize to interstitial sites through additional proteins.

Figure 4
figure 4

RAP1 and TRF2 occupy selective telomere-repeat-containing sites. (A) Motif search using the MEME and MAST software identified a highly conserved motif (TTAGGG)2. (B) MEME and MAST were used to identify additional motifs for RAP1 binding sites after sequences containing the (TTAGGG)2 motif were removed (E-value less than 0.1). (C) A similar analysis was done to obtain additional motifs for TRF2 binding sites (E-value less than 0.1). (D) RAP1 and TRF2 occupy a subset of the predicted binding sites on chromosome 2. Potential interstitial binding sites containing telomeric repeats were identified by scanning the human genome using T2N100 (TTAGGGTTAG{0, 100}TTAGGGTTAG). The predicted sites (red) were compared to RAP1 (blue) and TRF2 (green) sites found in ChIP-seq analysis.

Selective occupancy of interstitial telomere repeat-containing sites by TRF2

Our results thus far indicate that RAP1 and TRF2 are capable of occupying interstitial sites that contain TTAGGG repeat sequences. The small number of binding sites identified here prompted us to examine the abundance of such binding sites in the human genome. Based on the target site sequences for RAP1 and TRF2, we deduced a minimal telomere-containing pattern that was named T2N100. T2N100 represents two TTAGGGTTAG sequences that are separated by 0-100 nucleotides. This motif is consistent with the structure of the TRF2 myb domain in complex with telomere DNA 46, in which a single TRF2 myb domain recognizes TTAGGGTTA. It also took into consideration findings that TRF1 and TRF2 can bind telomere DNA as homodimers where they recognize two TTAGGGTTA sites that are separated by a spacer of varying length 11, 47, 48. Scanning of the human genome using the T2N100 motif revealed ∼300 potential binding sites for TRF2, an example of which is shown in Figure 4D. This number is much higher than the number of interstitial telomeric-repeat sites we observed in the RAP1 and TRF2 ChIP-seq experiments, suggesting a selective targeting mechanism for RAP1 and TRF2 extra-telomeric binding in human cells.

Selective association of RAP1 and TRF2 to interstitial sites may be achieved through a number of mechanisms. Interaction with additional proteins may determine whether RAP1 and TRF2 are targeted to certain sites. Alternatively, the amount of RAP1 and TRF2 proteins may be a limiting factor. If the latter is true, increasing RAP1 or TRF2 expression should increase the number of interstitial binding sites targeted by these proteins. To test this possibility, we utilized the cells that overexpressed FLAG-tagged TRF2 and determined its occupancy by ChIP coupled with qPCR on a subset of binding sites that were predicted to contain the T2N100 motif (Figure 5). When TRF2 was overexpressed (Figure 5C), association of TRF2 could be seen in 22 out of 26 predicted T2N100 sites (Figure 5A), but the majority of these sites were not occupied by endogenous TRF2 in control cells (Figure 5B). In comparison, overexpression of FLAG-RAP1 (Figure 5C) did not increase binding of RAP1 to the T2N100 sites tested (Figure 5D and 5E). Interestingly, overexpression of TRF2 also failed to bring RAP1 to the T2N100 sites examined (Supplementary information, Figure S2), suggesting that TRF2 may occupy T2N100 sites independently of RAP1. Our results point to the importance of cellular TRF2 concentration in determining TRF2 occupancy of interstitial sites that contain telomere repeats.

Figure 5
figure 5

Selective occupancy of interstitial sites. (A) ChIP-qPCR analysis using anti-FLAG antibodies using HTC75 cells expressing TRF2-FLAG or vector alone (control). (B) ChIP-qPCR analysis of HTC75 cells using anti-TRF2 antibodies or IgG. Primer pairs were derived from the predicted T2N100 sites and are listed in Supplementary information, Table S1. Sequences from chromosome 2 were used as negative control for qPCR. Error bars indicate standard error (n = 3). P values were calculated by Student's t test and * indicates P< 0.05. (C) HTC75 cells expressing FLAG-tagged RAP1 (RAP1-FLAG) or TRF2 (TRF2-FLAG) were analyzed by western blotting using the indicated antibodies. Anti-tubulin or anti-actin antibodies were used as loading control. (D) Vector control or RAP1-FLAG-expressing HTC75 cells were analyzed by ChIP-qPCR using anti-FLAG antibodies. (E) HTC75 cells were analyzed by ChIP-qPCR using IgG or anti-RAP1 antibodies. Primer pairs were derived from the predicted T2N100 sites and are listed in Supplementary information, Table S1. Sequences from chromosome 2 were used as negative control for qPCR. Error bars indicate standard error (n = 3). P values were calculated by Student's t test.

Regulation of gene transcription by telomeric proteins

Our findings of preferential binding of RAP1 and TRF2 to interstitial sites proximal to gene loci suggest that these telomeric proteins may regulate gene transcription. To understand whether RAP1 and TRF2 can affect the expression of their target genes, we generated HTC75 cells whose endogenous RAP1 or TRF2 was knocked down by RNA interference (RNAi). As shown in Figure 6A, RAP1 mRNA levels were reduced by ∼80% using two different siRNA sequences. This reduction in RAP1 message level was accompanied by significant decreases in RAP1 protein levels as well (Figure 6B). We then compared the expression of RAP1 target genes in these cells by qRT-PCR. While the expression of a subset of the genes examined exhibited no change upon RAP1 knockdown, altered expression was observed in a number of RAP1 target genes (Figure 6D). For example, while CLIC6 expression increased in RAP1 knockdown cells, the level of RPH3AL expression decreased. For TRF2, knocking down TRF2 by two different shRNAs in HTC75 cells (Figure 6C) resulted in decreased binding of TRF2 to its target genes (Supplementary information, Figure S3). Consequently, TRF2 knockdown reduced TRF2-target gene PDE3A expression while increasing the expression of RPA2 (Figure 6E). These findings support the hypothesis that telomeric proteins can regulate the transcription of their target genes located in the interstitial regions.

Figure 6
figure 6

Telomeric proteins regulate gene transcription. HTC75 cells transfected with two siRNA oligos were harvested for (A) qRT-PCR analysis for mRNA levels and (B) western blotting for protein levels. siControl, control siRNA oligos. Anti-actin antibodies were used as loading control. (C) HTC75 cells stably expressing shRNA sequences against TRF2 were analyzed by western blotting. shRNA sequences against GFP were used as negative controls (shGFP). Anti-actin antibodies were used as loading control. (D) The expression of RAP1 target genes was examined in RAP1 knockdown cells by qRT-PCR. Error bars indicate standard error (n = 3). P values were calculated by Student's t test and * indicates P< 0.05. (E) The expression of TRF2 target genes was examined in TRF2 knockdown cells by qRT-PCR. Error bars indicate standard error (n = 3). P values were calculated by Student's t test and * indicates P< 0.05.

Discussion

In this study, we analyzed genome-wide chromatin-binding patterns of two telomeric proteins RAP1 and TRF2, and found these proteins to associate with interstitial sites. In budding yeast, the scRAP1 protein has long been implicated in global gene regulation 39, 41, 42, 43, 49, 50. However, mammalian RAP1 proteins exhibit many differences from yeast RAP1, including the inability to bind directly to telomere DNA. Our findings provide evidence for an extra-telomeric function of human telomeric proteins, and suggest that such function is conserved through evolution. Recent findings of mouse RAP1 also support this notion 26. Given the association of multiple core telomeric proteins that can form high-molecular-weight complexes on the telomeres 10, other telomeric proteins such as TIN2 and POT1 may also co-occupy RAP1 and TRF2 target sites outside of telomeres. Such possibilities warrant further investigation.

Work on RAP1 from mice and yeast found numerous extra-telomeric sites for RAP1 26, 41, 42, 43, in direct contrast to our findings. In this study, we identified a limited number of interstitial binding sites for RAP1 and TRF2 in human cells. Our results appear to be consistent with another study, where ChIP-seq using anti-TRF1 and anti-TRF2 antibodies revealed restricted abilities of TRF1 and TRF2 in binding extra-telomeric sites in the human genome 51. It is possible that the number of RAP1 and TRF2 binding sites may have been underestimated, due to potential antibody-epitope access problems in our ChIP experiments. Or the binding of interstitial sites in human cells may be more tightly controlled than in other species. Here, we provide evidence that the cellular concentration of TRF2 may play a role in selective binding of TRF2 to its target sites. Using stringent criteria, we could predict ∼ 300 potential TRF2 binding sites that contain telomeric repeats. TRF2 overexpression was sufficient to target TRF2 to these predicted interstitial sites, indicating that these sites can indeed be targeted by TRF2. Telomeres in Mus musculus are at least 10 times longer than in humans, and the concentration of mouse TRF2 and RAP1 proteins is likely higher as well. This may explain why a large number of interstitial sites were bound by RAP1 in mouse cells. We also observed that overexpression of TRF2 did not lead to an increase in the binding of endogenous RAP1 at T2N100 sites, possibly due to marginal enhancement of RAP1 level in TRF2-overexpression cells (Figure 5C). It is equally possible that TRF2 may not associate with RAP1 at the T2N100 sites tested. We speculate that the difference in telomeric protein concentration may at least in part account for the difference in binding site numbers observed between human and mouse. Taken together, our observations raise the intriguing possibility that the concentration of telomeric proteins such as TRF2 may vary in different cell types, resulting in distinct chromatin-binding profiles and biological outcomes.

TRF2 and RAP1 have overlapping but clearly distinct interstitial binding sites. The overlap is expected given that these two proteins form heterodimers on telomere repeats. Indeed, the major DNA motif common to both RAP1 and TRF2 binding sites is the TTAGGG repeat. Surprisingly, we found additional non-telomere-repeat motifs that are also shared by RAP1 and TRF2. Whether these sequences can be recognized by RAP1 and TRF2 in vitro and in vivo would be interesting to pursue. In addition, we also identified unique bindings for RAP1 and TRF2, respectively, indicating TRF2-independent function of RAP1. While the binding of RAP1 to TTAGGG repeat most likely occurs through its interaction with TRF2, the mechanism by which RAP1 associates with non-telomere-repeat sequences remains unclear and warrants further investigation. One possibility is that RAP1 may interact with other chromatin-associated factors (Songyang, unpublished data).

Finally, the enrichment of TRF2 and RAP1 binding in gene loci suggests a role of these telomeric proteins in gene regulation. Our data indicate that the transcription of a subset of the identified target genes was modulated by RAP1 and TRF2. Reducing RAP1 or TRF2 level led to altered target gene expression. This finding is in line with the repressor/activator role of scRAP1 41, 42, 43. Recently, RAP1 and TRF2 have been reported to associate with RNA splicing factors as well as proteins that regulate DNA recombination and replication 52, 53. It will be important to sort out the various regulatory pathways impacted by telomeric proteins at the extra-telomeric sites, and to uncover the differences in the underlying mechanisms utilized by telomeric proteins in different species. Our work underscores the importance of further investigation into the non-canonical activities of telomeric proteins.

Materials and methods

Vectors, cell line, and antibodies

cDNAs encoding human TRF2 and RAP1 were cloned into a pCL-based retroviral vector that allows for FLAG epitope tagging and stable expression in HTC75 cells. For western blotting and ChIP experiments, the following antibodies were used: mouse monoclonal anti-TRF2 antibody for western blotting (Calbiochem), rabbit polyclonal anti-TRF2 antibody for ChIP analysis (Bethyl Laboratories), rabbit polyclonal anti-RAP1 antibody (Bethyl Laboratories), rabbit polyclonal anti-FLAG antibody (Sigma), goat polyclonal anti-actin antibody (Santa Cruz), HRP-conjugated goat anti-mouse and goat anti-rabbit antibodies (Bio-Rad), and HRP-conjugated anti-FLAG M2 antibody (Sigma).

RNAi knockdown

Control Stealth RNAi siRNA duplexes (siControl, cat# 12935-300) and those against RAP1 were obtained from Invitrogen. siRAP1-1: AUUAACUGCCGAAUGAUCUUAAUGG, anti-sense; and siRAP1-2: ACGAACAGAGUCGAGGAAUGGGUGG, antisense. HTC75 cells were transfected with the siRNA duplexes (60 pmol each) using RNAiMAX (5 μl) (Invitrogen) in six-well plates, and analyzed 48 h later. For knockdown of TRF2, two shRNA sequences against TRF2, shTRF2-1 (5′-AGGAAATGGTGAAGTCTAT-3′) and shTRF2-2 (5′-GAGCATGGTTCCTAATAAT-3′), were cloned into a retroviral vector as previously described 54. A shRNA sequence against GFP (shGFP) (5′-CACAAGCTGGAGTACAACT-3′) was the negative control. HTC75 cells stably expressing these sequences were selected in puromycin for 3 days before further analysis.

ChIP assay

ChIP assays were performed essentially as described previously 55, with slight modifications. Sonicated lysates (from ∼5 × 106 cells) were pre-cleared with 50 μl protein A beads, 2 μg rabbit IgG (Sigma), 10 μl of 5% BSA, and 5 μg of sheared E. coli DNA at 4 °C for 2 h. For immunoprecipitation, pre-cleared lysates were incubated with 3 μg of antibodies (rabbit IgG, polyclonal anti-TRF2, anti-RAP1, or anti-FLAG), 1 μl of 5% BSA, 25 μg of sheared E. coli DNA, and 50 μl of Protein A agarose beads. The precipitated DNA was then eluted and analyzed by sequencing, dot-blotting, or qPCR. A radiolabeled oligonucleotide (TTAGGG)3 was used to detect telomeric DNA in dot-blotting.

Real-time qPCR

Isolated total RNA (RNeasy Mini Kit, Qiagen) was reverse transcribed using the iScript cDNA Synthesis Kit (Bio-Rad). Real-time qPCR was carried out using an ABI StepOnePlus real-time PCR system and the SYBR green master mix (Applied Biosystems). qPCR was also done to validate candidates and T2N100 sites found in ChIP and ChIP-seq experiments. For specific primers used in this study, please see Supplementary information, Table S1.

ChIP sequencing

The ChIP DNA libraries were prepared using DNA sample kit (Illumina) following the manufacturer's protocols. Briefly, 10 ng of ChIP DNA was end-repaired and ligated to Illumina adaptors. DNA samples were then amplified using adaptor-specific primers for 21 cycles and fragments of ∼ 150 bp were isolated from the agarose gel. The quantity and size distribution of sequencing libraries were determined using the PicoGreen fluorescence assay and the Agilent 2100 Bioanalyzer, respectively. The quality of the library was assessed by checking the enrichment for known targeted regions with real-time PCR. Sequencing of the library was carried out on the Illumina Genome Analyzer system according to the manufacturer's specifications.

Data analysis

Alignment of short reads to the human genome (hg19 release, USCS) was carried out using SOAP 56, allowing for a maximum of two mismatches. Mitochondria sequences are included. For reads that were longer than 35 bp and could not be mapped to the genome, only the first 35 bp were used for alignment. All short reads that were uniquely aligned were used for further analysis. The programs SISSRs v.1.4 57 and CisGenome v. 1.2 58 were used for peak detection under default settings (0.001 FDR for SISSRs and 0.1 FDR for CisGenome). RAP1 or TRF2 ChIP libraries were individually pooled as the sample set and the IgG library was pooled as the negative control set for the two-sample ChIP-seq analysis. All of the peaks detected by the two algorithms were combined and merged (overlap or within 400 bp).

Annotation of the identified peaks was carried out using human genomic sequences (hg19) and gene loci information based on UCSC exon locus annotation (http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/) and BioMart (www.biomart.org/ version 57). For exon annotation, the longest transcript was considered for each gene. Regions within 5 kb up- or downstream of annotated transcripts were referenced as 5′UTR or 3′UTR, respectively. Subtelomeric regions are defined as 500-kb regions adjacent to the terminal fragment of each chromosome 59.

For comparative analysis of RAP1-binding sites between human and mouse, the 30 398 mouse RAP1 binding sites identified by Cisgenome (10% FDR) were obtained 26. Pair-wise alignment files of human (hg19) and mouse (mm9) genome sequences were obtained from UCSC, and human and mouse orthologue information was retrieved from Ensembl v.59 (www.ensembl.org). Genomic sequences at the corresponding peaks from ChIP-seq experiments were used for motif search with the Multiple Expectation Maximization for Motif Elicitation (MEME) and Motif Alignment and Search Tool (MAST) software 60.