Background
In a search for new molecular markers capable of conferring more specificity and sensitivity to prostate cancer (PC) diagnosis, many laboratories in the past few years have applied transcriptomic profiling analyses, which have unveiled new differentially expressed genes in non-tumoral and tumoral prostate tissues. Expression microarrays have also been used to identify profiles characteristic of metastasic disease [
1], prostate intraepithelial neoplasia (PIN) [
2], and subgroups of tumors with distinct outcomes [
3], including a 5-gene model predictive of recurrence [
4]. Several of these studies compared tumor tissue with benign hyperplastic tissue, or with non-tumoral prostate tissues that were not precisely characterized in terms of location or epithelial representation. Therefore, the outcomes of these analyses, although identifying genes whose expression patterns were most strongly altered in PC, were possibly biased because the comparisons included tissues of diverse histological or embryological origins, or with undefined epithelial and stromal contents. The high degree of tissue heterogeneity of the prostate represents a challenge for molecular studies of PC, and must be taken into account when performing high-throughput analyses. The representation of each cell type within a given sample determines the overall expression profile, which makes it difficult to compare prostate samples which have very different epithelial and stromal contents. One study addressed this issue by applying
in silico corrections to compensate for variable epithelial representations in different samples [
5], whereas other studies resorted to laser microdissection and
in vitro linear amplification [
6].
Great efforts have also been dedicated to elucidate the molecular bases of prostate carcinoma, which are beginning to provide important mechanistic insights into some of the key events in PC initiation and progression. For example, the inactivation of the PTEN and p53 tumor suppressor genes have been shown to play major roles in the initiation of PC [
7]. Also, fusions of TMPRSS2 (transmembrane protease, serine 2) with different members of the ETS transcription factors family are likely relevant initiating factors in this neoplasia [
8]. Moreover, genomewide analyses of single nucleotide polymorphisms have allowed the identification of polymorphisms and copy number variations associated either with predisposition to prostate cancer in familial clusters [
9,
10], or with the occurrence and aggressiveness of prostate cancer [
11].
In this study, we have performed a transcriptomic study of carefully selected samples with the aim of finding expression profiles characteristic of the different cell type compartments of the prostate, to better understand the molecular events responsible for this malignancy. Our analysis has allowed the identification of transcriptional profiles and new markers characteristic of different prostate compartments, as well as a new highly recurrent gain on chromosome 17q25.3 associated with prostate cancer.
Methods
Tissue samples and cell lines
Tissues were procured from untreated patients undergoing radical prostatectomy for clinically localized prostate adenocarcinoma at the Hospital Clínic of Barcelona. Samples were obtained after informed consent by the patients and approval by the Institutional Ethics Committee. Tissue fragments from tumoral and non-tumoral areas were embedded in OCT, snap-frozen, and stored at -80°C at the tumor bank of this institution (Table
1). Non-tumoral samples were completely devoid of cells or glands with neoplastic appearance upon histologic examination of serial sections corresponding to the processed tissues, and are hereafter called normal samples. The rest of the specimen was routinely formalin-fixed and paraffin-embedded. HeLa and RWPE1 cells (ATCC, Manassas, VA) were grown in DMEM (PAA, Ontario, Canada) supplemented with 10% FBS or keratinocyte serum-free medium (KSFM; Gibco, Carlsbad, CA), respectively. Primary cultures (PC1 and PC2) were derived from prostatectomies in which the adenocarcinoma was macroscopically detected. Fibroblast-depleted explants were grown for 4–5 weeks in KSFM supplemented with 10
-11 M 5-α-dihydrotestosterone.
Table 1
Clinico-pathological characteristics of prostate samples
1 | 6 (3+3) | T2 | T1d, e
| 70 | 40 | N1d, e
| 40 | 0 | - | N1-μDe
| T1-μDe
|
2 | 9 (4+5) | T3a | T2d
| 90 | 100 | N2d
| 50 | 0 | - | - | - |
3 | 5 (2+3) | T3a | T3d, e
| 80 | 85 | N3d, e
| 40 | 0 | - | N3-μDe
| T3-μDe
|
4 | 7 (3+4) | T2 | T4d
| 60 | 80 | N4d
| 40 | 0 | - | N4-μDe
| T4-μDe
|
5 | 7 (4+3) | T2 | T5d
| 65 | 90 | N5d
| 45 | 0 | - | - | - |
6 | 9 (5+4) | T2 | T6d, e
| 80 | 100 | N6d, e
| 40 | 0 | - | N6-μDe
| T6-μDe
|
7 | 7 (3+4) | T3a | T7d, e
| 80 | 100 | N7d, e
| 55 | 0 | - | N7-μDe
| T7-μDe
|
8 | 7 (3+4) | T2c | T8d
| 80 | 100 | - | - | - | S1 | - | - |
9 | 8 (3+5) | T2 | T9d
| 80 | 90 | - | - | - | - | - | - |
10 | 7 (3+4) | T3a | T10d
| 80 | 100 | - | - | - | - | - | - |
11 | 7 (3+4) | T3a | T11d
| 70 | 80 | - | - | - | - | - | - |
12 | 7 (3+4) | T3a | T12d
| 50 | 90 | - | - | - | - | - | - |
13 | 9 (4+5) | T3a | T13d
| 80 | 100 | - | - | - | - | - | - |
14 | 7 (3+4) | T3a | T14d
| 65 | 90 | - | - | - | - | N14-μDe
| T14-μDe
|
15 | 7 (3+4) | T3a | T15d
| 70 | 100 | - | - | - | - | - | - |
16 | 5 (2+3) | T2 | T16d
| 80 | 90 | - | - | - | - | N16-μDe
| T16-μDe
|
17 | 7 (3+4) | T3a | T17d
| 80 | 95 | - | - | - | - | - | - |
18 | 8 (3+5) | T2 | T18d
| 65 | 95 | - | - | - | - | - | - |
19 | 6 (3+3) | T2 | T19d
| 70 | 90 | - | - | - | - | - | - |
20 | 7 (4+3) | T3a | T20d
| 60 | 80 | - | - | - | - | - | - |
21 | - | - | - | - | - | - | - | - | S2 | - | - |
22 | - | - | - | - | - | - | - | - | S3 | - | - |
| | - | - | - | - | - | POOL Nd
| 45 | 0 | - | - | - |
Total RNA isolation
Microscopical examination of hematoxylin-eosin (H&E) stained sections from frozen tissues was used to assess the percentage of each histological compartment (stroma, total epithelium, normal and neoplastic epithelium) prior to RNA extraction. Normal samples contained an average of 45% epithelium, stromal samples, obtained from non-tumoral areas of the specimens, contained less than 1% epithelium and tumoral samples contained an average of 70% epithelium, being 90% of it neoplastic glands. At least twenty 20-μm cryosections were used for RNA isolation, and the first and the last sections were H&E-stained to monitor for tumoral and normal gland status. RNA from tissues and cell lines was isolated using RNeasy kits with a DNase I digestion step (Qiagen, Valencia, CA), and analyzed on a 2100 Bioanalyzer instrument (Agilent Technologies, Palo Alto, CA) to assess quality and quantity.
Microarray hybridization and data analysis
The samples used for microarray analyses were grouped in four biological groups: adenocarcinomas (n = 20), normal samples (7 normal samples from 7 prostate cancer patients and a pool of normal prostate tissues, n = 8), and normal-associated stromal samples (n = 3), all of them from the peripheral zone of the prostate, and the cell lines described above (n = 4). Biotinylated cRNA (10 μg) from the 35 samples enumerated above was processed and hybridized to Affymetrix Human Genome Focus Arrays (Affymetrix, Santa Clara, CA). Arrays were scanned at 3 μm resolution in an Agilent HP G2500A GeneArray scanner (Agilent Technologies). Data were RMA-normalized [
12], followed by quantile-quantile probe-level between-array normalization [
13]. Normalized expression data were analyzed by FADA [
14], which applies Q-mode Factor Analysis. Genes were considered differentially expressed between the tumoral and normal groups when their associated
q-value was below 2.5 × 10
-4.
Q-values were computed from t-test
P-values using the Benjamini-Hochberg step-down false-discovery rate (FDR) algorithm [
15]. Normalized expression data were standardized and submitted to UPGMA hierarchical clustering [
16]. Microarray data sets from this study are deposited in the Array Express repository under accession number E-MEXP-1331.
Laser Microdissection
Tissue samples from seven of the patients whose samples had been used in microarray analysis were selected for laser microdissection (Table
1). Eight-μm cryosections were mounted onto plastic membrane slides (PALM, Bernried, Germany), fixed in cold 70% ethanol, stained with hematoxylin, dehydrated, air-dried and stored at -80°C until use. Laser microdissection was performed using the PALM MicroBeam System, with the Laser Microdissection and Pressure Catapulting technology. Approximately 1.2 mm
2 of both normal and tumoral epithelium were collected separately from each sample and RNA isolation was performed as above.
Real-Time RT-PCR (Q-PCR)
Microdissected tissues and 4 paired normal-tumoral samples included in the microarray analysis were used for Q-PCR analyses. Custom-designed TaqMan Low Density Arrays (TLDA; Applied Biosystems, Foster City, CA) contained primers and probes for 45 genes, and the RPS18 gene as the calibrator. One ng of total RNA was used as template for reverse transcription for each replicate of non-microdissected (triplicates) and microdissected (quadruplicates) samples. Q-PCR was performed on an ABI PRISM 7900 HT instrument (Applied Biosystems), and relative quantitation determined by the ΔΔCt method.
Immunohistochemistry
Tissue microarrays (TMA), were built using a Manual Tissue Arrayer 1 (Beecher Instruments, Sun Prairie, WI), and contained a total of 52 paraffin-embedded tumors, 21 PIN and 40 normal samples, each in duplicated or triplicated cores. Two μm thickness TMA sections were mounted on xylaned glass slides (DAKO, Glostrup, Denmark) and used for immunohistochemistry. Mouse monoclonal antibodies specific for myosin VI (MYO6, clone MUD19; 1/100 dilution) and ephrin type-A receptor 2 precursor (EPHA2, clone D7; 1/50 dilution) were purchased from Sigma (Madrid, Spain) and a rat monoclonal antibody specific for multidrug resistance-associated protein 4 (ABCC4, clone M4I-10; 1/50 dilution) was purchased from Abcam (Cambridge, MA). Immunohistochemistry was performed with the Envision system (DAKO) and developed with diaminobenzidine, after antigen retrieval in a pressure cooker with citrate buffer pH 6 (MYO6) and EDTA pH 9 (ABCC4) or no retrieval (EPHA2). The staining was scored as a percentage of positive cells and its intensity as null (0), weak (1), moderate (2) or strong (3). Differences over 20% in the percentage of positive cells and/or one or more than one degree of intensity were considered as a significant change in expression. Images were obtained in an Olympus BHT microscope (Olympus, Germany) with an Olympus Camedia C-3030 camera. Immunohistochemistry information is MISFISHIE compliant [
17].
In silicoanalysis of chromosomal localizations of coexpressed genes
To determine putative non-random colocalizations of coexpressed genes, we determined the precise genomic localizations of all the genes present on the array (ENSEMBL-NCBI 36 assembly of the consensus human genome sequence). We then determined the groups of four or more consecutive FADA-selected genes that were all either over- or underexpressed in tumoral samples and simultaneously colocalized within a distance of 4 Mb, or less. We modelled the random distribution of over- or underexpressed genes as a Poisson distribution, taking as lambda parameter the product of the number of genes present in the array within the tested region multiplied by the total number of over- or underexpressed genes and divided by the number of genes in the array.
Fluorescent in situhybridization (FISH)
Tumoral samples from the same 20 adenocarcinoma cases used in the microarray analysis and 8 lymph node metastases were used. From each paraffin-embedded sample, a 2-μm section was obtained for FISH analysis and a consecutive tissue section was H&E-stained in order to identify the tumoral and normal regions to be analyzed. Sections were deparaffinized, washed in 2 × SSC, dehydrated, denatured and hybridized with a BAC probe corresponding to the segment of interest on 17q25.3 (RP11-165M24) and a chromosome 17 centromeric probe (CEP17, Vysis, Des Plaines, IL). Slides were then washed in 0.4 × SSC and in 2 × SSC, counterstained with DAPI II (Vysis) and imaged using an Olympus BX60 fluorescence microscope (Olympus) with the MetaSystems software (MetaSystems, Germany). Signals corresponding to both the RP11-165M24 and the chromosome 17 centromeric probes were scored in 200 non-overlapping nuclei of the tumoral zone and gains were defined in samples where ≥ 10% of the analyzed nuclei presented a ratio between the RP11-165M24 and the reference probe ≥ 1.5. In all cases, a non-tumoral area was analyzed to assess its normal chromosomic status.
Discussion
Our microarray analysis of highly selected prostate samples, which included normal, tumoral and stromal samples with defined epithelial and stromal representations, and basal epithelial cell lines, has allowed us to identify sets of genes specific of the major cellular compartments in normal and tumoral prostate tissue. Each gene set includes known markers of these compartments, which lends support to our classification, and therefore permits us to infer that other genes in these sets will be relevant in each of these cell types. Amongst the genes included as overexpressed in our prostate tumoral set, there are several well-established PC markers, such as AMACR [
20] and HPN [
21,
22], or previously associated with PC, like those for ectonucleoside triphosphate diphosphohydrolase 5 (ENTPD5), tumor-associated calcium signal transducer 1 (TACSTD1), single-minded homolog 2 (SIM2) or myosin VI (MYO6) [
25,
26]. Likewise, the tumoral and proliferative gene set includes the gene for cyclin-dependent kinase 5 (CDK5), which has been shown to regulate cell proliferation in thyroid carcinoma [
27]. Similarly, several of the genes in the non-tumoral set have been described as selectively suppressed in PC, like those for caveolin-1 (CAV1) [
28], caveolin-2 (CAV2), annexin A2 (ANXA2) [
29], or glutathione-S-transferase π1 (GSTP1) [
30]; some genes such as desmin (DES), fibroblast growth factor 2 (FGF2) or transforming growth factor beta receptor III (TGFBR3) have been described to be predominantly expressed by the stromal compartment [
5]; finally, several genes in the basal set are also known markers of basal cells of prostate glands and other stratified epithelia, such as those for tumor protein p63 (TP73L), keratin 5 (KRT5), keratin 7 (KRT7), keratin 14 (KRT14), or laminin beta 3 (LAMB3) [
18,
19]; the underexpression of these genes in tumoral samples could reflect the loss of this cell layer in prostate tumors. Similarly, our immunohistochemical analyses show that the ephrin type-A receptor 2 precursor (EPHA2), which is classified in this basal gene set, is expressed in a restricted compartment of the basal cell layer. It is also worth noting that all these basal markers are expressed at high levels by our primary cultures, suggesting that they have basal cell features.
Further computational analysis of our FADA-selected genes, in which we applied profiling-based
in silico predictions of genomic imbalances, have led us to the prediction of a recurrent copy number gain in 17q25.3, a prediction that was subsequently demonstrated experimentally by FISH. This was the most significant genomic imbalance predicted by our analysis, and, given the strong functional significance of the gene sets selected by FADA, we suggest that the 17q25.3 segmental gain may play a relevant role in prostate carcinogenesis. Although comparative genomic hybridization studies have revealed gains in distal segments of 17q in some tumors, including prostate cancer [
31,
32], they have not been associated with the precise region in 17q25.3 described in our study. Our analysis suggests that this recurrent gain may involve a region as small as 0.4 Mb in size, which may fall below the resolution of genomewide BAC array analysis [
33].
Considering only the most significantly biased four-gene cluster on 17q25.3, there are 21 genes in the region comprised between SLC25A10, the most centromeric gene of the cluster, and DUS1L, the most telomeric one, of which 13 are represented in the microarrays used in this study (Figure
4A). Of these genes, of immediate interest in cancer are ARHGDIA, that codes for an inhibitor of GDP dissociation from the ras-like cytoskeleton regulator Rho [
34]; ANAPC11, coding for an essential subunit of the anaphase-promoting complex [
35]; SIRT7, coding for an homologue of the NAD
+-dependent histone deacetylase SIRT1 that regulates RNA polymerase II [
3]; MAFG, whose product is a basic region leuzine-zipper transcription factor that heterodimerizes with NRF-2 to regulate the transcriptional response to oxidative stress, and also with Fos and JunB [
36,
37]; and ASPSCR1 (also known as ASPL), a gene that is frequently translocated in alveolar soft part sarcomas and papillary renal cell carcinomas to the chromosome X gene TFE3, causing the increased expression of this transcriptional regulator [
38]. Also located in this region, immediately telomeric to DUS1L, is the fatty acid synthase gene (FASN), a well known marker of malignancy and progression in PC [
39].
The discovery of gene copy number variations acquires added significance if it is simultaneously correlated with transcriptional profiling, a combined approach that few studies have followed [
23,
40,
41]. Thus, we believe that the relevance of our finding of a novel recurrent gain in 17q25.3 in prostate cancer resides mainly on the facts that it is correlated with coordinate overexpression of genes that are tightly associated with that region, and that it is the most significant copy number abnormality predicted by our approach and that it is a recurrent gain, observed in 65% of primary prostate cancer cases. This frequency is significantly higher than most of the recurrent gene copy number variations reported in association with prostate cancer [
11,
41‐
44]. Although gains in 17q25.3 have been reported by other laboratories as part of high-throughput screenings aimed at identifying genomic alterations in association with prostate cancer [
41,
43], they were not analyzed in detail. Therefore, our identification is the first to directly correlate this tumor-associated abnormality with co-regulation of transcripts encoded by specific genes in this region, applying a reverse-genetics approach that combines transcriptomic analyses,
in silico predictions and experimental verification by FISH. Finally, the fact that this genetic aberration is detected in PIN and also in metastasis samples is suggestive of its involvement in all the stages of malignant transformation and progression in PC.
Conclusion
Careful selection of samples with known epithelial and non-epithelial composition, and the inclusion of pure prostate stromal tissues, combined with the application of an inclusive, non-supervised analysis method based on Factor Analysis (FADA), has permitted the extraction of biologically significant sets of genes characteristic of prostate cancer and of the major prostate tissue compartments. Further analysis of transcript levels from these FADA-selected genes, treated as coexpression of closely linked genes, has allowed the identification of a highly recurrent chromosomal gain on chromosome 17q25.3, encompassing 21 genes. This chromosomal aberration is present in 65% of primary prostate cancers and metastases, and also in PIN lesions, making it one of the most frequent genetic abnormalities associated with all stages of malignancy in prostate cancer.
Acknowledgements
We gratefully acknowledge the expert assistance by A. Carrió and A. Mozos with FISH analysis, M. Soler with Q-PCR, E. Fernández with immunohistochemical analysis, D. Benítez and R. Vilella with the establishment of primary cultures, and M. Pacold with implementing some of the programs used in this work. This study was supported by grants from the Ministerio de Educación (GEN2001-4856-C13, GEN2001-4865-C13-10 and SAF2005-05109), the Ministerio de Sanidad (PI020231), the Fundació Marató TV3, the Red Temática de Cáncer of the Instituto Carlos III (RET2039) and the Fundación Ramón Areces.
The authors wish to dedicate this report to the memory of the late Ángel R. Ortiz, a key contributor to this report.
Competing interests
RB, DA, BF, AB, AZ, EC, CM-A, ARO, PLF and TMT are co-inventors in a patent application related to part of the results described in this manuscript.
Authors' contributions
RB, PLF and TMT have participated in the design, execution, analysis and writing of all the sections in this report. BF, IN and EC have participated in sample procurement and processing, and in immunohistochemical analyses. DA and ARO have participated in biocomputational analyses. AB, AZ and CM-A have participated in microarray data generation. JdR and RM have contributed to FISH analysis.