The Human Protein Atlas project has generated a comprehensive map of global gene expression patterns in normal tissues [
]. Through integration of antibody-based, spatial proteomics and quantitative transcriptomics, expression and localization of more than 90% of all human protein-coding genes have been analyzed. Whereas the majority of proteins show a widespread expression profile, subsets of tissue-enriched proteins have been defined [
], including proteins with enriched expression in the kidney [
]. To facilitate screening and discovery efforts for cancer-relevant proteins, the Human Protein Atlas also contains immunohistochemistry-based protein expression profiles for the 20 most common forms of cancer [
Renal cell carcinoma (RCC) is the most common type of cancer affecting the kidney. Several histological subtypes of RCC have been defined, the most frequent being clear cell RCC (ccRCC) [
]. Diagnosis and subtyping of RCC are achieved through the morphological analysis of tumor sections. The application of immunohistochemistry (IHC) can reveal important additional clues during the diagnostic work-up. A variety of antibodies have been described to guide pathologists during the diagnosis of distant metastases from the kidney, to distinguish primary RCCs from benign mimics, and to differentiate RCC from malignancies derived from other retroperitoneal structures [
]. Most recently, PAX8 and PAX2 have shown improved RCC-specificity over the traditionally used RCC markers CD10 and RCC monoclonal antibody, although several female genital tract and thyroid tumors stain positive for both markers [
The clinical risk stratification of RCC patients relies heavily on the assessment of histopathological parameters. Clear cell histology is significantly associated with a more aggressive disease progression and reduced overall survival [
]. For the prediction of recurrence in patients with localized ccRCC, algorithms were developed by teams at Memorial Sloan-Kettering Cancer Center (based on tumor stage, nuclear grade, tumor size, necrosis, vascular invasion and clinical presentation) [
] or the Mayo Clinic (based on tumor stage, tumor size, nuclear grade and histological tumor necrosis) [
]. More recently, gene expression signatures have been proposed to add prognostic value to conventional algorithms [
The aim of this study was to utilize the vast data resources generated by the Human Protein Atlas project to identify novel biomarkers of clinical relevance for patients with RCC. Cubilin (CUBN) was identified and validated as a marker with the potential to classify RCC patients into low- and high-risk groups, as loss of CUBN expression was significantly and independently associated with less favorable patient outcome. In addition, CUBN expression appears highly specific for RCC compared to other types of cancer, rendering CUBN a possible clinical role in cancer differential diagnostics.
Human Protein Atlas database searches
Global mRNA expression data for 27 normal human tissues [
] was searched for genes specifically expressed in normal kidney and a maximum of six additional tissues. Genes with >5-fold higher fragments per kilobase of transcript per million mapped reads (FPKM) levels in normal human kidney compared to all other tissues and genes with 5-fold higher average FPKM level within a group of 2–7 tissues, including normal human kidney, were investigated further. Corresponding IHC-based expression data within the Human Protein Atlas database (
and unpublished data) was evaluated manually.
Similarly, proteome-wide IHC-based expression data for 83 normal human cell types, corresponding to 44 normal tissues, was searched for proteins expressed in renal tubules or glomeruli and a maximum of nine additional cell types. Retention of protein expression in RCC was evaluated manually. IHC-based expression data for 216 cancer tissues, including up to 12 cases of RCC, were systematically queried for antibodies yielding positive IHC-staining primarily in RCC. Database searches were conducted using varying positive/negative definitions (e.g. negative or weak staining as cut-off) and various levels of specificity (e.g. staining in 50% or 75% of RCC cases and less than 10% or 25% of any other cancer type, respectively).
Initially, a tissue microarray (TMA) containing tumor material from 39 patients with available, corresponding transcriptomics data and protein lysates was used (Additional file
: Table S1). In addition, three independent TMA cohorts were used. Cohort 1 was a multi-cancer cohort including 940 tumor samples, representing 22 different tumor sites (Additional file
: Table S2, [
]). Formalin-fixed, paraffin-embedded (FFPE) tumor specimens were identified from the archives of Uppsala University Hospital, Falun Hospital and Lund University Hospital, where all cases were originally diagnosed between 1984 and 2011. A large fraction of samples (502 tumors) represented material from metastatic sites. For RCC, 20 primary tumors and 20 metastases were included. Cohort 2 included 167 primary, 103 venous tumor thrombi and 96 metastatic tumors from 183 RCC patients following radical nephrectomy at the Department of Urology, Edinburgh, between 1983 and 2010 (Additional file
: Table S3, [
]). Written consent was obtained from study participants from cohort 2. Cohort 3 was assembled from 114 primary ccRCC samples (Additional file
: Table S3) from patients diagnosed with metastatic RCC between 2006 and 2010 at one of seven Swedish medical centers (Uppsala, Göteborg, Örebro, Västerås, Gävle, Falun, Karlstad). All patients within this cohort had undergone a radical nephrectomy. Written consent was obtained from study participants from cohort 3.
Tissue microarray construction, immunohistochemistry and annotation
TMAs were constructed as described previously [
]. Two antibodies targeting CUBN were tested (HPA043854 and HPA004133, Atlas Antibodies AB, Stockholm, Sweden). Automated IHC was performed as described previously [
]. IHC staining intensities and fractions of stained tumor cells were manually evaluated and each core annotated by two independent observers. Due to the large number of annotations this task was shared within a group of three observers (TP, NK, GG). Cases with divergent scores were reviewed by a third observer (DD) and consensus reached. Total cellular staining (including cytoplasm and cell membrane) was annotated. Cases were considered positive for CUBN if the fraction of stained cells was greater than 10% and the staining intensity showed at least moderate intensity.
RNA expression and Western blot analysis
RNA expression analyses were performed as described previously [
]. Western blot analysis was performed according to standard protocols.
For the calculation of sensitivity, specificity and positive predictive value (PPV) standard formulas were applied [
]. Kaplan–Meier survival curves were generated to evaluate the correlation between CUBN expression and patient survival. The log-rank test was used to compare patient survival in groups stratified according to CUBN expression. Cox proportional-hazards regression was applied to estimate hazard ratios in univariate and multivariate models. The
test and Fisher’s exact test were used to calculate the significance of associations between CUBN expression and clinicopathological parameters. Calculations were carried out using SPSS Statistics Version 22 (IBM, Armonk, NY).
We utilized the Human Protein Atlas resources to identify in an unbiased fashion, novel targets to improve and supplement currently used tools for the prognostication and differential diagnosis of RCC. Following state-of-the-art validation of antibodies targeting CUBN [
], we analyzed the expression of CUBN in normal human tissues, a large variety of cancers and two RCC-specific cohorts. We found that loss of CUBN expression in ccRCC patients was significantly associated with poor prognosis. Importantly, this observation was independent of T-stage, Fuhrman grade and nodal status, implying added clinical value of routine CUBN testing. In addition, we found the expression of CUBN to be highly specific to RCC, suggesting a potential use of CUBN in clinical cancer differential diagnostics as a complement to other diagnostic antibodies in cases where RCC needs to be confirmed.
CUBN is an endocytic receptor that is specifically expressed on epithelial cells in the proximal tubules of the kidney and in glandular cells of the small intestine [
]. In the kidney, CUBN mediates the reabsorption of filtered proteins such as albumin and transferrin [
], whereas in the small intestine, CUBN is primarily involved in the uptake of intrinsic factor-vitamin B
]. Even though the role of CUBN in normal kidney and small intestine has been well characterized and CUBN has been used as a marker for renal cell differentiation [
], the role of CUBN during RCC development and progression is largely unknown.
Although IHC is not quantitative, results from validated antibodies provide protein expression data at cellular resolution and can readily be translated to a clinical setting. The applied TMA methodology also appears well suited to simulate small tissue biopsies, which are exceedingly relevant in the clinical practice. The specificity and sensitivity of IHC staining for CUBN in cohorts of tumor tissue has provided an example of a novel diagnostic biomarker for RCC. Although extended studies regarding the expression pattern in additional tumors of relevance for differential diagnostics, e.g. adrenal gland tumors and other forms of clear cell cancer, are required to establish the usefulness of CUBN staining in clinical routine, the presented results indicate that this marker could be used for difficult cases where a diagnosis of RCC needs to be confirmed.
There is an unmet need for better tools for risk stratification of ccRCC patients. Several prognostic algorithms based on clinicopathological parameters have been proposed. For example, algorithms developed at Memorial Sloan-Kettering Cancer Center [
] or the Mayo Clinic [
] are used for the prediction of recurrence in patients with localized ccRCC. More recently, molecular phenotyping of RCC has shown promise in adding prognostic value to standard clinicopathological parameters. With ClearCode34, a 34-gene expression signature for the prognostic stratification of localized ccRCC patients was introduced and a combination of molecular and clinical parameters shown to provide better risk prediction than clinical variables alone [
]. Unlike mRNA-based assays, the immunohistochemical detection of CUBN can easily be implemented in routine pathology laboratories. An application of CUBN as marker for early disease spread and the added value of CUBN as a prognostic marker over clinical stage, grade and nodal status are promising and additional validation is highly desirable.
Functional studies to understand the mechanism linking the expression of a protein involved in re-absorption of proteins in proximal tubules and aggressiveness of RCC are needed. Previous studies showing that TGF beta reduces CUBN expression [
] and contributes to RCC aggressiveness [
] could provide one starting point to explore the biological background for the correlation between CUBN expression in RCC and prognosis. Extended functional studies regarding malignancy grade and also larger studies on well-defined cohorts with high quality clinical data from RCC patients will be needed to further explore the role of CUBN in RCC and to establish the clinical utility of this promising RCC biomarker.
In a quest to identify novel biomarkers for RCC, we have applied a systematic search strategy to exploit the extensive data resources of the Human Protein Atlas (
). We identified CUBN as a marker for risk stratification of patients with RCC. Lack of CUBN expression was significantly associated with early disease progression and poor patient outcome, independent of T-stage, Fuhrman grade and nodal status. Owing to a highly RCC-specific expression profile, CUBN expression also has a potential role in clinical cancer differential diagnostics.
The authors warmly acknowledge the staff of the Human Protein Atlas project in both Sweden and India for their efforts in generating the Human Protein Atlas. In particular, the authors would like to thank Sofie Gustafsson and IngMarie Olsson for constructing TMA cohorts 1 and 3, Dijana Cerjan and Urban Rydberg for performing the IHC stainings and Ann-Sofi Strand and Cane Yaka for slide scanning. We are also grateful to Frances Rae and Craig Marshall (Health Sciences Scotland) for assistance with cohort 2 TMA construction.
This work was supported by the Swedish Cancer Society and the Knut and Alice Wallenberg Foundation. The work of DJH and GDS was funded by the Chief Scientist Office (grant number ETM37), Renal Cancer Research Fund and Kidney Cancer Scotland. The funding bodies provided basic financial support regarding salaries and materials and did not participate in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.
Availability of data and materials
All primary data supporting our finding are confined within the manuscript, either as given data or in provided references. All IHC-based expression data will be made available on the Human Protein Atlas database (
FP conceived and designed the study and provided study supervision, technical and material support. GG contributed to study development and methodology, participated in acquisition, analysis and interpretation of data. DD participated in acquisition, analysis and interpretation of data. MN was partly responsible for design and acquisition of clinical data and biological material for cohort 3. AL was partly responsible for design and acquisition of clinical data and biological material for cohort 2. OL performed antibody validation and WB. HJ performed antibody validation and WB. JB was partly responsible for design and acquisition of clinical data and biological material for cohort 1. PHE was partly responsible for design and acquisition of clinical data and biological material for cohort 1. SN was responsible for evaluation and scoring of immunohistochemically stained tissue microarrays. NK was responsible for primary annotation of immunohistochemistry. TP was responsible for primary annotation of immunohistochemistry. ÅS has taken part in analysis of RNAseq data. MU supervised antibody validation and RNAseq analyses. DJH was partly responsible for design and acquisition of clinical data and biological material for cohort 2. GJU was partly responsible for design and acquisition of clinical data and biological material for cohort 3. GDS has taken part in study supervision and data analyses. All authors have read and approved the submitted manuscript, primary authors of manuscript text were GG, FP and GDS.
Two of the co-authors were employed at Atlas Antibodies AB and their contribution was technical and material support, essentially aiming to perform an extended validation of the cubilin antibodies. None of these co-authors have ownership in the Atlas Antibodies AB company. Three of the co-authors were pathologists and as such employed by Lab Surgpath. Their contribution to this study was to evaluate and annotate all the immunohistochemical staining patterns in TMAs representing cohort 1, 2 and 3. In performing this task they received salary from Lab Surgpath.
Consent for publication
Ethics approval and consent to participate
This study was approved by the Research Ethics Committee at Uppsala University (2002–577, 2009/139 and 2011/473) and the Lothian Regional Ethics Committee (08/S1101/41 and 10/S1402/33). Written consent was required from study participants in TMA cohorts 2 and 3. All human tissue samples used in cohort 1 were anonymized in accordance with approval and advisory report from the Uppsala Ethical Review Board (2007–159), and consequently the need for informed consent was waived by the ethics committee. The use and analyses based on tissues in cohort 1 has previously been described [
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (
), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (
) applies to the data made available in this article, unless otherwise stated.