Single cell analysis
To track those dominant and dormant subclones within a tumor, single cell technology has been proposed to understand and predict intratumoral heterogeneity. Not only is it necessary to determine the range of genetic markers of disease and its progression within a tumor, but we also need to study the combinations and interactions among gene mutations of subclonal populations. By understanding the genetic composition on a single-cell level, we will have greater insight into the interplay between genetic mutations within subclonal tumor populations. For example, new single cell genome technology reveals host-tumor immune interface as a key part of the glioblastoma ecosystem composed of cancer and immune cells, leading to a novel discovery suitable for target therapy [
50]. Ideally, single cell genome technology would ultimately apply to other tumor ecosystems as well.
TCGA-integrated single cell analysis technology may help to uncover the role of each subclone in intratumoral heterogeneity and to understand the advantages this arrangement affords to a tumor. There are different theories of intratumoral heterogeneity, including variant subclones interacting competitively for resources versus cancer growth stemming from clonal relationships of commensalism and mutualism [
36]. It appears that heterogeneity contributes to therapeutic resistance. If the potential mutability of each cancer cell can be determined and the mutation predicted, novel strategies and therapies may arise [
51]. For instance, directing drugs to subclones with angiogenic properties may eradicate non-angiogenic, “free rider” subclones [
36].
Current technologies for detecting single cancer cells in peripheral blood or cerebrospinal fluid, termed circulating tumor cells, represents a novel method to determine efficacy of therapeutic drugs on cancer as well as identifying tumor progression. Research has shown that identifying circulating tumor cells may predict overall survival in metastatic breast, prostate, and colorectal cancer and may provide prognoses in additional cancers, such as small cell lung cancer and hepatocellular carcinoma [
52]-[
56].
While there is significant research in the field of single cell cancer detection, CellSearch™ was the first to obtain FDA clearance for a method of detecting circulating tumor cells in patients. It uses antibody-coated magnetic beads to bind to and sort cells derived from blood or CSF samples as a means to identify cancer cells [
57]. Newer mechanisms are emerging due to the rising interest in single cell classification, such as capturing cancer cells in blood utilizing a microfluidic approach, where a blood sample flows through microchannels and cells are sorted by size to identify cancer cells [
58]. Most commonly, though, single cell detection uses biomarkers, such as epithelial adhesion molecule, prostate-specific membrane antigen, and cytokeratin [
59]. Complete sequencing of the cancer genome will provide insight on new biomarkers that may enhance specificity in discriminating tumor cells from normal circulating cells. Furthermore, sequencing of copy number variations, SNPs, DNA methylation, and microRNA profiling from TCGA will provide information beyond recognizing gene expression for enhancing single cell detection. While immense progress has been made at a single-cell detection level, the significance of circulating tumor cells and their potential for contributing to recurrence and metastasis has yet to be fully determined [
35]. Therefore, technology that amplifies the whole genome from a single cell may prove useful to advance the specificity of detecting circulating tumor cells and in further defining the role of single cells in predicting resistance to treatment or recurrence of disease.
The next step in progression of single cell analysis is massive parallel sequencing, or next-generation sequencing (NGS), which is becoming more feasible with competition to improve technology among biotech companies. Current NGS platforms, such as Illumina HiSeq2000, can allow profiling of 200 single cells in one run [
60]. Implications of current platforms include being able to sample multiple sites of fluid with the potential to identify circulating cancer cells, identifying tumor cell lineage relationships, and classifying different subclones within a tumor sample at once. Barriers to massive parallel sequencing currently include high costs, which are soon predicted to decrease with rapidly advancing technology, and the necessity for high-fidelity methods of whole genome amplification of single-cell DNA without incorrect SNPs, without uneven sequencing coverage, and without allele dropout [
61].
Another diagnostic tool becoming increasingly common in cancer centers is gene-sequencing panels. In March 2013, the first multi-gene DNA-sequencing tests were administered to patients through the National Health Service (NHS) in the United Kingdom to classify oncologic genetic mutations. These were designed to help physicians choose the most effective therapeutic targets for each patient’s tumor [
62]. While a single genetic screen on a tumor previously cost £150 through the NHS, this 46-gene panel costs £300 [
63]. This relatively inexpensive multi-gene panel aims to eradicate guesswork for selection of chemotherapy strategies, which improve efficacy by minimizing the negative consequences of ineffective treatment. Similarly, in the U.S., a genetic panel predicting prognosis, the Onco
type DX Colon Cancer test, provides prognostic information that other diagnostic tools have not yielded; the test distinguishes the absolute increase in recurrence risk at three years between low and high-risk patients by 10% [
64],[
65].
Standardization
Tools for diagnosis of cancer should be sensitive, minimally invasive, reproducible, standardized, and potentially be able to prognosticate outcomes at early stages of a disease. Researchers can collaborate with health-care groups to establish regulations for sharing genetic information from large research endeavors, like TCGA, without compromising medical ethics and patient privacy. By creating a framework for institutions to aggregate and exchange genomic data, researchers and medical providers can advance the progress of diagnosis and treatment of certain malignancies. In regards to cancer, mega-databases of thousands of tumor samples may enable faster development of TCGA catalogues for tumor progression and drug responding profiles, a routine testing like a blood biochemical profile.
The TCGA project inspires the development of TCGA-integrated instrumentation to bring down the cost and facilitate broad clinical access [
66]. For example, Cancer Research UK has embedded de-identified breast cancer genetic data into a new Smartphone program so that average participants can identify copy number variations within chromosomes that are difficult to visualize. Along the same vein, emerging smartphone applications to encrypt digitized genome data will provide patients with risk factors for cancer. A third venue to utilize the TCGA with advancing technology is to create diagnostic kits for every day clinical diagnosis. Although currently in the research phase, kits, such as RightOn Cancer Sequencing Kit may be used every day in the hospital. Its technology enables identification of 1,000 cancer genes with a single test. An ideal diagnostic kit would include comprehensive profiling of all cancer types, with vast coverage, high specificity and high sensitivity to detect common and rare genetic variants, in one-cost efficient test [
67]. The Ion Proton, a desktop-sized, semi-conductor based gene sequencer, can help sequence the entire human genome for less than $1,000 in two hours [
66]. With the advent of multi-gene sequencing panels and portable DNA sequencers, a patient’s genetic profile will become part of the routine regimen for cancer diagnosis and treatment protocol decisions.
Can all of these diversified instruments, analytical tools, and different institutions produce the accuracy of DNA sequencing? This is particularly critical for DNA sequencing with respect to some of the personally held instrumentation as mentioned. A general guideline for the quality control of DNA sequencing should be implemented to produce medically meaningful genetic information, a regulation that should be generated by a government agency like the US Food and Drug Administration (FDA).
Tumorigenesis, cancer progression, and metastasis
The TCGA has shown that the mutational landscape of cancer is complex and multifaceted. In order to carry out diagnosis and treatment of an individual with potentially 500 oncogenic mutations [
38], we must have an understanding of cancer initiation and progression. New algorithms (Dendrix™) have scaled up to whole-genome analysis of thousands of patients for larger data sets of TCGA for specific “driver” mutations [
68]. While the number of genetic mutations in a tumor may range from 30-200 depending on the type of tumor, research has shown that approximately 2-8 of these mutations are “driver genes” [
69]. A “driver gene” mutation is a mutation that provides the cancer with a small, but selective growth advantage over the surrounding cells, potentially enabling that cell to become a clone [
14]. Multiple insults in these “driver genes” occur over years within a cell before the cell takes on a cancer phenotype. Passenger gene mutations, on the other hand, provide neither a positive nor a negative effect to cancer cell growth.
Cancer mutations follow a natural selection theory. Thus, when a cancer cell divides, it will acquire new mutations upon selection pressure, in addition to or altering its “driver gene mutations” [
70]. These new mutations cause the new cell’s genetic composition to be slightly different from its progenitor cell. Therefore, it is not surprising that heterogeneity exists within a tumor; cells at different ends of the tumor may be genetically different. The same rules of evolution apply in metastatic cancer. Research performed by Gerlinger
et al. showed that tumor samples from multiple primary tumor sites, perinephric fat metastasis, chest-wall metastases, and germline DNA could be synthesized into a phylogenetic tree, much in the same way that trees are constructed in the evolution of species [
5]. This means metastatic tumor samples and the primary tumor itself exhibit different genetic compositions; their mutations diverged from the common mutations of the original primary tumor. The diverging genetics of metastatic tumors also stresses the importance of early diagnosis.
The evolutionary growth of cancer sounds impossible to tackle, but Vogelstein
et al. have organized all of the known driver genes into 12 cancer cell-signaling pathways:
RAS, PI3K, STAT, MAPK, TGF-β, DNA damage control genes, transcriptional regulation, chromatin modification,
APC, HH, NOTCH, and cell cycle/apoptosis [
14]. Rather than focusing on the differences among tumor cells, we must target the common mutations that occur before the branching, or diverging, points. Indeed, a TCGA-guided new approach to therapy has surfaced based on a comprehensive molecular analysis of tumor samples from 825 patients with breast cancer [
71]. Previously breast cancers were classified in four main molecular subtypes of the disease: basal-like; luminal A and luminal B, which are both estrogen receptor (
ER) positive; and
HER2 enriched. The TCGA analysis uncovered new mutated genes, expanding these four subtypes. For example, they found at least two subtypes of clinical
HER2-positive tumors. One type is
ER negative and has high levels of
EGF receptor and
HER2 enriched in
HER2 protein phosphorylation. The other with
ER positive shows lower DNA amplification and protein-based signaling, resembling the luminal subtypes. This may explain why current
HER2 (trastuzumab)-based treatment failed half of patients with
HER2-positive tumors.
In certain breast cancers, mutations of genetic regulatory sequences promote cancer. Mutations such as duplications of the densely estrogen receptor-α-bound distant estrogen response elements in the chromosomal sequences 17q23 and 20q13 predict poorer outcomes and anti-estrogen resistance in patients [
72].
Other researchers think that TCGA provides only part of the picture of tumor heterogeneity under pressure from drug therapy. Joan Brugge pointed out that cells that are not intrinsically resistant to a drug rewire their gene circuitry during treatment to become resistant without any genetic changes [
38]. Mina Bissell and Jacqueline Lees show that tumors cannot thrive without certain signaling patterns from their neighboring cells since traditional drug screening missed that microenvironment [
38]. These wake-up cells switch back on by taking advantage of interactions with normal surrounding cells [
38],[
39]. Thus, drugs that suppress this crosstalk could prevent them from restarting a tumor after therapy [
38].
TCGA genomic data has been collected simultaneously while other comprehensive “omic” profiles have begun to build extensive libraries as well in order to provide better indicators on how to holistically identify and characterize disease. The notion of calling diseases by body part is rooted in mid-1800s in France and is likely characterized by pathways and signals at the molecular level (David Agus) [
66]. The importance of copy number analysis, for example, argues that tumors can be classified in those driven by either mutations (M class) or copy number aberrations (C class) (Paul C. Boutros, 11JUN2014,
webinar.sciencemag.org). C class tumors include breast, ovarian, squamous cell lung, and prostate cancer. However, next generation sequencing technologies have limited ability to detect clinically relevant lower level amplifications, copy neutral loss of heterozygosity, and homozygous deletions, even at significant depth of coverage.
TCGA-integrated biochemical assays would enable monitoring of tumor progression using soluble, biochemical markers. Cytokine profiling in blood or cerebrospinal fluid may also help with diagnosis and evaluating prognosis in cancer patients. While previous research has shown characterization of cytokine profiles for breast cancer, TCGA project is revealing how cytokines affect other tumors. For example, TCGA data showed that expression of high levels of miR-18 and low levels of
TGF-β genes in the proneural glioblastoma subtype correlates with prolonged patient survival [
73]. For the same subtype, proneural glioblastoma, increased levels of interferon/
STAT1 and genes related to interferon also determined poor survival outcome [
74]. By incorporating key markers yielded from TCGA, cytokine profiling of tumors would provide an additional layer of diagnostic and prognostic information in conjunction with gene expression. Another forms of soluble biomarkers are those that undergo DNA methylation because this can be detected in blood serum samples of cancer patients [
75]. Markers, such as O (6)-methylguanine-DNA methyltransferase (
MGMT) that is known to confer resistance to temozolomide in glioblastomas, would help to predict a cancer patient’s personalized response to chemotherapy [
76]. In a study by Majchrzak-Celińska
et al., methylation profiles of biomarkers—
MGMT, RASSF1A, p15INK4B, and
p14ARF—in cancer patient’s serum were found to match the methylation profiles in paired tumor samples in most cases [
77]. In clinical practice, sequenced genomes of tumor profiles would direct which markers to test in biochemical assays. These assays can aid in the development of a personalized therapeutic plan and predict the effectiveness of various cancer drugs in a given individual.
The human metabolome may be another adjunct to link the gap between genotype and phenotype [
78]. Integrating genetic data with metabolomics will add additional power to analysis, yielding improved accuracy and sensitivity. TCGA mutations can serve as a guide to focus metabolomic research, and the metabolomics can validate TCGA research by cross checking protein synthesis with DNA expression. Altered metabolism is a distinct feature of tumor cells, and it is known that specific genetic alterations, such as KRAS and BRAF, increase the expression of glucose transporter 1 [
79],[
80]. Research in metabolomics has defined altered levels of metabolic enzymes and metabolites in various tumors, including oncogenic mutations that cause the malignancy [
81],[
82]. It has been recently determined that the master transcriptional regulators of prostate cancer progression, AR and ETS gene fusions, control the regulatory enzymes of sarcosine, and therefore, high levels of sarcosine in the urine demonstrates a promising clinical biomarker of metastatic prostate disease [
83]. TCGA-driven discovery may help elucidate the mechanisms of tumor-altered metabolism. This in turn may help develop quantitative, high-throughput metabolomics for systems biology to define metabolites as biomarkers for tumor progression. Key metabolites can be identified non-invasively and rapidly in the blood, cerebrospinal fluid, urine, saliva, and prostatic fluids. Another example of metastatic effects is the Warburg effect, which shows that under oxygen consuming (aerobic) conditions tumor tissues take up glucose and convert it to lactate ten-fold as much as than typical tissues in a given time [
84]. TCGA may help us narrow down on some mutations tied in with cancer metabolism, opening up new targets for detection and treatment [
85].
TCGA approach may shed new light on some not well-characterized cancer metastasis. For example, little was known about the metastasis of medulloblastoma, a tumor that is the most widely recognized childhood and adolescent tumor of the central nervous system (CNS). New data provide a mechanism by which metastasis of medulloblastoma yield a highly intrusive spread of tumor cells into the leptomeningeal space along the neuroaxis over the course of disease, this extraneural metastasis that is uncommon yet oftentimes deadly, happening in 1 to 5% of patients, the metastasis that presents near ventriculoperitoneal shunt [
86]. Another example is the genomic characterization of 128 instances of metastases from the primary modalities of GBM defined the uncommonness of the metastasis based on histological and immunogenetic data [
87]. Cancer genome information may therefore offer patients a hope to seek treatment for metastases to improve survival time.