Introduction
WNT7B in Mammary Gland Development and Breast Cancer
Exploring Spatiotemporal Patterns of Wnt7b Expression Using scRNAseq Data
Tool | Description | Reference |
---|---|---|
dbSuper is an interactive database containing more than 80,000 putative super enhancers for 25 mouse and >100 human tissues and cell lines. The database has migrated from its original reported location (http://bioinfo.au.tsinghua.edu.cn/dbsuper/) and while functional and highly intuitive, it is not clear whether it has been updated since 2017. | [19] | |
SEA version 3.0 was updated in 2019 and promises to be a comprehensive resource that stores predicted super-enhancers and enhancers from 11 different species and more than 200 types of cells, tissues and diseases. | [20] | |
A large compendium of single cell transcriptome data from the model organism Mus musculus that contains scRNAseq datasets of 23 organs and tissues, including the mammary gland at 6 different timepoints (1 month, 3 months, 18 months, 21 months, 24 months, 30 months). This online dataset explicitly includes stromal cells and other cell types from the supportive tissue (e.g. endothelial and immune cells). Of note, all tissues have been processed and analysed by two different protocols: cells were either FACS sorted, or single-cell sorted using microfluidic droplet-capture techniques and thus sequenced using two different methodologies, providing an innate technical validation of the data when using this tool. | [21] | |
Also part of the Tabula Muris Senis effort. Offers extensive statistical analysis and visualization of bulk RNA seq datasets from 17 organs of Mus musculus at 10 different timepoints. | [22] | |
3DIV collects human Hi-C data from 80 cells lines or tissues (including HMEC, MCF7, MCF10A) and promoter capture Hi-C from 27 tissues. Chromatin conformation data from the locus of a gene or location of interest can be either displayed as a Hi-C heatmap and as a virtual 4C (with the location of interest as viewpoint). If applicable, it also predicts the boundaries of local TADs based on the provided datasets. 3DIV offers more flexibility to its users as it allows the user to select the algorithm used to predict TADs, define the cut-off for positive interactions in the virtual 4C and it is straightforward to extract the coordinates of positive hits. | ||
Single Cell Expression Atlas & Gene Expression Atlas: A database that compiles and visualizes published RNA & scRNA-seq datasets from Human, Mouse & a wide variety of model organisms. Selected datasets are plotted as a tSNE, and a heatmap highlighting marker genes for each annotated cluster is displayed. The database can be searched by gene across species, experiments, tissues and cell lines to reveal where this gene is expressed. | [25] | |
HACER is an atlas of Human ACtive Enhancer to interpret Regulatory variants, which includes active, transcribed enhancers derived from GRO-seq, PRO-seq and CAGE data. HACER not only compiles cell type specific enhancers but also integrates transcription factor-enhancer binding prediction, validated chromatin interactions and links GWAS SNPs and eQTL variants to enhancer regions. The database includes the MCF10A and MCF7 cell lines. | [26] | |
An online database that compiles published spatial transcriptomic datasets and offers a web interface for spatially resolved transcriptomic data visualisation and comparison. Includes a human breast cancer dataset. | [27] | |
tSNE visualisation of gene expression during mammary gland development: from E16 to Adult. | [17] | |
PanglaoDB is a database that collects and integrates scRNAseq data from human and mouse and presents them through an unified framework. | [28] | |
The database provides enhancer annotation in nine species, including human (hg19), mouse (mm9), fly (dm3), worm (ce10), zebrafish (danRer10), rat (rn5), yeast (sacCer3), chicken (galGal4), and boar (susScr3). The consensus enhancers were predicted based on multiple high throughput experimental datasets (e.g. histone modification, CAGE, GRO-seq, transcription factor binding and DHS). This database includes the HMEC cell line. | ||
A database visualized by an intuitive shiny app that allows for an interactive exploration of gene expression profiles across tissues, developmental stages and species. This does not only include protein coding genes but also putative LncRNAs. The mammary gland is not included in this dataset. | ||
ARCHS4 is a web resource that compiles the majority of RNA-seq data published from both human and mouse datasets and makes that available at the gene and transcript levels. It provides a web-interface that allows exploration of the processed data. Moreover, individual genes can be searched for their average expression across cell lines and tissues, top co-expressed genes, and predicted biological functions and protein-protein interactions. | [33] | |
Cistrome DB is a comprehensive database (~47.000 sets) for curated ChIP and DNase-seq data. It provides an uniform platform that contains manually curated information for each ChIP-seq and DNase-seq dataset, including species, factors, biological source, publication etc, the analysis results of each dataset from human and mouse, and comprehensive quality control checks across the complete database. By using the CistromeDB toolkit, epigenetic features or transcription factors that regulate your gene of interest can be predicted based on the datasets present in Cistrome DB. | ||
The Human Cell Landscape offers a large compendium of human scRNA-seq data. Mammary gland tissue is not included in the original dataset, but scRNA-seq data from Nguyen et al. 2018 has been integrated in the online visualisation tool. Gene expression can be visualised superimposed on a tSNE plot. | [36] | |
A ‘gene atlas for structural immunity’. This multi-omics dataset profiles the immunological potential of epithelial cells, endothelial cells and fibroblasts from 12 different mouse tissues. The mammary gland is not included in this dataset. Aggregated ATAC-seq, ChIP-seq and RNAseq can be visualised in the UCSC genome browser. | [37] | |
The cBioPortal for cancer genomics is an open-access resource for exploring and visualizing multidimensional cancer genomics datasets. cBioPortal compiles a wide variety of datasets, including TCGA, that can contain non-synonymous mutations, DNA copy-number variation, mRNA and microRNA expression data, protein-level and phosphoprotein level data, DNA methylation, and de-identified clinical data. | ||
The Kaplan-meier plotter is a tool that can be used to assess the effect of 54k genes (mRNA and protein levels) on survival across 21 cancer types, including breast, ovarian, lung and gastric cancer. Sources for these datasets include GEO, TCGA, and EGA. This is a valuable and easy to use tool for discovering and validating cancer survival biomarkers. | [81] | |
Enrichr is a comprehensive online tool or gene set enrichment analysis that includes over 30 gene-set libraries. It offers interactive and intuitive visualisation of the results via clustergrams. Note that while highly useful this tool requires predefined gene sets from e.g. RNA-seq and is not as useful for e.g. searching KEGG or GO terms for a gene of interest. For these kind of queries ARCHS4 is better suited. |