Introduction
Methods
Gene expression data
GEO Accession Number | Reference | Data Format | Sample Number | Platform Type (probe number) |
---|---|---|---|---|
GSE7849 | Anders et al., 2008 [51] | Processed only | 78 | Affymetrix Human Genome U95 Version 2 Array (12,625 probes) |
GSE3143 | Bild et al., 2006 [52] | Raw .CEL files | 158 | Affymetrix Human Genome U95 Version 2 Array (12,625 probes) |
GSE12276 | Bos et al., 2009 [53] | Raw .CEL files | 204 | Affymetrix U133 Plus 2.0 (54,675 probes) |
GSE22219 | Buffa et al., 2011 [44] | Raw Data files | 216 | Illumina humanRef-8 v1.0 expression beadchip |
GSE10510 | Calabro et al., 2009 [54] | Raw .gpr files | 152 | DKFZ Division of Molecular Genome Analysis Human Operon 4.0 oligo Array 35 k (36,486 probes) |
NA | Chang et al., 2005 [31] | Processed only | 295 | Agilent 21 K oligo array (22,575 probes) |
NA | Chin et al., 2006 [55] | Processed only | 118 | Affymetrix U133AAofAv2 (22,944 probes) |
GSE9893 | Raw data available | 155 | MLRG Human 21 K V12.0 (22,656 probes) | |
GSE7390 | Raw .CEL files | 198 | Affymetrix U133A (22,283 probes) | |
GSE16391 | Desmedt et al., 2009 [58] | Raw .CEL files | 48 | Affymetrix U133 Plus 2.0 (54,675 probes) |
GSE25055 | Hatzis et al., 2011 [59] | Raw .CEL files | 508 | Affymetrix U133A (22,283 probes) |
GSE24450 | Raw Data files | 183 | Illumina HumanHT-12 V3.0 expression beadchip | |
GSE1992 | Hu et al., 2006 [27] | Processed only | 99 | Agilent 21 K oligo array (22,575 probes) |
GSE20685 | Kao et al., 2011 [61] | Raw .CEL files | 327 | Affymetrix U133 Plus 2.0 (54,675 probes) |
NA | Kok et al., 2009 [62] | Processed only | 109 | Agilent 44 K oligo array (54,675 probes) |
GSE9195 | Loi et al., 2008 [63] | Raw .CEL files | 77 | Affymetrix U133 Plus 2.0 (54,675 probes) |
GSE6532 | Loi et al., 2008 [63] | Raw .CEL files | 265 | Affymetrix U133A/B (22,283/22,645 probes) and U133 Plus 2.0 |
GSE1378, GSE 1379 | Ma et al., 2004 [64] | Processed only | 60 | Custom 22 K oligo array (22,575 probes) |
GSE3494 | Miller et al., 2005 [65] | Raw .CEL files | 251 | Affymetrix U133A/B (22,283/22,645 probes) |
GSE45255 | Raw .CEL files | 139 | Affymetrix U133A (22,283 probes) | |
GSE1456 | Raw .CEL files | 159 | Affymetrix U133A/B (22,283/22,645 probes) | |
GSE21653 | Raw .CEL files | 266 | Affymetrix U133 Plus 2.0 (54,675 probes) | |
GSE11121 | Raw .CEL files | 200 | Affymetrix U133A (22,283 probes) | |
GSE17907 | Sircoulomb et al., 2010 [70] | Raw .CEL files | 51 | Affymetrix U133 Plus 2.0 (54,675 probes) |
GSE2034 | Wang et al., 2006 [71] | Raw .CEL files | 286 | Affymetrix U133A (22,283 probes) |
GSE12093 | Zhang et al., 2008 [72] | Raw .CEL files | 136 | Affymetrix U133A (22,283 probes) |
Total | 4738 |
GEO ID | Median age | Median size (cm) | Lymph node status | Chemo-therapy info. | Hormone treatment info. | ER status | HER2 status | PR status | Tumour grade (1/2/3) | DFS (months) | DDFS (months) | OS (months) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
GSE7849 | 55 ± 12 | 2.3 ± 1.1 | A | A | A | A | NA | A | 2/30/34 | 81 ± 40 | NA | NA |
GSE3143 | NA | NA | NA | NA | NA | NA | NA | NA | NA | 51 ± 31 | NA | A |
GSE12276 | NA | NA | NA | NA | NA | NA | NA | NA | NA | 26 ± 22 | NA | NA |
GSE22219 | 55 ± 11 | 2.6 ± 1.4 | A | NA | NA | A | NA | NA | 41/87/63 | 94 ± 38 | NA | NA |
GSE10510 | 59 ± 12 | NA | A | NA | NA | A | NA | A | NA | 57 ± 53 | NA | 87 ± 60 |
NKI295, (Chang et al., 2005) | 44 ± 5 | 2.25 ± 0.9 | A | A | NA | A | NA | NA | NA | 84 ± 50 | NA | 94 ± 47 |
Chin et al., 2006 | 55 ± 15 | 2.7 ± 1.4 | A | A | A | A | A | A | 10/42/61 | NA | 69 ± 48 | NA |
GSE9893 | 67 ± 10 | 2.3 ± 0.9 | A | NA | A | A | NA | NA | 21/94/33 | 65 ± 32 | 66 ± 31 | 72 ± 29 |
GSE7390 | 46 ± 7 | 2.2 ± 0.8 | NA | NA | NA | A | NA | NA | 30/83/83 | 113 ± 68 | 114 ± 65 | 138 ± 61 |
GSE16391 | 62 ± 8 | NA | A | A | A | A | A | A | NA | 35 ± 15 | NA | NA |
GSE25055 | 49 ± 10 | NA | A | A | A | A | A | A | 32/180/259 | NA | 36 ± 20 | NA |
GSE24450 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 72 ± 27 |
GSE1992 | 55 ± 15 | NA | A | NA | NA | A | NA | NA | 8/34/57 | 25 ± 23 | NA | 29 ± 25 |
GSE20685 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 88 ± 43 | 94 ± 38 |
Kok et al., 2009 | NA | NA | NA | NA | NA | NA | NA | NA | NA | 15 ± 17 | NA | NA |
GSE9195 | 64 ± 9 | 2.4 ± 0.96 | A | NA | A | A | NA | A | 14/20/24 | 95 ± 30 | 97 ± 28 | NA |
GSE6532 | 59 ± 13 | 2.2 ± 0.9 | A | NA | A | A | NA | A | 38/71/24 | 71 ± 42 | 71 ± 42 | NA |
GSE1378, GSE1379 | 67 ± 9 | 2.3 ± 1.1 | A | NA | NA | A | A | A | 3/39/18 | 87 ± 46 | NA | NA |
GSE3494 | 62 ± 13 | 2.3 ± 1.25 | A | NA | NA | A | NA | A | 67/128/54 | NA | NA | 98 ± 46 |
GSE45255 | 55 ± 12 | 2.9 ± 1.3 | A | A | A | A | A | A | 17/52/67 | 48 ± 22 | 51 ± 25 | 54 ± 21 |
GSE1456 | NA | NA | NA | NA | NA | NA | NA | NA | 28/58/61 | 72 ± 29 | NA | 77 ± 23 |
GSE21653 | 54 ± 14 | NA | A | NA | NA | A | A | A | 45/89/125 | 60 ± 41 | NA | NA |
GSE17907 | 50 ± 14 | NA | A | NA | NA | A | A | A | 3/10/34 | 39 ± 29 | NA | NA |
GSE11121 | NA | 2 ± 0.99 | A | NA | NA | NA | NA | NA | 29/136/35 | NA | 94 ± 51 | NA |
GSE2034 | NA | NA | A | NA | NA | A | NA | NA | NA | 78 ± 42 | NA | NA |
GSE12093 | NA | NA | A | A | A | A | NA | NA | NA | 92 ± 38 | NA | NA |
microRNA expression data
Breast cancer subtypes
Survival analysis
Software parameters
Web server
Validation of BreastMark using the OncotypeDX gene signature
Validation of BreastMark using the MammaPrint gene signature
Receptor tyrosine kinases
Results
The robustness of BreastMark is tested using the 21 genes from OncotypeDX
OncotypeDX category | Gene symbol |
BreastMark
hazard ratio | BreastMarkHR P-value | Sample number | RS weighting |
---|---|---|---|---|---|
Proliferation | KI67 | 1.68 | 4.40e-05 | 902 | +1.04 |
STK15 | 2.32 | 3.93e-11 | 902 | ||
Survivin | 1.96 | 8.56e-08 | 902 | ||
CCNB1 | 1.89 | 3.63e-06 | 793 | ||
MYBL2 | 1.76 | 8.01e-06 | 902 | ||
Invasion | MMP11 | 1.55 | 1.00e-03 | 875 | +0.1 |
CTSL2 | 1.42 | 7.12e-03 | 875 | ||
HER2 | GRB7 | 1.26 | 0.07 | 902 | +0.47 |
HER2 | 1.03 | 0.83 | 875 | ||
ER | ER | 1.32 | 0.05 | 875 | -0.34 |
PGR | 0.80 | 0.08 | 902 | ||
BCL2 | 0.75 | 0.03 | 875 | ||
SCUBE2 | 0.71 | 0.03 | 628 | ||
Other | GSTM1 | 0.92 | 0.56 | 651 | -0.08 |
CD68 | 0.96 | 0.74 | 902 | +0.05 | |
BAG1 | 1.01 | 0.91 | 902 | -0.07 |
BreastMark is consistent with the MammaPrint gene signature
Entrez Gene ID | Gene symbol | Hazard ratio | P-value | Sample number | MammaPrint correlation with prognosis |
---|---|---|---|---|---|
Good Prognosis | |||||
8659 | ALDH4 | 0.92 | 0.42 | 1105 | 0.421 |
8817 | FGF18 | 0.86 | 0.16 | 1183 | 0.411 |
27113 | BBC3 | 0.76 | 0.03 | 1004 | 0.407 |
57593 | KIAA1442 | NA | NA | NA | 0.402 |
57758 | CEGP1 | 0.69 | 5.37e-03 | 819 | 0.400 |
146923 | RUNDC1 | 0.53 | 2.23e-03 | 387 | 0.390 |
8840 | WISP1 | 0.85 | 0.13 | 1183 | 0.384 |
2947 | GSTM3 | 0.79 | 0.02 | 1183 | 0.380 |
151126 | ZNF533 | 0.84 | 0.39 | 382 | 0.375 |
146760 | RTN4RL1 | 0.84 | 0.45 | 281 | 0.374 |
10455 | PECI | 0.81 | 0.05 | 1059 | 0.373 |
7043 | TGFB3 | 0.83 | 0.09 | 1155 | 0.372 |
55351 | HSA250839 | 0.71 | 2.48e-03 | 1109 | 0.368 |
10455 | PEC1 | 0.88 | 0.05 | 1059 | 0.366 |
58475 | CFFM4 | 0.67 | 0.01 | 510 | 0.364 |
163 | AP2B1 | 0.84 | 0.10 | 1155 | 0.363 |
79132 | LGP2 | 0.67 | 1.70e-03 | 986 | 0.363 |
Poor prognosis | |||||
55321 | C20orf46 | 1.09 | 0.41 | 1137 | -0.356 |
11082 | ESM1 | 1.41 | 1.71e-03 | 1139 | -0.357 |
9134 | CCNE2 | 1.74 | 2.74e-06 | 1032 | -0.357 |
54583 | EGLN1 | 1.44 | 2.13e-03 | 981 | -0.357 |
1058 | CENPA | 1.94 | 1.26e-09 | 1183 | -0.358 |
9055 | PRC1 | 1.87 | 1.03e-08 | 1137 | -0.358 |
445815 | AKAP2 | 1.01 | 0.95 | 928 | -0.360 |
10874 | NMU | 1.51 | 1.12e-04 | 1183 | -0.360 |
3488 | IGFBP5 | 1.18 | 0.12 | 1155 | -0.360 |
10531 | MP1 | 1.08 | 0.52 | 893 | -0.361 |
57110 | LOC57110 | 1.50 | 2.16e-04 | 1109 | -0.361 |
3488 | IGFBP5 | 1.19 | 0.12 | 1155 | -0.361 |
8577 | TMEFF1 | 1.30 | 0.02 | 1077 | -0.362 |
4175 | MCM6 | 1.84 | 1.56e-08 | 1183 | -0.364 |
643008 | LOC643008 | NA | NA | NA | -0.365 |
83879 | CDCA7 | 1.02 | 0.93 | 387 | -0.365 |
5984 | RFC4 | 1.62 | 6.38e-06 | 1183 | -0.366 |
23594 | ORC6L | 1.80 | 7.32e-08 | 1137 | -0.366 |
6515 | SLC2A3 | 1.12 | 0.29 | 1155 | -0.366 |
57211 | DKFZP564D0462 | 0.96 | 0.72 | 1004 | -0.367 |
79791 | FBXO31 | 0.85 | 0.13 | 1137 | -0.367 |
1633 | DCK | 1.36 | 4.67e-03 | 1155 | -0.368 |
51514 | L2DTL | 1.62 | 1.19e-05 | 1109 | -0.369 |
1284 | COL4A2 | 1.22 | 0.10 | 1004 | -0.371 |
9833 | KIAA0175 | 1.82 | 2.21e-08 | 1183 | -0.371 |
92140 | MTDH | 1.32 | 0.01 | 1155 | -0.373 |
51377 | UCH37 | 1.19 | 0.11 | 1137 | -0.374 |
51560 | RAB6B | 0.98 | 0.84 | 1109 | -0.376 |
160897 | GPR180 | 1.24 | 0.31 | 337 | -0.379 |
79888 | FLJ12443 | 1.31 | 0.02 | 1004 | -0.381 |
8293 | SERF1A | 1.54 | 0.44 | 28 | -0.383 |
8476 | PK428 | 1.19 | 0.10 | 1183 | -0.384 |
10403 | HEC | 1.34 | 7.04e-03 | 1183 | -0.386 |
8833 | GMPS | 1.37 | 3.12e-03 | 1183 | -0.386 |
1894 | ECT2 | 1.59 | 1.70e-05 | 1137 | -0.390 |
4318 | MMP9 | 1.25 | 0.04 | 1183 | -0.392 |
5019 | OXCT | 1.00 | 0.99 | 1183 | -0.392 |
2781 | GNAZ | 1.08 | 0.49 | 1155 | -0.396 |
2321 | FLT1 | 1.05 | 0.71 | 857 | -0.398 |
2131 | EXT1 | 1.25 | 0.04 | 1183 | -0.400 |
56942 | DC13 | 1.80 | 4.69e-08 | 1137 | -0.400 |
81624 | DIAPH3 | 1.08 | 0.52 | 998 | -0.405 |
81624 | DIAPH3 | 1.08 | 0.52 | 998 | -0.409 |
169714 | QSOX2 | 1.57 | 0.04 | 343 | -0.415 |
286052 | LOC286052 | NA | NA | NA | -0.424 |
51203 | LOC51203 | 1.83 | 2.44e-08 | 1137 | -0.425 |
81624 | DIAPH3 | 1.08 | 0.52 | 998 | -0.433 |
85453 | TSPYL5 | 0.96 | 0.72 | 999 | -0.527 |
miRNAs associated with prognosis in breast cancer
Receptor tyrosine kinases associated with poor survival in the basal molecular subtype
Gene name | Gene description | Survival end point | Molecular classifier | Expression cut-off | Hazard ratio | P-value | Number |
---|---|---|---|---|---|---|---|
EPHA5 | EPH receptor A5 | OS | SSP2003 | median | 2.03 | 3.36e-03 | 233 |
DFS | SSP2006 | median | 1.37 | 0.05 | 422 | ||
OS | SSP2006 | median | 1.59 | 0.05 | 271 | ||
FGFR1 | fibroblast growth factor receptor 1 | DFS | SSP2006 | High | 1.43 | 0.02 | 465 |
DFS | PAM50 | High | 1.36 | 0.05 | 408 | ||
FGFR3 | fibroblast growth factor receptor 3 | OS | SSP2003 | High | 1.63 | 0.04 | 273 |
OS | SSP2003 | Median | 1.53 | 0.04 | 273 | ||
OS | SSP2006 | Median | 1.62 | 0.01 | 323 | ||
OS | PAM50 | Median | 1.54 | 0.03 | 293 | ||
VEGFR1 | vascular endothelial growth factor receptor 1 | DDFS | SSP2003 | Low | 1.84 | 0.05 | 320 |
OS | SSP2003 | Median | 1.53 | 0.05 | 249 | ||
OS | SSP2006 | High | 1.76 | 7.40e-03 | 284 | ||
OS | SSP2006 | Median | 1.69 | 9.50e-03 | 284 | ||
DDFS | SSP2006 | Low | 1.85 | 0.03 | 378 | ||
DDFS | PAM50 | Low | 2.07 | 0.02 | 365 | ||
OS | PAM50 | High | 1.61 | 0.04 | 261 | ||
OS | PAM50 | Median | 1.61 | 0.03 | 261 | ||
PDGFRβ | platelet-derived growth factor receptor, beta polypeptide | DDFS | SSP2003 | Median | 1.88 | 1.64e-03 | 341 |
DDFS | SSP2003 | High | 2.26 | 9.34e-04 | 341 | ||
OS | SSP2003 | Median | 1.55 | 0.05 | 273 | ||
DFS | SSP2006 | Median | 1.37 | 0.02 | 474 | ||
OS | SSP2006 | Median | 1.72 | 5.84e-03 | 323 | ||
OS | SSP2006 | High | 2.12 | 1.26e-03 | 323 | ||
DDFS | SSP2006 | High | 1.76 | 0.01 | 423 | ||
DFS | SSP2006 | High | 1.50 | 0.01 | 474 | ||
DDFS | PAM50 | Median | 1.81 | 8.58e-04 | 393 | ||
DDFS | PAM50 | High | 1.86 | 6.33e-03 | 393 | ||
OS | PAM50 | High | 1.94 | 7.27e-03 | 293 | ||
DFS | PAM50 | High | 1.58 | 7.56e-03 | 419 | ||
DFS | PAM50 | Median | 1.38 | 0.02 | 419 | ||
DDFS | PAM50 | Low | 1.45 | 0.04 | 393 | ||
TIE1 | tyrosine kinase with immunoglobulin-like and EGF-like domains 1 | OS | SSP2003 | Median | 1.63 | 0.02 | 273 |
OS | SSP2006 | Median | 1.70 | 4.82e-03 | 323 | ||
OS | PAM50 | Median | 1.56 | 0.03 | 293 |