Background
Results
Sequence conservation analysis for Mtb32A, Mtb39A, Mtb72F and M72
Lineagea
| Strains | NCBI Accession number | Mtb39A | Mtb39A | Mtb32A | Mtb72F | M72 | |||
---|---|---|---|---|---|---|---|---|---|---|
No. of aa changes | No. of aa/nt, position (pos) | % identity | ||||||||
Insertions | Deletions | Frameshifts | ||||||||
1 | T17 | PRJNA55273 | 3 | 99.23 | 94.37 | 96.79 | 96.65 | |||
T46 | PRJNA55875 | 3 | 99.23b
| - | - | - | ||||
T92 | PRJNA55099 | 3 | 99.23 | 100 | 99.58 | 99.44 | ||||
EAS054 | PRJNA55133 | 117 | 1 nt, pos 811 | 1, pos 270-391 | 68.92 | 100 | 84.37 | 84.23 | ||
2 | 94_M4241A | PRJNA55095 | 0 | 100 | 100 | 100 | 99.86 | |||
210 (Mtb Beijing)
| PRJNA42617 | 30 | 1 aa, pos 274 | 91.3 | 100 | 95.4 | 95.26 | |||
02_1987 | PRJNA55097 | 1 | 99.74 | 100 | 99.86 | 99.72 | ||||
T85 | PRJNA55131 | 1 | 99.74 | 100 | 99.86 | 99.72 | ||||
W-148 (Mtb. Beijing;MDR)
| PRJNA182020 | 1 | 99.74 | 100 | 99.86 | 99.72 | ||||
3 | - | - | - | - | - | |||||
4 | CDC1551 | PRJNA57775 | 0 | 100 | 100 | 100 | 99.86 | |||
C | PRJNA54359 | 0 | 100b
| - | - | - | ||||
F11 | PRJNA58417 | 1 | 99.74 | 100 | 99.86 | 99.72 | ||||
GM1503 | PRJNA55271 | - | - | - | - | - | 100b
| - | - | |
H37Rac
| PRJNA58853 | 62 | 1 nt, pos 922 | 1, pos 309–391 | 82.28 | 100 | 90.56 | 90.42 | ||
Haarlem | PRJNA54453 | 0 | 100 | 100 | 100 | 99.86 | ||||
KZN1435 (MDR)
| PRJNA59069 | 0 | 100 | 100 | 100 | 99.86 | ||||
98R604INHRIFEM | PRJNA55399 | 0 | 100 | 100 | 100 | 99.86 | ||||
5 | CPHL_A (M. africanum)
| PRJNA55877 | 4 | 1 aa, pos 274 | 98.72 | 100 | 99.3 | 99.16 | ||
6 | K85 (M. africanum)
| PRJNA55879 | 3 | 2 aa, pos 162 | 1 aa, pos 274 | 98.48 | 100 | 99.17 | 99.03 | |
n.d. | SUMu001 | PRJNA51927 | 8 | 1 aa, pos 274 | 97.70 | 100 | 98.75 | 98.61 | ||
SUMu002 | PRJNA51925 | - | - | - | - | - | 100b
| - | - | |
SUMu003 | PRJNA51931 | 2 | 99.49 | 100 | 99.72 | 99.58 | ||||
SUMu004 | PRJNA51933 | 2 | 1 aa, pos 274 | 99.23 | 100 | 99.58 | 99.44 | |||
SUMu005 | PRJNA51935 | 2 | 1 aa, pos 274 | 99.23 | 100 | 99.58 | 99.44 | |||
SUMu006 | PRJNA91937 | 2 | 99.49 | 100 | 99.72 | 99.58 | ||||
SUMu008 | PRJNA51941 | - | - | - | - | - | 100b
| - | - | |
SUMu010 | PRJNA51945 | - | - | - | - | - | 100b
| - | - | |
KZN605 (XDR)
| PRJNA54947 | 0 | 100 | 100 | 100 | 99.86 | ||||
KZN4207 | PRJNA83619 | 0 | 100 | 100 | 100 | 99.86 | ||||
KZNR506 (XDR)
| PRJNA47489 | 0 | 100 | 100 | 100 | 99.86 | ||||
KZNV2475 (MDR)
| PRJNA47491 | 1 | 99.74 | 100 | 99.86 | 99.72 | ||||
UT205 | PRJNA162183 | 0 | 100 | 100 | 100 | 99.86 | ||||
BTB05-552 | PRJNA51871 | 0 | 100 | 100 | 100 | 99.86 | ||||
BTB05-559 | PRJNA51873 | 0 | 100 | 100 | 100 | 99.86 | ||||
S96-129 | PRJNA51869 | 0 | 100 | 100 | 100 | 99.86 | ||||
CCDC5079 | PRJNA161943 | 1 | 99.74 | 100 | 99.86 | 99.72 | ||||
CCDC5180 | PRJNA161941 | 1 | 99.74 | 100 | 99.86 | 99.72 | ||||
CTRI-2 | PRJNA161997 | 0 | 100 | 100 | 100 | 99.86 | ||||
CTRI-4 (XDR)
| PRJNA43175 | 3 | 99.23 | 100 | 99.58 | 99.44 | ||||
R1207 (MDR)
| PRJNA46669 | 1 | 99.74 | 100 | 99.86 | 99.72 | ||||
X122 (pre-XDR)
| PRJNA46667 | 1 | 99.74 | 100 | 99.86 | 99.72 | ||||
NA-A0008 | PRJNA168604 | 6 | 98.47 | 100 | 99.58 | 99.44 | ||||
NA-A0009 | PRJNA168605 | 3 | 99.23 | 100 | 99.16 | 99.02 | ||||
HN878 | PRJNA46665 | 3 | 99.23 | 100 | 99.58 | 99.44 | ||||
RGTB423 | PRJNA162179 | 2 | 99.23 | 85.63 | 92.48 | 92.34 | ||||
RGTB327 | PRJNA157907 | 24 | 1 nt, pos 480 | 1 nt, pos 464 | 1, pos 139–155 | 93.91 | 100 | 96.67 | 96.53 | |
2 nt, pos 951 | 1 nt, pos 468 | 1, pos 314–316 | ||||||||
1 nt, pos 959 | 1 nt, pos 469 | |||||||||
1 nt, pos 470 | ||||||||||
1 nt, pos 475 | ||||||||||
Affected Strains; no (%) | 28 (67) | 3 (7) | 8 (19) | 3 (7) | ||||||
Average % identity | 98.08 | 99.55 | 98.71 | 98.57 |
MHC-II-binding peptides in Mtb32A, Mtb39A and Mtb72F
Region/Alleles assessed (N) | Protein | Predicted epitopes (N) | DRB1 alleles without predicted epitope | ||
---|---|---|---|---|---|
Total | Average/allele | N | Allele | ||
Overall
| Mtb32A | 8792 | 56 | 5 | *03:02, *04:07, *14:10, *14:14, *14:44 |
(N = 158) | Mtb39A | 14675 | 93 | 2 | *03:02, *16:04 |
Mtb72F | 22065 | 140 | 2 | *03:02, *16:04a
| |
China
| Mtb32A | 5873 | 40 | 5 | *03:02, *04:07, *14:10, *14:14, *14:44 |
(N = 146) | Mtb39A | 12794 | 88 | 2 | *03:02, *16:04 |
Mtb72F | 19109 | 131 | 2 | *03:02, *16:04a
| |
N. India
| Mtb32A | 2595 | 55 | 2 | *03:02, *04:07 |
(N = 47) | Mtb39A | 4516 | 96 | 1 | *03:02 |
Mtb72F | 6733 | 143 | 1 | *03:02a
| |
S. India
| Mtb32A | 1087 | 43 | 0 | |
(N = 25) | Mtb39A | 1993 | 80 | 1 | *16:04 |
Mtb72F | 2868 | 115 | 1 | *16:04a
| |
N.E. India
| Mtb32A | 2155 | 51 | 2 | *03:02, *04:07 |
(N = 42) | Mtb39A | 3517 | 84 | 1 | *03:02 |
Mtb72F | 5298 | 126 | 1 | *03:02a
| |
India (Total)
| Mtb32A | 2807 | 37 | 2 | *03:02, *04:07 |
(N = 76) | Mtb39A | 4634 | 61 | 2 | *03:02, *16:04 |
Mtb72F | 6876 | 90 | 2 | *03:02, *16:04a
| |
S.S. Africa
b
| Mtb32A | 4447 | 67 | 1 | *03:02 |
(N = 66) | Mtb39A | 6977 | 106 | 1 | *03:02 |
Mtb72F | 10737 | 163 | 1 | *03:02a
|
HLA molecule | Mtb32A | Mtb39A | Mtb72F | M72 |
---|---|---|---|---|
DRB3*01:01 | 18 | 7 | 31 | 31 |
DRB3*02:02 | 19 | 56 | 71 | 71 |
DRB3*03:01 | 51 | 82 | 125 | 125 |
DRB4*01:01 | 27 | 38 | 58 | 58 |
DRB5*01:01 | 32 | 85 | 109 | 109 |
DRB5*01:02 | 45 | 107 | 145 | 145 |
DQA1*05:01-DQB1*02:01a
| 14 | 35 | 51 | 51 |
DQA1*02:01-DQB1*02:01b
| 0 | 0 | 0 | 0 |
DQA1*05:01-DQB1*03:01a
| 291 | 325 | 614 | 610 |
DQA1*03:01-DQB1*03:02a
| 17 | 75 | 88 | 88 |
DQA1*04:01-DQB1*04:02a
| 15 | 75 | 87 | 88 |
DQA1*01:01-DQB1*05:01a
| 8 | 0 | 1 | 1 |
DQA1*01:02-DQB1*05:02 | 26 | 37 | 61 | 61 |
DQA1*01:02-DQB1*06:02a
| 161 | 262 | 392 | 392 |
DPA1*02:01-DPB1*01:01c
| 27 | 33 | 56 | 56 |
DPA1*01:03-DPB1*02:01c
| 20 | 23 | 39 | 39 |
DPA1*01:03-DPB1*04:01c
| 14 | 11 | 16 | 16 |
DPA1*01:03-DPB1*04:02c
| 18 | 17 | 28 | 28 |
DPA1*02:02-DPB1*05:01b, c
| 8 | 0 | 0 | 0 |
DPA1*02:01-DPB1*14:01 | 50 | 128 | 170 | 170 |
Impact of the alterations made to construct Mtb72F and M72, on MHC-II binding predictions
Alteration introduced in Mtb72F | Change in the numbers of covereda alleles for Mtb72F | Explanation |
---|---|---|
Deletion of the Mtb32A signal sequence. | Loss of 1 covered allele containing 28 predicted epitopes | Epitopes in the Mtb32A signal sequence were predicted for 149 of the 158 alleles assessed. For 148 of the 149 alleles, epitopes were also predicted for the other parts of the protein. Only for 1 allele (DRB1*16:04), all 28 predicted epitopes were located in the Mtb32A signal sequence, and were thus not predicted for Mtb72F. |
Splitting the Mtb32A sequence upstream and downstream of ‘TAAS’ sequence. | No changes in the number of covered alleles. | For each allele with an epitope predicted in this part of the protein there was also an epitope predicted in other parts of the protein. There was an overall loss of 14 predicted epitopes. |
Addition of a poly-His tag (MHHHHHH) at the Mtb32A C-terminal end. | No changes in the number of covered alleles. | One epitope (MHHHHHHTAASDNFQ, binding to DRB1*08:18) was predicted for the Meth-His tag in Mtb72F. There were also other epitopes predicted for this allele. |
Addition of 2-amino acid hinge sequences at the junction sites between Mtb32C and Mtb39A (EF), and between Mtb39A and Mtb32N (DI). | No changes in the number of covered alleles. | Adding the EF and DI sequences resulted in 13 and 5 additional predicted epitopes, binding to 43 and 31 alleles, respectively. However, the number of alleles with at least one predicted epitope did not change. |
Comparison of MHC-II binding predictions for 3 DRB1 alleles with experimental data
Performance algorithm | DRB1*0101 (DR1) | DRB1*1501 (DR2) | DRB1*0401 (DR4) | ||||||
---|---|---|---|---|---|---|---|---|---|
Mtb32A
|
Mtb39A
|
Mtb72F
|
Mtb32A
|
Mtb39A
|
Mtb72F
|
Mtb32A
|
Mtb39A
|
Mtb72F
| |
INAFSVGSGQTYGVD
| VNEAEYGEMWAQDAA | LSQDRFADFPALPLD |
MLKGFAPAAAAQAVQ
|
LNGLIQFDAAIQPGD
|
TAYGLTVPPPVIAEN
| ||||
GSGQTYGVDVVGYDR |
AYETAYGLTVPPPVI
| DRFADFPALPLDPSA | KTVSPHRSPISNMVS | NFQLSQGGQGFAIPI | VVWGLTVGSWIGSSA | ||||
ATDINAFSVGSGQTY
|
AEYGEMWAQDAAAMF
| APAQAAPPALSQDRF
a
| MSSLGSSLGSSGLGG | LNGHHPGDVISVTWQ | GLTVGSWIGSSAGLM | ||||
YDRTQDVAVLQLRGA
|
GEMWAQDAAAMFGYA
|
SSAGLMVAAASPYVA
| LTNNHVIAGATDINA | GVAANLGRAASVGSL | |||||
VAVLQLRGAGGLPSA
|
AAAAYETAYGLTVPP
| QTYGVDVVGYDRTQD | ANLGRAASVGSLSVP | ||||||
GGQGGTPRAVPGRVV |
ASVGSLSVPQAWAAA
| VPGRVVALGQTVQAS | AAAAYETAYGLTVPP | ||||||
QTYGVDVVGYDRTQD |
VVWGLTVGSWIGSSA
| IPIGQAMAIAGQIRS | ASAFQSVVWGLTVGS | ||||||
FSVGSGQTYGVDVVG |
VRVAAAAYETAYGLT
| NGARVQRVVGSAPAA | VTPAARALPLTSLTS | ||||||
IPIGQAMAIAGQIRS
| AIAVNEAEYGEMWAQ | INAFSVGSGQTYGVD |
LMILIATNLLGQNTP
| ||||||
MLKGFAPAAAAQAVQ
|
RVQRVVGSAPAASLG
| VAAASPYVAWMSVTA | |||||||
AENRAELMILIATNL
| AGQIRSGGGSPTVHI |
ASPYVAWMSVTAGQA
| |||||||
ASAFQSVVWGLTVGS
| GSGQTYGVDVVGYDR |
VGSWIGSSAGLMVAA
| |||||||
FSAASAFQSVVWGLT
|
LQLRGAGGLPSAAIG
| AENRAELMILIATNL | |||||||
SSAGLMVAAASPYVA
| IAGATDINAFSVGSG | SASLVAAAQMWDSVA | |||||||
RVVALGQTVQASDSL |
SSAGLMVAAASPYVA
| ||||||||
YDRTQDVAVLQLRGA | LPPEINSARMYAGPG | ||||||||
FSVGSGQTYGVDVVG | GQAELTAAQVRVAAA | ||||||||
RVVGSAPAASLGIST |
MLKGFAPAAAAQAVQ
| ||||||||
VAVLQLRGAGGLPSA
|
AAQVRVAAAAYETAY
| ||||||||
GFAIPIGQAMAIAGQ |
YVAWMSVTAGQAELT
| ||||||||
GVDVVGYDRTQDVAV |
DAAAMFGYAAATATA
| ||||||||
ATDINAFSVGSGQTY | PSSKLGGLWKTVSPH | ||||||||
PLDPSAMVAQVGPQV | VTAGQAELTAAQVRV | ||||||||
TQDVAVLQLRGAGGL | AVQTAAQNGVRAMSS | ||||||||
GGTPRAVPGRVVALG |
LIATNLLGQNTPAIA
| ||||||||
NHVIAGATDINAFSV |
GEMWAQDAAAMFGYA
| ||||||||
GGGSPTVHIGPTAFL | MYAGPGSASLVAAAQ | ||||||||
GTGIVIDPNGVVLTN | VRVAAAAYETAYGLT | ||||||||
RWSWLLSVLAAVGLG
a
|
FSAASAFQSVVWGLT
| ||||||||
AIAVNEAEYGEMWAQ | |||||||||
AYETAYGLTVPPPVI
| |||||||||
ASVGSLSVPQAWAAA | |||||||||
Total binders determined in vitro
| 9 | 14 | 23 | 3 | 4 | 6 | 29 | 32 | 60 |
Total non-bindersb
| 106 | 113 | 202 | 112 | 123 | 219 | 86 | 95 | 165 |
True predicted binders (TP) | 5 | 12 | 17 | 0 | 2 | 2 | 4 | 13 | 17 |
True predicted non-binders (TN) | 45 | 28 | 68 | 102 | 108 | 197 | 77 | 70 | 134 |
Sensitivity prediction algorithm
|
55 %
|
86 %
|
74 %
|
0 %
|
50 %
|
33 %
|
14 %
|
41 %
|
28 %
|
Specificity prediction algorithm
|
42 %
|
25 %
|
34 %
|
91 %
|
88 %
|
90 %
|
90 %
|
74 %
|
81 %
|