Background
Constitution | Voice characteristics |
---|---|
TY | talkative, impatient, clear, influential, loud, resonant |
SY | vigorous, clear, fruity, talkative, fast, hasty, illogical, impatient, high-pitched |
TE | regular, taciturn, thick, loud, resonant, grave, dignified |
SE | unstrained, artless, easy, sharp, not clear, not hoarse, still, calm, gentle, slow, low |
Methods
Overview
Voice data acquisition
Vocal feature extraction
Features | Descriptions |
---|---|
sF0, sFSTD | Average pitch and standard deviation |
sI0, sISTD | Average intensity and standard deviation |
sMFCC0 ~ 12 | 13 Mel-frequency cepstral coefficients |
sSPD | Reading speed for a sentence |
sLPR1 | Log power ratio (60 ~ 240 Hz/240 ~ 960 Hz) |
sLPR2 | Log power ratio (240 ~ 960 Hz/960 ~ 3840 Hz) |
sLPR3 | Log power ratio (60 ~ 240 Hz/960 ~ 3840 Hz) |
Data preprocessing
Male | Female | |||||
---|---|---|---|---|---|---|
TE | SE | SY | TE | SE | SY | |
Train | ||||||
Age 15-19 | †6 (27.3) | 9 (40.9) | 7 (31.8) | 7 (30.4) | 8 (34.8) | 8 (34.8) |
20-29 | 26 (37.7) | 25 (36.2) | 18 (26.1) | 47 (29.0) | 54 (33.3) | 61 (37.7) |
30-39 | 45 (37.8) | 36 (30.3) | 38 (31.9) | 63 (26.9) | 77 (32.9) | 94 (40.2) |
40-49 | 59 (42.1) | 34 (24.3) | 47 (33.6) | 98 (34.9) | 70 (24.9) | 113 (40.2) |
50-59 | 72 (47.7) | 31 (20.5) | 48 (31.8) | 90 (36.7) | 66 (26.9) | 89 (36.3) |
60-69 | 50 (46.3) | 17 (15.7) | 41 (38.0) | 74 (42.5) | 39 (22.4) | 61 (35.1) |
>70 | 26 (53.1) | 8 (16.3) | 15 (30.6) | 45 (48.9) | 26 (28.3) | 21 (22.8) |
ΌTotal | 284 (43.2) | 160 (24.3) | 214 (32.5) | 424 (35.0) | 340 (28.1) | 447 (36.9) |
Test | ||||||
Age 15-19 | 2 (25.0) | 6 (75.0) | - | 3 (33.3) | 2 (22.2) | 4 (44.4) |
20-29 | 2 (16.7) | 5 (41.7) | 5 (41.7) | 9 (47.4) | 7 (36.8) | 3 (15.8) |
30-39 | 8 (40.0) | 5 (25) | 7 (35) | 18 (34.0) | 23 (43.4) | 12 (22.6) |
40-49 | 16 (42.1) | 9 (23.7) | 13 (34.2) | 27 (42.2) | 20 (31.3) | 17 (26.6) |
50-59 | 19 (43.2) | 6 (13.6) | 19 (43.2) | 30 (37.5) | 18 (22.5) | 32 (40.0) |
60-69 | 12 (44.4) | 3 (11.1) | 12 (44.4) | 27 (46.6) | 14 (24.1) | 17 (29.3) |
>70 | 8 (50.0) | - | 8 (50.0) | 12 (50.0) | 6 (25.0) | 6 (25.0) |
ΌTotal | 67 (40.6) | 34 (20.6) | 64 (38.8) | 126 (41.0) | 90 (29.3) | 91 (29.6) |
Classification method
Results
Classification model using MLR via LASSO
Male | Female | |||||
---|---|---|---|---|---|---|
Features | ||||||
0.310 | −0.309 | 0.000 | 0.042 | −0.157 | 0.116 | |
sF0 | 0.081 | −0.112 | 0.015 | . | ||
sFSTD | −0.209 | 0.168 | −0.048 | . | ||
sI0 | 0.159 | −0.212 | −0.071 | 0.062 | . | |
sISTD | 0.107 | −0.019 | −0.040 | |||
sSPD | −0.004 | 0.081 | 0.048 | −0.015 | ||
sLPR1 | 0.060 | 0.017 | −0.004 | |||
sLPR2 | 0.197 | −0.126 | 0.049 | −0.067 | ||
sLPR3 | −0.031 | −0.012 | ||||
sMFCC0 | −0.007 | 0.207 | ||||
sMFCC1 | −0.213 | 0.065 | −0.027 | 0.081 | ||
sMFCC2 | −0.068 | 0.218 | −0.071 | 0.001 | ||
sMFCC3 | 0.238 | −0.127 | −0.083 | 0.032 | ||
sMFCC4 | 0.124 | 0.020 | ||||
sMFCC5 | 0.096 | −0.105 | 0.012 | −0.010 | ||
sMFCC6 | 0.170 | −0.028 | −0.035 | |||
sMFCC7 | 0.126 | −0.067 | −0.126 | 0.016 | ||
sMFCC8 | 0.032 | 0.263 | ||||
sMFCC9 | −0.119 | 0.215 | −0.038 | |||
sMFCC10 | −0.099 | 0.146 | ||||
sMFCC11 | 0.109 | −0.074 | −0.032 | 0.027 | ||
sMFCC12 | 0.046 | −0.099 | 0.126 |
Male | Female | ||||||
---|---|---|---|---|---|---|---|
γ
l,
| S.E. | Wald’s Z | γ
l,
| S.E. | Wald’s Z | ||
TE (l = 1) | γ0
| −0.284 | 0.341 | −0.834 | −0.498 | 0.275 | −1.810 |
Age | 0.004 | 0.006 | 0.691 | 0.011 | 0.005 | 2.374 | |
η
1
| 1.254 | 0.442 | 2.838 | 1.116 | 0.234 | 4.776 | |
η
2
| 0.001 | 0.312 | 0.004 | 0.105 | 0.823 | 0.127 | |
η
3
| −1.222 | 0.339 | −3.603 | −1.297 | 0.521 | −2.492 | |
SE (l = 2) | γ0
| 0.754 | 0.375 | 2.014 | 0.251 | 0.280 | 0.897 |
Age | −0.017 | 0.007 | −2.331 | −0.003 | 0.005 | −0.561 | |
η1
| 0.241 | 0.519 | 0.464 | −0.087 | 0.240 | −0.361 | |
η2
| 1.078 | 0.372 | 2.897 | 1.453 | 0.863 | 1.683 | |
η3
| −1.176 | 0.380 | −3.094 | −1.464 | 0.544 | −2.689 |
Classification results
Male | Female | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Predicted SC | Sensitivity | Predicted SC | Sensitivity | |||||||||
TE | SE | SY | Total | TE | SE | SY | Total | |||||
Do et al.
| ||||||||||||
TE
| 185 | 31 | 43 | 259 | 71.4% | 204 | 34 | 150 | 388 | 52.6% | ||
Train |
True SC
|
SE
| 52 | 62 | 33 | 147 | 42.2% | 95 | 63 | 144 | 302 | 20.9% |
SY
| 84 | 26 | 77 | 187 | 41.2% | 137 | 43 | 229 | 409 | 56.0% | ||
Total
|
321
|
119
|
153
|
593
|
436
|
140
|
523
|
1099
| ||||
Accuracy | 54.6% | 45.1% | ||||||||||
TE
| 46 | 6 | 15 | 67 | 68.7% | 51 | 15 | 60 | 126 | 40.5% | ||
Test |
True SC
|
SE
| 14 | 4 | 16 | 34 | 11.8% | 32 | 20 | 38 | 90 | 22.2% |
SY
| 37 | 12 | 15 | 64 | 23.4% | 37 | 11 | 43 | 91 | 47.3% | ||
Total
|
97
|
22
|
46
|
165
|
120
|
46
|
141
|
307
| ||||
Accuracy | 39.4% | 37.1% | ||||||||||
Proposed
| ||||||||||||
TE
| 184 | 22 | 53 | 259 | 71.0% | 212 | 20 | 156 | 388 | 54.6% | ||
Train |
True SC
|
SE
| 86 | 30 | 31 | 147 | 20.4% | 119 | 26 | 157 | 302 | 8.6% |
SY
| 110 | 15 | 62 | 187 | 33.2% | 137 | 26 | 246 | 409 | 60.1% | ||
Total
|
380
|
67
|
146
|
593
|
468
|
72
|
559
|
1099
| ||||
Accuracy | 46.5% | 44.0% | ||||||||||
TE
| 49 | 2 | 16 | 67 | 73.1% | 70 | 7 | 49 | 126 | 55.6% | ||
Test |
True SC
|
SE
| 16 | 10 | 8 | 34 | 29.4% | 30 | 7 | 53 | 90 | 7.8% |
SY
| 41 | 3 | 20 | 64 | 31.3% | 43 | 1 | 47 | 91 | 51.6% | ||
Total
|
106
|
15
|
44
|
165
|
143
|
15
|
149
|
307
| ||||
Accuracy | 47.9% | 40.4% |