Introduction

Physical fitness is involved in the performance of the daily physical activity or physical exercise1. With multiple components, physical fitness includes cardiorespiratory fitness, musculoskeletal strength, endurance, flexibility, agility, balance2. Physical fitness has been considered as a predictor of morbidity and mortality for all causes3. A higher physical fitness level in children has been associated with more positive health-related outcomes4. While poor physical fitness is a risk factor for cardiovascular diseases5, diabetes mellitus6, and poor mental health7. Meanwhile, aspects of physical fitness in childhood are predictive of health outcomes later in life8.

Normative values of physical fitness, placing individuals and groups in percentiles and categories, can be used to help interpret an individual’s fitness test results by identifying how their results compare with the general population and helped with the diagnosis and prevention of diseases as well as in detecting sporting talents9. The developmental patterns of children and adolescents’ physical fitness have been well studied and extensively reviewed, such as USA10, Europe11,12, Spanish13, Canada14, and Australia15. We also developed 20 m-shuttle run test norms for Chinese children and adolescents16, while reference values are scarce for a comprehensive set of physical fitness tests.

Given the decline in physical fitness among Chinese children and adolescents17, the present study aimed: (1) to develop sex- and age-specific physical fitness reference standards; (2) to express sex- and age-related differences using standardized effect sizes, thus providing references for the improvement of Chinese children and adolescents.

Method

Participants

Data for this study were drawn from the “Preparation of New Evaluation Methods and Criteria for Physical Health of Children and Adolescents in China” (No. 11001-412221-15017). It was approved by the Human Experimental Ethics Committee of the East China Normal University (Approval No.: HR2016/12055) and was conducted in 2015–2016 by the Key Laboratory of Adolescent Health Assessment and Exercise Intervention of the Ministry of Education. A stratified randomized cluster sampling method was used to select participants from 27 provinces in six geographical divisions of China: East China (provinces including Shanghai, Shandong, Jiangsu, Zhejiang, Anhui, Jiangxi, Fujian), North China (provinces including Beijing, Neimenggu, Hebei, Shanxi), Central-South China (provinces including Henan, Hubei, Hunan, Guangdong, Guangxi, Hainan), Northwest China (provinces including Shanxi, Xinjiang, Gansu), Southwest China (provinces including Sichuan, Guizhou, Xizang, Yunnan), and Northeast China (provinces including Heilongjiang, Jilin, Liaoning). Public schools from urban and rural, decided by the administrative region of China, were selected in each province. Then classes were randomly selected from the selected schools. Subsequently, cluster students without physical and mental disabilities in the selected classes were recruited. The detailed sampling methods were also reported elsewhere18.

Finally, a total of 85,535 children and adolescents (48.7% girls) aged 7–18 years were involved in the present study. Participants included for each physical fitness test were presented in Table 1. Therein, 2.1% of the participants came from Lasa, Tibetan, which is 3500 m high than the sea level. Regarding nutritional status, BMI (kg/m2), calculated as body weight (kg) divided by height (m2), to define overweight and obesity and thinness according to the WHO standards and classifications19: thinness (< − 2 for BMI Z score), normal (≥ − 2 and ≤ 1 for BMI Z score), overweight (> 1 and ≤ 2for BMI Z score) and obesity (> 2 for BMI Z score). The prevalence of thinness, normal weight, overweight and obesity in the present study were1.9%, 68.9%, 12.8%, 16.4% for boys and 1.3%, 83.3%, 9.5%, 6.0% for girls, respectively. Before the investigation, verbal and written informed consent was obtained from both the students and their parents. All students’ names were digitally coded to avoid leaking their personal information.

Table 1 The averages and deviation of weight, height, and physical fitness tests by age by sex.

Physical fitness measurement

All the measurements were carried out following relevant guidelines20,21 and regulations were conducted by trained staff. In each school, 1–2 professionals majored in human sport science and 4–5 trained and qualified physical education teachers were in charge of the physical fitness tests. To reduce measurement error, the measurement instruments were calibrated before use and each test was completed at a fixed time of the day to reduce data deviation caused by different test times. Physical fitness items included grip strength (reflecting upper-body strength), standing long jump (reflecting lower limb strength), 30-s sit-ups (reflecting abdominal strength), sit and reach (reflecting flexibility), 50-m dash (reflecting speed), 20-s repeated straddling (reflecting agility), and 20-m shuttle run test (20-m SRT, reflecting cardiorespiratory fitness).

Grip strength

Participants were requested to stand upright with feet shoulder-width apart and elbow fully extended during the assessment. Then they were instructed to squeeze the grip with full force and continuously for at least two seconds twice. The larger value was recorded.

Standing long jump

The participant was instructed to stand behind the starting line (but as close to it as possible) to prepare for the upcoming standing long jump. Each participant was instructed to push off vigorously and jump horizontally as far as possible, taking off and landing with the feet together and to stay upright. The distance from the starting line to the heel of the foot closest to the start line was recorded. The test was repeated twice and the best score was retained in centimeters.

30-s sit-ups

The participants were requested to lay relaxed on the cushion, with feet pressed by an assistant and hands crossed over the chest to prepare the test of 30 s sit-ups. When heard the starting signal, the participant repeatedly sat up and touched his knee with the forehead, then lay down quickly. The times of the forehead touching the knee within 30 s is recorded as the result.

Sit and reach

The participant sat on a mat with shoes removed, with both legs shoulder-width apart and fully extended, heels on the pad of the instrument. The height of the guide rail was adjusted to keep the participant’s toes even with the lower edge of the marker. The participant was then instructed to slowly reach forward and push the marker forward with the middle fingertips of both hands as far as possible on the scale. Two trials were completed, and the greater distance was recorded as the result of the sit and reach test.

50-m dash

The result of the 50-m dash was the time taken to run 50 m from the starting line. The participants were instructed to run toward the finish line as fast as they could immediately on hearing the starting signal. The result was recorded to the nearest 0.1 s.

20-s repeated straddling

There were three parallel lines 100 cm apart on the ground. The participants stood across the central line and moved horizontally to the right line then back to the central, left, central, and so on when heard the starting signal. Jumping is prohibited. The number of straddles in 20 s was recorded as a result.

20-m SRT

20-m SRT involves continuous running back and forth between two parallel lines 20-m apart in time to audio signals. It comprises several stages (also called levels), each lasting about one minute, with each stage comprising many 20-m laps (also called shuttles). At each stage, the required running speed increases, until the child can no longer run the 20-m distance in time with the audio signal (on 2 consecutive occasions) or when the child stops due to volitional fatigue. The last lap completed was recorded as the result.

Statistical analysis

All statistical analyses were performed using the LMS Chart maker Pro version 2.43 (Institute of Child Health, Lon-don) and SPSS version 25.0 (IBM, Armonk, NY, USA). The level of statistical significance was set at 0.05. Percentile curves for each physical fitness test were calculated using the LMS, which summarizes the changing distribution in reference centile curves, representing skewness (L, expressed as a Box-Cox power transformation), median (M), and coefficient of variation (S). Smooth centile curves were fitted to obtain the sex- and age-specific norms for Chinese children and youth and the effective degrees of freedom in the present study were 2 (L curve), 4 (M curve), and 2 (S curve) for both boys and girls. At last, the age- and sex-specific percentile values were calculated for each physical fitness test.

Age- and sex-related differences in means were expressed as standardized effect sizes for each fitness test. In the age-related analysis, taking the mean of each test of 7 years boys and girls as reference respectively, standardized effect sizes of 8–18 years old children and adolescents were calculated. Similarly, in sex-related analysis, taking the mean of each test of 7–18 years girls as reference respectively, standardized effect sizes of 7–18 years old boys were obtained. Positive effect sizes indicated that mean fitness test performances for older children (in age-related analysis) or boys (in sex-related analysis) were higher than those for 7 years old children or girls. Effect sizes of 0.2, 0.5, and 0.8 were used as thresholds for small, moderate, and large22.

Result

Table 1 showed the averages and deviation of weight, height, and physical tests by age and sex. Table 2 showed the sex‑ and age‑specific percentile values (5th, 15th, 25th, 35th, 45th, 50th, 55th, 65th, 75th, 85th, and 95th percentiles) for each physical fitness test. Figure 1 showed the percentile curves for the 5th, 25th 50th, 75th, and 95th percentiles for all the physical fitness measures across different age and sex groups. In general, the performance improved with age along with the analyzed percentiles for most tests. For example, from 7 to 18 years old, the score of standing long jump increased by 91.8% for boys and 47.0% for girls at P50 (Table 2).

Table 2 Reference standards of the seven physical fitness tests by age and sex for Chinese children and adolescents.
Figure 1
figure 1

Smoothed centile curves of seven physical fitness tests for Chinese children and adolescents.

Age-related differences for each test were shown in Fig. 2. It can be observed large differences in high age groups such as over 8 years for grip strength and standing long jump. The largest rate of increase occurring in teenage years especially for muscular fitness tests. Boys had considerably better performances than girls in grip strength, standing long jump, and 50-m dash across all ages and along with the analyzed percentiles. Taken grip strength as an example, boys outperformed girls by 6.6% at 9 years old and 61.7% at 18 years old at 50th percentiles. While the advantages of boys at 7 years old slightly declined since girls at this age performed better in 30-s sit-ups below P60, 20-m SRT at P10 and 20-s repeated straddling at most of the percentiles. When it comes to sit and reach, girls had better values than boys with the biggest difference (8.7%) at 10 years old in P50.

Figure 2
figure 2

Age-related difference in each physical fitness test ((A) grip strength, (B) 20-m SRT, (C) 30-s sit-ups, (D) 50-m dash, (E) standing long jump, (F) 20-s repeated straddling, (G) sit and reach) expressed as standardized effect sizes (anchored to age 7 years = 0). The limits of the grey zone represent the threshold for a large standardized difference (i.e., 0.8 or − 0.8). Positive effect sizes indicated that mean fitness test performances for older children and adolescents were higher than those for 7 years old children.

Sex-related differences for each test were shown in Fig. 3. Large differences can be observed for grip strength, standing long jump, 50-m dash, and 20-m SRT over 13 or 14 years old, from which the increased sex differences with age can also be obtained.

Figure 3
figure 3

Sex-related difference in each physical fitness test ((A) grip strength, (B) 20-m SRT, (C) 30-s sit-ups, (D) 50-m dash, (E) standing long jump, (F) 20-s repeated straddling, (G) sit and reach) expressed as standardized effect sizes (anchored to girls = 0). The limits of the grey zone represent the threshold for a large standardized difference (i.e., 0.8 or − 0.8). Positive effect sizes indicated that mean fitness test performances for boys were higher than those for girls.

Discussion

The present study used nationally representative data on physical fitness to develop sex- and age-specific norms for Chinese children and adolescents, which can be used as benchmark values for health and fitness screening and surveillance. We observed that the performance improved with age along with the analyzed percentiles in all tests. Boys had higher values compared to girls in all the physical fitness items except for sit and reach test, where girls showed better performance in all analyzed percentiles. Also, the sex differences increased with ages except sit and reach.

Comparing the international studies with the results obtained in our study, it can be concluded that, taking boys aged 11 years at P50 as an example, cardiorespiratory fitness resulted similar for China (6.0 stages/minutes) and Spanish (5.8 stages/minutes, for 16–17 years boys)23, but worse than Australian (8 stages/minutes)15. Regarding lower limb muscle strength, Chinese girls aged 11 years at P50 had better performances (151.7 cm) of French (127 cm)24, Macedonian (127.9 cm)25, and Australian (140 cm)15. Finally, Chinese children and youth underperformed in speed capability than their Australian counterparts (9.0 s vs 8.6 s)15.

The results from this study generally align with findings from previous research, such as for European children11, Australian children15. This study’s findings for the increasing physical fitness with age support previous Canadian and French studies9,26. We found that for boys and girls, the performance in physical fitness tests increased with increasing age especially for grip strength, in which P50 increased averagely by 3.1 kg as age increased 1 year for boys, and 1.64 kg for girls. The factors of this age-difference may be included motivation, concentration, the degree of motor skills, physical activity, and body composition27.

Another finding of our study is that physical fitness levels were better in boys than girls, except for flexibility (sit and reach test), where girls have achieved better results. This finding agrees with the results previously reported in children and adolescents11,28. Moreover, it was reported that sex differences in physical fitness (i.e. cardiorespiratory fitness, muscular strength, and speed-agility) are detectable as early as preschool age29. Distinct development, growth, and maturation of boys and girls undoubtedly contribute to these differences, while the sex differences in physical fitness performance in our study might also be related to the effects of genetics, anatomy, physiology, behavior, and social and physical environments30,31. Carlos et al. investigated the magnitude of sex differences in physical fitness and suggested that greater sex differences in the explosive strength of upper and lower limbs, and smaller in the abdominal and upper limbs muscular endurance and trunk extensor strength and flexibility, balance, and speed32. Recent studies have identified that boys outperformance in cardiorespiratory fitness and muscular strength because they are more physically active and have a higher fat-free mass33. Regarding the flexibility, some of the factors presented for better performance of girls are that girls have greater passive dorsiflexion angle, while boys have a higher muscle volume and dynamic property of tendon tissues34.

We also observed sex differences also increased with age. The P50 differences of cardiorespiratory fitness between boys and girls increased from 1 lap in 9 years old to 21 laps in 18 years old. Consistent with this study’s findings, other studies in children and adolescents showed a similar sex-differences trend in P50, which was + 38 laps for boys in 18-year-old adolescents11. The higher age-related sex differences in adolescents compared to children might be explained by more pronounced physiological changes caused by pubertal development30,35. Sex and age-related differences reflect the complex and interconnected effects of genetics, anatomy, physiology, behavior, social, and physical environments14,36.

This study has several strengths, including the large sample of children and adolescents from across China with sex-specific information, and the harmonization and standardization of assessment of physical fitness. Despite these strengths, this study is not without limitations. The main limitation of the study is the cross-sectional design, which prevents the examination of inter- and intra-individual differences, resulting in the need for a longitudinal study with repeated measurements. Besides, differences during the maturation can’t be excluded since we didn’t take the physical growth or biological maturity into account.

Conclusion

The present study produced nationally representative normative‑referenced percentile values for seven physical fitness tests. All these norms suggested sex-based differences in physical fitness and older children performed better than younger children. Thus, there is a need for a differentiated approach in the physical education class in terms of adjustment of physical activity based on sex, age, and level of fitness abilities.