Background
By mid-July 2020, more than 108,000 COVID-19 cases had been diagnosed in Canada with more than half in the province of Quebec [
1]. The province of Quebec confirmed its first case of COVID-19 on February 27. With only seventeen confirmed cases, the state of emergency was declared by the Quebec government on March the 13th 2020 with a shutdown of daycare facilities, primary and secondary schools, CEGEPs (General and Vocational College) and Universities followed by a more extensive shutdown of restaurants and bars, indoor sport facilities and non-essential economic activities. In early March, most cases could be traced to infected travelers returning to Canada or to close contact with those travelers. By March 23, according to the Public Health Agency of Canada, nearly half of Canadian COVID-19 cases have been acquired through community spread [
2]. In mid-July, more than 56,000 COVID-19 postive cases have been confirmed in the province of Quebec with nearly half in the Montreal metropolitan area and around one-third of the confirmed cases have been diagnosed among middle-aged and older adults. As in many countries over the world, the COVID-19 testing strategy in Canada has evolved over the first wave, taking into account each province’s specific situation.
In order to be prepared for a potential second wave of COVID-19 in the fall, it seems of utmost importance to analyze the epidemiological characteristics and the socio-economic impact of the spring outbreak in the population, together with the results of the public health policies that have been implemented. Such information can provide useful knowledge for planning public health strategies for the next coming months.
Despite the impressive number of publications about COVID-19 infection, most of them are hospital-based series, [
3‐
6] to name a few, while population-based studies are scarce [
7‐
9]. However, population-based cohorts that have been established before the pandemic may yield unbiased estimates of the characteristics and consequences of the pandemic in the general population. It may also provide useful information about the effectiveness of public health interventions such as testing strategies. In this context, the existing population-based cohort CARTaGENE (CaG), composed of middle-aged and older adults, which was established before the COVID-19 outbreak, offers a unique opportunity to analyze the characteristics and consequences of the outbreak among a population that seems to be at greatest risk.
A large survey (hereinafter referred to as CaG COVID-19) was launched in early June. An online questionnaire was sent to the participants of the CaG cohort for collecting information about COVID-19-related symptoms, diagnosis, comorbidities and social impacts of the spring outbreak. The aim of this survey was to analyze the demographic and clinical characteristics, and the socio-economic impact associated with the COVID-19 pandemic in Quebec in spring 2020. We also investigated the statistical association between risk exposure, pre-existing medical conditions and clinical features, and the frequency of being tested and to have a positive result among those who have been tested.
Methods
CARTaGENE population-based cohort
CARTaGENE is a population-based cohort composed of more than 43,000 Quebec residents aged between 40 and 69 years at recruitment [
10]. Original survey design was defined by gender, age groups and forward sortation area (FSA - defined by the first 3-digit postal codes). Participants were randomly selected to be broadly representative of the population recorded on the Quebec administrative health insurance registries (about 98% of Quebec residents [
11]). Participants have been recruited during two phases (Phase A: 2009–2010 (n = 23,000) and Phase B: 2013–2014 (n = 20,000)) in metropolitan areas where nearly 70% of Quebecers live and prospective follow-ups are conducted on a regular basis. CARTaGENE is part of the Canadian Partnership for Tomorrow’s Health (CanPath, former called CPTP) which is the Canada’s largest population health research platform [
12]. More information can be found in the
Supplementary file S1.
CaG COVID-19 online questionnaire
For this study, we have developed a specific online questionnaire [
13] for collecting updated information on basic demographic variables (age, gender, household composition and postal code), COVID-19 testing, test dates and results, whether participants suspect they had an undiagnosed COVID-19 infection based on symptoms (cough, shortness of breath, fever, etc.). We also collected information on exposure status: health care or essential workers, contact with someone who has been diagnosed with COVID-19, international travel. Those with confirmed COVID-19 infection were asked if they were hospitalized along with details related to their care, treatment and outcome of the hospitalization. In addition, participants were asked to report existing chronic health conditions and medication use. Finally, they were asked about the psychosocial and socio-economic impacts of the pandemic on their lives. This online questionnaire was unfolded in close collaboration with the CanPath consortium for allowing future inter-provincial comparisons as well as other national and international research initiatives. A weblink to the consent and questionnaire was sent by email to all the CARTaGENE participants with a valid email address, that is 33,019. Initial invitation was followed by up to 3 emails reminders. The digital invitations were sent early in June 2020 and were valid for 4 weeks. Survey closed early July 2020. Among the 33,019 participants, 8,137 responded to the questionnaire. The informed consent and questionnaire can be found in the
Supplementary files.
Investigated variables and outcomes
A positive exposure status was defined as an individual being either medical or essential workers, having been in contact with a COVID-19 positive patient or coming back from international travel. A medical worker was defined as either a physician, nurse, hospital/aged-care facility employee, first responder, or pharmacist with patient exposure. An essential worker was defined as either a grocery store attendant, public transit, police, or fireman. A contact with a COVID-19 positive individual was defined as being in the same room as a person who was told by a health professional that he or she has COVID-19. International travel was considered as a travel outside Canada returning after the 1st of February.
Pre-existing conditions were defined as medical condition currently treated among the following list: cardiovascular diseases (high blood pressure, coronary artery disease,...), auto-immune diseases, diabetes, infectious diseases, gastro-intestinal and liver diseases, cancers, renal diseases.
The following list of symptoms were considered in the questionnaire: dry cough, wet cough, shortness of breath (dyspnea), fever (≥38∘C), shivering, fatigue, runny nose, sinus pain (sinusitis), ear pain (otitis), sore throat, hoarseness, loss of taste (ageusia), loss of smell (anosmia), diarrhea, nausea, vomiting, loss of appetite, headache, general muscle aches and pains. The possible responses were: none, mild, moderate or severe.
For the association analyses, the following symptoms were considered as present only when being severe: wet and dry cough, fatigue, loss of appetite, runny nose, sinus pain, ear pain, sore throat, hoarseness, headache, muscle pain, diarrhea.
Hereafter, we will refer to the symptom-based case definition chosen by the Quebec Public Health authorities with at least one of the four major symptoms: fever (≥38
∘C), a new cough or a cough that gets worse, difficulty of breathing, anosmia with or without ageusia [
14].
Two outcomes were studied. The first was the status of having been tested for SARS-CoV-2 (rt-PCR). The second was the status of having received a positive test when having been tested. Confirmed COVID-19 infection was defined as at least one positive result test. If multiple tests were performed, we considered the date of the first positive one.
Statistical methods
Descriptive and univariate analyses
The participants’ baseline characteristics have been reported and tested: demographic (age, gender, geographical location, level of education, financial resources, household income, dwelling type), health history (high blood pressure, cardio-vascular disease, pulmonary disease (chronic obstructive pulmonary disease (COPD), asthma,...), cancer, diabetes, autoimmune diseases, mental health, body mass index (BMI)), lifestyle behaviors (smoking status, alcohol intake,...). To assess the selection bias, we compared the baseline characteristics between the CARTaGENE cohort and the contacted individuals, and between the non-respondents and respondents.
Mean (or median) with standard deviations (or interquartile ranges) were reported for continuous outcomes. Frequencies with 95% confidence intervals were reported for qualitative variables. For comparing means between groups, we used an ANOVA test. For categorical variables, we used chi-squared test or exact test if needed.
For the univariate association analyses (between factors and studied outcomes), we used a chi-square test (qualitative) or a logistic regression model (quantitative). For each hypothesis test, we reported the
p-values. Moreover, in order to address the multiple testing problem, we also indicated those that are still significant for a false disovery rate (the expected proportion of false discoveries among all discoveries) controlled at a level of 1% using the Benjamini-Hochberg procedure [
15].
We compared the early impacts of the COVID-19 outbreak between men and women, and between individuals under and over 65 years. We analyzed the changes before and after the pandemic for income, alcohol consumption, physical activity, quality of sleep, food intake, mental health services access, seeking support, and mental/emotional health.
The univariate analyses were performed without replacing missing data.
Multivariate analyses
For the multivariate analyses of exposure and risk factors, we used a multiple logistic regression model.
For the multivariate analyses of symptoms associated with the outcomes, we had to cope with complex interplay between clinical symptoms. Thus, we considered a Generalized Partially Linear Tree-based Regression (GPLTR) model. GPLTR models represent a class of semi-parametric regression models that integrate the advantages of generalized linear regression and tree-structure models [
16]. The linear part is used to model the main effects of confounding variables (e.g. exposure) while the nonparametric tree part is used to address potential collinearity and interactions between explanatory variables (e.g. clinical symptoms). This tree-based model provides a classification of individuals in homogeneous groups in terms of risk for the event of interest and identifies relevant combination of explanatory variables. The optimal GPLTR tree is selected using a penalized maximum likelihood method with the Bayesian information criterion (BIC) [
16].
Regression trees are prone to instability, especially when dealing with a low number of outcomes, making variable selection somewhat precarious. Thus, for the analysis of the positive status outcome, we constructed multiple trees using a bagging approach [
17], which provides a way to assess the relevance of each variable across the set of trees using variables’ importance measures. These later results provide us some arguments regarding the reliability of the selected optimal GPLTR tree. We reported the depth deviance importance score (DDIS) of each symptom that is computed for each GPLTR model as the sum of the values of the deviance at each split based on this variable, weighted by the location of the split in the tree. These scores are summed across the set of trees, and normalized to take values between 0 and 1, with the sum of all scores equal to 1. A set of 300 GPLTR models was done and we reported the symptoms as ranked by the DDIS.
The logistic regression and the GPLTR models were performed without replacing missing data.
Statistical analyses were performed using R software [
18]. Regression tree analyses were performed with the ‘GPLTR’ R package [
16].
Estimation of the probability of being positive when experiencing a symptom
Since the tested participants are selected based on the symptoms being hypothesized to be associated with a positive test, the tested participants constitute a non-random set (ascertainment bias). Thus, it is not possible to estimate directly the probability of being infected given that the person has experienced a particular symptom since non-selected (untested) individuals are unobserved. We can only estimate the probability of being positive given that the person has experienced a particular symptom and has been tested.
Instead, we proposed to report the minimum, mean and maximum values that these probabilities can take using the law of total probability. More precisely, we know that the probability of being positive given that the person has experienced a particular symptom (P(+|S)) can be expressed as:
\(P(+|S)=P(+ | S \cap T) \times P(T|S) + P(+ | S \cap \bar {T}) \times (1-P(T|S))\) where P(+|S∩T) (resp. \(P(+ | S \cap \bar {T})\)) are the probabilities of being positive when the symptom is present and he/she has been tested (resp. not tested) and P(T|S) is the probability of being tested when having the symptom.
From our data, P(+|S∩T) and P(T|S) could be directly estimated while \(P(+ | S \cap \bar {T})\) could not. However, we know that this probability ranges from zero (complete dependence) to P(+|S∩T) (independence between positivity and test).
Thus, we calculated the minimum, mean and maximum values that P(+|S) can attain for each symptom.
Discussion
We report the results of an online survey, using participants from a population-based cohort, with the main objective being to analyze the characteristics and consequences of the first pandemic wave of COVID-19 spring outbreak in Quebec, Canada. We also report the exposure risk factors and COVID-19 related symptoms associated with the status of having been tested for SARS-CoV-2 and having been declared positive.
As seen from our analysis, the demographic characteristics of the respondents are broadly representative of the middle-aged and older population of Quebec, while both the percentage of tested participants for COVID-19 and positive individuals among those being tested are consistent with those reported in Quebec at the closure of our survey. Moreover, the trend observed in our study for the number of tests per day reflected the shortage faced by the province in early April followed by a rebound by the end of May. Montrealers, individuals living in an apartment/condominium, having a pre-existing medical condition and having risk exposure (medical worker, contact with a COVID-19 patient, international travel) were more frequently tested. As expected, medical workers and individuals with known or suspected contact with a COVID-positive individual were the most tested. These later findings are in accordance with public health notices giving priority to health care workers and individuals in contact with people tested positive for SARS-Cov-2, whether they had symptoms or not.
Results from the extended tree-based model analysis, adjusted on exposure factors, show that the combination of dyspnea, dry cough and fever are highly associated with being tested. These results are consistent with the case definition chosen by the Public Health authorities in early spring. The fact that anosmia is not selected reflects its paucity in the general population (4.2% as compared to 11.7% to 25% for the other three symptoms) and that it has been added later in the official list of the main symptoms.
Among the COVID-19 related symptoms associated with testing, ear, nose and throat (ENT) symptoms (running nose, sore throat, hoarseness, sinus pain) and wet cough were not related to a positive test in univariate analysis. Results from the extended tree-based model analysis, adjusted on exposure factors, show that a combination of anosmia, fever and headache are the most discriminant factors for a positive test in our series. Individuals with both anosmia and headache had the highest chance (almost two third) of being positive while those with anosmia alone had almost one-fourth chance of being positive. Individuals without anosmia and fever had less than two percent chance of being positive. These results underline the importance of neurological symptoms such as anosmia and headache, as compared to ENT and gastro-intestinal symptoms. While a literature review found only five studies that have mentioned anosmia an agueusia after SARS-CoV-2 infection in August 2020 [
20], recent studies found that anosmia were among the most important symptoms associated with positivity, including a American statewide seroprevalence study [
21,
22]. Moreover, the primary symptom cluster most associated with SARS-CoV-2 infection was ageusia, anosmia, and fever [
22]. Results obtained from the bagging procedure confirm the importance of the selected symptoms and suggest that the final tree-based model is sufficiently reliable. They also show the importance of fatigue and loss of appetite that are, with anosmia, the main factors found in the predictive model of Menni et al. [
9].
It is worth noting that among the 41 positive individuals, five did not meet the four criteria chosen by the Public Health authorities, two of them reported few symptoms such as fatigue, shivering without fever and/or loss of appetite and three individuals did not experience COVID-19 related symptoms. Interestingly, all these five patients were tested for having a contact with people being tested positive for SARS-Cov-2. This highlights that a non-negligible fraction of infected people could be asymptomatic or pauci-symptomatic, even in this age group. It underlines the importance of contact tracing as an essential component of the toolbox for containing the COVID-19 outbreak.
Interestingly, we observe some discrepancies between being tested and being positive among the studied factors. Some of them are linked to both outcomes such as being a medical worker, having a contact with a COVID-19 patient and fever. Anosmia increases the discriminative power for being positive as compared to being tested, reflecting its value for positivity. In contrast, dyspnea, that was the main factor for being tested, has a lower discriminative power for positivity.
Some factors associated with testing were not related to a positive test such as international travel, pre-existing conditions and ENT symptoms. While the first COVID-19 cases were linked to international travelers, the public health measure (e.g. quarantine, border closure) mitigated this risk. The lack of relevance for the ENT symptoms should lead to withdraw these symptoms from the list of COVID-19 related symptoms. For the pre-existing conditions, this discrepancy reflects the testing policies that have focused on group of patients that might be of high risk for severe disease if exposed to the virus. It is worth to note that this over-representation of people having a pre-existing condition led to an inverse relationship with a positive test that is however no longer significant after multiple testing adjustment. Such spurious association could be related to an ascertainment bias caused by testing primarily individuals reporting specific conditions and thought to be positive.
As reported by the participants, the economic consequences in this early stage of the pandemic seem still moderate but is more important for younger participants. However, the real economic impact of the pandemic may appear in the coming months. The alcohol and food consumption slightly increased during the lockdown, especially among women and younger participants. The current emotional health is considered good or excellent for the majority of the participants, but lower for women. As in our study, a systematic review also found more depressive symptoms among women during the pandemic [
23]. The sleep’s duration and quality seemed less frequent than in another web-based study [
24]. As feared, one third of the participants have experienced a decreased access to health services with surgery or medical procedures canceled or deferred for more than ten percent of them. However, it is interesting to note that virtual consultations were widely used.
One of the main strengths of this study is its embedding within the CARTaGENE cohort. It provides a rich body of information collected before the pandemic that is representative of the middle-aged and older population, which seems to be the most commonly affected group. Thus, it offers a unique opportunity to appreciate the impact of COVID-19 infection and public health measures in the Quebec population. In particular, we highlight the selection process for testing at the population level. We show that the most exposed people (medical worker and people having a contact with a COVID-19 positive patient) were more widely tested than the rest of the population. In this later group, only the individuals having all the COVID-19 related symptoms criteria had a high probability of being tested. This emphasizes the need of increasing the accessibility of testing for the general population who meet the testing criteria through easy to access point-of-care or drive-thru testing centers.
There are however some limitations to this study. Firstly, we experienced a low response rate that might be explained by the online survey in a population that was previously used to paper surveys and by the short time allowed to respond. Nevertheless, our series is broadly representative of the entire cohort. Secondly, it relies on self-reported data, which could be subjected to biases in respondents’ recall and to potential effect from mainstream media coverage. Thirdly, when analyzing the factors related to a positive test, there is an ascertainment bias caused by testing primarily individuals reporting specific exposures or symptoms. This may blur some relationship between factors and a positive outcome. This highlights the interest of seroepidemiological studies at the population level. Fourthly, our population is limited to middle and older adults and has less than one percent of individuals living in senior’s house that were severely affected by the COVID-19 outbreak in the province. Fifthly, due to the low number of cases, our study is under-powered to detect some weak associations. However, the low power of our study reinforces the value of the reported significant associations.
Finally, this study is a first step toward a follow-up study intended to understand the course of the pandemic and its consequences in the population. It will also allow us to compare the public health policies between provinces and countries. Moreover, it will be enriched with a serological study in order to analyze more thoroughly the non-tested symptomatic and asymptomatic cases.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.