The Colombian mandatory system of quality assurance (SOGCS,
Sistema Obligatorio de Garantía de Calidad en Salud) includes a series of methodological elements to assess the quality of the health services. The current legislation states that the evaluation of the quality of services must be made by a comparison of observed quality and expected quality [
4]. Such quality is validated based on pre-established rules, which are known to all participants in the health care system. These rules can come in different formats: manuals, practice guides, technical standards, and indicators, among others.
In order to evaluate the quality of gastric cancer treatments, we propose a three step strategy. First step prepares the information from RIPS to obtain those that are related to gastric cancer and with reliable information. Second step consists on applying DEA and the Malmquist Index techniques over the available information from different EPS. These techniques are applied by using a set of previously defined quality indicators related to the diseases treatment as base.
Finally, the third step selects a set of EPS with distinct efficiency characteristics and analyzed their treatment processes using the sequential clustering algorithm to construct typical care pathways. These efforts were aimed at identifying tendencies and patterns regarding the treatments applied to patients and relate them to the quantitative efficiency measures previously gathered for efficient and inefficient EPSs.
Data preparation
Data preparation consists of three basic steps: 1) Selection of patients diagnosed with gastric cancer; 2) Analysis of the quality of the available data; and 3) Grouping of the procedures performed on the patients starting from the time of the diagnosis.
Patient Identification. Table
1 shows stomach cancer-related diseases codes used by ICD-10 standard (International Statistical Classification of Diseases and Related Health Problems 10th Revision). This standard is the one used on the RIPS an allows to identify all the patients suffering from stomach cancer in Colombia.
Table 1
ICD-10 codes and number of patients
D131 | Benign neoplasm of stomach | 176 |
Z120 | Special screening examination for neoplasm of stomach | 129 |
C160 | Malignant neoplasm of cardia | 80 |
C161 | Malignant neoplasm of fundus of stomach | 559 |
C162 | Malignant neoplasm of body of stomach | 363 |
C163 | Malignant neoplasm of pyloric antrum | 108 |
C164 | Malignant neoplasm of pylorus | 18 |
C165 | Malignant neoplasm of lesser curvature of stomach | 101 |
C166 | Malignant neoplasm of greater curvature of stomach | 75 |
C168 | Malignant neoplasm of overlapping lesion of stomach | 35 |
C169 | Malignant neoplasm of stomach, unspecified | 1618 |
D000 | Carcinoma in situ of lip, oral cavity and pharynx | 54 |
D001 | Carcinoma in situ of esophagus | 104 |
D002 | Carcinoma in situ of stomach | 357 |
Analysis of the quality of the data. The patient data and their related procedures are analyzed in order to validate data quality. The two most common quality-related problems for the patients are data replication and inconsistencies between the procedures performed and the diagnosis; 9.9% of patients diagnosed with cancer are reported more than once between 2010 and 2011. Additionally, 1.2% of these patients did not have surgical procedures between 2010 and 2011, and only 39% of procedures had a patient associated to them in the analyzed data. All records with one of the aforementioned problems were excluded from the analysis.
Grouping of procedures. In order to reduce the number of different procedures to be analyzed and as suggested by an oncologist, a filter was applied based on the CUPS classification (Clasificación Única de Procedimientos en Salud). The CUPS, according to Resolution 1896 of 2001 of Colombia, are a logical, hierarchical and detailed classification of health procedures and interventions performed in Colombia, identified by a code and described by a unique nomenclature. The analyzed procedures in this work are related to the first three characters as follows:
Selected first-level group filter
-
Digestive System
-
Imaging
-
Consultation, Monitoring and Diagnostic Procedures
-
Clinical Laboratory
-
Transfusiology and Blood Bank
-
Nuclear Medicine and Radiation Therapy
-
Other nonsurgical procedures
-
Miscellaneous Procedures
Selected second-level subgroup filters
-
Transfusiology and Blood Bank
-
Radiological Imaging
-
Clinical Laboratory
-
Nuclear Medicine and Radiation Therapy +
-
Stomach Related Procedures
-
Intestine Related Procedures
-
Procedures in the Abdominal Wall
-
Esophagus procedures
-
Prophylactic, therapeutic and other miscellaneous procedures
As in the precedent step, data quality-related problems were detected: (1)difficulty to validate the relationship between the procedure and the analyzed disease, (2)Data replication, and (3) reliability in the content of the attributes.
13.31% of records presented problems insofar as identifying a relationship between the procedure used and the disease analyzed. Approximately, 65% of procedures can be used for different health problems, affecting the precision in the analysis. Additionally, there are procedures related to the same patient on the same date, or a near date that are not possible. For example, complete stomach surgery followed by partial stomach surgery. Finally, the most important cases regarding reliability in the content of the attributes are related to time. In fact, the majority of records have 00:00:00.000 as their time value. All records with the aforementioned problems were identified and excluded.
Data envelopment analysis
Data Envelopment Analysis (DEA) is a non-parametric efficiency approach. It was developed by [
6] and later elaborated by [
7] (BCC model). DEA computes the relative efficiency of decision-making units (DMUs) with many inputs and outputs [
8]. In the same way, DEA allows the quality assessment to include not only a set of performance indicators (outputs) but also the amount of consumed resources (inputs). As a result, DEA it is now considered mainstream to appraise the efficiency of health institutions [
2]. Furthermore, Emrouznejad et. al in [
9] concluded that DEA applications will continue to be a primary arena of research in the future.
In operations management, DEA is used for benchmarking where a set of measures (indicators) is selected to estimate the performance of production and/or service operations, comparing multiple DMUs with a structure of multiple inputs and outputs. As a result, a set of DMUs that belong to a “best-practice frontier” [
10], are identified. This frontier, allows us to calculate an efficient solution for every level of input or output. Any DMU not on the frontier is considered inefficient. A numerical coefficient is given to each firm, defining its relative efficiency. Where there is no actual corresponding firm, virtual efficient producers are identified to make comparisons [
11].
Classical DEA models rely on the assumption that inputs have to be minimized while outputs have to be maximized [
12]. However in health care, one or more outputs -called undesirable outputs- have to be minimized [
13]. Even more, according to [
14] considering such variables, in efficiency analysis, have paved the way for more thorough assessments. Nevertheless, modeling undesirable outputs has been object of considerable discussion in the efficiency literature, because of the lack of consensus about the most appropriated approach [
3]. Even when variable transformations can be addressed to avoid this problem, [
15] conclude that such transformations could generate loss of linearity. Authors also compare methods to deal with this situation.
Liu et al. [
16] presented a survey of DEA applications from 1978 and 2010. According to the authors, health care is the second largest application area. Authors also state that most of the reviewed papers studied hospital performance. More recent papers studied the integration of DEA and complementary techniques to measure health care efficiency. As an example, Al-Refaie et al. [
17] applied simulation and DEA to improve the emergency department of a Jordanian hospital. In this research DEA was used to identify the best possible scenario regarding nurses’ workload. They concluded that using DEA to develop quality frontiers in health services is a new promising direction.
In recent years, several studies to evaluate health policies and health services has been developed. In Uganda, DEA was used to evaluate the efficiency of referral hospitals [
18], as the same in Angola [
19], Zambia [
20], South Africa [
21], among others. In Asia, recently [
22] propose to evaluate the performance of maternal and child services in China hospitals using DEA, comparing poverty and non-poverty country hospitals. In Latin-America there are few studies that use DEA to evaluates performance in health institutions [
23] propose DEA as new strategy to evaluate efficiencies in Chilean hospitals. In the best of our Knowledge there are not studies that allow us to evaluate the quality of a procedure in Colombia.
To evaluate the quality of gastric cancer treatments in different Colombian EPSs, a DEA model with non desirable output variables keeping the linearity between them are considered. After analyzing different models, we use the model proposed by [
15], an output-oriented model with variable return to scale. The mathematical model is the following:
Subject to:
$$ \sum\limits_{j=1}^{n}{\eta_{j} x_{ij} + s_{i}^{-}}=x_{i0} \ \ \ i=1 \ldots m $$
(2)
$$ \sum\limits_{j=1}^{n}{\eta_{j} y_{rj} - s_{r}^{+}}=\rho_{0} y_{r0} \ \ \ r=1 \ldots s $$
(3)
$$ \sum\limits_{j=1}^{n}{\eta_{j} b_{tj} - s_{t}^{+}}=2b_{t0}-\rho_{0}b_{t0} \ \ \ t=1 \ldots T $$
(4)
Let
x,
y and
b be sets that represent input desirable variables, output desirable variables and output non desirable variables. Equation
1 is the objective function that looks to maximize efficiency considering the constrains over each set of variables, represented by Eqs.
2 to
4. This last set of equations, are used to assign weights to output and input variables and to estimate the distance of each EPS from the efficient frontier.
[
24] Productivity Index was developed to measure changes in technological productivity over a period of time. This index was initially proposed by [
24] and has been used in several studies with multiple variants such as [
25,
26]. In order to measure the changes of EPSs’ efficiency over the observed period, we used the index used by [
27], defined as follows:
$$ MI=\left[\frac{\delta^{1}\left.\left(\left(x_{0},y_{0}\right)^{2} \right)\right)}{(\delta^{1}((x_{0},y_{0})^{1})} \times \frac{\delta^{2}\left.\left(\left(x_{0},y_{0} \right)^{2} \right)\right)}{(\delta^{2}((x_{0},y_{0})^{1})}\right]\frac{1}{2} $$
(5)
Let δ
(
t
2)((x
0,y
0)(
t
1)) be the efficiency of one DMU (x
0,y
0)(
t
1) measured with respect to the technological frontier t
2, and obtained from the results of the DEA model.
From the RIPS and based on health professionals’ opinions, the following efficiency indicators were calculated for each EPS and used for the DEA analysis:
-
Input variables:
-
Number of patients: The total number of newly diagnosed patients with gastric cancer and, at least, one associated treatment procedure as classified by the EPS, for the period of study. A treatment procedure may be a surgical procedure associated with cancer, chemotherapy or radiotherapy. This indicator is taken as a desirable input variable for the model
-
Output variables
-
Treatment opportunity: The average number of days per patient between the time of diagnosis and the first procedure for the treatment of the disease, as classified by the EPS. This indicator is taken as a desirable input variable for the model.
-
Number of readmissions: The average number of readmissions per patient, as classified by the EPS, for patients who had at least one emergency readmission after undergoing surgery in the study period. This indicator is taken as an undesirable output variable.
-
Number of previous studies: The average number of diagnostic studies prior to the first diagnosis of cancer per patient as classified by the EPS. This indicator is taken as a desirable output variable.
-
Number of histological studies: The average number of histological studies per patient, as classified by the EPS. This indicator is taken as a desirable output variable.
Once the input and output variables were defined, the most (and least) efficient EPSs in the Colombian Health System were identified. Subsequently, the efficient frontier was computed using the calculated efficiencies of each institution. Finally, the Malmquist indicator for each institution was calculated to evaluate the change in efficiency in the study period.
Process mining
Process mining is a field of data mining that allows the discovery of business processes in a given domain using different types of algorithms. Examples of process mining algorithms include genetic algorithms, heuristics mining and sequential clustering, among others [
28]. This work uses a ProM plugin that provide the Sequence Clustering Algorithm, a combination between sequence analysis (first-order Markov chain) and clustering. In particular, we decided to use sequential clustering, given its popularity in the process mining community, and information provided to interpret the results. In fact, sequential clustering algorithms generate a series of discrete states that are very similar internally [
29]. These series, known as clusters are represented by a Markov chain consisting in states and transition probabilities between them. These probabilities depend only on the current state.
The use of process mining in this work allowed us to identify typical treatments used in patients by distinct EPSs to analyze various treatment patterns and, in turn, improve the quality of the health services. To do this, we had to carefully select the EPSs to be analyzed and the model to be used to do this, as well as prepare the data to execute the model, and evaluate the results.
This project uses two different EPSs: the most and least efficient according to DEA results. This selection allows a comparison to be made between these EPS to understand patterns affecting treatment quality. On the other hand, the design of the model consists in creating the sequential clustering model, and preparing the data used in the analysis related to procedures applied in treatments. It is important to note that a detailed granularity level of information used can increase the complexity of the results because of the number of procedures and relations between them. Finally, the evaluation, presented in the following “
Evaluation” section includes the criteria of an oncologist to validate the quality of the identified treatments.
Data models
The data model used in the study includes two entities that consolidate information from patients with gastric cancer and procedures used in the treatment of these patients. According to patients, the entity has a unique patient identifier, information about institutions related to the health service (EPS and IPS unique codes), date of the first diagnosis, main diagnosis of the consultation, and other three possible diagnoses related to the consultation. On the other hand, the entity where the procedure takes place holds information about the date of procedure, institutions related to the health service (EPS and IPS unique codes), patient identifier, procedure code (CUPS identifier), and the main diagnosis of the procedure.
The entity where the procedure takes place does not hold information on time, given the previously discussed data quality problem in this field. As a consequence, the identified paths using sequential clustering represent patients immediately operated on after the diagnosis, although, in reality, that is not the case.
Data aggregation
An aggregation strategy based on the CUPS was used in order to reduce the complexity of the models arising from the vast number of different treatments that a cancer patient can be subjected to and to facilitate analysis by the experts, according to procedure characteristics such as: type of procedure (diagnostic, radiotherapy, surgery) and affected organ (intestine, stomach). Table
2 shows the aggregation scheme proposed to characterize non-surgical and surgical procedures, presented in the CUPS section column. This schema has two levels to define the procedures,
Level 1 and
Level 2 columns in the Table. According to non-surgical procedures, Level 1 represents diagnosis, radiotherapy and chemotherapy procedures. Level 1 for surgical procedures represents diagnosis, pre and post operative and surgery procedures. Level 2 has more specific information about a Level 1 procedure. For example, non-surgical diagnosis procedures are classified in terms of radiography, tomography, clinical examination, derived examination and scintigraphy. Finally, the percentage in brackets in the Table represents the percentage of procedures in each category used in this work. In this way, the information used contains 92.2% of non-surgical procedures.
Table 2
Two-level aggregation scheme
| | Radiography (7,72%) |
| | Tomography (5,06%) |
| Diagnosis (86,3%) | Clinical Examination (45,7%) |
Non-surgical | | Derived Examination (27,2%) |
Procedures (92,2%) | | Scintigraphy (0,55%) |
| Radiotherapy (0,5%) | Teletherapy and Therapy with radioisotope (0,5%) |
| Chemotherapy (5,3%) | Chemotherapy (5,3%) |
| Diagnosis (5,2%) | Biopsy (0,08%) |
| | Cavity Exploration (5,1%) |
Surgical | Pre-Post Operative (1,5%) | Pre-Postoperative (1,5%) |
Procedures (8,2%) | | Abdominal (0,08%) |
| | Esophagus (0,17%) |
| Surgery (1,6%) | Stomach (1,18%) |
| | Intestine (0,13%) |
Evaluation
The evaluation phase is carried out by a gastric cancer oncologist, according to whom, the quality of a given medical treatment depends on some of the features included in the patient’s care pathway. Thus, the treatments administered to a patient with gastric cancer should include the following features:
-
Will follow the treatment established in the clinical guide for the disease.
-
Will establish diagnostic procedures before and after a surgical procedure.
-
Will include procedures such as chemotherapy or radiation treatments. Furthermore, these procedures should be applied sequentially.