First, we explain the matching phase, which analyzes the compatibility between donors and recipients using six biological factors presented in the "
Background" section. Then, we continue with the determination of the number of potential exchange cycles given a cycle length. The third phase computes the probability of a successful transplantation based on the matching results for all potential exchange cycles. In the final phase, we output a robust set of
disjoint exchange cycles, i.e., with a high probability for compatibility. The final result contains a combination of disjoint exchange cycles that maximizes the likelihood of as many transplantation as possible being successful. The weight of a cycle
c is denoted by
\(w_c\) where a higher value indicates a higher likelihood of a transplantation being successful. Thus, the weight of a set of disjoint cycles
\(\mathcal {C}\), i.e., the likelihood for as many transplantation being successful in the set, can be described as the sum of all cycles
\(w_{c_i}\) for
\(i \in \{1, \ldots , |\mathcal {C}|\}\) in the set. The weight of a cycle is determined by the sum over all edge weights
\(w_e\) in the cycle. Finally, the weight of an edge is determined by the sum over the results of all matching criterion which are multiplied by a weight which can be assigned by medical experts to highlight certain biomedical factors. Note that we write this computation as a dot product between a vector
\(\vec {p}(k, l)\) and
\(\vec {w}\) where
\(\vec {p}(k, l)\) contains the results of the matching between pair
k and
l in vector form and
\(\vec {w}\) the respective weights of each criterion. Equation (
1) describes the previous conditions. To achieve the described result, we greedily select disjoint cycles in decreasing priority according to the weight of each individual cycle.
$$\begin{aligned}&\max \sum _{i=1}^{|\mathcal {C}|} \overbrace{\left( \sum _{j=1}^{|cLen|} \underbrace{\left( \vec {p}(k, l) \cdot \vec {w} \right) _j}_{:=w_{e_j}} \right) _i}^{:=w_{c_i}}\qquad&\end{aligned}$$
(1)
Note that our solution is a local optimum which is computed with a greedy algorithm while the solutions by Breuer et al. [
9,
10] are globally optimal. We argue that a locally optimal solution is sufficient in our application scenario for two reasons: First, we assume that the locally optimal results are in close proximity of the global optimum, as real world data sets will likely show sparse compatibility and the additional medical compatibility factors considered by SPIKE will increase the solution quality. Second, the additional expert evaluation following the algorithmic matching will most likely introduce a much higher variance in the chosen solution. The empirical evaluation of those two claims are interesting points for further research requiring real-world kidney exchange data sets. The protocol presented by Breuer et al. [
10] enables usage in a dynamic setting, i.e., a setting in which donor-recipient pairs are put together in a pool where pairs come and go over time. They run their matching protocol on a subset of the pairs of the pool and, afterwards, evaluate the resulting compatibility graph. By design, SPIKE enables usage in a dynamic setting, too, since each part of the protocol can be executed independently of the others parts as long as they receive the output of the previous parts. Such a dynamic setting can also be adapted to an outsourcing scenario. Each input party has their own pool of donor-recipient pairs where they can select a random subset of pairs and send them to the computing parties.
Compatibility matching
The first phase of SPIKE is called
compatibility matching. In this phase, we compare the pair-wise general compatibility and match quality of all donors and recipients with respect to human leukocyte antibodies and antigens, ABO blood group compatibility, age, sex, and weight. The output of this phase is a weighted compatibility graph, where the edge weights indicate the probability of compatibility for each pair.
Table 4
\(\texttt {matchHLA}(\langle \mathsf {hla_d}\rangle ^{\mathcal {B}}\): vector, \(\langle \mathsf {ahla_r}\rangle ^{\mathcal {B}}\): vector) \(\rightarrow\) int
We present the main protocols for the compatibility assessment in the following. The subprotocols for assessing the individual matching criteria HLA mismatches, ABO blood type, age, sex, and weight are given as Additional file
1: Tables S1–S6 in the Appendix.
The HLA crossmatch subprotocol is shown in Table
4. It tests whether the human leukocyte antigens of the donor are unsuitable to the human leukocyte antibodies of the recipient rendering them incompatible.
The subprotocol takes a vector with the antigens of a donor
\(\mathsf {hla}_d\) and a vector with the antibodies of the recipient
\(\mathsf {ahla}_r\) as input. The number of observed HLA, denoted by
\(|\mathsf {HLA}|\), is publicly known. A vector
\(\mathsf {comp}\) stores whether the recipient possesses an antibody against any of the donor’s HLA (cf. Line 3). For enhanced efficiency, we parallelize this comparison as
Single Instruction, Multiple Data (SIMD) operation, such that all HLA matches of one patient are computed in just one step. Afterwards, the overall compatibility (i.e., no antigen-antibody mismatch was found) is computed with
\(\texttt {OR}\) gates in a tree structure, to reduce the (multiplicative) dephts of the circuit from
\(|\mathsf {HLA}|\) to
\(\log _2(|\mathsf {HLA}|)\). To prepare for further processing, we invert
\(\mathsf {combined}\) and return it as result of the HLA crossmatching in Line 6.
Table 5
\(\texttt {computeCompatibilityGraph}(\langle \mathsf {pairs}\rangle ^{\mathcal {B}}\): vector, \(\langle \mathsf {w}\rangle ^{\mathcal {A}}\): vector) \(\rightarrow\) weighted adjacency matrix
In Table
5, we present our MPC protocol that combines the results of the evaluated six medical criteria influencing the compatibility of a kidney donation into a weighted adjacency matrix indicating the donor-recipient compatibility, named
\(\mathsf {compG}\).
It takes a vector
\(\mathsf {pairs}\) containing all possible pairs of donors and recipients and a vector
w with a weight for each criteria (i.e., how much it influences the overall probability for good compatibility compared to the other factors) as input. Lines 4 to 6 additively combine the computed weighted probability of each compatibility criterion and assign it to the respective edge representing the donor of the
i-th pair and the patient of the
j-th pair, where
\(i\ne j\) and
\(i,j\in \{0,\ldots ,|\mathsf {pairs}|-1\}\). In Line 7, we additionally check whether the
i-th donor and the
j-th patient exhibit general immunological compatibility using the HLA crossmatch subprotocol (cf. Table
4). If this is the case, we store the result of the edge weight at the respective index, otherwise, we store the secret shared constant 0.
MPC Cost. The two sections in Table
4 evaluate
\(|\mathsf {HLA}|\) \(\texttt {AND}\) gates (as SIMD) and
\(\log _2(|\mathsf {HLA}|)\) \(\texttt {OR}\)7 gates, respectively. Finally, we invert
\(\mathsf {combined}\) once. This results in a circuit depth of
\(\log _2(|\mathsf {HLA}|)+1\) and a total number of
\(\texttt {AND}\) gates of
\(2\times |\mathsf {HLA}|\). Boolean sharing (
\(\mathcal {B}\)) is used in this protocol, as Boolean operations are performed and the circuit depths is low, thanks to the SIMD vectorization [
61].
To fully assess the matching quality (Table
5), all criteria have to be evaluated for each recipient, i.e., Table
4 and Additional file
1: Tables S1, S2, S4, and S6 are run
\(|\mathsf {pairs}|^2\) times. Then, in Table
5, we additionally evaluate five multiplications, five additions, one comparison, one
\(\texttt {AND}\) gate, and one
\(\texttt {MUX}\) gate. Due to the arithmetic operations in this protocol, the results of the compatibility evaluation protocols must be converted between
\(\mathcal {B}\) and
\(\mathcal {A}\).
Cycle computation
The second phase of SPIKE computes the number of possible kidney exchange cycles given a concrete input cycle length
8 from the compatible donors and recipients that were output by the compatibility matching. Our MPC protocol for this part is shown in Table
6.
Table 6
\(\texttt {determineNumberCycles}(\langle \mathsf {compG}\rangle ^{\mathcal {A}}\): matrix) \(\rightarrow\) number of cycles
Table
6 takes the secret shared weighted compatibility graph
\(\mathsf {compG}\) as input. The desired length of cycles
\(\mathsf {cLen}\) is public. We first compute the unweighted adjacency matrix in Line 2 (cf. Additional file
1: Table S7, in the Appendix). For the unweighted matrix, we compute the
\(\mathsf {cLen}\)-th power using a naïve implementation
9. The entries in this resulting matrix indicate how many paths of length
\(\mathsf {cLen}\) start at vertex
i and end at vertex
j. For cycles, the entries are on the diagonal, as start- and end-vertex are identical. Following this thought, the sum of the entries of the diagonal is the total number of cycles with the given cycle length
\(\mathsf {cLen}\). Note that this number contains duplicates, namely, “congruent” cycles that are the same, but were found via a different start/end vertex.
10 We remove the duplicates later in Additional file
1: Table S9 (described in the Appendix).
MPC Cost. Table
6 contains mostly arithmetic operations (
\(|\mathsf {pairs}|^3\) multiplications and
\((|\mathsf {pairs}|^3-|\mathsf {pairs}|^2\) additions), however, the computation of the unweighted adjacency matrix is most efficiently performed in
\(\mathcal {B}\) \(|\mathsf {pairs}|^2\) comparisons and
\(\texttt {MUX}\) gates). For that reason we convert
\(\mathsf {compG}\) from
\(\mathcal {A}\) to
\(\mathcal {B}\) (cf. Line 1) and back (in Additional file
1: S7).
Table 7
\(\texttt {findCycles}(\langle \mathsf {compG}\rangle ^{\mathcal {Y}}\): matrix, \(\mathsf {cCycle}\): vector, \(\langle \mathsf {allCycles}\rangle ^{\mathcal {Y}}\): vector, \(\langle \mathsf {weight}\rangle ^{\mathcal {Y}}\): int, \(\langle \mathsf {valid}\rangle ^{\mathcal {Y}}\): int) \(\rightarrow\) vector of tuples
Cycle evaluation
The third phase of SPIKE then identifies the most likely successful unique exchange cycles consisting of compatible pairs of donors and recipients and sorts them in descending order with respect to their weight.
Our first subprotocol for this phase, shown in Table
7, finds all exchange cycles of the desired length (including duplicates) and computes the weight of each cycle. This weight is the sum of all included weighted edges. As mentioned before, the weight associated with an exchange cycle indicates the probability of the transplantation being successfully carried out, i.e., its robustness.
The subprotocol takes the secret shared compatibility graph
\(\mathsf {compG}\) output by Table
5, the currently analyzed exchange cycle
\(\mathsf {cCycle}\), its secret shared weight
\(\mathsf {weight}\), a secret shared counter
\(\mathsf {valid}\), which tracks the number of edges in
\(\mathsf {cCycle}\), and a vector of secret shared tuples
\(\mathsf {allCycles}\), which will be consecutively filled with all possible exchange cycles and the corresponding sum of weights. In a recursive execution of Subprotocol
7, this vector is filled, as explained in detail in the following. The desired output cycle length
\(\mathsf {cLen}\) and the number of recipient-donor pairs
\(|\mathsf {pairs}|\) are public. Contrary to the protocols in [
9,
10], the output number of cycles
\(|\mathsf {cycles}|\) found in Table
6 is revealed for efficiency reasons. We consider this leakage as acceptable since it leaks only a very high-level aggregate property, generally not allowing the inference of the compatibility graph’s topology
11. In the legal sense, the revealed number is considered non-sensitive as well, as it is an aggregated, anonymized datum.
Table
7 first checks if the currently analyzed exchange cycle
\(\mathsf {cCycle}\) already has the desired length
\(\mathsf {cLen}\). If this is the case, the weight of the last edge is added to the respective sum of this cycle’s weights in Line 2. Next, each valid
\(\mathsf {cCycle}\) is added to
\(\mathsf {allCycles}\) with its respective sum of weights. A
\(\mathsf {cCycle}\) is valid, if it is closed (cf. Lines 3–4). An invalid cycle is associated with weight zero (cf. Line 5). Note that a weight of zero does not contribute to the solution, hence a cycle with weight zero is never considered for a solution. In Line 8, the operations done in Lines 2–3 are reverted to restore the state of
\(\mathsf {cCycle}\) before the last edge was added, i.e., the weight of the last edge is subtracted from
\(\mathsf {weight}\) and
\(\mathsf {valid}\) is decreased by 0 (no edge) or 1 (edge).
Cycles that do not have the desired length yet are handled in Lines 10–21. For these exchange cycles, the subprotocol checks whether they are already part of
\(\mathsf {cCycle}\), as each vertex may only appear at most once (cf. Line 11). If it is not included, the weight of the edge from the previous to the new vertex is added by increasing
\(\mathsf {cCycle}\)’s weight and counter
\(\langle valid \rangle ^{\mathcal {Y}}\), and the new vertex is added to
\(\mathsf {cCycle}\) (cf. Lines 14–16). Afterwards, Table
7 is recursively called again with the newly added vertex. Once the function returns, we revert the operations done before to be able to analyze the next cycle (cf. Lines 18–19).
The second subprotocol of the cycle evaluation (cf. Additional file
1: Table S9 in the Appendix) removes duplicates from the exchange cycles set. It extracts
\(\#{\mathsf {unique}}=\lfloor \frac{\#{\mathsf {cycles}}}{\mathsf {cLen}} \rfloor\) cycles and returns the
k cycles with the highest probability for a successful transplantation.
Table 8
\(\texttt {evaluateCycles}(\langle \mathsf {compG}\rangle ^{\mathcal {Y}}\): matrix) \(\rightarrow\) vector of tuples
Table
8 combines the previously discussed subprotocols. It first calculates the sum of weights for each cycle with Table
7 (findCycles) and sorts the result using Additional file
1: Table S8 (kNNSort), such that only the
k cycles with the largest weight are output. Those are all valid cycles, possibly including duplicates. Afterwards, the protocol removes all duplicates within the
k cycles.
MPC Cost. The complexity of Subprotocol
7 depends on the number of pairs
\(|\mathsf {pairs}|\),
\(\mathsf {cLen}\), and the number of possible cycles
\(|\mathsf {allCycles}|\). It is most efficient in
\(\mathcal {Y}\) , as the
\(\texttt {MUX}\) gates are not independent, thus, creating a deep circuit of depth
\(\mathcal {O}(|\mathsf {allCycles}| \times |\mathsf {cycles}| \times \mathsf {cLen})\). For removing duplicates and extracting the most robust
k exchange circuits, we evaluate
\(\#{\mathsf {cycles}}\times (\#{\mathsf {unique}}+\sum _{i = 0}^{\#{\mathsf {cycles}}}\left( \mathsf {cLen} \times (\mathsf {cLen}-1)\right)\)) comparisons,
\(\#{\mathsf {cycles}}\) \(\times\) \(\sum _{i = 0}^{\#{\mathsf {cycles}}}((\mathsf {cLen} \times (\mathsf {cLen}-1)))\) \(\texttt {AND}\) gates,
\(\#{\mathsf {cycles}}\) \(\times\) \(\sum _{i = 0}^{\#{\mathsf {cycles}}}\left( \mathsf {cLen}-1\right)\) \(\texttt {OR}\) gates,
\(\#{\mathsf {cycles}}\) \(\times \#{\mathsf {unique}} \times (1 + \mathsf {cLen}) + \#\mathsf {cycles}\) \(\texttt {MUX}\) gates. This step is most efficient with
\(\mathcal {Y}\) , as the circuit is very deep. Thus, the complete cycle evaluation routine is most efficient in
\(\mathcal {Y}\) , as each of our subroutines is most efficient in
\(\mathcal {Y}\).
Solution evaluation
The fourth phase of SPIKE determines the final output, a set of
disjoint exchange cycles exhibiting the highest probability for a successful transplantation. As a pair of donor and recipient can only be involved in one exchange cycle, the output sets must be vertex disjoint. Thus, the resulting set contains a combination of disjoint exchange cycles that greedily maximizes the number of exchanges with respect to the likelihood of the transplantation being successful. Note that we find a locally optimal solution, which might differ from the globally optimal solution
12. The locally optimal solution is computed using a greedy algorithm. This last part of SPIKE is shown in Table
9.
Table 9
\(\texttt {evalSolution}(\langle \mathsf {filteredCycles}\rangle ^{\mathcal {Y}}\): vector of tuples) \(\rightarrow\) tuple(int, vector of vectors)
Table
9 takes a secret shared vector of tuples
\(\mathsf {filteredCycles}\) with all valid unique cycles and their respective weights, the number of valid cycles
\(|\mathsf {unique}|\), and the cycle length
\(\mathsf {cLen}\) as input. The number of pairs
\(|\mathsf {pairs}|\) is a public variable as before.
It checks each valid cycle
\(\mathsf {cCycle}\) whether it is disjoint from all other previously analyzed cycles in
\(\mathsf {tempSet}\). The MPC subprotocol for testing the disjointness is given in Additional file
1: S11 in the Appendix. If it is disjoint,
\(\mathsf {cCycle}\) is added to the set of potential solutions (Lines 16–22). Finally, the set with the highest weight is returned. Details of the corresponding MPC protocol can be found in Additional file
1: Table S12 in the Appendix.
MPC Cost. In total, we evaluate \(|\mathsf {unique}|^2\) \(\texttt {ADD}\) gates, \(|\mathsf {unique}|^2 \times \mathsf {cLen}^2 + |\mathsf {unique}|\) comparisons, \(4 \times |\mathsf {unique}|^2 + |\mathsf {unique}|\) \(\texttt {MUX}\), and \(|\mathsf {unique}|^2 \times \mathsf {cLen^2}\) \(\texttt {OR}\) gates. The solution evaluation is most efficient in \(\mathcal {Y}\) , as there are only few arithmetic operations and mostly comparisons.