Introduction
Osteoclasts are multinucleated cells of haematopoietic lineage that resorb bone. Osteoclasts are typically cultured in vitro on a variety of physiological (e.g. cortical bone slices, dentine discs) or non-physiological (e.g. calcium phosphate-coated plates, tissue culture plastic or glass) substrates for analysis of cellular physiology, morphology, and biochemical endpoints. Typical osteoclast parameters measured include tartrate resistant acid phosphatase (TRAP) positivity, number, and resorptive activity as well as multinuclearity (≥ 2 nuclei per cell) and actin ring/ruffled border formation [
1‐
4]. Of these, number and resorption area provide valuable data about osteoclast formation and activity and have historically been manually quantified through image-processing softwares such as ImageJ [
1]. Whilst this method enables user confirmation of individual osteoclasts and associated resorption events, it is time consuming, labour intensive and results in substantial intra- and inter-user variability. Thus, there is a clear need to develop an automated method that allows quick, easy, and accurate analysis of in vitro osteoclast cultures.
Attempts to automate in vitro endpoint analyses have been described but often rely on independent and sequential steps of (1) counting osteoclasts; (2) clearing cells from dentine/bone discs [
1,
5‐
7]; and (3) separate measurement of the resorption area [
8,
9]. These processes are time consuming and effectively destroy the experiment, preventing revisitation later (e.g. for imaging). Currently, the only attempt to simultaneously quantify osteoclasts and bone surface erosion has been performed on histological sections [
10]. TrapHisto is an open-source software integrated into ImageJ that semi-automates histomorphometric analysis of static and dynamic bone turnover parameters, particularly resorption analysis [
10]. Recent advances mean that new technologies such as machine learning (ML) can now be used to develop an automated workflow for in vitro osteoclast cultures. ML is an application of artificial intelligence (AI) where constructed mathematical models automatically learn from existing data to create an algorithm that produces accurate predictions from new observations without being explicitly programmed [
11,
12]. Supervised ML, such as decision tree algorithms and random forest, requires labelled examples from training datasets. The algorithm learns from the labelled objects and generates a predictive model that accurately sorts new data objects into categories [
11,
13,
14].
Application of ML methods has improved understanding and analysis efficiency of complex biological data and processes, especially in genomics, systems biology, and image analysis [
11,
15,
16]. However, extensive computational and mathematical knowledge has historically been required to build such ML models, making their application to niche biological questions and processes difficult. The development of ilastik, a free, open-source supervised ML-based bio-image analysis software, has since enabled non-computationally proficient researchers to develop methodologies to rapidly execute complicated image analyses [
17]. This user-friendly software contains pre-defined workflows that are adapted by the operator to create bespoke image analysis pipelines whilst completely shielding users from the mathematical and computational complexities required to build the random forest algorithm [
14,
17,
18]. Some applications of ilastik include measuring neuronal nuclei and cell bodies and osteoblast differentiation from mesenchymal stem cells [
19,
20].
Historically, automatically quantifying osteoclasts in vitro has proven challenging due to the non-uniform cell shape, size, and considerable spacing between nuclei and the cytoplasm of single osteoclasts [
8,
21]. Four recent reports have built complex AI-based models to quantify TRAP+ or fluorescently labelled osteoclasts cultured on plastic but not bone or dentine [
22‐
25]. Resorption parameters were not quantified in any of these models [
22‐
25]. To date, ML, specifically ilastik, has not been applied to simultaneously measure osteoclast culture endpoints such as osteoclast number and resorption area for cells grown on physiologically relevant substrates. Therefore, the aim of this study was to develop and validate an automated image segmentation workflow in ilastik to reliably, and robustly quantify osteoclast number and resorption area in vitro.
Discussion
In vitro cultures are widely used to study osteoclast biology. The unique nature of these cells means that analysis of osteoclast culture endpoints is typically performed manually and/or involves clearance of osteoclasts from the resorptive surface [
7,
9]. However, these manual analysis methods are time consuming, labour intensive, and subjective. This work has utilised freely available software to develop and validate an automatic image segmentation workflow that enables quick, accurate, and reproducible quantification of in vitro osteoclast culture endpoints. The significant experimental advantages of this new method compared to established manual techniques are shown in Table
2.
Table 2
Advantages of using ilastik-based, automated osteoclast endpoint quantitative methods
Intra-user variability | High | Very low |
Inter-user variability | High | Very low |
Training time/experience needed | Training 1–2 h | 1 h to watch tutorials & install software |
Data produced is influenced by user experience | Data produced not influenced by user experience |
Analysis time: osteoclast number | ~ 5 min/disc | < 1 min/disc |
80 disc experiment = ~ 7 h researcher time | 80 disc experiment = ~ 5 min researcher time to set up workflow then ~ 1 h automated analysis |
Sequential number and resorption quantification | Yes, but time consuming | No, osteoclast number only |
Suitable for use with cells grown on plastic and dentine | Yes | Yes |
Applicable to different species | Yes | Yes |
Ilastik, a ML-based imaging software, was trained to identify pre-osteoclasts, osteoclasts, resorption pits, and the dentine disc. Extensive testing revealed that the algorithm could accurately identify osteoclasts and distinguish between pre-osteoclasts and mature cells; however, detection of resorption pits was less reliable. To determine if this approach was sensitive enough to detect increases or decreases in osteoclast number, the algorithm was validated using two pharmacological agents and co-culture with MCF7 cells. Treatment with the bisphosphonate, zoledronate (10 nM), reduced osteoclast number, irrespective of quantification method used. This is consistent with previous reports that show an inhibitory effect of zoledronate on osteoclast number using manual quantification [
33‐
35]. Second, osteoclasts were cultured with ticagrelor, a P2Y
12 receptor antagonist typically used to inhibit platelet aggregation [
36]. Dose-dependent decreases in osteoclast number were detected by both manual and automated methods. This is in line with an earlier study that also reported a ~ 60% reduction in osteoclasts at 10 μM ticagrelor [
37]. Finally, an increase in osteoclast number was robustly detected by the ilastik model upon co-culture with MCF7 breast cancer cells. This is consistent with previous reports which show that MCF7 cells can promote osteoclastogenesis [
31,
32]. Taken together, these findings suggest that the developed algorithm can be implemented to identify treatment effects (inhibitory or stimulatory), address biological questions and sensitively quantify subtle differences in osteoclast number.
Although accurate segmentation of bone marrow-derived mouse osteoclasts was achieved, absolute osteoclast number was usually lower than manually obtained values. The likely explanation for the absolute differences is the significant intra- and inter-variation in manually quantified values by operators, preventing the establishment of ground truth. Ground truth is a set of measurements that are known to be accurate and is used to assess the precision of a developed ML model. Operator variability is rarely reported within the literature despite manual quantification being the gold standard for measuring osteoclast parameters in vitro. In histomorphometric analyses, Tong et al. reported manual variability of ≥ 50% when analysing the same histological sample on six different occasions even with strictly defined parameters [
38]. In the current study, intra-variation was assessed across 2 users by quantifying the same discs over 2–3 consecutive years. Significant differences in the osteoclast number obtained were observed in user 1 (a PhD student with no prior experience quantifying osteoclast culture endpoints), but not user 2 (an established researcher with > 20 years’ experience of manual osteoclast quantification). This suggests that user experience is likely a major factor influencing variability. Similarly, minor image modifications (e.g. brightness and contrast) to better visualise osteoclasts and resorption pits during manual analysis may also contribute to user variation. Despite differences in absolute osteoclast number, similar trends were reported between users. Consequently, the accuracy of the trained model was estimated by qualitative assessment of segmented images and comparing treatment responses, rather than absolute numbers, between both quantification methods.
The ilastik algorithm variance is 1.5% and represents a 93% reduction in user variability for osteoclast number compared to the manual method (Table
2). Furthermore, no differences in osteoclast number were recorded upon re-analysis of the same image sets and irrespective of image orientation. This highlights the robustness and reliability of this new automated osteoclast quantification method which can also reduce the inherent analysis variability posed by inexperienced users. Similar reductions in user variability upon automation of histomorphometric analyses have been reported [
10,
39‐
41]. In contrast, the recent AI-based models quantifying in vitro osteoclasts on plastic did not measure improvements in operator variability from manual counting methods [
22‐
25]. The ilastik model presented in this study requires limited operator input of defined parameters (as defined in Supp. Fig. 1B) for image segmentation and no algorithm re-training prior to implementation, further limiting the introduction of user variation. It should, however, be noted that variability could be introduced should users alter the original training file, image scale, or osteoclast size threshold from what has been described and optimised. Furthermore, image quality (e.g. brightness, staining) can impact osteoclast quantification. For example, homogenous TRAP staining is essential for accurate image segmentation, particularly when quantifying larger osteoclasts. Alterations to the pixel features (e.g. colour, brightness, texture, edge) modify the random forest decision surface in ilastik for classifier categorisation [
17] which impacts the accuracy of the model. Consequently, image settings were optimised here to ensure appropriate segmentation of classifiers including a defined exposure time range, saturation and gain that are applicable across all images and users.
Overall, this user-friendly ilastik model shows that simple microscopy and staining can be used to robustly detect osteoclasts from different species (mouse and human), sample illumination (reflective light and brightfield) and seeding substrate (dentine disc and plastic) without additional re-training of the model. Furthermore, this pipeline reduces analysis time by 80% whereby osteoclast number from 1 disc is obtained in ~ 1 min compared to ~ 5 min when counted manually. Recently, Cohen-Karlik et al. trained a deep ML algorithm by manually contouring each cell cultured on plastic to classify TRAP-stained pre-osteoclasts, mature osteoclasts (3–14 nuclei) and hyper-nucleated osteoclasts (≥ 15 nuclei) [
22]. Alternatively, Maurin et al. fluorescently labelled nuclei, F-actin, and microtubules and used CellProfiler™ to automatically segment primary osteoclasts cultured on tissue culture plastic [
23]. However, unlike ilastik, these pipelines are time consuming and reliant on extensive and complex mathematical and computational knowledge for their manual construction and subsequent re-training for individual operators’ pipelines. In contrast, our model is quick, easy-to-use, flexible and readily implementable (with associated tutorial resources) without any need of classifier re-training or mathematical and programming knowledge. This represents one of the main advantages of this algorithm over other previously reported automated models.
Whilst this model is very effective at measuring osteoclast number, further work is necessary to incorporate the unique features of osteoclasts (e.g. multinucleation, actin ring) into an ilastik workflow for in vitro endpoint analysis. For example, although TRAP staining is an excellent way of staining osteoclasts, using it to visualise nuclei is more problematic, primarily because it is very easy to overstain cells. Thus, an alternative staining approach similar to Maurin et al. [
23] would be required to identify and quantify the number of nuclei per osteoclasts. However, if a new staining method was used, an entirely new ilastik model would need to be generated, trained and validated.
It is important to emphasise that this ilastik-based model has been optimised for in vitro osteoclast cultures, particularly dentine-cultured osteoclasts. Therefore, the algorithm parameterisation and training required to develop this method is specific to these conditions. Although plastic-cultured osteoclasts can be detected by the model, we advise that segmented images are reviewed for erroneous classification as the model has not been specifically trained and optimised to identify plastic-cultured osteoclasts. Furthermore, this model is not readily transferrable to other workflows where osteoclast quantification is needed (e.g. histology, histomorphometry). In principle, this software can be used to construct a new ilastik-based model for analysis of tissue sections.
Although the automated segmentation of osteoclasts was successful, accurately detecting resorption events proved challenging. Resorption pits were reliably identified in training but not during validation of image sets, suggesting that this classifier may be overfitted. Overfitting refers to over-specific training of the algorithm that minimises its generalised predictive power when exposed to new data. Whilst ilastik operates on minimal brushstroke annotations to train classifiers, it was necessary to add more brushstrokes to differentiate the pixel features at the resorption pit-dentine disc boundary. Similar difficulties assessing the resorption boundary have been previously reported [
42]. Furthermore, the inherent variation between primary cultures, TRAP staining and the heterogeneity of the dentine disc surface hinders the determination of optimal pixel features that can be generalisable. Thus, providing more example images to train the ilastik model would be unlikely to improve the sensitivity of resorption pit delimitation. Use of a grid overlay to manually quantify resorption area remains the gold standard, but grid size and area are seldom reported leading to operator variability across research centres [
43‐
45]. Semi-automatic methods are available to analyse resorption area but require the removal of cells from the discs, effectively destroying the experiment, and still introduces user variability [
9,
10,
42]. It is, therefore, likely that more complex models, such as deep learning (DL), will be required to fully automate the simultaneous quantification of both osteoclast number and resorptive activity. DL has already successfully quantified osteoclast and nuclei numbers [
22,
24,
25], but not resorption events. Due to greater processing layers, DL could discover complicated feature patterns in large datasets that better delimit the resorption pit-dentine disc boundary for osteoclast activity analysis.
In conclusion, a ML-based image segmentation workflow successfully identified mature osteoclasts, but not resorption events, and significantly reduced user variability and analysis time of in vitro endpoint quantification by 93% and 80%, respectively. This protocol is flexible to deviations in experimental set-up and can be readily implemented for standardised osteoclast quantification across skeletal research centres. The model and associated tutorials are freely available and readily implementable without any additional training or coding knowledge through this hyperlink:
ILASTIK. Please contact the corresponding author if there are any issues accessing the files or if there are further questions.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.