Health-e-Child - IST-2004-027749 - Deliverable D.11.3

Inflammatory Diseases

Summary

  1. Clinical Introduction
  2. Task Objectives
  3. Data Collection
  4. Multi-modal Image Registration: Status & Perspectives [NEW]
  5. Volume Estimation: Status & Perspectives [UPDATED]
  6. Erosion Detection: Status & Perspectives [UPDATED]
Up

Clinical Introduction

Juvenile Idiopathic Arthritis (JIA) is a very heterogeneous condition combining different forms of chronic arthritis of unknown origin. The onset of JIA ranges from several months after the birth to the age of 16, while the peak age is approximately at 6 years. JIA primarily affects the synovia and the bones of the joints; moreover chronic synovial inflammations may cause progressive joint destruction and serious functional disabilities. The impact of JIA on our society is significant, since it affects approximately one in 1000 children, and represents the major cause of acquired disability in paediatric population.

The causes of JIA have not been yet identified insofar that, in most cases, JIA is still a diagnosis of exclusion. Even if arthritis is not directly inherited, it seems very likely that both genetic and environmental factors are important for the manifestation of the disease. Hence, many efforts have been produced in the last few years in order to understand better the causes of the disease and, consequently, to introduce more effective (drug) treatments. More specifically, one major issue of interest is the possibility of distinguishing/classifying among different levels of the disease, and thus the ability of developing more specific and appropriate treatments for each subtype of patients. However, no statistically significant results have been obtained so far, and current classification - which is based on clinical criteria - is still unsatisfactory: many of the proposed subgroups don't appear homogeneous enough. It is still not possible, for instance, to distinguish effectively, early in the disease course, patients who are most likely to develop joint damage (and who therefore require a more aggressive treatment at an early stage) from those who are having a mild disease course. Finally, in clinical trials, drug efficacy may be judged only on clinical parameters, since physicians still lack quantitative measures that can allow for an early identification of both the progression of joint damage and the drug efficacy on disease progression.

From the considerations above, one can easily understand how important it is to identify reliable indicators of JIA activity, which can provide us with quantitative information about the disease. Actually, such goal is the main motivation of the research work outlined in this report, in which we exploit information contained in radiological images and we try to integrate it with clinical data.

Several imaging techniques have been tested on their ability to visualise and quantify both synovitis and erosions. Our hypothesis is that automatic analysis of the acquired images can lead to the desired quantitative indicators. However, not all the imaging modalities can be conveniently used for this purpose. In fact, for example, X-rays can show bone erosion, cartilage loss (indirectly, through joint space narrowing), and joint misalignment, but can not visualise synovia, joint effusion, articular cartilage, bone marrow, or ligaments and tendons directly. Thus, X-ray scoring methods can allow for joint damage evaluation in JIA, but are insensitive to synovitis.

Magnetic resonance imaging (MRI) is able to image synovitis and bone oedema/inflammation as well as damage to cartilage and bone, and it can detect erosive changes with greater sensitivity than radiography, particularly in early stages of the disease. MRI is also capable of detecting tendon pathology and evaluating ligament integrity.

Musculoskeletal ultrasound (MSUS) has been shown to be superior to clinical examination in the diagnosis and localisation of joint effusion, bursal fluid collection and synovitis. A number of studies have described improved sensitivity for detection of bone erosions in joints with the use of ultrasound as compared with conventional plain radiography. It is the imaging modality of choice for the diagnosis of tendon pathology. Ligament, muscle, peripheral nerve and cartilage pathology can also be readily investigated by MSUS. It allows a quick, safe and inexpensive access to otherwise undetectable anatomical information on the early targets of most rheumatic diseases.

MSUS and MRI are, therefore, potential powerful imaging tools to assess joint inflammation and the progression of joint damage. However, to date, little information exists on the use of these image techniques in JIA and standardised, validated, and feasible assessment systems are lacking.

The value of MRI, as an advanced method to evaluate disease activity and disease damage in adults with rheumatoid arthritis, is under active investigation by a research consortium called Outcome Measures in Rheumatology Clinical Trials (OMERACT). However, the results that will be drawn from OMERACT studies are not directly applicable to children, because adult rheumatoid arthritis is different from JIA and because the growing skeleton of children needs a different approach. Indeed, in children ossification is incomplete and joint space width varies with age.

Up

Task Objectives

Data analysis in the scope of JIA pathology has two main objectives: 1) the identification of meaningful features that can be extracted from medical data, and consequently 2) the modelling of joint damage for an early diagnosis and assessment of the disease. As already pointed out above, our two main sources of information are clinical data and images. More specifically, the hypotheses we want to investigate is that a first set of feature can be extracted directly from radiological images of the joints, since we believe these are the most "characterising" feature (i.e. they discriminate better among patients). In a second stage of the classification process, image-based features can be integrated with those extracted from clinical data, in order to obtain a statistically grounded model of the disease.

Even if both image modalities, MR and MSUS, have shown to be very sensitive to early changes in the joints, our focus is only on the first modality, which appears better suited for quantitative measurements and for standardisation of the protocol. Henceforth, we assume MRI are the input images of our algorithms, and all the following discussions and assumptions apply specifically to this imaging modality.

From the wide literature on rheumatoid arthritis, we extracted the following set of features that have been reported to be effective in quantitative assessment of arthritis activity and damage.

Therefore, from an algorithmic point of view, our main objective is to design, implement and validate algorithms to 1) (semi-)automatically extract and 2) quantitatively measure the above features from MR images.

The measurement of the synovial membrane volume is of particular interest. Indeed, such volume has been shown to be an early predictor for joint damage in adult patients affected by rheumatoid-arthritis (Østergaard et al., 1999) and osteoarthritis (Farrant et al., 2007), and it is expected to be valid also for JIA. The detection of bone erosions is important as well, since in MRIs, as opposed to conventional radiography, it can be successfully performed also in the early stages of the disease. Therefore both features, synovial volume and bone erosions, are important in modelling the disease course.

From a technical point of view, measurement of the synovial membrane volume completely depends on its segmentation. Working with 3D MRIs with no gap or overlap between the slices, as it is our case, the segmentation of the synovia from the other tissues allows the estimation of its volume by counting the number of detected voxels. The actual volume is given by the number of voxels times the volume of a single voxel. The quality of the estimate will depend on two factors: the accuracy of the segmentation, and the entity of the partial volume effect.

In the case of bone erosions we are interested in their detection, rather than in a quantitative estimate of their volume. However, the problem is in many respect similar to the segmentation of the synovial membrane, since it involves the classification of voxels or patches as belonging to an erosion or not. Differences will be in the features used to describe the voxels/patches, and possibly in the exploitation of methods to detect the bones contours (where the erosions are necessarily located).

Once one is equipped with the results of the two segmentation algorithms, it is possible to combine them in order to gain the confidence in the corresponding measurements. As it will be discussed in detail in the section about the data collection, for each enrolled patient there are MRI data at different modalities and different times (baseline and follow-up). Therefore, in order to combine the results, we also need to register the input images of the same patient.

To summarise, the achievement of our (clinical) goals implies the solution of the following (algorithmic) Computer Vision problems:

In the following subsections we provide brief overviews of related works for each of the technical objectives.

Estimation of the Synovial Membrane Volume

The segmentation of the synovial membrane has often been performed manually by human operators (Østergaard et al., 1999; Farrant et al., 2007). However such approach is extremely time-consuming, and furthermore the reliability of the results can suffer from intra- and inter-observer variability. Even if it is widely acknowledged that automatic segmentation can be advantageous in terms of both time and consistency of the results, relatively few works propose automatic methods for the segmentation of the synovial membrane, especially for the sites of our interest, hips and wrist.

The automatic segmentation problem can be approached from two different directions: either by looking for the contours/surfaces of the regions in 2D/3D image (i.e. by contour-based segmentation), or by labelling directly the pixels/voxels (i.e. by pixel-based segmentation). By surveying the tiny state-of-the-art works on the segmentation of the synovial membrane, the pixel-based approach results the leading one. Surprisingly enough, so far the focus has been only on the simplest pixel-based segmentation method: the thresholding in its various forms. In global thresholding the intensity range of the images is split into intervals and each pixel is labelled according to the interval its intensity belongs to. A comparison of segmentation of the synovial membrane by thresholding with manual annotation is reported in (Østergaard, 1997). This technique is obviously too basic, and necessarily is highly sensitive to noise and artifacts in the images, inhomogeneities in the magnetic fields. Moreover if one looks at the value of a single pixel alone, the spatial coherence between neighbouring pixels cannot be taken into account. Another widespread segmentation method in medical image analysis is fuzzy c-means clustering, and it has been recently applied to the segmentation of the synovia in the wrist (Tripoliti et al., 2007).

For completeness, we now report some recent works on the segmentation of the cartilage, because this problem shares some affinities with our. (Folkesson et al., 2005a) and (Folkesson et al., 2005b) segment the cartilage in MR knee scans, with the aim of quantifying the progression of osteoarthritis. The proposed methods are both based on the combination of Approximate Nearest Neighbour (ANN) classifiers; an ANN is similar to a plain k-Nearest-Neighbour classifier, but it looks for an approximate solution in order to reduce the computational cost. The method presented in (Grau et al., 2004) has also been applied to knee cartilage. It extends the popular watershed segmentation algorithm by introducing prior knowledge obtained by estimating the posterior probabilities of the classes under study assuming a Potts' model, a generalisation of Ising model for two classes. (Warfield et al., 2000) presented a semi-automatic segmentation method, applicable to a wide range of tissues (among them also cartilage), iterating between a classification step and a template registration step. At the classification step each voxel is assigned to the class most represented by its first k neighbours in the aligned template (which has been previously segmented). The intermediate classification result is then used to refine the non-rigid registration. (Tang et al., 2004) applied a modified version of the gradient vector flow (GVF) snakes to the segmentation of ankle cartilage. In (Solloway et al., 1997) the active shape model is applied, unmodified, to the segmentation of femoral cartilage.

Detection of Bone Erosions

In order to detect bone erosions, two different approaches have been proposed so far. First, since it seems plausible to consider the erosions as “lesions” (or abnormalities) connected to bone boundaries, a two step procedure can be adopted in which the entire bone volume is pre-segmented, and then the abnormalities are searched along the boundaries of the segmented volume. The alternative approach aims to detect directly the regions in the images where bone lesions are most likely to occur. The first approach has drawn more attention than the second one. Indeed, most of the previous works focused on finding the contours of the bones in plain images (i.e. each MR slice separately). However, they share the disadvantage that the performance of the detection is dependent on the accuracy of the delineation of the contours. In order to increase such accuracy - and, thus, to reduce the risk of misclassification - the algorithms usually rely on human initialisation and intervention.

On the other side, a direct detection of erosions necessarily requires both a formal definition of what an erosion is - i.e. how it looks like in an MR image - and some assumptions about where it can happen; otherwise the number of false detections can make the performance of the system fall down. This is because the local appearance of bone erosions is indistinguishable from many other artifacts that do lie within the wrist area. As for the other approach, the human contribution may improve significantly the performances of the system.

Therefore, if one aims at comparing and evaluating different methods for bone erosion detection, one important aspect that must be taken into account is the degree of autonomy, which refers to the need for human intervention either before or during the running of the algorithms. Intuitively, the physicians might be asked to assist the algorithm only rarely and if no other options are available. In fact, human intervention is shown to be critical when physicians are required to manually segment image regions, due to the great variability of their responses. Indeed, it is still matter of debate how to combine different manual segmentations performed by either the same physician several times or two or more physicians. Bias can be introduced if this step is not performed carefully. Given all these caveats, we strongly believe that human intervention should be avoided almost completely. Therefore, later in this report we will propose a method to overcome this problem, by making the segmentation process more complicated but, at the same time, more robust.

The following is a more detailed discussion of some of the most popular algorithms that have been used so far to detect bone erosions.

Let's consider first the methods based on active contours [Kass et al. 1988]. In a nutshell, active contours look for a 2D curve minimising an energy made up of two terms: an internal energy which accounts for the elasticity and smoothness of the curve, and an external energy which accounts for the image features (for instance the edges extracted by Gaussian gradient filtering). Note that although the solution is highly dependent on the model initialisation, the method might be made more robust by adopting a coarse-to-fine strategy.

This general approach has been applied, with minor modifications, to the segmentation of bones in both radiographies and MRIs. The former is the case of the work described in [De Luis-Garcia et al. 2003], where the authors require an accurate segmentation of the finger bones in order to estimate the skeletal age of the patient. Their method modify the general approach in two ways: first, they exploit domain knowledge to initially place the snakes on the phalanges and metacarpal bones; second, the external energy is the sum of the image gradient and of an energy which favours the expansion of the contour (introduced in [Cohen 1999], to make the method more robust to initialisation). Active contours have been applied to bone MRIs in [Snel et al. 1998], who looked for the planar contours of the carpal bones in the MRI slices. Inspired by the work described in [Lobregt et al. 1995], they use only the normal component (with respect to the contour) of the external energy, which avoids the problem of clustering the pixels of the curve (henceforth called snaxel), which affecting the original method. Moreover, in order to overcome the problems due to the varying image contrast along the contour, they also normalise the external energy with a local contrast level estimated in the neighbourhood of each snaxel. In both works, the snake can be refined or snaxels fused together depending on the distances between each of them.

Active contour methods do not use any prior knowledge on the shapes of the contours. Hence a more sophisticated class of methods has been introduced, called Active Shape Models (ASM), which try to exploit any prior information available for the specific problem, by analysing the mode of variations of a set of representative examples (the so-called training set). In the Medical Imaging context, ASMs have been already applied to the segmentation of vertebrae in Dual Energy X-ray Absorptiometry (DXA) images [Smyth et al. 1997] to overcome the problem posed by the low signal-to-noise ratio and poor spatial resolution of DXA with respect to standard radiographies. The authors claim an accuracy comparable with that of human operators. In [Behiels et al. 2002], an assessment is presented, on 400 images of different bone structures, of the performance of three different methods: a standard ASM method, a probabilistic method based on Maximum-a-Posteriori criterion [Wang et al. 1998] and an extension of ASM. This third method is essentially based on the introduction of a regularisation term in the fitting problem, which is then solved by minimal cost path (MCP) search.

Registration of multi-modality MR images

As we made it clear above, the results obtained separately by the two segmentation algorithms (synovial membrane and bone erosions) would improve if one was able to compare (spatially) the locations of the segmented regions. In fact, inflammation of synovia seems to be caused by specific neurotransmitters released from inside the bones in the regions where an erosion occurred. However, since our aim is to combine statistically meaningful (quantitative) features extracted from heterogeneous data sources (clinical and images), it is possible that the advantages of working on registered images, rather than on the measures extracted from unregistered images, can be quite limited. In fact, other clinical factors could carry similar information insofar that information related to the registration would result in more redundancy of the dataset. Therefore, so far we considered the registration of multi-modality MR images more as an exploratory tool than as an essential component of our statistical decision system.

The actual work we have done so far focuses on surveying the scientific literature on multi-modality registration of medical images. A complete report on the state-of-the-art works is still in preparation and will be available in next documents of the project.

Up

Data Collection

Planned Data Acquisitions

The Health-e-Child study is designed as cross-sectional and prospective, with an expected patient sample of 300 JIA patients (100 in each of the 3 participating centres). Patients with JIA (according to ILAR classification) and with active arthritis in the wrist and/or hip are enrolled in the study. For enrolled patients, MRI and US of the wrist and/or hip are performed at study entry (baseline), and then at one and, when possible, two years of follow-up. MRI of the wrist is performed only in cooperating patients that do not require general anaesthesia.

The images are acquired with the protocol developed in the first year of the project, consisting of:

Fig.1 - Examples of the different types of acquisitions from the MR protocol. The images in the same column are slices at the same position along the Y axis.
y=-70y=-65y=-60
T1 pre-contrast y=-70 y=-65 y=-60
T1 after 1.5 min y=-70 y=-65 y=-60
T2 y=-70   y=-60
T1 after 10 min y=-70 y=-65 y=-60

As explained in the section about the task objectives, the results of the image analysis will be integrated with other data in order to provide a model of the disease course. In this early stage of the research work, we are mostly interested in integration with clinical parameters (including standardised measures of disease activity and disease damage), routine laboratory data (including ESR, CRP, WBC and differential count, RBC, Hb, PTL, ANA, RF), and semi-quantitative scores for damage evaluation in XR, US and MR. At a second stage, we will also consider genetics and proteomics data.

Data Annotations

Our approach to image analysis is based on machine learning; more specifically we design our algorithms in the setting of learning by example. Therefore, we need annotated examples, highlighting the structures of interest, in order to perform training, validation and testing of the methods we develop. Namely, we need a manual segmentation of the synovial membrane for the volume estimation and of the bone erosion for erosion detection. As we will see in short, the former annotations must be particularly accurate in order to obtain an accurate estimation of the volume. For the annotation task we developed a software tool described in the deliverable for WP12.

Fig.2 - Examples of manual annotations from two operators. The images in the same column are slices at the same position along the Y axis.
y=-70y=-65y=-60
Operator 1 z=15 z=25 z=35
Operator 2 z=15 z=25 z=35

The analysis of the manual annotations performed on the same exams by two clinicians provides useful indications on the complexity of the problem. First, we can compare the annotations on a per-slice basis. If we compare the synovial membrane volume resulting from the annotations, we can observe that: 1) they are highly correlated, with r=0.92 and p<0.001, but 2) there is an average absolute difference of 30.5%. The latter measurement is mitigated when we look at the median absolute difference, which is 14.4%, but the fact remains that the volumes estimated from different observers shows a variability. This variability does not disappear considering whole exams: some of the differences tend to balance, and the absolute average difference drops to 15.2%, but the median absolute difference decreases only to 12.1%. Also interesting is the comparison between annotations from a classification point of view. If we look at the voxels which have been classified by both observers as belonging to the synovial membrane, these make up 63% (on average) of the voxels marked as synovia by either of them. That is, around one third of the voxels classified by one observer as synovia, are classified by another one as NOT synovia.

This results, although partially due to operator errors, are also the product of an inherently difficult problem. The structures to be segmented are extremely thin structures (see Fig.2), and often can be discriminated only on the basis of anatomical knowledge rather than signal intensity. As a result of the thin shape of the structures, small pixel-wise errors on their boundaries might produce significant errors on the volume. This can be verified by subjecting the manual annotations to binary dilation and erosion, and comparing the resulting volumes with the original ones. In both cases there is a absolute change of over 30%.

The analysis of the manual annotations, briefly summarised above, points out the complexity of the problem of accurately estimating the synovial membrane volume. It also sets out what might be a reasonable goal for the performance of an automatic method, which of course cannot beat the examples it is trained with.

Up

Multi-modal Image Registration: Status & Perspectives

Both clinical objectives of synovial volume estimation and bone erosion detection would benefit from the preliminary registration of the multi-modal MR images included in each study. In particular, we face three types of registration problems:

The first problem is the simple registration of all modalities in a single MR study. This problem can be approached as a rigid registration problem, and its solution is needed in order to be capable of fully exploit all information provided by the different modalities. Indeed, this type of registration allows to locate the image voxels corresponding to the same physical point across modalities. The second problem occurs when considering pairs of baseline and follow-up studies. In this case it would be helpful to be able to find the voxels corresponding to the same anatomical point, in order to perform visual or automatic comparison of image features indicating, eg. bone erosions. This problem is considerably more difficult than the previous one, since the natural growth together with the pathological processes produce deformations between the two studies which cannot be modeled as rigid. Finally, addressing the third problem would provide a standard frame of reference for all studies, which could potentially allow the exploitation of information about the voxels position both in synovia segmentation and in erosion detection.

Currently we have addressed the first and third problems, that is the multi-modal rigid registration either of all the MR series from a single patient's study, or of corresponding MR series from two studies of two different patients. For both problems we employ a registration algorithm based on the maximization of the mutual information, as published in Mattes et al, 2003. The implementation is based on the open source library ITK (see http://www.itk.org). Note that although technically in the third problem the modalities are the same, in practice we found the maximizing the mutual information provides an efficient way to be robust with respect to the changes in image intensities and anatomy naturally occurring in this case. We show below some representative examples of the results obtained on our data. It can be readily seen that the three modalities are perfectly aligned. The inter-patient alignment also worked well, keeping in mind that pure rigid registration cannot compensate for the differences inherent to the different anatomies between patients.

Fig.3 - Examples of registered slices, at the same position along the z axis.
modality #1modality #2modality #3
Exam #1 z=15 z=25 z=35
Exam #2 z=15 z=25 z=35

The second problem mentioned above requires a non-rigid registration algorithm. A feasible approach would be the extension of the algorithm implemented for the rigid registration. However, unforeseen problems might force us to implement different approaches, such as derivatives of Thirion's algorithm (see Thirion, 1998).

Up

Volume Estimation: Status & Perspectives

The accurate estimation of the volume of the inflamed synovial membrane is the first objective of our work. From an algorithmic point of view, such objective can be achieved by first segmenting inflamed synovia from all others tissues, and then evaluating the volume of the segmented regions. The conversion between the volume in voxels of the segmented regions and their actual volume in cubic mm can be conveniently done by means of MR volumetric acquisitions. To this aim, the clinical partners designed a protocol for MRI acquisitions, including series of T1-weighted, volumetric MR images, of the hips or wrists of the enrolled patients. Two of the series are acquired 1.5 and 10 minutes after the injection of a Gd-DTPA bolus, which enhances the contrast of highly vascularized tissues, as it should occur in case of an active synovial pannus.

As we have seen in the previous section, the manual segmentation of the enhanced parts is, at least in the wrist, a tedious and time-consuming, and the results have an high inter-operator variability. These are serious hurdles to the use of the synovial volume as a predictive marker for the future erosive damage. Therefore, any automatic or semi-automatic measurement method would be welcome, provided it could reduce both the variability and the time required for the measure. However, one should not overlook the difficulties in designing such a method. These are due to the thin shapes under study (high sensitivity of the result to errors in the segmentation), and to the necessity of incorporate some prior anatomical knowledge (for discriminating between synovia and other enhanced structures).

We approach the problem as one of classification. Previously, we approached the problem in two stages. On a basic level, we wanted to classify the voxels as belonging to enhanced structures or not. Note that at this stage we were not yet interested in the discrimination between actual synovial membrane and other enhanced structures. At a second stage, however, once the enhanced voxels had been classified/segmented, we wanted to classify each group of connected voxels as belonging to the synovial membrane or not. This approach has been now superseded. Both levels of classification (enhanced vs. not enhanced, inflamed synovia vs. other enhanced structures) are addressed by a single classifier.

The classifier is obtained by training a discriminative function, represented by a linear combination of features selected from a large data-dependent dictionary. The training step is performed by solving a regularised least-square problem, while the feature selection is currently performed via Orthogonal Matching Pursuit (OMP, see Davis et al, 1997).

The performance of the classifiers and of the volume estimation is tested in a cross-validation fashion. We iteratively put aside one or more annotated exams and train/validate the classifiers on the remaining data; the obtained classifiers are then tested on the untainted exams.

Fig.4 - Examples of segmentation. The images in the same column are slices at the same position along the Z axis.
Patient #1 z=15 z=25 z=35
Patient #2 (baseline) z=15 z=25 z=35
Patient #2 (follow-up) z=15 z=25 z=35

The volume estimation system has an estimated precision (computed per slice, over the cross-validation loops) of 0.025 cc. For comparison, a human observer compared with a second one would achieve a precision of 0.017 cc. In order to evaluate the usability of the system in a clinical setting, preliminary to validation, we computed its agreement with respect to the volume computed by a clinician. The system achieves an ICC rate of 0.85 (95% CI: 0.83-0.87), compared to the agreement between two human observers of 0.94 (95% CI: 0.92-0.95). This two results are excellent, especially since they have been obtained from a completely automatic system, without human intervention.

Up

Erosion Detection: Status & Perspectives

As anticipated above, bone erosions play a fundamental role in the analysis and quantification of JIA progression. Specifically, an accurate quantification of temporal changes of erosions between baseline and followup studies may be used to discriminate the actual outcome of specific drug treatments or therapies. Design and implementation of effective algorithms for the detection of the erosions and the monitoring of their changes over time are a fundamental sub-goal of WP11, and our research work is focusing mainly on this task in the upcoming months of the project.

Fig.5 - An T1 pre-contrast image of a wrist (left). The red square represents the region of interest for the bone erosion modules, and is magnified on the right.

A lot of preparatory work has been done in the past months in order to build up most of the basic modules which our algorithmic pipeline for bone erosion is comprised of. Some of these modules have already been completed and tested, while some others (e.g. the detection of candidate ROI and the association baseline-followup) have still to be thoroughly tested and validated. The remaining effort will be devoted to complete all the modules and integrate them into the general architecture. In parallel we are going to investigate statistical correlation between specific clinical parameters and results of automatic image analysis, by approaching the problem with the same methodologies used for the estimation of inflamed synovial membrane.

The pillars upon which we based our work for the detection and estimation of bone erosions were mainly two. First, since an accurate quantification of changes strongly depends on the correct registration of exams acquired at different times (baseline wrt follow-up), a considerable amount of our work has been done to adapt and exploit the results of task T11.3.3. Details about the registration part are discussed above. Second, based on preliminary results obtained by bone segmentation algorithms that we tested at the beginning of WP11, we preferred to approach the detection of erosions directly as a supervised classification problem.

Our initial choice of the segmentation-based approach to erosion detection was motivated by the analysis of the state of the art, from which it emerged as a prominent thread of research the use of a two stage schema, in which one first segments the contours of bone regions and then detects those group of voxels on which the actual erosions lie. One of the main drawbacks related to using such approach on MR images is the poor contrast between regions of interest and their surroundings, as opposed to the original radiographical context in which they have been proposed. Indeed, in radiographies bones looks like well defined structures with sharp contours, while in MR images bone regions and their contours are slightly brighter than background and textural features seems to be more informative than the bare intensity. Likely viable alternatives could be global, region-based approaches in which most of the ambiguities could be solved more effectively by looking not only at the local neighborhoods of the pixels but also at their global organization and relationships. The detection of bone contours and the subsequent analysis of the abnormalities ought to be carried out from the segmentation of the regions. Despite the soundness of these ideas, in practice the high variability of the visual patterns of bones in the wrist, due to a complex and varied anatomy, prevented us from obtaining good results.

The analysis of medical images by means of machine learning algorithms requires both positive and negative annotated examples highlighting the structures of interest. In the context of JIA, annotated examples are simply voxels in the 3D MR volumes of T1 pre-contrast wrists. Positive voxels are those in the neighborhood of an erosion while all the others in the volume are negatives. In order to perform training, validation and testing of the methods we want to develop, we asked physicians (who are the experts of the specific pathology) to place manually 3D boxes wrapping up the erosions. The crucial stage of careful annotation of training examples required the development of a tool for the annotation of the erosions which could reduce the time needed to fully annotate an exam. We therefore adapted conveniently the GUI -- already designed and developed for the analysis of the synovial membrane -- trying to meet all the requirements from the clinicians.


Fig.6 - A screenshot of the GUI used for the annotation of MR images of wrists in order to collect positive and negative examples for the classification module.

After an initial stage of testing by the clinicians, the tool for the annotation of the erosions has been improved in order to reduce the time needed to fully annotate an exam. The tool now uses a multiplanar visualization with volumetric annotations reported on all planes. The annotations can be easily modified after their initial placement, in order to better adapt them to the 3D shape of the erosion, which might be rather complex and require double-checking on all planes.


Fig.7 - An example of annotated slice

From the algorithmic viewpoint, we decided to adopt the same classification method designed for the synovia. However, the type of information used to highlight regions of inflamed synovia can be very different from that required to detect erosions. Therefore, several feature extraction algorithms have been implemented and we are investigating the optimal set of cue and features that can lead to the best detection performance. Moreover, at a local level, the appearance of a bone patch may lead to ambiguities because it is extremely difficult to assess if a single pixel does (or does not) lie within a region representing an erosion. Therefore, to overcome these possible ambiguities, we are implementing also a region-based approach (as opposed to the voxel based used for the synovia) to combine both textural information from the surroundings of each pixel and more geometrical information such as the spatial configuration of pixels in a small neighbor.

To transform voxels to higher level of abstraction, for each voxel of an MR image we rely on the following cues:

In summary, our actual contributions to the problem of detecting bone erosions will be the following. First, we are designing a supervised classification algorithm to detect voxels representing erosions (i.e. holes) of the bones. The feature representation and the classification strategy are fairly similar to those used to segment inflamed synovial tissue and their main novelty with respect to the state of the art is an effective way of fusing position and local appearance and texture of voxels an MRI of the wrist.

References