Skip to content

Datasets description

MinaRe edited this page May 15, 2017 · 1 revision

BRATS Dataset :

The Multimodal Brain Tumor Segmentation (BraTS) is a challenge held annually since 2012 in conjunction with MICCAI conference. The BraTS 2016 contains 220 brains with high-grade and 54 brains with low grade gliomas for training and 53 brains with mixed high and low grade gliomas for testing. For each brain from the training data comes with a 5 class’s segmentation ground truth. All BraTS datasets, share four MRI modalities namely; T1, T1C, T2, FLAIR. All images are skull stripped. Quantitative evaluation of the model's performance on the test set is by uploading the segmentation results to the online BRATS evaluation system . The online system provides the quantitative results as follows: The tumor structures are grouped in 3 different tumor regions. This is mainly due to practical clinical applications. As described by Menze et al. (2014) [1], tumor regions are defined as:

  1. The complete tumor region (including all four tumor structures).
  2. The core tumor region (including all tumor structures except edema).
  3. The enhancing tumor region (including the enhanced tumor structure).

Depending on the year the challenge was held, different evaluation metrics have been considered. For each tumor region, they consider Dice, Sensitivity, Specificity, Kappa and as well as the Hausdorf distance.

FIGURE: Manual annotation through expert raters. Shown are image patches with the tumor structures that are annotated in the different modalities (top left) and the final labels for the whole dataset (right). The image patches show from left to right: the whole tumor visible in FLAIR (Fig. A), the tumor core visible in T2 (Fig. B), the enhancing tumor structures visible in T1c (blue), surrounding the cystic/necrotic components of the core (green) (Fig. C). The segmentations are combined to generate the final labels of the tumor structures (Fig. D): edema (yellow), non-enhancing solid core (red), necrotic/cystic core (green), enhancing core (blue). (Figure from the BRATS TMI reference paper.)

BraTS images are available in .mha format which we convert them to 2D images for deep learning processing

Preprocessing: The pre-processing follows three steps. First, the 2% highest and lowest intensities are removed. Then, we apply an ITK bias correction to all modalities. The data is then normalized within each input channel by subtracting the channel’s mean and dividing by the channel’s standard deviation. We import 2D images with some augmented data like multi scaling, flip horizontally and vertically and contrast changing.

ISLES Dataset :

Ischemic Stroke Lesion Segmentation (ISLES) challenge started in 2015 and is held in conjunction with the Brain Lesion workshop as part of MICCAI. ISLES has two categories with individual datasets; sub-acute ischemic stroke lesion segmentation (SISS) and acute stroke penumbra estimation (SPES) datasets [1]. SISS contains 28 brains with four modalities namely: FLAIR, DWI, T2, TSE (Turbo Spin Echo), and T1 TFE (Turbo Field Echo). The challenge dataset consists of 36 subjects. The evaluation measures used for the ranking were the Dice coefficients, the average symmetric surface distance, and the Hausdorf distance. SPES dataset contains 30 brains with 7 modalities namely: CBF (Cerebral blood ow), CBV (cerebral blood volume), DWI, T1c, T2, Tmax and TTP (time to peak). The challenge dataset contains 20 subjects. Both datasets provide pixel accurate level ground truth of the abnormal areas (2 class segmentation). The metrics used to judge performances are the Dice score, the Hausdorf distance, the recall and precision as well as the average symmetric surface distance (ASSD). Online evaluation is available .

ISLES images are available in NIFTI format(.nii) which we applied some preprocessing like BraTS data for deep learning processing

MSGC Dataset:

The MSGC dataset which was introduced at MICCAI-2008 provides 20 training MR cases with manual ground truth MS lesion segmentation and 23 testing cases from the Boston Childrens Hospital (CHB) and the University of North Carolina (UNC. For each subject T1, T2 and FLAIR are provided which are co-registered. While lesions masks for the 23 testing cases are not available for download, an automated system is available to evaluate the output of a given segmentation algorithm. The MSGC benchmark provides different metric results normalized between 0 and 100, where 100 is a perfect score and 90 is the typical score of an independent rate.

MSGC data are available in NIFTI format(.nii) which we applied some preprocessing like BraTS data for deep learning processing-This figures is a slice of FLAIR modalities in cronal, sagittal, transversal and top left is annotated axial by Medical expert.

Evaluation metrics

LiTS Dataset : The Liver Tumor Segmentation challenge is organized in conjunction with ISBI 2017. The data and segmentations are provided by various clinical sites around the world . The training data set contains 130 CT scans and the test data set 70 CT scans. And online evaluation is available by organizer.

LiTS data are available in NIFTI format(.nii) and annotated files is available for training set

LUNA : The Lung Nodule Analysis is challenges which focused on large scale of evaluation of automatic nodule detection on chest CT. The complete dataset is divided into 10 subsets. In total, 888 CT scans are included and each CT scan with a slice thickness greater than 2.5 mm. The challenge held in two track of nodule detection and false positive reduction. For each track annotated file in csv format is available which has done by radiologist. Same as other challenges online evaluation is available on the webpage.

LUNA data are available in NIFTI format(.nii) and ground truth in csv file provided by radiologist is available for training set

REFERENCES

[1] Menze, B., Reyes, M., Leemput, K.V., 2014. The multimodal brain tumor image segmentation benchmark (brats). IEEE Trans. on Medical Imaging

Clone this wiki locally