Generalized dice loss for multi-class segmentation #9395

Closed

Generalized dice loss for multi-class segmentation#9395

Hey guys, I just implemented the generalised dice loss (multi-class version of dice loss), as described in ref :
(my targets are defined as: (batch_size, image_dim1, image_dim2, image_dim3, nb_of_classes))

def generalized_dice_loss_w(y_true, y_pred): 
    # Compute weights: "the contribution of each label is corrected by the inverse of its volume"
    Ncl = y_pred.shape[-1]
    w = np.zeros((Ncl,))
    for l in range(0,Ncl): w[l] = np.sum( np.asarray(y_true[:,:,:,:,l]==1,np.int8) )
    w = 1/(w**2+0.00001)

    # Compute gen dice coef:
    numerator = y_true*y_pred
    numerator = w*K.sum(numerator,(0,1,2,3))
    numerator = K.sum(numerator)
    
    denominator = y_true+y_pred
    denominator = w*K.sum(denominator,(0,1,2,3))
    denominator = K.sum(denominator)
    
    gen_dice_coef = numerator/denominator
    
    return 1-2*gen_dice_coef

But something must be wrong. I'm working with 3D images that I have to segment for 4 classes (1 background class and 3 object classes, I have a imbalanced dataset). First odd thing: while my train loss and accuracy improve during training (and converge really fast), my validation loss/accuracy are constant trough epochs (see image). Second, when predicting on test data, only the background class is predicted: I get a constant volume.

I used the exact same data and script but with categorical cross-entropy loss and get plausible results (object classes are segmented). Which means something is wrong with my implementation. Any idea what it could be?

Plus I believe it would be usefull to the keras community to have a generalised dice loss implementation, as it seems to be used in most of recent semantic segmentation tasks (at least in the medical image community).

PS: it seems odd to me how the weights are defined; I get values around 10^-10. Anyone else has tried to implement this? I also tested my function without the weights but get same problems.

xychenunc

Hi, I came across a similar issue. The dataset is imbalanced, and the region which is small compared to the whole image cannot be well segmented. I think it has nothing to do with your loss function, maybe it is due to patch-based approach. How large you choose your patch size?

xychenunc

And I think the problem with your loss function is the weights are not normalized. I think a normalized weights should be what you want. And w = 1/(w**2+0.00001) maybe should be rewritten as something like w = w/(np.sum(w)+0.00001). Otherwise, the generalized loss is not 'balanced', the region which takes a larger portion of the image accounts for a relatively small part in the total loss.

emoebel

Author

Hey xynechunc, thanks for your answer! I tried normalizing the weights, but it didn't do any difference. While it is true that the weight values are better interpretable (instead of values around 10^-10 I have now values between 0 and 1), it seems that numerically it does not change the loss behaviour.

Why are you asking about the patch size? Aren't the weights supposed to cope with the unbalanced class problem? Anyway, my patch size is 56x56x56 voxel, and my objects have a diameter of 10 voxel. In my patches, I have in average 7% of voxels labeled as object, the rest is background.

jpcenteno80

I am trying something similar for a 2D semantic segmentation project with 10 categories (label 0 is background). Before trying dice, I was using sparse categorical crossentropy with very good results. However, because label 0 was being included in the loss calculation, both training and validation accuracy were artificially high (> 0.98). My implementation of dice is based on this: https://github.com/Lasagne/Recipes/issues/99.

y_true has shape (batch,m,n,1) and y_pred has shape (batch,m,n,10). Here is my version of dice:

def dice_coef_9cat(y_true, y_pred, smooth=1e-7):
    '''
    Dice coefficient for 10 categories. Ignores background pixel label 0
    Pass to model as metric during compile statement
    '''
    y_true_f = K.flatten(K.one_hot(K.cast(y_true, 'int32'), num_classes=10)[...,1:])
    y_pred_f = K.flatten(y_pred[...,1:])
    intersect = K.sum(y_true_f * y_pred_f, axis=-1)
    denom = K.sum(y_true_f + y_pred_f, axis=-1)
    return K.mean((2. * intersect / (denom + smooth)))

def dice_coef_9cat_loss(y_true, y_pred):
    '''
    Dice loss to minimize. Pass to model as loss during compile statement
    '''
    return 1 - dice_coef_9cat(y_true, y_pred)

A model trained with the above implementation of dice tends to predict 4 out of the 9 categories and the segmentation is less than ideal and much worse than I got with sparse categorical crossentropy.

However, when I convert the segmentation task into a binary decision (merge all categories into one), the segmentation is pretty good. Here is the loss function changed for a binary problem:

def dice_coef_binary(y_true, y_pred, smooth=1e-7):
    '''
    Dice coefficient for 2 categories. Ignores background pixel label 0
    Pass to model as metric during compile statement
    '''
    y_true_f = K.flatten(K.one_hot(K.cast(y_true, 'int32'), num_classes=2)[...,1:])
    y_pred_f = K.flatten(y_pred[...,1:])
    intersect = K.sum(y_true_f * y_pred_f, axis=-1)
    denom = K.sum(y_true_f + y_pred_f, axis=-1)
    return K.mean((2. * intersect / (denom + smooth)))


def dice_coef_binary_loss(y_true, y_pred):
    '''
    Dice loss to minimize. Pass to model as loss during compile statement
    '''
    return 1 - dice_coef_binary(y_true, y_pred)

Not sure if the binary results are better because it is an 'easier' task or because my dice loss function is wrong.

xychenunc

Sorry for my late reply. The email you sent to me went to junk email. In my understanding, you problem is the result of class imbalance. A common way to solve it is choosing a better sampling approach. Hope it helps! Xiaoyang Get Outlook for iOS<https://aka.ms/o0ukef>

…

________________________________ From: Juan Pablo Centeno <notifications@github.com> Sent: Tuesday, March 6, 2018 6:52:41 PM To: keras-team/keras Cc: Chen, Xiaoyang; Comment Subject: Re: [keras-team/keras] Generalized dice loss for multi-class segmentation (#9395) I am trying something similar for a 2D semantic segmentation project with 10 categories (label 0 is background). Before trying dice, I was using sparse categorical crossentropy with very good results. However, because label 0 was being included in the loss calculation, both training and validation accuracy were artificially high (> 0.98). My implementation of dice is based on this: Lasagne/Recipes#99<url>. y_true has shape (batch,m,n,1) and y_pred has shape (batch,m,n,10). Here is my version of dice: def dice_coef_9cat(y_true, y_pred, smooth=1e-7): ''' Dice coefficient for 10 categories. Ignores background pixel label 0 Pass to model as metric during compile statement ''' y_true_f = K.flatten(K.one_hot(K.cast(y_true, 'int32'), num_classes=10)[...,1:]) y_pred_f = K.flatten(y_pred[...,1:]) intersect = K.sum(y_true_f * y_pred_f, axis=-1) denom = K.sum(y_true_f + y_pred_f, axis=-1) return K.mean((2. * intersect / (denom + smooth))) def dice_coef_9cat_loss(y_true, y_pred): ''' Dice loss to minimize. Pass to model as loss during compile statement ''' return 1 - dice_coef_9cat(y_true, y_pred) A model trained with the above implementation of dice tends to predict 4 out of the 9 categories and the segmentation is less than ideal and much worse than I got with sparse categorical crossentropy. However, when I convert the segmentation task into a binary decision (merge all categories into one), the segmentation is pretty good. Here is the loss function changed for a binary problem: def dice_coef_binary(y_true, y_pred, smooth=1e-7): ''' Dice coefficient for 2 categories. Ignores background pixel label 0 Pass to model as metric during compile statement ''' y_true_f = K.flatten(K.one_hot(K.cast(y_true, 'int32'), num_classes=2)[...,1:]) y_pred_f = K.flatten(y_pred[...,1:]) intersect = K.sum(y_true_f * y_pred_f, axis=-1) denom = K.sum(y_true_f + y_pred_f, axis=-1) return K.mean((2. * intersect / (denom + smooth))) def dice_coef_binary_loss(y_true, y_pred): ''' Dice loss to minimize. Pass to model as loss during compile statement ''' return 1 - dice_coef_binary(y_true, y_pred) Not sure if the binary results are better because it is an 'easier' task or because my dice loss function is wrong. — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#9395 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AeoH50BrTK6TulQ7vH7t1VZyG7yLCzWfks5tbyFIgaJpZM4SGok6>.

emoebel

Author

@xychenunc thanks for your answer, I realised also that the problem is class imbalance. I just dont understand why, because the weights in the loss function are supposed to cope for that.

@jpcenteno80 does sparse-categorical-crossentropy work better than normal categorical-crossentropy in the case of segmentation? I tried it out myself, but I'm getting an error concerning array shapes:
ValueError: Error when checking target: expected conv3d_15 to have shape (56, 56, 56, 1) but got array with shape (56, 56, 56, 4)

Concerning your loss function 'dice_coef_9cat_loss': I don't think it is a good idea to ignore background. Examples of "non-objects" are as important as examples of "objects", the problem is class imbalance. If you completely ignore the background, chances are that you'll get a lot of false positives.
Concerning your function 'dice_coef_binary_loss': maybe it works better because by merging all your object classes there is a better balance between "background" and "objects".

Check out the ref I cited in my original post, they describe how to implement dice loss for multiple imbalanced classes. Maybe your implementation will work, because mine doesn't and I don't understand why.

jpcenteno80

When using sparse categorical crossentropy, you don't need to one-hot encode your target. Here is a discussion comparing crossentropy vs sparse crossentropy in stackoverflow (google: TensorFlow: what's the difference between sparse_softmax_cross_entropy_with_logits and softmax_cross_entropy_with_logits?) <- Copying the link wasn't working for me.
So in my case (2D segmentation model) I feed in the raw mask as my target, which is an array 384x384 with pixels labeled from 0 to 9 (10 categories).

I ended up getting decent results with the 'dice_coef_9cat_loss' function after all (neglecting the background label). It just took longer training and starting with lower learning rates (Nadam, 1e-5 or 1e-4). I also tried cyclical learning rates, which I think helped: https://github.com/bckenstler/CLR. I decided to neglect background in the loss calculation because my class imbalance was pretty large and I could not figure out in Keras how to use 'sample_weight' in the fit method with 2D arrays.

I'm also transitioning into pytorch and I like that it seems more flexible in terms of setting up custom metrics or loss functions. I am using the tiramisu architecture for semantic segmentation which uses negative log likelihood as the loss (implementation here: https://github.com/bfortuner/pytorch_tiramisu). The results so far are great. I highly recommend using this architecture for semantic segmentation. Have not tried it in 3D though.

zdryan

I've taken the same approach as @jpcenteno80 as I am also unable to successfully implement the generalised dice loss. Would rather avoid using temporal sample weights.

emoebel

Author

Hey guys, I found a way to implement multi-class dice loss, I get satisfying segmentations now. I implemented the loss as explained in ref : this paper describes the Tversky loss, a generalised form of dice loss, which is identical to dice loss when alpha=beta=0.5

Here is my implementation, for 3D images:

# Ref: salehi17, "Twersky loss function for image segmentation using 3D FCDN"
# -> the score is computed for each class separately and then summed
# alpha=beta=0.5 : dice coefficient
# alpha=beta=1   : tanimoto coefficient (also known as jaccard)
# alpha+beta=1   : produces set of F*-scores
# implemented by E. Moebel, 06/04/18
def tversky_loss(y_true, y_pred):
    alpha = 0.5
    beta  = 0.5
    
    ones = K.ones(K.shape(y_true))
    p0 = y_pred      # proba that voxels are class i
    p1 = ones-y_pred # proba that voxels are not class i
    g0 = y_true
    g1 = ones-y_true
    
    num = K.sum(p0*g0, (0,1,2,3))
    den = num + alpha*K.sum(p0*g1,(0,1,2,3)) + beta*K.sum(p1*g0,(0,1,2,3))
    
    T = K.sum(num/den) # when summing over classes, T has dynamic range [0 Ncl]
    
    Ncl = K.cast(K.shape(y_true)[-1], 'float32')
    return Ncl-T

I would be curious to know if this works for your applications. To adapt from 3D images to 2D images, you should modify all "sum(...,(0,1,2,3))" to "sum(...,(0,1,2))".

gattia

@lazyleaf, I just stumbled upon this. I am doing 3D segmentation on multiclass. I will definitely try out the proposed method and see how it works. However, I also have another solution that has worked for me in the past:

def dice_coef(y_true, y_pred):
    y_true_f = K.flatten(y_true)
    y_pred_f = K.flatten(y_pred)
    intersection = K.sum(y_true_f * y_pred_f)
    return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)

def dice_coef_multilabel(y_true, y_pred, numLabels=5):
    dice=0
    for index in range(numLabels):
        dice -= dice_coef(y_true[:,index,:,:,:], y_pred[:,index,:,:,:])
    return dice

This simply calculates the dice score for each individual label, and then sums them together, and includes the background. The best dice score you will ever get is equal to numLables*-1.0. When monitoring I always keep in mind that the dice for the background is almost always near 1.0.

kroskal

@lazyleaf, I was also struggling to implement this loss function. But with som inspiration from your code, here is my take on it (for 2D images).
Ps! I haven't tried to train a network with it yet.

def generalized_dice_coeff(y_true, y_pred):
    Ncl = y_pred.shape[-1]
    w = K.zeros(shape=(Ncl,))
    w = K.sum(y_true, axis=(0,1,2))
    w = 1/(w**2+0.000001)
    # Compute gen dice coef:
    numerator = y_true*y_pred
    numerator = w*K.sum(numerator,(0,1,2,3))
    numerator = K.sum(numerator)

    denominator = y_true+y_pred
    denominator = w*K.sum(denominator,(0,1,2,3))
    denominator = K.sum(denominator)

    gen_dice_coef = 2*numerator/denominator

    return gen_dice_coef

def generalized_dice_loss(y_true, y_pred):
    return 1 - generalized_dice_coeff(y_true, y_pred)

lkshrsch

@lazyleaf thank you for pointing to the tversky loss. I implemented your code (had to change K.shape --> K.int_shape ) but it still complaints that "TypeError: long() argument must be a string or a number, not 'NoneType'".

Do you know why this is happening, and do you see it in your own code?

rakeshakkineni

@lkshrsch To remove the compilation error i have replaced "ones = K.ones(K.shape(y_true))" with "ones = K.ones_like(y_true)".

DaanKuppens

@kroskal Thanks for the implementation. Did you try training a network yet with this loss?
When I use your generalized dice loss I somehow get loss values larger than 1 and dice coefficients smaller than 0. Do you know what may cause this?

DaanKuppens

I have the generalized_dice_coef and generalized_dice_loss now working between [0 1] for 2D images. I normalized the weights to the presence of the class in the entire dataset instead of just the batch, using the following code:

def generalized_dice_coeff(y_true, y_pred):
n_el = 1
for dim in y_train.shape: 
    n_el *= int(dim)
n_cl = y_train.shape[-1]
w = K.zeros(shape=(n_cl,))
w = (K.sum(y_true, axis=(0,1,2)))/(n_el)
w = 1/(w**2+0.000001)
numerator = y_true*y_pred
numerator = w*K.sum(numerator,(0,1,2))
numerator = K.sum(numerator)
denominator = y_true+y_pred
denominator = w*K.sum(denominator,(0,1,2))
denominator = K.sum(denominator)
return 2*numerator/denominator

32 remaining items

xychenunc

I am trying something similar for a 2D semantic segmentation project with 10 categories (label 0 is background). Before trying dice, I was using sparse categorical crossentropy with very good results. However, because label 0 was being included in the loss calculation, both training and validation accuracy were artificially high (> 0.98). My implementation of dice is based on this: https://github.com/Lasagne/Recipes/issues/99.

y_true has shape (batch,m,n,1) and y_pred has shape (batch,m,n,10). Here is my version of dice:
def dice_coef_9cat(y_true, y_pred, smooth=1e-7):
    '''
    Dice coefficient for 10 categories. Ignores background pixel label 0
    Pass to model as metric during compile statement
    '''
    y_true_f = K.flatten(K.one_hot(K.cast(y_true, 'int32'), num_classes=10)[...,1:])
    y_pred_f = K.flatten(y_pred[...,1:])
    intersect = K.sum(y_true_f * y_pred_f, axis=-1)
    denom = K.sum(y_true_f + y_pred_f, axis=-1)
    return K.mean((2. * intersect / (denom + smooth)))

def dice_coef_9cat_loss(y_true, y_pred):
    '''
    Dice loss to minimize. Pass to model as loss during compile statement
    '''
    return 1 - dice_coef_9cat(y_true, y_pred)
A model trained with the above implementation of dice tends to predict 4 out of the 9 categories and the segmentation is less than ideal and much worse than I got with sparse categorical crossentropy.

However, when I convert the segmentation task into a binary decision (merge all categories into one), the segmentation is pretty good. Here is the loss function changed for a binary problem:
def dice_coef_binary(y_true, y_pred, smooth=1e-7):
    '''
    Dice coefficient for 2 categories. Ignores background pixel label 0
    Pass to model as metric during compile statement
    '''
    y_true_f = K.flatten(K.one_hot(K.cast(y_true, 'int32'), num_classes=2)[...,1:])
    y_pred_f = K.flatten(y_pred[...,1:])
    intersect = K.sum(y_true_f * y_pred_f, axis=-1)
    denom = K.sum(y_true_f + y_pred_f, axis=-1)
    return K.mean((2. * intersect / (denom + smooth)))


def dice_coef_binary_loss(y_true, y_pred):
    '''
    Dice loss to minimize. Pass to model as loss during compile statement
    '''
    return 1 - dice_coef_binary(y_true, y_pred)
Not sure if the binary results are better because it is an 'easier' task or because my dice loss function is wrong.

I noticed that this implementation of multi-class dice loss could lead to sub-optimal performance in some cases (probably depending on specific dataset and/or architecture design). That is, for some minority class(es), the prediction can be nothing. Have you encountered this problem in your work and how did you solve it? Thanks

gattia

This can sometimes be resolved by tuning other parameters (learning rate, optimizer, etc.)

If you have many classes that are imbalanced it can cause issues or not equally weight them. In this case, you can create weighting schemes that can sometimes help.

maxvfischer

I've implemented a bunch of binary and multi-class loss functions for image segmentation (one-hot encoded masks) that you might find useful: https://github.com/maxvfischer/keras-image-segmentation-loss-functions

E.g.:
Tanimoto loss: https://github.com/maxvfischer/keras-image-segmentation-loss-functions/blob/master/losses/multiclass_losses.py#L8
Dice's coefficient loss: https://github.com/maxvfischer/keras-image-segmentation-loss-functions/blob/master/losses/multiclass_losses.py#L42
Squared Dice's coefficient loss: https://github.com/maxvfischer/keras-image-segmentation-loss-functions/blob/master/losses/multiclass_losses.py#L74

rohan19250

Hello - I am working on 4 class segmentation problem, so I have 4 labels. I am able to get combined dice scores and losses using the functions below:

from keras import backend as K
def dice_coef(y_true, y_pred, epsilon=1e-6):
    intersection = K.sum(K.abs(y_true * y_pred), axis=-1)
    return (2. * intersection) / (K.sum(K.square(y_true),axis=-1) + K.sum(K.square(y_pred),axis=-1) + epsilon)

def dice_coef_loss(y_true, y_pred):
    return 1-dice_coef(y_true, y_pred)

How can I get dice coefficient and dice loss per label instead of a combined dice coefficient (see below)?

Epoch 1/40
100/100 [==============================] - 171s 2s/step - loss: 0.2989 - dice_coef: 0.7011 - val_loss: 0.2194 - val_dice_coef: 0.7806
Epoch 2/40
100/100 [==============================] - 74s 739ms/step - loss: 0.0272 - dice_coef: 0.9728 - val_loss: 0.0457 - val_dice_coef: 0.9543

maxvfischer

@rohan19250 I don't know what optimizer you're using during training, but presuming that you're using a gradient-based optimizer like SGD or ADAM, you want a single loss value to be able to optimize the network.

That being said, if you still want to compute dice_coef and dice_coef_loss for each label seperately, I would add them as Keras metrics.

You could probably try to add something like this (NOTE: I have not tested this code. Think of it as pseudo-code):

def dice_coef_single_label(class_idx: int, name: str, epsilon=1e-6) -> Callable[[tf.Tensor, tf.Tensor], tf.Tensor]:
    def dice_coef(y_true: tf.Tensor, y_pred: tf.Tensor) -> tf.Tensor:
        # Extract single class to compute dice coef
        y_true_single_class = y_true[..., class_idx]
        y_pred_single_class = y_pred[..., class_idx]

        intersection = K.sum(K.abs(y_true_single_class * y_pred_single_class))
        return (2. * intersection) / (K.sum(K.square(y_true_single_class)) + K.sum(K.square(y_pred_single_class)) + epsilon)

    dice_coef.__name__ = f"dice_coef_{name}"  # Set name used to log metric
    return dice_coef

Then compile your model in this fashion:

# The order needs to be the same as the order in your target/label tensor. 
# In this case, you need to have (None, <IMG_HEIGHT>, <IMG_WIDTH>, 4)
classes = ['dog', 'cat', 'horse', 'bird'] 
model.compile(optimizer=<YOUR_OPTIMIZER>,
    loss=<YOUR_LOSS>,
    metrics=[
        *[dice_coef_single_label(class_idx=idx, name=class_name)
        for idx, class_name in enumerate(classes)]
    ]
)

rohan19250

Thanks a lot @maxvfischer! This worked. So we are not summing over the last axis (excluded"axis=-1") in this function for individual labels. Could you explain this a bit?

maxvfischer

@rohan19250

What axis=-1 means is that you're referring to the last axis (without knowing the actual index of the last axis). In your case when computing the dice loss for all labels, the tensors y_true and y_pred are of shape (None, <IMAGE_HEIGHT>, <IMAGE_WIDTH>, 4), where None is the unknown batch size and 4 is the amount of classes you have. By running K.sum(..., axis=-1) you are summing over all the classes (last channel).

But in my code, we're extracting the true values and the predictions for a single class:

y_true_single_class = y_true[..., class_idx]
y_pred_single_class = y_pred[..., class_idx]

The shape will go from

y_true.shape = y_pred.shape = (None, <IMAGE_HEIGHT>, <IMAGE_WIDTH>, 4)

y_true_single_class.shape = y_pred_single_class.shape = (None, <IMAGE_HEIGHT>, <IMAGE_WIDTH>)

If you would keep axis=-1, that will refer to the last channel in the tensor, in our case <IMAGE_WIDTH>. By keeping it, you would've summed over the image width, something we don't want to do.

Hope it explains why I removed it.

EDIT:
I just saw that I didn't write that. When you don't supply an axis to K.sum(...), it will sum over all axises

rohan19250

@maxvfischer Got it! this is really helpful.

My Y_val is of shape (2880, 192, 192, 4). I am using the combined dice coefficient loss function for overall network, and just calculating individual dice coefficients.

The functions I defined based on your above code for individual classes:

from keras import backend as K
def dice_coef_necrotic(y_true, y_pred, epsilon=1e-6):
    intersection = K.sum(K.abs(y_true[:,:,:,1] * y_pred[:,:,:,1]))
    return (2. * intersection) / (K.sum(K.square(y_true[:,:,:,1])) + K.sum(K.square(y_pred[:,:,:,1])) + epsilon)

def dice_coef_edema(y_true, y_pred, epsilon=1e-6):
    intersection = K.sum(K.abs(y_true[:,:,:,2] * y_pred[:,:,:,2]))
    return (2. * intersection) / (K.sum(K.square(y_true[:,:,:,2])) + K.sum(K.square(y_pred[:,:,:,2])) + epsilon)

Epoch 1/40
408/408 [==============================] - 298s 729ms/step - loss: 0.0067 - dice_coef: 0.9933 - dice_coef_necrotic: 0.7299 - dice_coef_edema: 0.8579 - dice_coef_enhancingtumor: 0.8162 - val_loss: 0.0318 - val_dice_coef: 0.9682 - val_dice_coef_necrotic: 0.2566 - val_dice_coef_edema: 0.4313 - val_dice_coef_enhancingtumor: 0.3654
Epoch 2/40
408/408 [==============================] - 302s 740ms/step - loss: 0.0064 - dice_coef: 0.9936 - dice_coef_necrotic: 0.7478 - dice_coef_edema: 0.8655 - dice_coef_enhancingtumor: 0.8255 - val_loss: 0.0195 - val_dice_coef: 0.9805 - val_dice_coef_necrotic: 0.2540 - val_dice_coef_edema: 0.4496 - val_dice_coef_enhancingtumor: 0.3562
Epoch 3/40
408/408 [==============================] - 301s 739ms/step - loss: 0.0061 - dice_coef: 0.9939 - dice_coef_necrotic: 0.7633 - dice_coef_edema: 0.8716 - dice_coef_enhancingtumor: 0.8333 - val_loss: 0.0180 - val_dice_coef: 0.9820 - val_dice_coef_necrotic: 0.2647 - val_dice_coef_edema: 0.4883 - val_dice_coef_enhancingtumor: 0.3726

It seems the combined validation dice coefficient is good, but the individual class dice coefficients are not that good (shown first few epochs). Perhaps I should explore other loss functions and data augmentation (Y train ~13000 images)?

maxvfischer

@rohan19250 Impossible for me to answer by the information you've provided.

combined validation dice coefficient should be converging to something "good", because that's what you're optimizing. For how long did you train it? Have you plotted the individual dice coefficients over more epochs? Does it converges to something?

I would probably think more about your problem:

Are the number of pixels in each class skewed over the dataset (i.e. does it exist very few pixels of a certain class in your dataset)? If so, maybe you want to use weighted loss functions? Under 3.2.4. Generalization to multiclass imbalanced problems in this paper, they explain a way to weight your classes: https://arxiv.org/pdf/1904.00592.pdf
Sure, it's always good to explore more loss functions. I've implemented a couple here that you're more than welcome to try: https://github.com/maxvfischer/keras-image-segmentation-loss-functions
Image augmentation is also good to explore for better generalization. I would take a look at Keras ImageDataGenerator and its built in image augmentations (https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator) or write your own ImageDataGenerator (by inheriting keras.utils.Sequence) and use this augmentation package: https://github.com/albumentations-team/albumentations .

rohan19250

@maxvfischer - I have trained for 40 epochs(Adam optimizer,lr=1e-5). The individual dice coefficients converge to ~.29 for the first class, ~.46 for the second class, and ~.42 for the third class.
The combined validation dice coefficient converges to ~.98
Some of the details about the architecture of the U-net model.

Also, I am able to do predictions and see the ground truth and the predicted labels for some of the test images(which is a bit satisfying), but was concerned about the individual dice scores not good which definitely could be worked upon.

I will explore the other suggestions you provided.

maxvfischer

@rohan19250 For interpretability, you might want to use intersection over union/Jaccard Index (https://en.wikipedia.org/wiki/Jaccard_index) for each class as a metric instead of dice coefficient:

def iou_metric(class_idx: int, name: str = 'iou') -> Callable[[tf.Tensor, tf.Tensor], tf.Tensor]:
    def iou(y_true: tf.Tensor, y_pred: tf.Tensor) -> tf.Tensor:
        # Array with True where prob. is highest and False elsewhere
        y_pred_bool = (K.max(x=y_pred, axis=-1, keepdims=True) == y_pred)
        # Generate one-hot vector
        y_pred = tf.where(y_pred_bool, 1., 0.)

        # Extract single class to compute IoU over
        y_true_single_class = y_true[..., class_idx]
        y_pred_single_class = y_pred[..., class_idx]

        # Compute IoU
        intersection = K.sum(y_true_single_class * y_pred_single_class)
        union = K.sum(y_true_single_class) + K.sum(y_pred_single_class) - intersection

        # union = 0 means that we predicted nothing and the ground truth contained no labels.
        # This will be treated as a perfect IoU-score = 1.
        return K.switch(K.equal(union, 0.), 1., intersection / union)

    iou.__name__ = f"iou_{name}"  # Set name used to log metric
    return iou

Good luck!

innat

mentioned this

on Apr 12, 2022

DICE loss keras-team/keras-cv#296

Tarandeep97

@gattia

@lazyleaf, I just stumbled upon this. I am doing 3D segmentation on multiclass. I will definitely try out the proposed method and see how it works. However, I also have another solution that has worked for me in the past:
def dice_coef(y_true, y_pred):
    y_true_f = K.flatten(y_true)
    y_pred_f = K.flatten(y_pred)
    intersection = K.sum(y_true_f * y_pred_f)
    return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)

def dice_coef_multilabel(y_true, y_pred, numLabels=5):
    dice=0
    for index in range(numLabels):
        dice -= dice_coef(y_true[:,index,:,:,:], y_pred[:,index,:,:,:])
    return dice
This simply calculates the dice score for each individual label, and then sums them together, and includes the background. The best dice score you will ever get is equal to numLables*-1.0. When monitoring I always keep in mind that the dice for the background is almost always near 1.0.

Won't there been a weight for each label multiplied to dice_coef, weight that depicts the number of pixels for that label ?

gattia

What you are describing is one version of the DSC that has been called generalized-dsc.

I think they inversely weighted based on the number of pixels labeled that class.

weight = 1/ sum(y_true[:,index,:,:,:])

dice -= weight * dice_coef(y_true[:,index,:,:,:], y_pred[:,index,:,:,:])

the above would replace the code in the for loop to inversely weight it based on the number of pixels labeled that particular class (in the ground truth).

oke-aditya

mentioned this

on Nov 18, 2022

New Feature: Dice Loss pytorch/vision#6435

to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

Labels

No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Generalized dice loss for multi-class segmentation #9395

32 remaining items

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Generalized dice loss for multi-class segmentation #9395

Description

Activity

xychenunc commented on Mar 5, 2018

xychenunc commented on Mar 5, 2018

emoebel commented on Mar 6, 2018

jpcenteno80 commented on Mar 6, 2018

xychenunc commented on Mar 20, 2018

emoebel commented on Mar 27, 2018

jpcenteno80 commented on Mar 27, 2018

zdryan commented on Mar 28, 2018

emoebel commented on Apr 6, 2018

gattia commented on Apr 6, 2018

kroskal commented on Apr 17, 2018

lkshrsch commented on May 7, 2018

rakeshakkineni commented on May 19, 2018

DaanKuppens commented on May 31, 2018

DaanKuppens commented on May 31, 2018

32 remaining items

xychenunc commented on Jul 4, 2020

gattia commented on Jul 4, 2020

maxvfischer commented on Aug 4, 2020

rohan19250 commented on Dec 10, 2020

maxvfischer commented on Dec 10, 2020

rohan19250 commented on Dec 10, 2020

maxvfischer commented on Dec 10, 2020

rohan19250 commented on Dec 10, 2020

maxvfischer commented on Dec 10, 2020

rohan19250 commented on Dec 10, 2020

maxvfischer commented on Dec 10, 2020

Tarandeep97 commented on Apr 17, 2022

gattia commented on Apr 17, 2022

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions