DPP sponsors:                            

5 ways to make histopathology image models more robust to domain shift w/ Heather Couture, Pixel Scientia Labs

In this episode, we talk with Heather Couture about how to make deep learning models for tissue image analysis more robust to domain shift.

Supervised deep learning has made a strong mark in the histopathology image analysis space, however, this is a data-centric approach. We train the image analysis solution on whole slide images and want them to perform on other whole slide images – images we did not train on.

The assumption is that the new images will be similar to the ones we train the image analysis solution on, but how similar do they need to be? And what is domain and domain shift?

Domain: a group of similar whole slide images (WSI). E.g., WSIs coming from the same scanner or coming from the same lab. We train our deep learning model on these WSIs, so we call it our source domain. We later want to use this model and target a different group of images, e.g. images from a different scanner or a different lab – our target domain.

When applying a model trained on a source domain to a target domain we shift the domain and the domain shift can have consequences for the model performance. Because of the differences in the images the model usually performs worse…

How can we prevent it or minimize the damage?

Listen to Heather explain the following 5 ways to handle the domain shift:

  1. Standardize the appearance of your images with stain normalization techniques
  2. Color augmentation during training to take advantage of variations in staining
  3. Domain adversarial training to learn domain-invariant features
  4. Adapt the model at test time to handle the new image distribution
  5. Finetune the model on the target domain

Click here to read Heather’s full article on making histopathology image analysis models more robust to domain shift.

Visit Pixel Scientia Labs here.

And listen to our previous episode titled “Why machine learning expertise is needed for digital pathology projects” here to learn more about the subjects and learn how Heather and her company can help.

Transcript

[00:00:53] Aleksandra Zuraw: I have Heather Couture as my guest today, the founder of Pixel Scientia. And I came across this article of Heather’s about five ways to make histopathology image models more robust to domain ships. The supervised deep learning has basically taken over or has put itself everywhere in the image analysis approaches for pathology. But this is a very data-centric approach. And data-centric in this context means that we train our image analysis solutions on data with certain appearances, and then we want it to perform well on similar data. But how this data has to be, how similar do the whole slide images we are working with have to be? Is it enough if they come from the same lab, or do they have to be stained in the same batch and scanned with the same scanners? And what about whole slide images from different labs, different countries, or from different times if we do archived slides? So I have Heather as my guest. Again, she’s the author of this article. Welcome, Heather, how are you today?

[00:02:05] Heather Couture: Good, thanks. Thanks for having me.

[00:02:09] Aleksandra: Let’s start with, what is a robust solution in supervised deep learning? Or basically, in deep learning for pathology. And let’s talk about, why did you write this article?

[00:02:25] Heather: Robust can mean different things in different contexts. And in this case, and specifically with pathology images, it’s meaning robust, it changes in the images. So it could be, they were scanned on a different scanner. They’re from a different lab. The stain intensities are different because of something in the processing, or they’ve faded over time. Or maybe a different population of patients. So any of these changes can create a difference in appearance in your images. And if you’re trying to apply a model that was trained in a source domain, let’s say from one scanner, and you’re trying to apply it to a target domain, let’s say a different scanner. Something about those images may be different that the model doesn’t know how to handle. So maybe your training set of images are from the Aperio scanner, and you have another dataset of images from Hamamatsu, and you want to apply your model there.

[00:03:23] But if those target images look different than the source ones, the model can fail. And in particular, can fail in unexpected ways. So the challenges that we want to create a model that will also perform well on the Hamamatsu scanner in this setup, or perhaps even other scanners. And similar challenges occur when the tissue is stain differently, it’s faded over time, or something else changes. Maybe you have a model for nuclei segmentation for one type of cancer, and you want to apply it to different type of cancer, that’s also a domain shift.

[00:03:57] Aleksandra: So basically domain is a group of similar images, similar data points. Like you said, domain would be all images scanned with the Aperio scanner. And another domain would be scanned with Hamamatsu. Do I understand it correctly?

[00:04:14] Heather: Right.

[00:04:14] Aleksandra: Okay.

[00:04:15] Heather: Domain could be with respect to scanners, it could be with respect to the lab or the location in the world, or any other kind of grouping like that.

[00:04:25] Aleksandra: And domain shift would be, we want to apply solution from one domain on a different domain that is similar, right? Similar meaning, we are still working on H&E stained [inaudible] like images.

[00:04:39] Heather: Right. But different in that different scanner, different lab, anything like that. So some sort of target domain that’s different from the source.

[00:04:48] Aleksandra: So if there was no option to account for this domain shift, to account for the dissimilarities in those similar images, how similar would the samples have to be if there was no option to account for anything?

[00:05:05] Heather: It’s hard to quantify that, in that for a model to perform well on test images, those test image images need to be from the same distribution as the training images. And so if they look quite similar, and it’s hard to quantify the term look, whether they appear similar. If they’re similar enough, it’ll perform well. If they’re different enough, the model will fail in unexpected ways. And especially with deep learning models that have a lot of moving parts, a lot of parameters when you have to train these models. If something changes in the image, if it’s something that changes in your dataset over time, or you’ve gathered images from a different place, that can be enough for the model to fail. What needs to and for the model to work is the images need to look similar enough to something that the model has seen when it was training. And so the features that model learns, it needs to be able to characterize those same features in with your target images.

[00:06:10] Aleksandra: And usually it has to be more similar that it would be for a human observer, right?

[00:06:18] Heather: Definitely for a human observer, but there can also be other subtleties that a machine learning model could pick up on that a human observer could not.

[00:06:27] Aleksandra: Yeah. For example, I don’t care if the slide is scanned on Aperio or Hamamatsu, or whatever, I’m going to approach it the same way. But this is, following with our example, this would be enough for a model not to perform well.

[00:06:42] Heather: It could be. And then I can’t say it will always fail in that scenario, but it could fail.

[00:06:47] Aleksandra: So before I read your article, and I will link to it in the show notes, I knew about two approaches. The first approach was to train on not so similar images, so across different domain to account for those variabilities. Or, do the stain normalization with GANs, generative adversarial networks. But from your article I learned that there are several more approaches that can help us. And I thought with those two I was already very on top of the matter, apparently not. So how many solutions do we have, and what are the ways to increase the robustness to domain shift of histopathology models?

[00:07:27] Heather: Well, you’re definitely ahead of the game by understanding that’s a problem and that there are some solutions, but there’s five different ones that I’m aware of that I wrote about in this article. One is stain normalization, like you said. So the goal is to make the target domain images look similar to the source. So in the case of H&E, make the hematoxylin and eosin stains look similar between the target and source domains. The next is color augmentation, and this is in particular used in deep learning models. The goal of that is to increase the diversity in the source domains to teach the model how to handle different situations. So to make it more robust.

[00:08:08] Aleksandra: The opposite of the stain normalization.

[00:08:11] Heather: Right. The third one is adversarial domain adaptation, and this is to encourage the model to learn features that are more domain and variant. So the key here is the adversarial, so you’re going to train a model to do the task that you’re trying to do, but you’re also going to try and teach it not to be able to predict which domain the image came from. So you’re going to train it with your labeled training set from the source domain, and also some unlabeled images from the target domain. And you’re going to ask it to predict the domain, and adversarially train it so that ideally it will not be able to predict that domain.

[00:08:48] Aleksandra: So that it doesn’t focus on the things that are specific for the domain.

[00:08:53] Heather: Right. The fourth one is to adapt the model for the target domain. And this isn’t possible with all models, but some models incorporate statistics like the mean and standard deviation of features within the model. If those statistics have changed from your source to do your target, you could recompute them on the target domain. And from that, have your model perform better on the target domain. So for that you need unlabeled images from the target domain.

[00:09:21] And the last one, which tends to be the best solution but only if you have the data, because this does require labeled images from your target domain. Is you initially train your model on source domain, but then you do what’s called fine tuning on the target domain. So essentially, you continue training it a little bit more on your target domain so that it can learn any unique characteristics of the target domain.

[00:09:45] Aleksandra: Mm-hmm. But then you have a constantly evolving model.

[00:09:49] Heather: Right. You will need to fine tune it or train it more for each target domain that you’re trying to apply to, yes.

[00:09:58] Aleksandra: Can all those five methods be used simultaneously, or it does not work like that? And if it doesn’t, which would you recommend? Or, how does it work?

[00:10:08] Heather: Not necessarily all five together, but definitely combinations within that subset of five. It’s going to depend what you have available. If you have enough labeled images from your target dataset, fine tuning your model on that target dataset is going to give you the best results. But you don’t always have labeled imagery from your target dataset, sometimes you don’t have any images from your target dataset, or sometimes they’re only unlabeled because the goal of the model is of course to predict whatever you’re trying to predict on that target domain. So in that case, if you have unlabeled images from your target domain, you could take the model adaptation approach to recompute some of the statistics, and that’s only possible some models. Other than that, it tends to be a combination of the other three. So adversarial domain adaptation, color augmentation, and stain normalization.

[00:10:58] And each of these have different goals, but when you bring them together into a single model, or sometimes it’s subsets of two of those. That tends to, from what I’ve seen, give the best results. But it’s not necessarily these two are going to always give you the best result within that. Machine learning is very experimental. So this is your toolbox, you have these techniques that you can try, and you’re going to have to tweak each of them because there’s different parameters and intensities related to each of them that you need to adjust. So the best solution tends to be, try some combination within this toolbox, tweak it as needed for your particular task and for your particular data.

[00:11:43] Aleksandra: So basically the model is giving you feedback on what’s working best. I understand if you are designing your model from scratch, or programming it in a programming language. And you’re familiar with those approaches, you can do it. Question here is, can these methods also be used when the models are being developed with commercially available software by non-computer scientists? Or is it something that has to be incorporated in the best software, and this is user independent?

[00:12:17] Heather: In most cases, it would need to be incorporated into the software, and I’m not familiar with what’s in each individual commercial software package. Some of these require modifications to the model itself, in particular, adversarial domain adaptation. Some of them, like color augmentation, is done during training. And it’s very common to use already, you just might apply it slightly differently or with a greater intensity if you need to improve your robustness to domain shift. So some of those might already be incorporated. Stay normalization, you could perhaps apply outside a software package, take the images that have been scanned, apply the normalization to make your target images look more like your source domain. And then put it through whatever toolkit you’re using to train and to apply your model. In the case of fine tuning, that requires first training a model on your source domain, and then tuning it some more on your target domain. Some software package will allow that.

[00:13:19] Aleksandra: Yeah, I think this is something that you can just do with whatever you’re working, if you’re not a computer scientist, with each of the software packages that I’m aware of. You just take the new data and let it run on it, and adjust to the extent it’s needed.

[00:13:37] Heather: Yeah.

[00:13:38] Aleksandra: So you would recommend that the options, or at least some of them are incorporated into a software package.

[00:13:49] Heather: Yeah. Without any of these options, models definitely do fail on new domains, and maybe not all domains, but some domains that you might want to apply your model to. And so to improve the robustness to that, some of these tools, not always all of them, but some of them do need to be incorporated to improve the robustness of models.

[00:14:09] Aleksandra: Do you have an example of a model you worked on that failed miserably because you were not aware of the domain shift, or the domain shift was so drastic but not really visible for a human observer? Do you have any example?

[00:14:29] Heather: I don’t think I have an example of a drastic one. There’s one dataset I was working with during my PhD. I think the images were all scanned on the same scanner, but they were archival slides. So some of them, the stain had faded over time, and some of them had been in storage longer than others. And so for that one, it wasn’t a specific domain A and domain B. It was, these images had different amounts of fading. And so for that case, we did apply stain normalization to get the same intensities in a more similar realm before modeling.

[00:15:05] Aleksandra: Thank you very much, this is very informative. I think many people are aware that it doesn’t work across the board, but I don’t think to that extent to pay attention to those little things that a pathologist or a human observer just totally sees through, like faded stain, because you took slides from the archive. Thanks so much for explaining this to us.

[00:15:32] Heather: Oh, thanks for having me.

[00:15:34] Aleksandra: And you wrote more articles like that, so tell the listeners where they can find them.

[00:15:41] Heather: My website, which is pixelscientia.com, P-I-X-E-L-S-C-I-E-N-T-I-A .com.

[00:15:48] Aleksandra: Okay. I’m going leave this link in the show notes, and I’m going to also link to our previous podcast episode. Thank you very much and have a great day.

[00:15:58] Heather: You too. Take care.

Related Projects