Aleksandra: [00:00:00] Using open-source tissue image analysis tools for life science and biomedical research is a no-brainer. There are already tens of thousands or probably even hundreds of thousands of scientists using those tools. Using those tools for commercial purposes is a different story. And building a platform for tissue image analysis based on those tools is something I have only recently heard about from my today’s guest.
When I was investigating open-source software for analyzing pathology images, this premise of open source being available for commercial purposes sounded super attractive, but I didn’t really know how it can be used. And my guest is going to talk about it today. So let’s dive into it.
(Introduction)
Aleksandra: Welcome Digital Pathology Trailblazers. Today, my guest is Trevor McKee.
Trevor McKee is the CEO of Pathomics. Pathomics is an image analysis company, but it’s a different kind of image analysis company because they are using [00:01:00] open source software to do image analysis, welcome Trevor to the podcast. How are you today?
Trevor: Thanks, Aleks. Really great to see you. I’m doing super well. Thanks.
Aleksandra: We have met Pathomics is not your first image analysis rodeo, so to say, because we have met at several companies before. I’m going to let you introduce yourself and let the listeners learn about your background.
Trevor: Okay. That sounds great. So I’m a renegade chemical engineer that didn’t want to go into petroleum and so I and got interested in biology.
Aleksandra: Like my husband actually.
Trevor: In undergrad. Yeah. Oh yeah.
Aleksandra: My husband was, my husband started as a chemical engineer and worked with some in some glue company, and then he became an MD, so now he’s a Doctor.
Trevor: Md. Oh. Very cool. Very cool. Yeah. Engineering in general, chemical engineering is very good sort of general-purpose problem-solving that you can apply to a lot of different things, right?
So I really got into biology in grad school where I went to MIT biological engineering and there I worked in the lab with Rakesh [00:02:00] Jain, who is a pioneer in biomedical imaging and intravital microscopy. And so I did a lot of two-photon imaging in mouse models of cancer through my work there. And really found that, the easy part is actually taking the images, the hard part is how to analyze and get the information out that you need.
So from then on, I’ve been working in that area. So I did a postdoc up here in Toronto and then joined a core facility that I called the STAR facility, which was doing, which had sort of a micro CT, micro MRI, and then a correlative histopathology lab that we could do all of these sorts of imaging do in patients, but on small animals and then do the pathology to register with those imaging modalities.
So I was the director of the image analysis core within that core facility. And so for about 10 years, we had worked on really any sort of image analysis that came through the door. We would analyze four people. And then we purchased the Diffinian software at about 2016, I think, and so we were using that and [00:03:00] really got to developing a lot of tools there using Diffinian to analyze again, those different imaging modalities, but focusing a lot on the histology and multiplex images.
Cause I was part of a big preclinical trial with Pfizer where we were testing out a bunch of, Pfizer’s drugs on patient-derived tumors, xenografts that were growing in mice. And I ran a study where we were doing PET imaging, CT imaging, and then taking that tissue out and staining it for a couple of markers for proliferation in the hypoxia, and then looking at those, all of those relationships across all of the different tissues.
And that was interesting worked there for about 10 years, then moved to HistoWiz where I was director of image analysis for about a year over COVID, and then moved to Deciphex where I worked as helping with commercialization of AI services. And then as of November, I’ve been running Pathomics and essentially trying to do what we were doing at STAR in terms of providing image analysis services [00:04:00] to people, as well as building our goal here at Pathomics, really is to build an online platform that can serve as a tool for either data scientists or pathologists or people working in this field to be able to go all the way from images to analyze the images to reports that can say something about, the content of those images.
Aleksandra: And you are going to be using, or you are using open-source software for your services. Why and how are you doing this?
Trevor: That’s a great question. For one there’s great open-source tools out there, right? So QPath, for example, is one we’ve used quite extensively, and QPath with the right words and scripting added in we’ll do a lot of really can do, handle a lot of things. Handle multiplex data, you can do single-cell segmentation, you can do, you can extract a bunch of features from those images and then use those for any of the downstream analytical processing you want. [00:05:00] And I think we, both myself and my co-founder, Mark, we really feel passionate about the fact that like open-source programming Software is, is the future. It’s something that everyone, as opposed to let’s say a closed-source solution requires developers to go in there and figure out what they’re doing. QPath has the forums, the great image analysis forums. And excellent…
Aleksandra: I remember.
Trevor: Yes. and co-founder is Mark Zaidi and he’s I think he might’ve even corresponded with you on there.
Aleksandra: Yeah, we corresponded through the forum. Yes. Yes. Before I even knew that you guys were working together.
Trevor: Yeah. Yeah. No, he’s great. And he just goes on there and anyone that has questions.
Aleksandra: I’m going to link to the forum and the show notes as well. Oh, that would be amazing. Yeah. Yeah.
Trevor: And that’s, it’s a great sort of cross-platform forum for, cell profiler, Fiji QPath, and others. And it’s sort of a great resource for a lot of people. It’s an interesting concept of like, how do you run a commercial business with open-source tools?
Aleksandra: Yes, and this is [00:06:00] like a super, something that interests me very much because open source, you can do business with it, but rarely anyone is.
So I don’t know any other company that is actually using your model. So let me know more about the model, that the value proposition in your business model with this kind of software.
Trevor: Was it the software? Yes. Yeah. So I think, really the value proposition that we have is right now, what are, let’s say you were a pharma company, right?
And you’re looking at a bunch of pathology data that’s flowing through your organization, right? You’re going to want to do something with that data and oftentimes that requires, of course, a pathologist has to be involved there, of course, but sometimes it’s, let’s say it’s a biomarker study and we need to, you need to know how many markers are there and in what proportions, et cetera, et cetera.
And so the options for you are, either high, higher data scientists who are expensive and maybe commercial software, which is expensive, and use that to do your work. [00:07:00] Or the option could also be outsourced to a company, let’s say like Pathomics, who has the expertise to be able to know how to analyze the data, who can do that analysis, who can present you back with a report that you can use to make decisions about what the next thing you are wanting to do, does this all using standardized open source, but standardized tools that we can write up in such a way that it’s going to cover from a regulator perspective, cover all checks, check all of the boxes.
So I think that’s one value proposition. And the other one is just, even if you do have commercial software and a data scientist many of the commercial software stop at spitting out a bunch of numbers at you, right? So here’s every single cell with all of the markers present within that area. But that’s not the end of the study, right? What we’ve done at STAR over the past 10 years is move from having that export to having an actual publication, writing figures for your manuscript, or, a report that says this biomarker has [00:08:00] gone up or down in this clinical trial, right?
So I think that’s really where we come in is to build a platform that lets our users move from images or move from an exported data file with all of the cells to actually having graphs and understanding of what that data says at the end of the day. So our first product that we’re putting together is an online based platform that really, if you need the segmentation done, we can do the segmentation in QPath.
But even if you’ve say, used Visio, Pharma, Halo, and you’ve got this export. You can upload that export to the platform, we will provide you with a way to to interactively generate those graphs that you’re looking for, or maybe have some preset templates that you can use that will generate at the end of the day, a report, a spatial report that says we see this many cells in this region and this many cells in that region and that’s indicative of whatever, PDL1 going up or down [00:09:00] or, more CD3D7 toxic cells in your tumor. There’s going to be a couple of different outputs that people might want depending on what they’re studying.
Aleksandra: So you say it’s a platform. Is it a cloud-based platform, I assume? Or it’s not like browser-based that you can…
Trevor: Browser-based.
Aleksandra: Yeah.
Trevor: Yes. Yeah…
Aleksandra: Whatever the behind-the-scenes of browser-based and cloud-based are, but something that you can access through Microsoft Edge or Chrome, right?
Trevor: Yes. Yes. And, there, there are versions of this out there already. It’s called business analytics. So there’s a lot of companies that are doing, a dashboard where you can analyze a lot of complex data and so we’re building a version of that, but specifically for digital pathology.
Aleksandra: Okay. So you’re building like an interactive dashboard for image analysis based on open-source image analysis tools.
Trevor: Tools. Oh yeah, it’s a combination of open-source tools and then probably some Python code in there that we’re going to be using as well.
Aleksandra: Tell me about if somebody [00:10:00] would want to work with you, how would that work? What would the workflow be and what kind of projects would you work on? Give me an example.
Trevor: That’s a very good question. One of our potential clients that we’re speaking with, like already have an image analysis team,It’s a CRO. They already have an image analysis team.
The problem they’re facing is that they need to double their output over the next six months without increasing headcount. And so they need tools to make their existing data scientists more efficient with what they’re doing. To be honest, a lot of their data scientist’s time is spent copying and pasting graphs into PowerPoint decks or into a Word doc, right?
And that’s not a very efficient use of their time. So if we can automate, even if we can just automate the. And reporting aspects of that, right? That’s something that has value. And then other clients are looking for more of a whole service solution of they’ve got, a pharma client that needs image analysis.
That needs staining and image analysis done, and they’re going to handle the [00:11:00] staining portion, but they need help with the image analysis. And so we’ll contract, we’ll subcontract the image analysis work from them. We’ll deliver, results, and specify what the outputs are, back to them and then that forms, another service offering that we can provide.
And really we’re looking at it across the whole spectrum, right? Another big area that we serve is academics, right? And so for academics, maybe they don’t have as much budget for, be it to be able to spend on things because grant money is a little harder to come by nowadays but what they can do is academics to have access to grad students.
And so we’re also working on. Let’s say an academic rate where we would provide, a virtual machine that would have access to QPath and then train their grad students how to do the work themselves. And then we’ll just charge for the computational costs of hosting that data and then provide them with an ability to develop whatever scripts they need to develop and really work with them on that aspect of it.
And or potentially even [00:12:00] give them something in QPath that they can learn themselves, right? So you know, that’s on the, let’s say the low-cost end through to the full service and we try to cover kind of everything in between.
Aleksandra: So what would people pay for? Is it the services that you guys are providing or is it the access to the platform or a combination or something totally different?
Trevor: It’s certainly a combination. I think there’s a phrase for startups that you’re building the plane as you’re flying it. So right now, we’re working on a version of the platform. Right now we’re going to be providing a lot more services-based work where you provide us your images, and we’ll provide you the analysis back, but we want to move towards this online platform, both for our external clients, but also even for us internally, it’d be great to just be able to use this platform to do our own services based work.
And, as we’re building it for ourselves, we’ll see how it works for building it for other people too. But I think [00:13:00] down the line, we’d like to get to have this, being able to run all of the analysis in as automated a fashion as possible, but there’s a lot of complexities that go into that, right?
Because there’s, if you ask 10 different people, you get 10 different answers as to what’s, what analysis looks like, right? So I think that’s going to be the real work over the next couple of months is starting to define what those image analysis sort of pipelines look like, right?
And then building all of the various interconnected pieces to, let’s say, take a segmentation and bring it over here to do post-processing or to do dimension reduction clustering or any of the various things that you want to do, I think what we’re looking at is to really map out all of those potential connections and sort of a diagram and then be able to flow from one through to the others.
Aleksandra: And you mentioned that this can be built in a compliant way if this is supposed to be used for some in, can it be used in a [00:14:00] regulated environment? What’s your plan for that? Because especially when people in the regulated environment hear open source, they’re already super cautious that Oh, how can you be in the regular, in compliance with the open source software?
But the thing with open source, the only thing that is not there is not like people that can support the software. If you have the. capability in house to work with different versions, or at least that’s my understanding, but definitely when you say open source in a regulated environment, people are like, I don’t know if we can use it.
We probably cannot use it. Tell me more about this. Did you get this question as well?
Trevor: Yeah, it’s a very good question. And I would say we’re still early on the road there, but I have done analysis for clinical trials in the past. and that was done under GCLP kind of practices. . And it really came down to sort of documentation. In that case, I think we were using opinions at the time for that analysis. And [00:15:00] we just had to,
Aleksandra: which was not the compliance software on its own.
Trevor: On its own. No, yeah. No. But what we did was we just recorded all of the steps that we were doing to do the analysis, and we made sure that we had manual rex in there both for manually correcting any tissue classifiers that were done and also manually checking the number of ground cells that were counted within the small ROI and just providing some accuracy metrics back. As saying that, okay, we were eight between the automated counts were between 80 and 120 percent of the manual counts.
And it was a research portion of the clinical trial, right? So there was no kind of decisions that were being, no patient decisions that were being made off of this. It was more to understand. Were these immune cells going up or down? And so I think within that framework, we were able to make it work at that time, whether or not, it’s a big shifting landscape, as I understand with AI and digital pathology [00:16:00] and regulatory things.
So I guess. We will see, I think for now we’re mostly hoping to target the sort of preclinical end of the spectrum and as we start to get more traction, we’ll see which of these pipelines can be made robust enough. that they could fall into something that would go down a regulatory path.
But for now, I think we’re avoiding the question, let’s say, or we’re not focusing initially on solving everything, right? We can’t we’ll start at the research side. And then as we move, we can branch into a clinic.
Aleksandra: So you mentioned AI, do you guys incorporate AI in any of the platform?
I’m asking because I know that currently, and by AI, deep learning, AI is, a huge concept but deep learning and, example based, annotation based kind of image analysis, training of models, does Cupid have that already? I know Pete Bankhead, the author of Cupid was working on it. I don’t know where we stand.
Tell me as a super user, what do you [00:17:00] do with AI? In omics. And how do you incorporate it into open source that you’re using?
Trevor: Yes. So we actually incorporated at a number of different points, and I would say it’s machine learning broadly, right? . So it’s both sort of classical machine learning as well as deep learning.
They’re different for different reasons, right? And so actually one of the tools that Mark built and is. up on his GitHub is called Universal Stardust for QPath. And so the reason we built that so it’s a, that is a plugin that you can run within QPath that will run the Stardust deep learning tool to do cell segmentation within a bright field or a fluorescence image, right?
Just for explanation, Stardust is to detect nuclei in cells, right? This is a deep learning based, pre trained model to detect nuclei. Exactly. Yeah. So it’s a deep learning based, pre trained model, but it was trained on DAPI stained fluorescence signals, right? They’ve got two. I think they’ve got a bright field one and a DAPI one.
But a problem [00:18:00] we had when we were trying to use. Stardust within QPath to analyze IMC data. So imaging mass cytometry is that Stardust didn’t work very well out of the box on that data. And the reason it didn’t is because just the data like mass spectrometry based imaging, it’s very different from DAPI based imaging, different resolution, different kind of, it’s a 32 bit intensity range.
And so that was causing challenges with being able to run it. And what Mark did was actually build normalization and adjusted the resolution of the IMC data to make it look like DAPI data so that when you run universal starters for QPath on IMC data within QPath itself, you’ll be able to get good cell segmentation on your datasets.
So that’s one example. So basically you are able to incorporate. Because you can code into those softwares, right? So if there is any available and start a system like the most popular or the one I’ve heard of I assume that [00:19:00] you can either develop or there are other models available that are open source that you can plug it in.
If you could plug it in. So yes you could, you can, the challenge is it requires some knowledge of. Groovy scripting, which is a little specialized, but Mark’s very good at that. And so we write those scripts and then run them within QPath. And then we also do other things. So QPath has a pixel a tissue classifier built into it.
And so we’ve done where we’ve collected annotations of. glomeruli, tubules, interstitium, and trained a model to identify those regions within within a IMC image. And then once we’ve done the segmentation, QPath also lets you extract a bunch of spatial features from each cell in that image. And then what we do is we export all of those features.
And then we’ve run XGBoost, which is a type of machine learning classifier to predict the cause of a certain disease type based on the just the intensities and the types of cells that were present in the image. And we’re able to do a fairly [00:20:00] good job, about 84 percent accuracy in predicting the cause of our, in this case, it was a transplant rejection purely based off of the type and the location of immune cells that were present within the biopsies.
And the reason we chose XGBoost, which is a machine learning classifier, as opposed to deep learning, is that it’s more explainable. So within XGBoost, you can actually get a list of the most important features that made it decide that this is one type versus the other. And so we could see the interesting, we could interrogate it a bit more than in deep learning when it’s it’s a little harder to make deep learning explainable.
Aleksandra: Yeah, this is super important in both research and clinical, in the healthcare. Healthcare space in general life sciences and healthcare space, the explainability like explainability is part of the validity of your data, I would say, because if you can explain it and show the logic at based on, already phenotypes [00:21:00] that are known rather than end to end deep learning models, even if they.
work the same and have the same accuracy and then the same, they’re just basically the same outcomes, the same results. And if you have it explained, it’s a lot easier to accept by the scientific community. And that kind of, for me, it’s a stepping stone towards then accepting the end to end models that can also be super useful.
But you have to know. That it’s repeatable within the end to end. It’s rarely that you can explain it, but if you have something that does the same, but has the steps explained that has great value.
Trevor: Yeah. And then another thing to mention also is so as part of the open source, Commitment to open source.
I think one of the things that’s been on my mind forever that I’m just working on implementing now is let’s say a digital pathology wiki or that where anyone can come and can contribute that, Hey, I’ve just published this paper of this last month [00:22:00] here’s the link to the GitHub code and here’s the link to the paper, right?
Because I think the challenge that I have that I’m sure I’m not alone in is that there’s like thousands of digital pathology tools that are all buried somewhere on GitHub. And and it’s like as either as a new user, even as an experienced user coming. to this field, you could make the mistake of developing something for your master’s project that already exists.
Aleksandra: And see that already five people said that it failed. Yeah. So this is a big deal. I guess it’s not digital pathology specific, but. Because we are working in this space. We notice it here. Like the this concept of reinventing the wheel. Yes. Nobody wants to reinvent the wheel, but then once you started a project, nobody wants to abandon a project and then you end up developing the same thing.
Twice, I don’t know how many times, and then you look up PubMed and it’s Oh, they already did that two years ago. And it didn’t work. No wonder mine didn’t work. Or, Oh, [00:23:00] mine is such a fantastic discovery. So yeah, like a repository. And so I think hopefully I will get it out by the time this podcast is out, but.
Trevor: Yes, how far are you?
Aleksandra: How far are you? And the building show, I don’t want to see it. Let’s see. Yes. Yes.
Trevor: It’s, essentially at a high level I’m going to have like major categories, right? So you’re going to have pre analysis sort of QC, that kind of stuff. There’s going to be a part, a section on segmentation, section of classification.
That can not like supervised and unsupervised methods and then post processing, right? Those will be the broad categories of starting from the image going through to analysis. And then within those, what I do have is I’ve collected a number of, let’s say classical papers, let’s say Stardust, right?
The first publication on Stardust would be a great one for the segmentation portion and even stain separation, right? There’s A number of different ways of doing stain separation to get from your H& E image to just H& E or [00:24:00] H& AB. And they all have their pros and cons, but whatever, let’s just list them all.
And then let people decide as to, which one makes the most sense. And really, It’s what’s required on my end is to actually put that wiki together, which just requires a bit of fiddling with finding the right domain hosts that can let me install all the right things. It’s not a big hurdle and I keep talking about it and I just need to do it.
But like I said I’ll make it a goal that within the next month, I’ll get it up there so that so that we can. We can put it on here. It would be under the pathomics. io somewhere, pathomics. io slash, digital something that we can share.
Aleksandra: I’m going to definitely link to your website in the show notes. And if the thing is ready, then it’s going to have a separate entry in the show notes. I would love that because it’s perfect for pathologists or for non computer scientists working with computer scientists, like experts working with computer scientists to point out to, Hey. Look what’s [00:25:00] already done and let’s not reinvent the wheel because I keep repeating the slogan, stop reinventing the wheel, but I cannot point to what has already been done.
It’s on the computer scientists to research that. And obviously, when I do my literature research, okay, I find five papers and then I’m done the most recent ones. Let’s assume. This is the state of art, but I’m working on deepening my literature research, but basically this is how it works.
Trevor: Yeah. No, for sure. Yeah. You look at a couple of papers and you pick, wouldn’t it be great. And then actually, the fun thing is that with chatgpt, we could ask chatgpt, Hey, give a list of all of these. So that’s. We’d have to double check, right?
’cause half of the time it’s making stuff up.
Aleksandra: But yes, you have to check because it’s generative ai. It’s not a search engine. But yeah, you could basically have, let’s fantasize, you could have your stuff already there and have chat GPT work with the data that’s already there and search or yeah, as based on this.
this is so cool. And yeah, it, there is, there, there’s a. There’s a company out there that already [00:26:00] exists called Hugging Face. And they’re essentially a repository of all machine learning models. And in that instance, I’m just building a domain specific Hugging Face for digital catalogy.
Trevor: So yeah. Yeah. Which should, as I said, should, isn’t a big, isn’t a big hurdle, but it’s something that.
Aleksandra: I’m going to be on the lookout for this cause. I recently opened a membership site, a paid membership site for digital pathology interested people, my digital pathology trailblazers and what I’m doing there actually starting by the time we publish this, it’s going to be already up and running for a long time, but I started yesterday, every day reviewing like a quick paper review and when this is out, then I’m definitely going to include this in our daily digital pathology digest, that’s going to be a must.
Trevor: That’s great. No, I think I’ve attended a few of your things leading up to the digital, the trailblazers, it’s great.
Aleksandra: Yeah. So that’s the newest thing. Okay. Thank you [00:27:00] so much for joining us today and explaining about pathomics and about the commitment to open source. This is so cool because it doesn’t mean that you have. If you cannot do business, you can do business and still be committed to open source and make it available to people at. different level of financial resources, different level. There’s nothing free in the world. You either pay with your time or with money. If you want to have stuff fast, then you pay with money.
If you cannot pay with money, you do it yourself and you pay with time. And so to, this looks to me like a tool for the whole spectrum of image analysis researchers, which is super cool to have something that’s that accessible with. The support of experts and then with a cool platform that people can use in the making.
Thanks so much for telling us and you have a great day, Trevor.
Trevor: Thank you so much. It’s been really great talking to you, Aleks..
Aleksandra: Thank you so much for staying till the [00:28:00] end. The fact that you’re still here means that you do resonate with the subject. So I’m going to leave a link to all the open source.
software tools that I know of available for tissue image analysis. And if you are a tissue image analysis scientist yourself and need to increase your pathology knowledge to be able to understand the images and the tissue in the images better, I have a course for you. It’s called pathology one on one for tissue image analysis.
And it is within the membership site that I recently launched the digital pathology club. So I would love to give you a free trial to the club so that you can take advantage of. All the courses that are available at the moment and join a community of like minded digital pathology trailblazers. So I’m going to leave the link below.
Go ahead, click it and give it a try. And I talk to you in the next episode.