Ground Truths cover logo

Jim Collins: Discovery of the First New Structural Class of Antibiotics in Decades, Using A.I.

28m · Ground Truths · 13 Feb 16:24

Jim Collins is one of the leading biomedical engineers in the world. He’s been elected to all 3 National Academies (Engineering, Science, and Medicine) and is one of the founders of the field of synthetic biology. In this conversation, we reviewed the seminal discoveries that he and his colleagues are making at the Antibiotics-AI Project at MIT.

Recorded 5 February 2024, transcript below with audio links and external links to recent publications

Eric Topol (00:05):

Hello, it's Eric Topol with Ground Truths, and I have got an extraordinary guest with me today, Jim Collins, who's the Termeer Professor of Medical Engineering at MIT. He also holds appointments at the Wyss Institute and the Broad Institute. He is a biomedical engineer who's been making exceptional contributions and has been on a tear lately, especially in the work of discovery of very promising, exciting developments in antibiotics. So welcome, Jim.

Jim Collins (00:42):

Eric, thanks for having me on the podcast.

Eric Topol (00:44):

Well, this was a shock when I saw your paper in Nature in December about a new structure class of antibiotics, the one from 1962 to 2000. It took 38 years, and then there was another one that took 24 years yours, the structural antibiotics. Before I get to that though, I want to go back just a few years to the work you did published in Cell with halicin, and can you tell us about this? Because when I started to realize what you've been doing, what you've been chipping away here, this was a drug you found, halicin, as I can try to understand, it works against tuberculosis, c. difficile, enterobacter that are resistant, acinetobacter that are resistant. I mean, this is, and this is of course in mice models. Can you tell us how did you make that discovery before we get into I guess what's called the Audacious Project?

Jim Collins (01:48):

Yeah, sure. It's actually a fun story, so it is origins go broadly to institute wide event at MIT, so MIT in 2018 launched a major campus-wide effort focused on artificial intelligence. The institute, which had played a major role in the first wave of AI in the 1950s, 1960s, and a major wave in the second wave in the 1980s found itself kind of at the wheel in this third wave involving big data and deep learning and looked to correct that and to correct it the institute had a symposium and I had the opportunity to sit next to Regina Barzilay, one of our faculty here at MIT who specializes in AI and particularly AI applied to biomedicine and we really hit it off and realized we had interest in applying AI to drug discovery. My lab had focused on antibiotics to then close to 15 years, but primarily we're using machine learning and network biology to understand the mechanism of action of antibiotics and how resistance arise with the goal of boosting what we already had, with Regina we saw there was an opportunity to see if we could use deep learning to get after discovery.

(02:55):

And notably, as you kind of alluded in your introduction, there's really been a discovery void and the golden age of discovery antibiotics was in the forties, fifties and sixties before I was born and before you had the genomic revolution, the biotech revolution, AI revolution. Anyways, we got together with our two groups, and it was an unfunded project and we kind of cobbled together very small training set of 2,500 compounds that included 1,700 FDA approved drugs and 800 natural compounds. In 2018, 2019, when you started this, if you asked any AI expert should you initiate that study, they would say absolutely not, there's going to be two big data. The idea of these models are very data hungry. You need a million pictures of a dog, a million pictures of a cat to train a model to differentiate between the cat and the dog, but we ignored the naysayers and said, okay, let's see what we can do.

(03:41):

And we apply these to E. coli, so a model pathogen that's used in labs but is also underlies urinary tract infections. So it’s a look to see which of the molecules inhibited growth of the bacteria as evidence for antibacterial activity and we could have measured and we quantified each of their effects, but because we had so few compounds, we just discretized instead, if you inhibited at least 80% of the growth you were antibacterial, and if you didn't achieve that, you weren't antibacterial zero in ones. We then took the structure of each molecule and trained a deep learning model, specifically a graphical neural net that could look at those structures, bond by bond, substructure by substructure associated with whatever features you look to train with. In our case, making for good antibiotic, not for good antibiotic. We then took the train model and applied it to a drug repurposing hub as part of the Broad Institute that consists of 6,100 molecules in various stages of development as a new drug.

(04:40):

And we asked the model to identify molecules that can make for a good antibiotic but didn't look like existing antibiotics. So part of the discovery void has been linked to this rediscovery issue we have where we just keep discovering quinolones like Cipro or beta-lactams like penicillin. Well, anyways, from those criteria as well as a small tox model, only one molecule came out of that, and that was this molecule we called halicin, which was named after HAL, the killing AI computer system from 2001 Space Odyssey. In this case, we don't want it to kill humans, we want it to kill bacteria and as you alluded, it turned out to be a remarkably potent novel antibiotic that killed off multi-drug resistant extensively drugs, a pan-resistant bacteria went after to infections. It was affected against TB, it was affected against C. diff and acinetobacter baumannii and acted to a completely new mechanism of action.

(05:33):

And so we were very excited to see how AI could open up possibilities and enable one to explore chemical spaces in new and different ways. We took them model, then applied it to a very large chemical library of 1.5 billion molecules, looked at a subset of about 110 million that would be impossible for any grad student, any lab really to look at that experimentally but we looked at it in a model computer system and in three days could screen those 110 million molecules and identified several new additional candidates, one which we call salicin, which is the cousin of halicin that similes broad spectrum and acts to a novel mechanism of action.

Eric Topol (06:07):

So before we go further with this initial burst of discovery, for those who are not used to deep neural networks, I think most now are used to the convolutional neural network for images, but what you use specifically here as you alluded to, were graph neural networks that you could actually study the binding properties. Can you just elaborate a little bit more about these GNN so that people know this is one of the tools that you used?

Jim Collins (06:40):

Yeah, so in this case, the underlying structure of the model can actually represent and capture a graphical structure of a molecule or it might be of a network so that the underlying structure itself of the model will also look at things like a carbon atom connects to an oxygen atom. The oxygen atom connects to a nitrogen atom and so when you think back to the chemical structures we learned in high school, maybe we learned in college, if we took chemistry class in college, it was actually a model that can capture the chemical structure representation and begin to look at sub aspects of it, associating different properties of it. In this case, again, ours was antibacterial, but it could be toxic, whether it's toxic against a human cell and the model, the train model, the graph neural model can now look at new structures that you input them and then make calculations on those bonds so a bond would be a connection between two atoms or substructures, be multiple bonds, interconnecting multiple atoms and assign it a score. Does it make, for example, in our case, for a good antibiotic.

Eric Topol (07:48):

Right. Now, what's also striking as you set up this collaboration that's interdisciplinary with Regina, who I know of her work through breast cancer AI and not through drug discovery and so this was, I think that new effort and this discovery led to this, I love the name of it, Audacious Project, right?

Jim Collins (08:13):

Right. Yeah, so a few points on the collaboration then I'll speak to Audacious Project. In addition to Regina, we also brought in Tommi Jaakkola, another AI faculty member and marvelous colleague here at MIT and really we've benefited from having outstanding young folks who were multilingual. We had very rich, deep trained grad students from ML on Regina and Tommi's side who appreciated the biology and we had very richly, deeply trained postdocs, Jon Stokes in particular from the microbiology side on my side, who could appreciate the machine learning and so they could speak across the divide. And so, as I look out in the next few decades in this exciting time of AI coming into biomedicine, I think the groups will make a difference of those that have these multilingual young trainees and two who are well set up to also inject human intelligence with machine intelligence.

(09:04):

Brings the Audacious Project. Now, prior to our publication of halicin, I was invited by the Audacious Project to submit a proposal, the Audacious Project is a new philanthropic effort run by TED, so the group that does the TED Talks that's run by Chris Anderson, so Chris had the idea that there was a need to bring together philanthropists around the world to go for a larger scale in a collective manner toward audacious projects. I pitched them on the idea that we could use AI to address the antibiotic resistance crisis. As

The episode Jim Collins: Discovery of the First New Structural Class of Antibiotics in Decades, Using A.I. from the podcast Ground Truths has a duration of 28:52. It was first published 13 Feb 16:24. The cover art and the content belong to their respective owners.

More episodes from Ground Truths

Venki Ramakrishnan: The New Science of Aging

Professor Venki Ramakrishnan, a Nobel laureate for his work on unraveling the structure of function of the ribosome, has written a new book WHY WE DIE which is outstanding. Among many posts and recognitions for his extraordinary work in molecular biology, Venki has been President of the Royal Society, knighted in 2012, and was made a Member of the Order of Merit in 2022. He is a group leader at the MRC Laboratory of Molecular Biology research institute in Cambridge, UK.

A brief video snippet of our conversation below. Full videos of all Ground Truths podcasts can be seen on YouTube here. The audios are available on Apple and Spotify.

Transcript with links to audio and external links

Eric Topol (00:06):

Hello, this is Eric Topol with Ground Truths, and I have a really special guest today, Professor Venki Ramakrishnan from Cambridge who heads up the MRC Laboratory of Molecular Biology, and I think as you know a Nobel laureate for his seminal work on ribosomes. So thank you, welcome.

Venki Ramakrishnan (00:29):

Thank you. I just want to say that I'm not the head of the lab. I'm simply a staff member here.

Eric Topol (00:38):

Right. No, I don't want to give you more authority than you have, so that was certainly not implied. But today we're here to talk about this amazing book, Why We Die, which is a very provocative title and it mainly gets into the biology of aging, which Venki is especially well suited to be giving us a guided tour and his interpretations and views. And I read this book with fascination, Venki. I have three pages of typed notes from your book.

The Compression of Morbidity

Eric Topol (01:13):

And we could talk obviously for hours, but this is fascinating delving into this hot area, as you know, very hot area of aging. So I thought I'd start off more towards the end of the book where you kind of get philosophical into the ethics. And there this famous concept by James Fries of compression of morbidity that's been circulating for well over two decades. That's really the big question about all this aging effort. So maybe you could give us, do you think there is evidence for compression of morbidity so that you can just extend healthy aging and then you just fall off the cliff?

Venki Ramakrishnan (02:00):

I think that's the goal of most of the sort of what I call the saner end of the aging research community is to improve our health span. That is the number of years we have healthy lives, not so much to extend lifespan, which is how long we live. And the idea is that you take those years that we now spend in poor health or decrepitude and compress them down to just very short time, so you're healthy almost your entire life, and then suddenly go into a rapid decline and die. Now Fries who actually coined that term compression or morbidity compares this to the One-Hoss Shay after poem by Oliver Wendell Holmes from the 19th century, which is about this horse carriage that was designed so perfectly that all its parts wore out equally. And so, a farmer was riding along in this carriage one minute, and the next minute he found himself on the ground surrounded by a heap of dust, which was the entire carriage that had disintegrated.

Venki Ramakrishnan (03:09):

So the question I would ask is, if you are healthy and everything about you is healthy, why would you suddenly go into decline? And it's a fair question. And every advance we've made that has kept us healthier in one respect or another. For example, tackling diabetes or tackling heart disease has also extended our lifespan. So people are not living a bigger fraction of their lives healthily now, even though we're living longer. So the result is we're spending the same or even more number of years with one or more health problems in our old age. And you can see that in the explosion of nursing homes and care homes in almost all western countries. And as you know, they were big factors in Covid deaths. So I'm not sure it can be accomplished. I think that if we push forward with health, we're also going to extend our lifespan.

Venki Ramakrishnan (04:17):

Now the argument against that comes from studies of these, so-called super centenarians and semi super centenarians. These are people who live to be over 105 or 110. And Tom Perls who runs the New England study of centenarians has published findings which show that these supercentenarians live extraordinarily healthy lives for most of their life and undergo rapid decline and then die. So that's almost exactly what we would want. So they have somehow accomplished compression of morbidity. Now, I would say there are two problems with that. One is, I don't know about the data sample size. The number of people who live over 110 is very, very small. The other is they may be benefiting from their own unique genetics. So they may have a particular combination of genetics against a broad genetic background that's unique to each person. So I'm not sure it's a generally translatable thing, and it also may have to do with their particular life history and lifestyle. So I don't know how much of what we learned from these centenarians is going to be applicable to the population as a whole. And otherwise, I don't even know how this would be accomplished. Although some people feel there's a natural limit to our biology, which restricts our lifespan to about 115 or 120 years. Nobody has lived more than 122. And so, as we improve our health, we may come up against that natural limit. And so, you might get a compression of morbidity. I'm skeptical. I think it's an unsolved problem.

Eric Topol (06:14):

I think I'm with you about this, but there's a lot of conflation of the two concepts. One is to suppress age related diseases, and the other is to actually somehow modulate control the biologic aging process. And we lump it all together as you're getting at, which is one of the things I loved about your book is you really give a balanced view. You present the contrarians and the different perspectives, the perspective about people having age limits potentially much greater than 120, even though as you say, we haven't seen anyone live past 122 since 1997, so it's quite a long time. So this, I think, conflation of what we do today as far as things that will reduce heart disease or diabetes, that’s age related diseases, that's very different than controlling the biologic aging process. Now getting into that, one of the things that's particularly alluring right now, my friend here in San Diego, Juan Carlos Belmonte, who went over from Salk, which surprised me to the Altos Labs, as you pointed on in the book.

Venki Ramakrishnan (07:38):

I'm not surprised. I mean, you have a huge salary and all the resources you want to carry out the same kind of research. I wouldn't blame any of these guys.

Rejuvenating Animals With Yamanaka Factors

Eric Topol (07:50):

No, I understand. I understand. It's kind of like the LIV Golf tournament versus the PGA. It's pretty wild. At any rate, he's a good friend of mine, and I visited with him recently, and as you mentioned, he has over a hundred people working on this partial epigenetic reprogramming. And just so reviewing this for the uninitiated is giving the four Yamanaka transcription factors here to the whole animal or the mouse and rejuvenating old mice, essentially at least those with progeria. And then others have, as you point out in the book, done this with just old mice. So one of the things that strikes me about this, and in talking with him recently is it's going to be pretty hard to give these Yamanaka factors to a person, an intravenous infusion. So what are your thoughts about this rejuvenation of a whole person? What do you think?

Venki Ramakrishnan (08:52):

If I hadn't seen some of these papers would've been even more skeptical. But the data from, well, Belmonte's work was done initially on progeria mice. These are mice that age prematurely. And then people thought, well, they may not represent natural aging, and what you're doing is simply helping with some abnormal form of aging. But he and other groups have now done it with normal mice and observed similar effects. Now, I would say reprogramming is one way. It's a very exciting and powerful way to almost try to reverse aging because you're trying to take cells back developmentally. You're taking possibly fully differentiated cells back to stem cells and then helping regenerate tissue, which one of the problems as we age is we start losing stem cells. So we have stem cell depletion, so we can no longer replace our tissues as we do when we're younger. And I think anyone who knows who's had a scrape or been hurt in a fall or something knows this because if I fall and scrape my elbow and get a big bruise and my grandson falls, we repair our tissues at very, very different rates. It takes me days or weeks to recover, and my grandson's fine in two or three days. You can hardly see he had a scrape at all. So I think that's the thing that these guys want to do.

Venki Ramakrishnan (10:48):

And the problem is Yamanaka factors are cancer. Two of them are oncogenic factors, right? If you give Yamanaka factors to cells, you can take them all the way back to what are called pluripotent cells, which are the cells that are capable of forming any tissue in the body. So for example, a fertilized egg or an early embryo cells from the early embryo are pluripotent. They could form anything in the body. Now, if you do that to cells with Yamanaka factors, they often form teratomas, which are these unusual forms of cancer tumors. And so, I think there's a real risk. And so, what these guys say is, well, we'll give these factors transiently, so we'll only take the cells back a little ways and not all the way back to pluripotency. And that way if you start with

Svetlana Blitshteyn: On the Front Line With Long Covid and POTS

After finishing her training in neurology at Mayo Clinic, Dr. Svetlana Blitshteyn started a Dysautonomia Clinic in 2009. Little did she know what was in store many years later when Covid hit!

Ground Truths podcasts are on Apple and Spotify. The video interviews are on YouTube

Transcript with audio and external links

Eric Topol (00:07):

Well, hello, it's Eric Topol from Ground Truths, and I have with me a really great authority on dysautonomia and POTS. We will get into what that is for those who aren't following this closely. And it's Svetlana Blitshteyn who is a faculty member at University of Buffalo and a neurologist who long before there was such a thing as Covid was already onto one of the most important pathways of the body, the autonomic nervous system and how it can go off track. So welcome, Svetlana.

Svetlana Blitshteyn (00:40):

Thank you so much, Eric for having me. And I want to say it's a great honor for me to be here and just to be on the list with your other guests. It's remarkable and I'm very grateful and congratulations on being on the TIME100 Health list for influential people in 2024. And I am grateful for everything that you've done. As I mentioned earlier, I'm a big fan of your work before the pandemic and of course with Covid I followed your podcast and posts because you became the best science communicator and I'm very happy to see you being a strong advocate and thank you for everything you've done.

Eric Topol (01:27):

Well, that's so kind to you. And I think talking about getting things going before the pandemic, back in 2011, you published a book with Jodi Epstein Rhum called POTS - Together We Stand: Riding the Waves of Dysautonomia. And you probably didn't have an idea that there would be an epidemic of that more than a decade later, I guess, right?

Svetlana Blitshteyn (01:54):

Yeah, absolutely. Of course, SARS-CoV-2 is a new virus and we can technically say that Long Covid and post Covid complications could be viewed as a new entity. But practically speaking, we know that post-infectious syndromes have been happening for many decades. And so, the most common trigger for POTS happened to be infection, whether it was influenza or mononucleosis or Lyme or enterovirus. We knew this was happening. So I think it didn't take long for me and my colleagues to realize that we're going to be seeing a lot of patients with autonomic dysfunction after Covid.

On the Front Line

Eric Topol (02:40):

Well, one of the things that's important for having you on is you're in the front lines taking care of lots of patients with Long Covid and this postural orthostatic tachycardia syndrome (POTS). And I wonder if you could tell us what it's care for these patients because so many of them are incapacitated. As a cardiologist, I see of course some because of the cardiovascular aspects, but you are dealing with this on a day-to-day basis.

Svetlana Blitshteyn (03:14):

Yeah, absolutely. As early as April 2020 when everything was closed, I got a call from a young doctor in New York City saying that he had Covid and he couldn't recover, he couldn't return to the hospital. And his colleagues and cardiology attendants also had the same symptoms and the symptoms were palpitations, orthostatic intolerance, tachycardia, fatigue. Now, how he knew to contact me is that his sister was my patient with POTS before Covid pandemic. So he kind of figured this looked like my sister, let me check this out. And it didn't take long for me to have a lot of patience from the early wave. And then fairly soon, I think within months I was thinking, we have to write this up because this is important. And to some of us it was not news, but I was sure that to many physicians and public health officials, this would be something new.

Svetlana Blitshteyn (04:18):

So because I'm a busy clinician and don't have a lot of time for publications, I had to recruit a graduate student from McMasters and together we had this paper out, which was the first and largest case series on post Covid POTS and other autonomic disorders. And interestingly, even though it came out I think in 2021, by the time it was published, it became the most citable paper for me. And so I think from then on organizations and societies became interested in the work that I do because prior to that, I must say in the kind of a niche specialty was I don't think it was very popular or of interest to me.

How Did You Get Interested in Dysautonomia?

Eric Topol (05:06):

Yeah, so that's why I wanted to just take a step back with you Svetlana, because you had the foresight to be the founder and director of the Dysautonomia Clinic when a lot of people weren't in touch with this as an important entity. What prompted you as a neurologist to really zoom in on dysautonomia when you started this clinic?

Svetlana Blitshteyn (05:28):

Sure. So the reasons are how I ended up in this field is kind of a convoluted road and the reasons are many, but one, I will say that I trained at Mayo Clinic where we received very good training on autonomic disorders and EMG and coming back to returning back to Buffalo, I began working at the large multiple sclerosis clinic because Western New York has a high incidence MS. And so, what they quickly realized in that clinic is that there was a subset of women who did not qualify for the diagnostic criteria of multiple sclerosis, yet they had a lot of the same symptoms and they were certainly very disabled. Now I recognize that these women had autonomic disorders of all sorts and small fiber neuropathy, and I think this population sort of grew and eventually I realized there is no one not only in Buffalo but the entire Western New York who is doing this work.

Svetlana Blitshteyn (06:34):

So I kind of fell into that. But another reason is actually more personal that I haven’t talked about. So years ago I was traveling to Toronto, Canada for a neurology meeting to present my big study on meningioma and hormone replacement therapy using Mayo Clinic database. And so, in that year, the study received top 10 noteworthy studies of the year award from the Society of Neuro-Oncology, and it was profiled in Reuters Health. Now, on the way back from the conference, I had the flu, and when they returned I could no longer walk the same hallways of the hospital where I walked previously. And no matter how hard I try to push my body, we all do this in medicine, we push through, I just couldn’t do it. No amount of wishing or positive thinking. And so, I think that’s how I came to know personally the post-infectious syndromes. And I think it almost became a duality of experiencing this and also practicing it.

Eric Topol (07:52):

No, that’s really striking and it wasn’t so common to hear about this post flu, but certainly it changed in 2020. So how does a person with POTS typically present to you?

Clinical Presentation

Svetlana Blitshteyn (08:08):

So these are very important questions because what I want to stress is though POTS is one of the most common autonomic disorders. Even if you don’t have POTS by the diagnostic criteria, you may still have autonomic dysfunction and significant autonomic symptoms. How do they present? Well, they present like most Long Covid patients, the most common symptoms are orthostatic intolerance, fatigue, exercise intolerance, post exertional malaise, dizziness, tachycardia, brain fog. And these are common themes across the board in Long Covid patients, but also in pre-Covid post-acute infection syndrome patients. And you have to recognize because I think what I tell my colleagues is that oftentimes patients are not going to present to you saying, I have orthostatic intolerance. Many times they will say, I’m very tired. I can no longer go to the gym or when I go to the store, I have to be out of there in 15 minutes because the orthostatic intolerance symptoms come up.

Svetlana Blitshteyn (09:22):

So sometimes the patients themselves don’t recognize that and it’s up to us physicians to ask the right questions to get the information down. History is very important, knowing the pattern. And then of course, as I always say in all of my papers and lectures, you have to do a 10-minute stand test by measuring supine and standing blood pressure and heart rate on every Long Covid patients. And that’s how you spot those that have excessive postural tachycardia or their blood pressure dropping or so forth. So we have the tools. We don’t need fancy autonomic labs. We don’t even need a tilt table test. The diagnostic criteria for POTS is that you need to have either a 10-minute stand test or a tilt table test to get the diagnosis for POTS, orthostatic hypotension or even neurocardiogenic syncope. Now I think it's important to stress that even if a patient doesn't qualify, and let's say many patients with Long Covid will not elevate their heart rate by at least 30 beats per minute, it could be 20, it could be 25. These criteria are of course essential when we do research studies. But I think practically speaking, in patient care where everything is gray and nothing is black or white, especially in autonomic disorders, you really have to make a diagnosis saying, this sounds like autonomic dysfunction. Let me treat the patient for this problem.

Eric Topol (11:07):

Well, you brought up something that’s really important because doctors don’t have much time and they’re inpatient. They don’t wait 10 minutes to do a test to check your blood pressure. They send the patients for a tilt table, which nobody likes to have that test done, and it’s unnecessary added appointment and expense and whatnot. So that’s a good tip right there that you can get the same information just by checking the blood pressure and heart rate on standi

Kate Crawford: A Leading Scholar and Conscience for A.I.

“We haven't invested this much money into an infrastructure like this really until you go back to the pyramids”—Kate Crawford

Transcript with links to audio and external links. Ground Truths podcasts are on Apple and Spotify. The video interviews are on YouTube

Eric Topol (00:06):

Well, hello, this is Eric Topol with Ground Truths, and I'm really delighted today to welcome Kate Crawford, who we're very lucky to have as an Australian here in the United States. And she's multidimensional, as I've learned, not just a scholar of AI, all the dimensions of AI, but also an artist, a musician. We're going to get into all this today, so welcome Kate.

Kate Crawford (00:31):

Thank you so much, Eric. It's a pleasure to be here.

Eric Topol (00:34):

Well, I knew of your work coming out of the University of Southern California (USC) as a professor there and at Microsoft Research, and I'm only now learning about all these other things that you've been up to including being recognized in TIME 2023 as one of 100 most influential people in AI and it's really fascinating to see all the things that you've been doing. But I guess I'd start off with one of your recent publications in Nature. It was a world view, and it was about generative AI is guzzling water and energy. And in that you wrote about how these large AI systems, which are getting larger seemingly every day are needing as much energy as entire nations and the water consumption is rampant. So maybe we can just start off with that. You wrote a really compelling piece expressing concerns, and obviously this is not just the beginning of all the different aspects you've been tackling with AI.

Exponential Growth, Exponential Concerns

Kate Crawford (01:39):

Well, we're in a really interesting moment. What I've done as a researcher in this space for a very long time now is really introduce a material analysis of artificial intelligence. So we are often told that AI is a very immaterial technology. It's algorithms in the cloud, it's objective mathematics, but in actual fact, it comes with an enormous material infrastructure. And this is something that I took five years to research for my last book, Atlas of AI. It meant going to the mines where lithium and cobalt are being extracted. It meant going into the Amazon fulfillment warehouses to see how humans collaborate with robotic and AI systems. And it also meant looking at the large-scale labs where training data is being gathered and then labeled by crowd workers. And for me, this really changed my thinking. It meant that going from being a professor for 15 years focusing on AI from a very traditional perspective where we write papers, we're sitting in our offices behind desks, that I really had to go and do these journeys, these field trips, to understand that full extractive infrastructure that is needed to run AI at a planetary scale.

(02:58):

So I've been keeping a very close eye on what would change with generative AI and what we've seen particularly in the last two years has been an extraordinary expansion of the three core elements that I really write about in Atlas, so the extraction of data of non-renewable resources, and of course hidden labor. So what we've seen, particularly on the resources side, is a gigantic spike both in terms of energy and water and that's often the story that we don't hear. We're not aware that when we're told about the fact that there gigantic hundred billion computers that are now being developed for the next stage of generative AI that has an enormous energy and water footprint. So I've been researching that along with many others who are now increasingly concerned about how we might think about AI more holistically.

Eric Topol (03:52):

Well, let's go back to your book, which is an extraordinary book, the AI Atlas and how you dissected not just the well power of politics and planetary costs, but that has won awards and it was a few years back, and I wonder so much has changed since then. I mean ChatGPT in late 2022 caught everybody off guard who wasn't into this knowing that this has been incubating for a number of years, and as you said, these base models are just extraordinary in every parameter you can think about, particularly the computing resource and consumption. So your concerns were of course registered then, have they gone to exponential growth now?

Kate Crawford (04:45):

I love the way you put that. I think you're right. I think my concerns have grown exponentially with the models. But I was like everybody else, even though I've been doing this for a long time and I had something of a heads up in terms of where we were moving with transformer models, I was also quite taken aback at the extraordinary uptake of ChatGPT back in November 2022 in fact, gosh, it still feels like yesterday it's been such an extraordinary timescale. But looking at that shift to a hundred million users in two months and then the sort of rapid competition that was emerging from the major tech companies that I think really took me by surprise, the degree to which everybody was jumping on the bandwagon, applying some form of large language model to everything and anything suddenly the hammer was being applied to every single nail.

(05:42):

And in all of that sound and fury and excitement, I think there will be some really useful applications of these tools. But I also think there's a risk that we apply it in spaces where it's really not well suited that we are not looking at the societal and political risks that come along with these approaches, particularly next token prediction as a way of generating knowledge. And then finally this bigger set of questions around what is it really costing the planet to build these infrastructures that are really gargantuan? I mean, as a species, we haven't invested this much money into an infrastructure like this really until you go back to the pyramids, you really got to go very far back to say that type of just gargantuan spending in terms of capital, in terms of labor, in terms of all of the things are required to really build these kinds of systems. So for me, that's the moment that we're in right now and perhaps here together in 2024, we can take a breath from that extraordinary 18 month period and hopefully be a little more reflective on what we're building and why and where will it be best used.

Propagation of Biases

Eric Topol (06:57):

Yeah. Well, there's so many aspects of this that I'd like to get into with you. I mean, one of course, you're as a keen observer and activist in this whole space, you've made I think a very clear point about how our culture is mirrored in our AI that is our biases, and people are of course very quick to blame AI per se, but it seems like it's a bigger problem than just that. Maybe you could comment about, obviously biases are a profound concern about propagation of them, and where do you see where the problem is and how it can be attacked?

Kate Crawford (07:43):

Well, it is an enormous problem, and it has been for many years. I was first really interested in this question in the era that was known as the big data era. So we can think about the mid-2000s, and I really started studying large scale uses of data in scientific applications, but also in what you call social scientific settings using things like social media to detect and predict opinion, movement, the way that people were assessing key issues. And time and time again, I saw the same problem, which is that we have this tendency to assume that with scale comes greater accuracy without looking at the skews from the data sources. Where is that data coming from? What are the potential skews there? Is there a population that's overrepresented compared to others? And so, I began very early on looking at those questions. And then when we had very large-scale data sets start to emerge, like ImageNet, which was really perhaps the most influential dataset behind computer vision that was released in 2009, it was used widely, it was freely available.

(09:00):

That version was available for over a decade and no one had really looked inside it. And so, working with Trevor Paglen and others, we analyzed how people were being represented in this data set. And it was really quite extraordinary because initially people are labeled with terms that might seem relatively unsurprising, like this is a picture of a nurse, or this is a picture of a doctor, or this is a picture of a CEO. But then you look to see who is the archetypical CEO, and it's all pictures of white men, or if it's a basketball player, it's all pictures of black men. And then the labeling became more and more extreme, and there are terms like, this is an alcoholic, this is a corrupt politician, this is a kleptomaniac, this is a bad person. And then a whole series of labels that are simply not repeatable on your podcast.

(09:54):

So in finding this, we were absolutely horrified. And again, to know that so many AI models had trained on this as a way of doing visual recognition was so concerning because of course, very few people had even traced who was using this model. So trying to do the reverse engineering of where these really problematic assumptions were being built in hardcoded into how AI models see and interpret the world, that was a giant unknown and remains to this day quite problematic. We did a recent study that just came out a couple of months ago looking at one of the biggest data sets behind generative AI systems that are doing text to image generation. It's called LAION-5B, which stands for 5 billion. It has 5 billion images and text captions drawn from the internet. And you might think, as you said, this will just mirror societal biases, but it's actually far more weird than you

Akiko Iwasaki: The Immunology of Covid and the Future

If there’s one person you’d want to talk to about immunology, the immune system and Covid, holes in our knowledge base about the complex immune system, and where the field is headed, it would be Professor Iwasaki. And add to that the topic of Women in Science. Here’s our wide-ranging conversation.

A snippet of the video, Full length Ground Truths videos are posted here and you can subscribe.

Ground Truths is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Transcript with many external link and links to the audio, recorded 30 April 2024

Eric Topol (00:06):

Hello, it's Eric Topol and I'm really thrilled to have my friend Akiko Iwasaki from Yale, and before I start talking with Akiko, I just want to mention there aren't too many silver linings of the pandemic, but one for me was getting to know Professor Iwasaki. She is my go-to immunologist. I've learned so much from her over the last four years and she's amazing. She just, as you may know, she was just recently named one of the most influential people in the world by TIME100. [and also recognized this week in TIME 100 Health]. And besides that, she's been elected to the National Academy of Medicine, National Academy of Sciences. She's the president of the American Association of Immunologists and she's a Howard Hughes principal investigator. So Akiko, it's wonderful to have you to join into an extended discussion of things that we have of mutual interest.

Akiko Iwasaki (01:04):

Thank you so much, Eric, for having me. I equally appreciate all of what you do, and I follow your blog and tweets and everything. So thank you Eric.

Eric Topol (01:14):

Well, you are a phenom. I mean just, that's all I can say because I think it was so appropriate that TIME recognize your contributions, not just over the pandemic, but of course throughout your career, a brilliant career in immunology. I thought we'd start out with our topic of great interest on Long Covid. You've done seminal work here and this is an evolving topic obviously. I wonder what your latest thoughts are on the pathogenesis and where things are headed.

Long Covid

Akiko Iwasaki (01:55):

Yeah, so as I have been saying throughout the pandemic, I think that Long Covid is not one disease. It's a collection of multiple diseases and that are sort of ending up in similar sets of symptoms. Obviously, there are over 200 symptoms and not everyone has the same set of symptoms, but what we are going for is trying to understand the disease drivers, so persistent viral infection is one of them. There are overwhelming evidence for that theory now, all the way from autopsy and biopsy studies to looking at peripheral blood RNA signatures as well as circulating spike protein and nucleocapsid proteins that are detected in people with Long Covid. Now whether that persistent virus or remnants of virus is driving the disease itself is unclear still. And that's why trials like the one that we are engaging with Harlan Krumholz on Paxlovid should tell us what percentage of the people are suffering from that type of driver and whether antivirals like Paxlovid might be able to mitigate those. If I may, I'd like to talk about three other hypotheses.

Eric Topol (03:15):

Yeah, I'd love for you to do that.

Akiko Iwasaki (03:18):

Okay, great. So the second hypothesis that we've been working on is autoimmune disease. And so, this is clearly happening in a subset of people, again, it's a heterogeneous disease, but we can actually not only look at reactogenicity of antibodies from people with Long Covid where we can transfer IgG from patients with Long Covid into an animal, a healthy animal, and really measure outcomes of a pathogenesis. So that's a functional evidence that antibodies in some people with Long Covid is really actually causing some of the damages that are occurring in vivo. And the third hypothesis is the reactivation of herpes viruses. So many of us adults have multiple latent herpes virus family members that are just dormant and are not really causing any pathologies. But in people with Long Covid, we're seeing elevated reactivation of viruses like Epstein-Barr virus (EBV) or Varicella-zoster virus (VZV) and that may again be just a signature of Long Covid, but it may also be driving some of the symptoms that people are suffering from.

(04:32):

So that's again, we see the signature over and over, not just our group, but multiple other groups, Michael Peluso's group, Jim Heath, and many others. So that's also an emerging evidence from multiple groups showing that. And finally, we think that inflammation that occurs during the acute phase can sort of chronically change some tissue tone. For instance, in the brain with Michelle Monje’s team, we developed a sort of localized mild Covid model of infection and showed that changes in microglia can be seen seven weeks post infection even though the virus is completely gone. So that means that inflammation that's established as a result of this initial infection can have prolonged sequence and sequela within the person and that may also be driving disease. And Eric, the reason we need to understand these diseases separately is because not only for diagnostic purposes, but for therapeutic purposes because to target a persistent virus is very different approach from targeting autoantibodies, for example.

Eric Topol (05:49):

Well, that's great. There's a lot to unpack there as you laid out four distinct paths that could result in the clinical syndrome and sequelae. I think you know I had the chance to have a really fun conversation with Michelle about their joint work that you've done, and she reminded me how she made a cold call to you to start as a collaboration, which I thought was fantastic. Look what that yielded. But yeah, this is fascinating because as I think you're getting at is that it may not be the same pathogenesis in any given individual so that all these, and even others might be operative. I guess maybe I first delve into the antibody story as you're well aware, we see after people get Covid a higher rate of autoimmune diseases crop up, which is really interesting because it seems to rev up self-directed immune response. And this I think many people haven't really noted yet, although obviously you're well aware of this, it's across all the different autoimmune diseases, connective tissue disease, not just one in particular. And it's, as you say, the idea that you could take the blood from a person suffering from Long Covid and give it to an experimental animal model and be able to recapitulate some of the abnormalities, it's really pretty striking. So the question I guess is if you were to do plasmapheresis and try to basically expunge these autoantibodies, wouldn't you expect people to have some symptomatic benefit pretty rapidly or is it just that the process is already far from the initiating step?

Akiko Iwasaki (07:54):

That's a great question. Plasmapheresis may be able to transiently improve the person if they're suffering from these autoantibody mediated diseases. People have reported, for example, IVIG treatment has dramatically improved their symptoms, but not in everybody. So it's really critical to understand who's suffering from this particular driver and appropriately treat those people. And there are many other very effective therapies in autoimmune disease field that can be repurposed for treating these patients as well.

Eric Topol (08:34):

The only clinical trial that has clicked so far, interestingly, came out of Hong Kong with different types of ways to manipulate the gut microbiome, which again, you know better than me is a major modulator of our immune system response. What are your thoughts about taking advantage of that way to somehow modulate this untoward immune response in people with this condition?

Akiko Iwasaki (09:07):

Yeah, so that is an exciting sort of development, and I don't mean to discount the importance of microbiome at all. It's just the drivers that are mentioning are something that can be directly linked to disease, but certainly dysbiosis and translocation of metabolites and microbiome itself could trigger Long Covid as well. So it's something that we're definitely keeping our eyes on. And as you say, Eric, the immune system is in intimate contact with the gut microbiome and also the gut is intimate contact with the brain. So there's a lot of connections that we really need to be paying attention to. So yeah, absolutely. This is a very exciting development.

Eric Topol (09:57):

And it is intriguing of course, the reactivation of viruses. I mean, we’ve learned in recent years how important EBV is in multiple sclerosis (MS). The question I have for you on that pathway, is this just an epiphenomena or do you actually think that could be a driving force in some people?

Akiko Iwasaki (10:19):

Yeah, so that's really hard to untangle in people. I mean, David Putrino and my team we're planning a clinical trial using Truvada. Truvada obviously is an HIV drug, but it has reported antiviral activity to Epstein-Barr virus (EBV) and others. So potentially we can try to interrogate that in people, but we're also developing mouse models that can sort of recapitulate EBV like viral reactivation and to see whether there's any sort of causal link between the reactivation and disease process.

Eric Topol (10:57):

Right now, recently there's been a bunch of anecdotes of people who get the glucagon-like peptide one (GLP-1) drugs which have a potent anti-inflammatory, both systemic and in the brain. I'd love to test these drugs, but of course these companies that make them or have other interests outside of Long Covid, do you think there's potential for a drug lik

Aviv Regev: The Revolution in Digital Biology

Where do I think the next amazing revolution is going to come? … There’s no question that digital biology is going to be it. For the very first time in our history, in human history, biology has the opportunity to be engineering, not science.”—Jensen Huang, NVIDIA CEO

Aviv Regev is one of the leading life scientists of our time. In this conversation, we cover the ongoing revolution in digital biology that has been enabled by new deep knowledge on cells, proteins and genes, and the use of generative A.I .

Transcript with audio and external links

Eric Topol (00:05):

Hello, it's Eric Topol with Ground Truths and with me today I've really got the pleasure of welcoming Aviv Regev, who is the Executive Vice President of Research and Early Development at Genentech, having been 14 years a leader at the Broad Institute and who I view as one of the leading life scientists in the world. So Aviv, thanks so much for joining.

Aviv Regev (00:33):

Thank you for having me and for the very kind introduction.

The Human Cell Atlas

Eric Topol (00:36):

Well, it is no question in my view that is the truth and I wanted to have a chance to visit a few of the principal areas that you have been nurturing over many years. First of all, the Human Cell Atlas (HCA), the 37 trillion cells in our body approximately a little affected by size and gender and whatnot, but you founded the human cell atlas and maybe you can give us a little background on what you were thinking forward thinking of course when you and your colleagues initiated that big, big project.

Aviv Regev (01:18):

Thanks. Co-founded together with my very good friend and colleague, Sarah Teichmann, who was at the Sanger and just moved to Cambridge. I think our community at the time, which was still small at the time, really had the vision that has been playing out in the last several years, which is a huge gratification that if we had a systematic map of the cells of the body, we would be able both to understand biology better as well as to provide insight that would be meaningful in trying to diagnose and to treat disease. The basic idea behind that was that cells are the basic unit of life. They're often the first level at which you understand disease as well as in which you understand health and that in the human body, given the very large number of individual cells, 37.2 trillion give or take, and there are many different characteristics.

(02:16):

Even though biologists have been spending decades and centuries trying to characterize cells, they still had a haphazard view of them and that the advancing technology at the time – it was mostly single cell genomics, it was the beginnings also of spatial genomics – suggested that now there would be a systematic way, like a shared way of doing it across all cells in the human body rather than in ways that were niche and bespoke and as a result didn't unify together. I will also say, and if you go back to our old white paper, you will see some of it that we had this feeling because many of us were computational scientists by training, including both myself and Sarah Teichmann, that having a map like this, an atlas as we call it, a data set of this magnitude and scale, would really allow us to build a model to understand cells. Today, we call them foundational models or foundation models. We knew that machine learning is hungry for these kinds of data and that once you give it to machine learning, you get amazing things in return. We didn't know exactly what those things would be, and that has been playing out in front of our eyes as well in the last couple of years.

Spatial Omics

Eric Topol (03:30):

Well, that gets us to the topic you touched on the second area I wanted to get into, which is extraordinary, which is the spatial omics, which is related to the ability to the single cell sequencing of cells and nuclei and not just RNA and DNA and methylation and chromatin. I mean, this is incredible that you can track the evolution of cancer, that the old word that we would say is a tumor is heterogeneous, is obsolete because you can map every cell. I mean, this is just changing insights about so much of disease health mechanisms, so this is one of the hottest areas of all of life science. It's an outgrowth of knowing about cells. How do you summarize this whole era of spatial omics?

Aviv Regev (04:26):

Yeah, so there's a beautiful sentence in the search for lost time from Marcel Proust that I'm going to mess up in paraphrasing, but it is roughly that going on new journeys is not about actually going somewhere physically but looking with new eyes and I butchered the quote completely.[See below for actual quote.] I think that is actually what single cells and then spatial genomics or spatial omics more broadly has given us. It's the ability to look at the same phenomenon that we looked at all along, be it cancer or animal development or homeostasis in the lung or the way our brain works, but having new eyes in looking and because these new eyes are not just seeing more of something we've seen before, but actually seeing things that we couldn't realize were there before. It starts with finding cells we didn't know existed, but it's also the processes that these cells undergo, the mechanisms that actually control that, the causal mechanisms that control that, and especially in the case of spatial genomics, the ways in which cells come together.

(05:43):

And so we often like to think about the cell because it's the unit of life, but in a multicellular organism we just as much have to think about tissues and after that organs and systems and so on. In a tissue, you have this amazing orchestration of the interactions between different kinds of cells, and this happens in space and in time and as we're able to look at this in biology often structure is tightly associated to function. So the structure of the protein to the function of the protein in the same way, the way in which things are structured in tissue, which cells are next to each other, what molecules are they expressing, how are they physically interacting, really tells us how they conduct the business of the tissue. When the tissue functions well, it is this multicellular circuit that performs this amazing thing known as homeostasis.

(06:36):

Everything changes and yet the tissue stays the same and functions, and in disease, of course, when these connections break, they're not done in the right way you end up with pathology, which is of course something that even historically we have always looked at in the level of the tissue. So now we can see it in a much better way, and as we see it in a better way, we resolve better things. Yes, we can understand better the mechanisms that underlie the resistance to therapeutics. We can follow a temporal process like cancer as it unfortunately evolves. We can understand how autoimmune disease plays out with many cells that are actually bent out of shape in their interactions. We can also follow magnificent things like how we start from a single cell, the fertilized egg, and we become 37.2 trillion cell marvel. These are all things that this ability to look in a different way allows us to do.

Eric Topol (07:34):

It's just extraordinary. I wrote at Ground Truths about this. I gave all the examples at that time, and now there's about 50 more in the cardiovascular arena, knowing with single cell of the pineal gland that the explanation of why people with heart failure have sleep disturbances. I mean that's just one of the things of so many now these new insights it's really just so remarkable. Now we get to the current revolution, and I wanted to read to you a quote that I have.

Digital Biology

Aviv Regev (08:16):

I should have prepared mine. I did it off the top of my head.

Eric Topol (08:20):

It's actually from Jensen Huang at NVIDIA about the digital biology [at top of the transcript] and how it changes the world and how you're changing the world with AI and lab in the loop and all these things going on in three years that you've been at Genentech. So maybe you can tell us about this revolution of AI and how you're embracing it to have AI get into positive feedbacks as to what experiment to do next from all the data that is generated.

Aviv Regev (08:55):

Yeah, so Jensen and NVIDIA are actually great partners for us in Genentech, so it's fun to contemplate any quote that comes from there. I'll actually say this has been in the making since the early 2010s. 2012 I like to reflect on because I think it was a remarkable year for what we're seeing right now in biology, specifically in biology and medicine. In 2012, we had the beginnings of really robust protocols for single cell genomics, the first generation of those, we had CRISPR happen as a method to actually edit cells, so we had the ability to manipulate systems at a much better way than we had before, and deep learning happened in the same year as well. Wasn't that a nice year? But sometimes people only realize the magnitude of the year that happened years later. I think the deep learning impact people realized first, then the single cells, and then the CRISPR, then the single cells.

(09:49):

So in order maybe a little bit, but now we're really living through what that promise can deliver for us. It's still the early days of that, of the delivery, but we are really seeing it. The thing to realize there is that for many, many of the problems that we try to solve in biomedicine, the problem is bigger than we would ever be able to perform experiments or collect data. Even if we had the genomes of all the people in the world, all billions and billions of them, that's just a smidge compared to all of the ways in which their common var

Every Podcast » Ground Truths » Jim Collins: Discovery of the First New Structural Class of Antibiotics in Decades, Using A.I.