Daphne Koller: The Convergence of A.I. and Digital Biology

35m · Ground Truths · 10 Mar 16:35

Transcript

Eric Topol (00:06):

Well, hello, this is Eric Topol with Ground Truths and I am absolutely thrilled to welcome Daphne Koller, the founder and CEO of insitro, and a person who I've been wanting to meet for some time. Finally, we converged so welcome, Daphne.

Daphne Koller (00:21):

Thank you Eric. And it's a pleasure to finally meet you as well.

Eric Topol (00:24):

Yeah, I mean you have been rocking everybody over the years with elected to the National Academy of Engineering and Science and right at the interface of life science and computer science and in my view, there's hardly anyone I can imagine who's doing so much at that interface. I wanted to first start with your meeting in Davos last month because I kind of figured we start broad AI rather than starting to get into what you're doing these days. And you had a really interesting panel [←transcript] with Yann LeCun, Andrew Ng and Kai-Fu Lee and others, and I wanted to get your impression about that and also kind of the general sense. I mean AI is just moving it at speed, that is just crazy stuff. What were your thoughts about that panel just last month, where are we?

Video link for the WEF Panel

Daphne Koller (01:25):

I think we've been living on an exponential curve for multiple decades and the thing about exponential curves is they are very misleading things. In the early stages people basically take the line between whatever we were last year, and this year and they interpolate linearly, and they say, God, things are moving so slowly. Then as the exponential curve starts to pick up, it becomes more and more evident that things are moving faster, but it’s still people interpolate linearly and it's only when things really hit that inflection point that people realize that even with the linear interpolation where we'll be next year is just mind blowing. And if you realize that you're on that exponential curve where we will be next year is just totally unanticipatable. I think what we started to discuss in that panel was, are we in fact on an exponential curve? What are the rate limiting factors that may or may not enable that curve to continue specifically availability of data and what it would take to make that curve available in areas outside of the speech, whatever natural language, large language models that exist today and go far beyond that, which is what you would need to have these be applicable to areas such as biology and medicine.

Daphne Koller (02:47):

And so that was kind of the message to my mind from the panel.

Eric Topol (02:53):

And there was some differences in opinion, of course Yann can be a little strong and I think it was good to see that you're challenging on some things and how there is this “world view” of AI and how, I guess where we go from here. As you mentioned in the area of life science, there already had been before large language models hit stride, so much progress particularly in imaging cells, subcellular, I mean rare cells, I mean just stuff that was just without any labeling, without fluorescein, just amazing stuff. And then now it's gone into another level. So as we get into that, just before I do that, I want to ask you about this convergence story. Jensen Huang, I'm sure you heard his quote about biology as the opportunity to be engineering, not science. I'm sure if I understand, not science, but what about this convergence? Because it is quite extraordinary to see two fields coming together moving at such high velocity.

"Biology has the opportunity to be engineering not science. When something becomes engineering not science it becomes...exponentially improving, it can compound on the benefits of previous years." -Jensen Huang, NVIDIA.

Daphne Koller (04:08):

So, a quote that I will replace Jensen's or will propose a replacement for Jensen's quote, which is one that many people have articulated, is that math is to physics as machine learning is to biology. It is a mathematical foundation that allows you to take something that up until that point had been kind of mysterious and fuzzy and almost magical and create a formal foundation for it. Now physics, especially Newtonian physics, is simple enough that math is the right foundation to capture what goes on in a lot of physics. Biology as an evolved natural system is so complex that you can't articulate a mathematical model for that de novo. You need to actually let the data speak and then let machine learning find the patterns in those data and really help us create a predictability, if you will, for biological systems that you can start to ask what if questions, what would happen if we perturb the system in this way?

The Convergence

Daphne Koller (05:17):

How would it react? We're nowhere close to being able to answer those questions reliably today, but as you feed a machine learning system more and more data, hopefully it'll become capable of making those predictions. And in order to do that, and this is where it comes to this convergence of these two disciplines, the fodder, the foundation for all of machine learning is having enough data to feed the beast. The miracle of the convergence that we're seeing is that over the last 10, 15 years, maybe 20 years in biology, we've been on a similar, albeit somewhat slower exponential curve of data generation in biology where we are turning it into a quantitative discipline from something that is entirely observational qualitative, which is where it started, to something that becomes much more quantitative and broad based in how we measure biology. And so those measurements, the tools that life scientists and bioengineers have developed that allow us to measure biological systems is what produces that fodder, that energy that you can then feed into the machine learning models so that they can start making predictions.

Eric Topol (06:32):

Yeah, well I think the number of layers of data no less what's in these layers is quite extraordinary. So some years ago when all the single cell sequencing was started, I said, well, that's kind of academic interest and now the field of spatial omics has exploded. And I wonder how you see the feeding the beast here. It's at every level. It's not just the cell level subcellular and single cell nuclei sequencing single cell epigenomics, and then you go all the way to these other layers of data. I know you plug into the human patient side as well as it could be images, it could be past slides, it could be the outcomes and treatments and on and on and on. I mean, so when you think about multimodal AI, has anybody really done that yet?

Daphne Koller (07:30):

I think that there are certainly beginnings of multimodal AI and we have started to see some of the benefits of the convergence of say, imaging and omics. And I will give an example from some of the work that we've recently distributed on a preprint server work that we did at insitro, which took imaging data from standard histopathology slides, H&E slides and aligned them with simple bulk RNA-Seq taken from those same tumor samples. And what we find is that by training models that translate from one to the other, specifically from the imaging to the omics, you're able to, for a fairly large fraction of genes, make very accurate predictions of gene expression levels by looking at the histopath images alone. And in fact, because many of the predictions are made at the tile level, not at the entire slide level, even though the omics was captured in bulk, you're able to spatially resolve the signal and get kind of like a pseudo spatial biology just by making predictions from the H&E image into these omic modalities.

Multimodal A.I. and Life Science

Daphne Koller (08:44):

So there are I think beginnings of multimodality, but in order to get to multimodality, you really need to train on at least some data where the two modalities are simultaneously. And so at this point, I think the rate limiting factor is more a matter of data acquisition for training the models. It is for building the models themselves. And so that's where I think things like spatial biology, which I think like you are very excited about, are one of the places where we can really start to capture these paired modalities and get to some of those multimodal capabilities.

Eric Topol (09:23):

Yeah, I wanted to ask you because I mean spatial temporal is so perfect. It is two modes, and you have as the preprint you refer to and you see things like electronic health records in genomics, electronic health records in medical images. The most we've done is getting two modes of data together. And the question is as this data starts to really accrue, do we need new models to work with it or do you actually foresee that that is not a limiting step?

Daphne Koller (09:57):

So I think currently data availability is the most significant rate limiting step. The nice thing about modern day machine learning is that it really is structured as a set of building blocks that you can start to put together in different ways for different situations. And so, do we have the exact right models available to us today for these multimodal systems? Probably not, but do we have the right building blocks that if we creatively put them together from what has already been deployed in other settings? Probably, yes. So of course there's still a model exploration to be done and a lot of creativity in how these building blocks should be put together, but I think we have the tools available to solve these problems. What we really need is first I think a really significant data acquisition effort. And the other thing that we need, which is also something that has been a priority for us at insitro, is the right mix of people to be put together so that you can, because what

The episode Daphne Koller: The Convergence of A.I. and Digital Biology from the podcast Ground Truths has a duration of 35:16. It was first published 10 Mar 16:35. The cover art and the content belong to their respective owners.

Permalink

More episodes from Ground Truths

Venki Ramakrishnan: The New Science of Aging

Professor Venki Ramakrishnan, a Nobel laureate for his work on unraveling the structure of function of the ribosome, has written a new book WHY WE DIE which is outstanding. Among many posts and recognitions for his extraordinary work in molecular biology, Venki has been President of the Royal Society, knighted in 2012, and was made a Member of the Order of Merit in 2022. He is a group leader at the MRC Laboratory of Molecular Biology research institute in Cambridge, UK.

A brief video snippet of our conversation below. Full videos of all Ground Truths podcasts can be seen on YouTube here. The audios are available on Apple and Spotify.

Transcript with links to audio and external links

Eric Topol (00:06):

Hello, this is Eric Topol with Ground Truths, and I have a really special guest today, Professor Venki Ramakrishnan from Cambridge who heads up the MRC Laboratory of Molecular Biology, and I think as you know a Nobel laureate for his seminal work on ribosomes. So thank you, welcome.

Venki Ramakrishnan (00:29):

Thank you. I just want to say that I'm not the head of the lab. I'm simply a staff member here.

Eric Topol (00:38):

Right. No, I don't want to give you more authority than you have, so that was certainly not implied. But today we're here to talk about this amazing book, Why We Die, which is a very provocative title and it mainly gets into the biology of aging, which Venki is especially well suited to be giving us a guided tour and his interpretations and views. And I read this book with fascination, Venki. I have three pages of typed notes from your book.

The Compression of Morbidity

Eric Topol (01:13):

And we could talk obviously for hours, but this is fascinating delving into this hot area, as you know, very hot area of aging. So I thought I'd start off more towards the end of the book where you kind of get philosophical into the ethics. And there this famous concept by James Fries of compression of morbidity that's been circulating for well over two decades. That's really the big question about all this aging effort. So maybe you could give us, do you think there is evidence for compression of morbidity so that you can just extend healthy aging and then you just fall off the cliff?

Venki Ramakrishnan (02:00):

I think that's the goal of most of the sort of what I call the saner end of the aging research community is to improve our health span. That is the number of years we have healthy lives, not so much to extend lifespan, which is how long we live. And the idea is that you take those years that we now spend in poor health or decrepitude and compress them down to just very short time, so you're healthy almost your entire life, and then suddenly go into a rapid decline and die. Now Fries who actually coined that term compression or morbidity compares this to the One-Hoss Shay after poem by Oliver Wendell Holmes from the 19th century, which is about this horse carriage that was designed so perfectly that all its parts wore out equally. And so, a farmer was riding along in this carriage one minute, and the next minute he found himself on the ground surrounded by a heap of dust, which was the entire carriage that had disintegrated.

Venki Ramakrishnan (03:09):

So the question I would ask is, if you are healthy and everything about you is healthy, why would you suddenly go into decline? And it's a fair question. And every advance we've made that has kept us healthier in one respect or another. For example, tackling diabetes or tackling heart disease has also extended our lifespan. So people are not living a bigger fraction of their lives healthily now, even though we're living longer. So the result is we're spending the same or even more number of years with one or more health problems in our old age. And you can see that in the explosion of nursing homes and care homes in almost all western countries. And as you know, they were big factors in Covid deaths. So I'm not sure it can be accomplished. I think that if we push forward with health, we're also going to extend our lifespan.

Venki Ramakrishnan (04:17):

Now the argument against that comes from studies of these, so-called super centenarians and semi super centenarians. These are people who live to be over 105 or 110. And Tom Perls who runs the New England study of centenarians has published findings which show that these supercentenarians live extraordinarily healthy lives for most of their life and undergo rapid decline and then die. So that's almost exactly what we would want. So they have somehow accomplished compression of morbidity. Now, I would say there are two problems with that. One is, I don't know about the data sample size. The number of people who live over 110 is very, very small. The other is they may be benefiting from their own unique genetics. So they may have a particular combination of genetics against a broad genetic background that's unique to each person. So I'm not sure it's a generally translatable thing, and it also may have to do with their particular life history and lifestyle. So I don't know how much of what we learned from these centenarians is going to be applicable to the population as a whole. And otherwise, I don't even know how this would be accomplished. Although some people feel there's a natural limit to our biology, which restricts our lifespan to about 115 or 120 years. Nobody has lived more than 122. And so, as we improve our health, we may come up against that natural limit. And so, you might get a compression of morbidity. I'm skeptical. I think it's an unsolved problem.

Eric Topol (06:14):

I think I'm with you about this, but there's a lot of conflation of the two concepts. One is to suppress age related diseases, and the other is to actually somehow modulate control the biologic aging process. And we lump it all together as you're getting at, which is one of the things I loved about your book is you really give a balanced view. You present the contrarians and the different perspectives, the perspective about people having age limits potentially much greater than 120, even though as you say, we haven't seen anyone live past 122 since 1997, so it's quite a long time. So this, I think, conflation of what we do today as far as things that will reduce heart disease or diabetes, that’s age related diseases, that's very different than controlling the biologic aging process. Now getting into that, one of the things that's particularly alluring right now, my friend here in San Diego, Juan Carlos Belmonte, who went over from Salk, which surprised me to the Altos Labs, as you pointed on in the book.

Venki Ramakrishnan (07:38):

I'm not surprised. I mean, you have a huge salary and all the resources you want to carry out the same kind of research. I wouldn't blame any of these guys.

Rejuvenating Animals With Yamanaka Factors

Eric Topol (07:50):

No, I understand. I understand. It's kind of like the LIV Golf tournament versus the PGA. It's pretty wild. At any rate, he's a good friend of mine, and I visited with him recently, and as you mentioned, he has over a hundred people working on this partial epigenetic reprogramming. And just so reviewing this for the uninitiated is giving the four Yamanaka transcription factors here to the whole animal or the mouse and rejuvenating old mice, essentially at least those with progeria. And then others have, as you point out in the book, done this with just old mice. So one of the things that strikes me about this, and in talking with him recently is it's going to be pretty hard to give these Yamanaka factors to a person, an intravenous infusion. So what are your thoughts about this rejuvenation of a whole person? What do you think?

Venki Ramakrishnan (08:52):

If I hadn't seen some of these papers would've been even more skeptical. But the data from, well, Belmonte's work was done initially on progeria mice. These are mice that age prematurely. And then people thought, well, they may not represent natural aging, and what you're doing is simply helping with some abnormal form of aging. But he and other groups have now done it with normal mice and observed similar effects. Now, I would say reprogramming is one way. It's a very exciting and powerful way to almost try to reverse aging because you're trying to take cells back developmentally. You're taking possibly fully differentiated cells back to stem cells and then helping regenerate tissue, which one of the problems as we age is we start losing stem cells. So we have stem cell depletion, so we can no longer replace our tissues as we do when we're younger. And I think anyone who knows who's had a scrape or been hurt in a fall or something knows this because if I fall and scrape my elbow and get a big bruise and my grandson falls, we repair our tissues at very, very different rates. It takes me days or weeks to recover, and my grandson's fine in two or three days. You can hardly see he had a scrape at all. So I think that's the thing that these guys want to do.

Venki Ramakrishnan (10:48):

And the problem is Yamanaka factors are cancer. Two of them are oncogenic factors, right? If you give Yamanaka factors to cells, you can take them all the way back to what are called pluripotent cells, which are the cells that are capable of forming any tissue in the body. So for example, a fertilized egg or an early embryo cells from the early embryo are pluripotent. They could form anything in the body. Now, if you do that to cells with Yamanaka factors, they often form teratomas, which are these unusual forms of cancer tumors. And so, I think there's a real risk. And so, what these guys say is, well, we'll give these factors transiently, so we'll only take the cells back a little ways and not all the way back to pluripotency. And that way if you start with

Svetlana Blitshteyn: On the Front Line With Long Covid and POTS

After finishing her training in neurology at Mayo Clinic, Dr. Svetlana Blitshteyn started a Dysautonomia Clinic in 2009. Little did she know what was in store many years later when Covid hit!

Ground Truths podcasts are on Apple and Spotify. The video interviews are on YouTube

Transcript with audio and external links

Eric Topol (00:07):

Well, hello, it's Eric Topol from Ground Truths, and I have with me a really great authority on dysautonomia and POTS. We will get into what that is for those who aren't following this closely. And it's Svetlana Blitshteyn who is a faculty member at University of Buffalo and a neurologist who long before there was such a thing as Covid was already onto one of the most important pathways of the body, the autonomic nervous system and how it can go off track. So welcome, Svetlana.

Svetlana Blitshteyn (00:40):

Thank you so much, Eric for having me. And I want to say it's a great honor for me to be here and just to be on the list with your other guests. It's remarkable and I'm very grateful and congratulations on being on the TIME100 Health list for influential people in 2024. And I am grateful for everything that you've done. As I mentioned earlier, I'm a big fan of your work before the pandemic and of course with Covid I followed your podcast and posts because you became the best science communicator and I'm very happy to see you being a strong advocate and thank you for everything you've done.

Eric Topol (01:27):

Well, that's so kind to you. And I think talking about getting things going before the pandemic, back in 2011, you published a book with Jodi Epstein Rhum called POTS - Together We Stand: Riding the Waves of Dysautonomia. And you probably didn't have an idea that there would be an epidemic of that more than a decade later, I guess, right?

Svetlana Blitshteyn (01:54):

Yeah, absolutely. Of course, SARS-CoV-2 is a new virus and we can technically say that Long Covid and post Covid complications could be viewed as a new entity. But practically speaking, we know that post-infectious syndromes have been happening for many decades. And so, the most common trigger for POTS happened to be infection, whether it was influenza or mononucleosis or Lyme or enterovirus. We knew this was happening. So I think it didn't take long for me and my colleagues to realize that we're going to be seeing a lot of patients with autonomic dysfunction after Covid.

On the Front Line

Eric Topol (02:40):

Well, one of the things that's important for having you on is you're in the front lines taking care of lots of patients with Long Covid and this postural orthostatic tachycardia syndrome (POTS). And I wonder if you could tell us what it's care for these patients because so many of them are incapacitated. As a cardiologist, I see of course some because of the cardiovascular aspects, but you are dealing with this on a day-to-day basis.

Svetlana Blitshteyn (03:14):

Yeah, absolutely. As early as April 2020 when everything was closed, I got a call from a young doctor in New York City saying that he had Covid and he couldn't recover, he couldn't return to the hospital. And his colleagues and cardiology attendants also had the same symptoms and the symptoms were palpitations, orthostatic intolerance, tachycardia, fatigue. Now, how he knew to contact me is that his sister was my patient with POTS before Covid pandemic. So he kind of figured this looked like my sister, let me check this out. And it didn't take long for me to have a lot of patience from the early wave. And then fairly soon, I think within months I was thinking, we have to write this up because this is important. And to some of us it was not news, but I was sure that to many physicians and public health officials, this would be something new.

Svetlana Blitshteyn (04:18):

So because I'm a busy clinician and don't have a lot of time for publications, I had to recruit a graduate student from McMasters and together we had this paper out, which was the first and largest case series on post Covid POTS and other autonomic disorders. And interestingly, even though it came out I think in 2021, by the time it was published, it became the most citable paper for me. And so I think from then on organizations and societies became interested in the work that I do because prior to that, I must say in the kind of a niche specialty was I don't think it was very popular or of interest to me.

How Did You Get Interested in Dysautonomia?

Eric Topol (05:06):

Yeah, so that's why I wanted to just take a step back with you Svetlana, because you had the foresight to be the founder and director of the Dysautonomia Clinic when a lot of people weren't in touch with this as an important entity. What prompted you as a neurologist to really zoom in on dysautonomia when you started this clinic?

Svetlana Blitshteyn (05:28):

Sure. So the reasons are how I ended up in this field is kind of a convoluted road and the reasons are many, but one, I will say that I trained at Mayo Clinic where we received very good training on autonomic disorders and EMG and coming back to returning back to Buffalo, I began working at the large multiple sclerosis clinic because Western New York has a high incidence MS. And so, what they quickly realized in that clinic is that there was a subset of women who did not qualify for the diagnostic criteria of multiple sclerosis, yet they had a lot of the same symptoms and they were certainly very disabled. Now I recognize that these women had autonomic disorders of all sorts and small fiber neuropathy, and I think this population sort of grew and eventually I realized there is no one not only in Buffalo but the entire Western New York who is doing this work.

Svetlana Blitshteyn (06:34):

So I kind of fell into that. But another reason is actually more personal that I haven’t talked about. So years ago I was traveling to Toronto, Canada for a neurology meeting to present my big study on meningioma and hormone replacement therapy using Mayo Clinic database. And so, in that year, the study received top 10 noteworthy studies of the year award from the Society of Neuro-Oncology, and it was profiled in Reuters Health. Now, on the way back from the conference, I had the flu, and when they returned I could no longer walk the same hallways of the hospital where I walked previously. And no matter how hard I try to push my body, we all do this in medicine, we push through, I just couldn’t do it. No amount of wishing or positive thinking. And so, I think that’s how I came to know personally the post-infectious syndromes. And I think it almost became a duality of experiencing this and also practicing it.

Eric Topol (07:52):

No, that’s really striking and it wasn’t so common to hear about this post flu, but certainly it changed in 2020. So how does a person with POTS typically present to you?

Clinical Presentation

Svetlana Blitshteyn (08:08):

So these are very important questions because what I want to stress is though POTS is one of the most common autonomic disorders. Even if you don’t have POTS by the diagnostic criteria, you may still have autonomic dysfunction and significant autonomic symptoms. How do they present? Well, they present like most Long Covid patients, the most common symptoms are orthostatic intolerance, fatigue, exercise intolerance, post exertional malaise, dizziness, tachycardia, brain fog. And these are common themes across the board in Long Covid patients, but also in pre-Covid post-acute infection syndrome patients. And you have to recognize because I think what I tell my colleagues is that oftentimes patients are not going to present to you saying, I have orthostatic intolerance. Many times they will say, I’m very tired. I can no longer go to the gym or when I go to the store, I have to be out of there in 15 minutes because the orthostatic intolerance symptoms come up.

Svetlana Blitshteyn (09:22):

So sometimes the patients themselves don’t recognize that and it’s up to us physicians to ask the right questions to get the information down. History is very important, knowing the pattern. And then of course, as I always say in all of my papers and lectures, you have to do a 10-minute stand test by measuring supine and standing blood pressure and heart rate on every Long Covid patients. And that’s how you spot those that have excessive postural tachycardia or their blood pressure dropping or so forth. So we have the tools. We don’t need fancy autonomic labs. We don’t even need a tilt table test. The diagnostic criteria for POTS is that you need to have either a 10-minute stand test or a tilt table test to get the diagnosis for POTS, orthostatic hypotension or even neurocardiogenic syncope. Now I think it's important to stress that even if a patient doesn't qualify, and let's say many patients with Long Covid will not elevate their heart rate by at least 30 beats per minute, it could be 20, it could be 25. These criteria are of course essential when we do research studies. But I think practically speaking, in patient care where everything is gray and nothing is black or white, especially in autonomic disorders, you really have to make a diagnosis saying, this sounds like autonomic dysfunction. Let me treat the patient for this problem.

Eric Topol (11:07):

Well, you brought up something that’s really important because doctors don’t have much time and they’re inpatient. They don’t wait 10 minutes to do a test to check your blood pressure. They send the patients for a tilt table, which nobody likes to have that test done, and it’s unnecessary added appointment and expense and whatnot. So that’s a good tip right there that you can get the same information just by checking the blood pressure and heart rate on standi

Kate Crawford: A Leading Scholar and Conscience for A.I.

“We haven't invested this much money into an infrastructure like this really until you go back to the pyramids”—Kate Crawford

Transcript with links to audio and external links. Ground Truths podcasts are on Apple and Spotify. The video interviews are on YouTube

Eric Topol (00:06):

Well, hello, this is Eric Topol with Ground Truths, and I'm really delighted today to welcome Kate Crawford, who we're very lucky to have as an Australian here in the United States. And she's multidimensional, as I've learned, not just a scholar of AI, all the dimensions of AI, but also an artist, a musician. We're going to get into all this today, so welcome Kate.

Kate Crawford (00:31):

Thank you so much, Eric. It's a pleasure to be here.

Eric Topol (00:34):

Well, I knew of your work coming out of the University of Southern California (USC) as a professor there and at Microsoft Research, and I'm only now learning about all these other things that you've been up to including being recognized in TIME 2023 as one of 100 most influential people in AI and it's really fascinating to see all the things that you've been doing. But I guess I'd start off with one of your recent publications in Nature. It was a world view, and it was about generative AI is guzzling water and energy. And in that you wrote about how these large AI systems, which are getting larger seemingly every day are needing as much energy as entire nations and the water consumption is rampant. So maybe we can just start off with that. You wrote a really compelling piece expressing concerns, and obviously this is not just the beginning of all the different aspects you've been tackling with AI.

Exponential Growth, Exponential Concerns

Kate Crawford (01:39):

Well, we're in a really interesting moment. What I've done as a researcher in this space for a very long time now is really introduce a material analysis of artificial intelligence. So we are often told that AI is a very immaterial technology. It's algorithms in the cloud, it's objective mathematics, but in actual fact, it comes with an enormous material infrastructure. And this is something that I took five years to research for my last book, Atlas of AI. It meant going to the mines where lithium and cobalt are being extracted. It meant going into the Amazon fulfillment warehouses to see how humans collaborate with robotic and AI systems. And it also meant looking at the large-scale labs where training data is being gathered and then labeled by crowd workers. And for me, this really changed my thinking. It meant that going from being a professor for 15 years focusing on AI from a very traditional perspective where we write papers, we're sitting in our offices behind desks, that I really had to go and do these journeys, these field trips, to understand that full extractive infrastructure that is needed to run AI at a planetary scale.

(02:58):

So I've been keeping a very close eye on what would change with generative AI and what we've seen particularly in the last two years has been an extraordinary expansion of the three core elements that I really write about in Atlas, so the extraction of data of non-renewable resources, and of course hidden labor. So what we've seen, particularly on the resources side, is a gigantic spike both in terms of energy and water and that's often the story that we don't hear. We're not aware that when we're told about the fact that there gigantic hundred billion computers that are now being developed for the next stage of generative AI that has an enormous energy and water footprint. So I've been researching that along with many others who are now increasingly concerned about how we might think about AI more holistically.

Eric Topol (03:52):

Well, let's go back to your book, which is an extraordinary book, the AI Atlas and how you dissected not just the well power of politics and planetary costs, but that has won awards and it was a few years back, and I wonder so much has changed since then. I mean ChatGPT in late 2022 caught everybody off guard who wasn't into this knowing that this has been incubating for a number of years, and as you said, these base models are just extraordinary in every parameter you can think about, particularly the computing resource and consumption. So your concerns were of course registered then, have they gone to exponential growth now?

Kate Crawford (04:45):

I love the way you put that. I think you're right. I think my concerns have grown exponentially with the models. But I was like everybody else, even though I've been doing this for a long time and I had something of a heads up in terms of where we were moving with transformer models, I was also quite taken aback at the extraordinary uptake of ChatGPT back in November 2022 in fact, gosh, it still feels like yesterday it's been such an extraordinary timescale. But looking at that shift to a hundred million users in two months and then the sort of rapid competition that was emerging from the major tech companies that I think really took me by surprise, the degree to which everybody was jumping on the bandwagon, applying some form of large language model to everything and anything suddenly the hammer was being applied to every single nail.

(05:42):

And in all of that sound and fury and excitement, I think there will be some really useful applications of these tools. But I also think there's a risk that we apply it in spaces where it's really not well suited that we are not looking at the societal and political risks that come along with these approaches, particularly next token prediction as a way of generating knowledge. And then finally this bigger set of questions around what is it really costing the planet to build these infrastructures that are really gargantuan? I mean, as a species, we haven't invested this much money into an infrastructure like this really until you go back to the pyramids, you really got to go very far back to say that type of just gargantuan spending in terms of capital, in terms of labor, in terms of all of the things are required to really build these kinds of systems. So for me, that's the moment that we're in right now and perhaps here together in 2024, we can take a breath from that extraordinary 18 month period and hopefully be a little more reflective on what we're building and why and where will it be best used.

Propagation of Biases

Eric Topol (06:57):

Yeah. Well, there's so many aspects of this that I'd like to get into with you. I mean, one of course, you're as a keen observer and activist in this whole space, you've made I think a very clear point about how our culture is mirrored in our AI that is our biases, and people are of course very quick to blame AI per se, but it seems like it's a bigger problem than just that. Maybe you could comment about, obviously biases are a profound concern about propagation of them, and where do you see where the problem is and how it can be attacked?

Kate Crawford (07:43):

Well, it is an enormous problem, and it has been for many years. I was first really interested in this question in the era that was known as the big data era. So we can think about the mid-2000s, and I really started studying large scale uses of data in scientific applications, but also in what you call social scientific settings using things like social media to detect and predict opinion, movement, the way that people were assessing key issues. And time and time again, I saw the same problem, which is that we have this tendency to assume that with scale comes greater accuracy without looking at the skews from the data sources. Where is that data coming from? What are the potential skews there? Is there a population that's overrepresented compared to others? And so, I began very early on looking at those questions. And then when we had very large-scale data sets start to emerge, like ImageNet, which was really perhaps the most influential dataset behind computer vision that was released in 2009, it was used widely, it was freely available.

(09:00):

That version was available for over a decade and no one had really looked inside it. And so, working with Trevor Paglen and others, we analyzed how people were being represented in this data set. And it was really quite extraordinary because initially people are labeled with terms that might seem relatively unsurprising, like this is a picture of a nurse, or this is a picture of a doctor, or this is a picture of a CEO. But then you look to see who is the archetypical CEO, and it's all pictures of white men, or if it's a basketball player, it's all pictures of black men. And then the labeling became more and more extreme, and there are terms like, this is an alcoholic, this is a corrupt politician, this is a kleptomaniac, this is a bad person. And then a whole series of labels that are simply not repeatable on your podcast.

(09:54):

So in finding this, we were absolutely horrified. And again, to know that so many AI models had trained on this as a way of doing visual recognition was so concerning because of course, very few people had even traced who was using this model. So trying to do the reverse engineering of where these really problematic assumptions were being built in hardcoded into how AI models see and interpret the world, that was a giant unknown and remains to this day quite problematic. We did a recent study that just came out a couple of months ago looking at one of the biggest data sets behind generative AI systems that are doing text to image generation. It's called LAION-5B, which stands for 5 billion. It has 5 billion images and text captions drawn from the internet. And you might think, as you said, this will just mirror societal biases, but it's actually far more weird than you

Akiko Iwasaki: The Immunology of Covid and the Future

If there’s one person you’d want to talk to about immunology, the immune system and Covid, holes in our knowledge base about the complex immune system, and where the field is headed, it would be Professor Iwasaki. And add to that the topic of Women in Science. Here’s our wide-ranging conversation.

A snippet of the video, Full length Ground Truths videos are posted here and you can subscribe.

Ground Truths is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Transcript with many external link and links to the audio, recorded 30 April 2024

Eric Topol (00:06):

Hello, it's Eric Topol and I'm really thrilled to have my friend Akiko Iwasaki from Yale, and before I start talking with Akiko, I just want to mention there aren't too many silver linings of the pandemic, but one for me was getting to know Professor Iwasaki. She is my go-to immunologist. I've learned so much from her over the last four years and she's amazing. She just, as you may know, she was just recently named one of the most influential people in the world by TIME100. [and also recognized this week in TIME 100 Health]. And besides that, she's been elected to the National Academy of Medicine, National Academy of Sciences. She's the president of the American Association of Immunologists and she's a Howard Hughes principal investigator. So Akiko, it's wonderful to have you to join into an extended discussion of things that we have of mutual interest.

Akiko Iwasaki (01:04):

Thank you so much, Eric, for having me. I equally appreciate all of what you do, and I follow your blog and tweets and everything. So thank you Eric.

Eric Topol (01:14):

Well, you are a phenom. I mean just, that's all I can say because I think it was so appropriate that TIME recognize your contributions, not just over the pandemic, but of course throughout your career, a brilliant career in immunology. I thought we'd start out with our topic of great interest on Long Covid. You've done seminal work here and this is an evolving topic obviously. I wonder what your latest thoughts are on the pathogenesis and where things are headed.

Long Covid

Akiko Iwasaki (01:55):

Yeah, so as I have been saying throughout the pandemic, I think that Long Covid is not one disease. It's a collection of multiple diseases and that are sort of ending up in similar sets of symptoms. Obviously, there are over 200 symptoms and not everyone has the same set of symptoms, but what we are going for is trying to understand the disease drivers, so persistent viral infection is one of them. There are overwhelming evidence for that theory now, all the way from autopsy and biopsy studies to looking at peripheral blood RNA signatures as well as circulating spike protein and nucleocapsid proteins that are detected in people with Long Covid. Now whether that persistent virus or remnants of virus is driving the disease itself is unclear still. And that's why trials like the one that we are engaging with Harlan Krumholz on Paxlovid should tell us what percentage of the people are suffering from that type of driver and whether antivirals like Paxlovid might be able to mitigate those. If I may, I'd like to talk about three other hypotheses.

Eric Topol (03:15):

Yeah, I'd love for you to do that.

Akiko Iwasaki (03:18):

Okay, great. So the second hypothesis that we've been working on is autoimmune disease. And so, this is clearly happening in a subset of people, again, it's a heterogeneous disease, but we can actually not only look at reactogenicity of antibodies from people with Long Covid where we can transfer IgG from patients with Long Covid into an animal, a healthy animal, and really measure outcomes of a pathogenesis. So that's a functional evidence that antibodies in some people with Long Covid is really actually causing some of the damages that are occurring in vivo. And the third hypothesis is the reactivation of herpes viruses. So many of us adults have multiple latent herpes virus family members that are just dormant and are not really causing any pathologies. But in people with Long Covid, we're seeing elevated reactivation of viruses like Epstein-Barr virus (EBV) or Varicella-zoster virus (VZV) and that may again be just a signature of Long Covid, but it may also be driving some of the symptoms that people are suffering from.

(04:32):

So that's again, we see the signature over and over, not just our group, but multiple other groups, Michael Peluso's group, Jim Heath, and many others. So that's also an emerging evidence from multiple groups showing that. And finally, we think that inflammation that occurs during the acute phase can sort of chronically change some tissue tone. For instance, in the brain with Michelle Monje’s team, we developed a sort of localized mild Covid model of infection and showed that changes in microglia can be seen seven weeks post infection even though the virus is completely gone. So that means that inflammation that's established as a result of this initial infection can have prolonged sequence and sequela within the person and that may also be driving disease. And Eric, the reason we need to understand these diseases separately is because not only for diagnostic purposes, but for therapeutic purposes because to target a persistent virus is very different approach from targeting autoantibodies, for example.

Eric Topol (05:49):

Well, that's great. There's a lot to unpack there as you laid out four distinct paths that could result in the clinical syndrome and sequelae. I think you know I had the chance to have a really fun conversation with Michelle about their joint work that you've done, and she reminded me how she made a cold call to you to start as a collaboration, which I thought was fantastic. Look what that yielded. But yeah, this is fascinating because as I think you're getting at is that it may not be the same pathogenesis in any given individual so that all these, and even others might be operative. I guess maybe I first delve into the antibody story as you're well aware, we see after people get Covid a higher rate of autoimmune diseases crop up, which is really interesting because it seems to rev up self-directed immune response. And this I think many people haven't really noted yet, although obviously you're well aware of this, it's across all the different autoimmune diseases, connective tissue disease, not just one in particular. And it's, as you say, the idea that you could take the blood from a person suffering from Long Covid and give it to an experimental animal model and be able to recapitulate some of the abnormalities, it's really pretty striking. So the question I guess is if you were to do plasmapheresis and try to basically expunge these autoantibodies, wouldn't you expect people to have some symptomatic benefit pretty rapidly or is it just that the process is already far from the initiating step?

Akiko Iwasaki (07:54):

That's a great question. Plasmapheresis may be able to transiently improve the person if they're suffering from these autoantibody mediated diseases. People have reported, for example, IVIG treatment has dramatically improved their symptoms, but not in everybody. So it's really critical to understand who's suffering from this particular driver and appropriately treat those people. And there are many other very effective therapies in autoimmune disease field that can be repurposed for treating these patients as well.

Eric Topol (08:34):

The only clinical trial that has clicked so far, interestingly, came out of Hong Kong with different types of ways to manipulate the gut microbiome, which again, you know better than me is a major modulator of our immune system response. What are your thoughts about taking advantage of that way to somehow modulate this untoward immune response in people with this condition?

Akiko Iwasaki (09:07):

Yeah, so that is an exciting sort of development, and I don't mean to discount the importance of microbiome at all. It's just the drivers that are mentioning are something that can be directly linked to disease, but certainly dysbiosis and translocation of metabolites and microbiome itself could trigger Long Covid as well. So it's something that we're definitely keeping our eyes on. And as you say, Eric, the immune system is in intimate contact with the gut microbiome and also the gut is intimate contact with the brain. So there's a lot of connections that we really need to be paying attention to. So yeah, absolutely. This is a very exciting development.

Eric Topol (09:57):

And it is intriguing of course, the reactivation of viruses. I mean, we’ve learned in recent years how important EBV is in multiple sclerosis (MS). The question I have for you on that pathway, is this just an epiphenomena or do you actually think that could be a driving force in some people?

Akiko Iwasaki (10:19):

Yeah, so that's really hard to untangle in people. I mean, David Putrino and my team we're planning a clinical trial using Truvada. Truvada obviously is an HIV drug, but it has reported antiviral activity to Epstein-Barr virus (EBV) and others. So potentially we can try to interrogate that in people, but we're also developing mouse models that can sort of recapitulate EBV like viral reactivation and to see whether there's any sort of causal link between the reactivation and disease process.

Eric Topol (10:57):

Right now, recently there's been a bunch of anecdotes of people who get the glucagon-like peptide one (GLP-1) drugs which have a potent anti-inflammatory, both systemic and in the brain. I'd love to test these drugs, but of course these companies that make them or have other interests outside of Long Covid, do you think there's potential for a drug lik

Aviv Regev: The Revolution in Digital Biology

“Where do I think the next amazing revolution is going to come? … There’s no question that digital biology is going to be it. For the very first time in our history, in human history, biology has the opportunity to be engineering, not science.”—Jensen Huang, NVIDIA CEO

Aviv Regev is one of the leading life scientists of our time. In this conversation, we cover the ongoing revolution in digital biology that has been enabled by new deep knowledge on cells, proteins and genes, and the use of generative A.I .

Transcript with audio and external links

Eric Topol (00:05):

Hello, it's Eric Topol with Ground Truths and with me today I've really got the pleasure of welcoming Aviv Regev, who is the Executive Vice President of Research and Early Development at Genentech, having been 14 years a leader at the Broad Institute and who I view as one of the leading life scientists in the world. So Aviv, thanks so much for joining.

Aviv Regev (00:33):

Thank you for having me and for the very kind introduction.

The Human Cell Atlas

Eric Topol (00:36):

Well, it is no question in my view that is the truth and I wanted to have a chance to visit a few of the principal areas that you have been nurturing over many years. First of all, the Human Cell Atlas (HCA), the 37 trillion cells in our body approximately a little affected by size and gender and whatnot, but you founded the human cell atlas and maybe you can give us a little background on what you were thinking forward thinking of course when you and your colleagues initiated that big, big project.

Aviv Regev (01:18):

Thanks. Co-founded together with my very good friend and colleague, Sarah Teichmann, who was at the Sanger and just moved to Cambridge. I think our community at the time, which was still small at the time, really had the vision that has been playing out in the last several years, which is a huge gratification that if we had a systematic map of the cells of the body, we would be able both to understand biology better as well as to provide insight that would be meaningful in trying to diagnose and to treat disease. The basic idea behind that was that cells are the basic unit of life. They're often the first level at which you understand disease as well as in which you understand health and that in the human body, given the very large number of individual cells, 37.2 trillion give or take, and there are many different characteristics.

(02:16):

Even though biologists have been spending decades and centuries trying to characterize cells, they still had a haphazard view of them and that the advancing technology at the time – it was mostly single cell genomics, it was the beginnings also of spatial genomics – suggested that now there would be a systematic way, like a shared way of doing it across all cells in the human body rather than in ways that were niche and bespoke and as a result didn't unify together. I will also say, and if you go back to our old white paper, you will see some of it that we had this feeling because many of us were computational scientists by training, including both myself and Sarah Teichmann, that having a map like this, an atlas as we call it, a data set of this magnitude and scale, would really allow us to build a model to understand cells. Today, we call them foundational models or foundation models. We knew that machine learning is hungry for these kinds of data and that once you give it to machine learning, you get amazing things in return. We didn't know exactly what those things would be, and that has been playing out in front of our eyes as well in the last couple of years.

Spatial Omics

Eric Topol (03:30):

Well, that gets us to the topic you touched on the second area I wanted to get into, which is extraordinary, which is the spatial omics, which is related to the ability to the single cell sequencing of cells and nuclei and not just RNA and DNA and methylation and chromatin. I mean, this is incredible that you can track the evolution of cancer, that the old word that we would say is a tumor is heterogeneous, is obsolete because you can map every cell. I mean, this is just changing insights about so much of disease health mechanisms, so this is one of the hottest areas of all of life science. It's an outgrowth of knowing about cells. How do you summarize this whole era of spatial omics?

Aviv Regev (04:26):

Yeah, so there's a beautiful sentence in the search for lost time from Marcel Proust that I'm going to mess up in paraphrasing, but it is roughly that going on new journeys is not about actually going somewhere physically but looking with new eyes and I butchered the quote completely.[See below for actual quote.] I think that is actually what single cells and then spatial genomics or spatial omics more broadly has given us. It's the ability to look at the same phenomenon that we looked at all along, be it cancer or animal development or homeostasis in the lung or the way our brain works, but having new eyes in looking and because these new eyes are not just seeing more of something we've seen before, but actually seeing things that we couldn't realize were there before. It starts with finding cells we didn't know existed, but it's also the processes that these cells undergo, the mechanisms that actually control that, the causal mechanisms that control that, and especially in the case of spatial genomics, the ways in which cells come together.

(05:43):

And so we often like to think about the cell because it's the unit of life, but in a multicellular organism we just as much have to think about tissues and after that organs and systems and so on. In a tissue, you have this amazing orchestration of the interactions between different kinds of cells, and this happens in space and in time and as we're able to look at this in biology often structure is tightly associated to function. So the structure of the protein to the function of the protein in the same way, the way in which things are structured in tissue, which cells are next to each other, what molecules are they expressing, how are they physically interacting, really tells us how they conduct the business of the tissue. When the tissue functions well, it is this multicellular circuit that performs this amazing thing known as homeostasis.

(06:36):

Everything changes and yet the tissue stays the same and functions, and in disease, of course, when these connections break, they're not done in the right way you end up with pathology, which is of course something that even historically we have always looked at in the level of the tissue. So now we can see it in a much better way, and as we see it in a better way, we resolve better things. Yes, we can understand better the mechanisms that underlie the resistance to therapeutics. We can follow a temporal process like cancer as it unfortunately evolves. We can understand how autoimmune disease plays out with many cells that are actually bent out of shape in their interactions. We can also follow magnificent things like how we start from a single cell, the fertilized egg, and we become 37.2 trillion cell marvel. These are all things that this ability to look in a different way allows us to do.

Eric Topol (07:34):

It's just extraordinary. I wrote at Ground Truths about this. I gave all the examples at that time, and now there's about 50 more in the cardiovascular arena, knowing with single cell of the pineal gland that the explanation of why people with heart failure have sleep disturbances. I mean that's just one of the things of so many now these new insights it's really just so remarkable. Now we get to the current revolution, and I wanted to read to you a quote that I have.

Digital Biology

Aviv Regev (08:16):

I should have prepared mine. I did it off the top of my head.

Eric Topol (08:20):

It's actually from Jensen Huang at NVIDIA about the digital biology [at top of the transcript] and how it changes the world and how you're changing the world with AI and lab in the loop and all these things going on in three years that you've been at Genentech. So maybe you can tell us about this revolution of AI and how you're embracing it to have AI get into positive feedbacks as to what experiment to do next from all the data that is generated.

Aviv Regev (08:55):

Yeah, so Jensen and NVIDIA are actually great partners for us in Genentech, so it's fun to contemplate any quote that comes from there. I'll actually say this has been in the making since the early 2010s. 2012 I like to reflect on because I think it was a remarkable year for what we're seeing right now in biology, specifically in biology and medicine. In 2012, we had the beginnings of really robust protocols for single cell genomics, the first generation of those, we had CRISPR happen as a method to actually edit cells, so we had the ability to manipulate systems at a much better way than we had before, and deep learning happened in the same year as well. Wasn't that a nice year? But sometimes people only realize the magnitude of the year that happened years later. I think the deep learning impact people realized first, then the single cells, and then the CRISPR, then the single cells.

(09:49):

So in order maybe a little bit, but now we're really living through what that promise can deliver for us. It's still the early days of that, of the delivery, but we are really seeing it. The thing to realize there is that for many, many of the problems that we try to solve in biomedicine, the problem is bigger than we would ever be able to perform experiments or collect data. Even if we had the genomes of all the people in the world, all billions and billions of them, that's just a smidge compared to all of the ways in which their common var

Every Podcast » Ground Truths » Daphne Koller: The Convergence of A.I. and Digital Biology