Putting AI Medicine into Practice

There’s been a lot of talk about technology — and AI, deep learning, and machine learning specifically — finally reaching the healthcare sector. But AI in medicine isn’t actually new; it’s actually been there since the 1960s. And yet we didn’t see it effect a true change, or even become a real part our doctor’s offices — let alone routine healthcare services. So: what’s different now? And what does AI in medicine look like, practically speaking, whether it’s ensuring the best data, versioning software for healthcare, or other aspects?

In this episode of the a16z Podcast, Brandon Ballinger, CEO of Cardiogram; Mintu Turakhia, cardiologist at Stanford and Director of the Center for Digital Health; and general partner and head of a16z bio fund Vijay Pande in conversation with Hanne Winarksy discuss where will we start to see AI in healthcare first — diagnosis, treatment, or system management — to what it will take for it to succeed. Will we perhaps see a “levels” of AI framework for doctors as we have for autonomous cars?

Show Notes

Discussion of how AI has been used so far in medicine [0:45] and its potential for the future [3:55]
Questions about data sets [7:46] and how AI might be scaled [12:45]
Unknowns about integrating AI into medicine [15:01], and a discussion of incentives for making the transition [21:04]

Transcript

Hanne: Hi, and welcome to the “a16z Podcast.” I’m Hanne, and today we’re talking about AI in medicine. But we want to talk about it in a really practical way. What it means to use it in practice and in a medical practice, what it means to build medical tools with it — but also what creates the conditions for AI to really succeed in medicine, and how we design for those conditions, both from the medical side and from the software side.

The guests joining for this conversation, in the order in which you will hear their voices, are Mintu Turakhia, a cardiologist at Stanford and director of the Center For Digital Health; Brandon Ballinger, CEO and founder of Cardiogram, a company that uses heart rate data to predict and prevent heart disease; and Vijay Pande, a general partner here at a16z and head of our bio fund.

So, let’s maybe just do a quick breakdown of what we’re actually talking about when we talk about introducing AI to medicine. What does that actually mean? How will we actually start to see AI intervene in medicine and in hospitals and in waiting rooms?

AI’s history and its future potential

Dr. Turakhia: AI is not new to medicine. Automated systems in healthcare have been described since the 1960s. And they went through various iterations of expert systems and neural networks and [were] called many different things.

Hanne: In what way would those show up in the ’60s and ’70s?

Dr. Turakhia: So, at that time, there were no high resolutions. There weren’t too many sensors. And it was about a synthetic brain that could take what a patient describes as the inputs, and what a doctor finds on the exam as the inputs.

Hanne: Using verbal descriptions?

Dr. Turakhia: Yeah. Basically words. People created what are called ontologies and classification structures. You put in the ten things you felt, and a computer would spit out the top ten diagnoses in order of probability. And even back then, they were outperforming average physicians. So, this is not a new concept.

Hanne: So, basically doing what hypochondriacs do with Google today, but verbally.

Dr. Turakhia: Right. So, Google is, in some ways, an AI expression of that, where it’s actually used ongoing inputs and classification to do that over time. [It’s a] much more robust neural network, so to speak.

Brandon: So, an interesting case study is the MYCIN system, which is from 1978, I believe. And so, this was an expert system trained at Stanford. It would take inputs that were just typed in manually, and then it would essentially try to predict what a pathologist would show. And it was put to the test against five pathologists, and it beat all five of them.

Hanne: And it was already outperforming.

Brandon: It was already outperforming doctors, but when you go to the hospital, they don’t use MYCIN or anything similar. And I think this illustrates that sometimes, the challenge isn’t just the technical aspects or the accuracy, it’s the deployment path. Some of the issues around there are — okay, is there a convenient way to deploy this to actual physicians? Who takes the risk? What’s the financial model for reimbursement? And so, if you look at the way the financial incentives work, there are some things that are backwards. For example, if you think about a hospital from the CFO’s perspective, a misdiagnosis actually earns them more money…

Hanne: What?

Brandon: …because when you misdiagnose, you do follow-up tests, right? And those — and our billing system is fee-for-service. So, every little test that’s done is billed for.

Hanne: But nobody wants to be giving out wrong diagnoses, so where’s the incentive? The incentive is just in the system — the money that results from it.

Brandon: No one wants to give an incorrect diagnosis. On the other hand, there’s no budget to invest in better diagnoses.

Hanne: In making sure it doesn’t happen.

Brandon: I think that’s been part of the problem. And so, things like fee-for-value are interesting because now, you’re paying people for an accurate diagnosis or for a reduction in hospitalizations, depending on the exact system. And so, I think that’s the case where accuracy is rewarded with a greater payment, which sets up the incentives so that AI can actually win in this circumstance.

Dr. Turakhia: Where I think AI has come back at us with a force is — it came to healthcare as a hammer looking for a nail. What we’re trying to figure out is where you can implement it easily and safely, with not too much friction and with not a lot of physicians going crazy, and where it’s going to be very, very hard. And that, I think, is the challenge in terms of building, developing these technologies, commercializing them, and seeing how they scale. And so, the use cases really vary across that spectrum.

Brandon: Yeah, I think about there as being a couple different cases where AI can intervene. One is to substitute what doctors do already, and so people use the example of radiology as an example. The other area that I think is maybe more interesting is that AI can complement what doctors can’t do already.

So, it would be possible for a doctor to, say, read an ECG and tell you whether you’re in an abnormal heart rhythm. No doctor right now can read your Fitbit data and tell you whether you have a condition like sleep apnea. I mean, if you look at your own data, you can kind of see restful sleep as real structured REM cycles, so you can see some patterns there. That said, the gold standard that a sleep doctor would use is a sleep study, where they wire you up with six different sensors and tell you to sleep naturally. There’s a big difference here between the very noisy consumer sensors that may be less interpretable, and what a doctor is used to seeing.

Vijay: Or it could be that the data is on the device, but the analysis can’t be done yet. Maybe the analysis needs a gold standard data set to compare to. There are a lot of missing parts beyond just gathering the data from the patient in the first place.

Dr. Turakhia: I think there’s some inherent challenges in the nature of the beast. Healthcare is unpredictable. It’s stochastic. You can predict a cumulative probability — like a probability of getting condition X, or diagnosis X, over a time horizon of 5 or 10 years — but we are nowhere near saying, “You’re going to have a heart attack in the next 3 days.” Prediction is very, very, very difficult, and so where prediction might have a place is where you’re getting high fidelity data, whether it’s from a wearable or a sensor.

It’s so dense that a human can’t possibly do it. Like, a doctor’s not going to look at it. And two, it’s relatively noisy. Inaccurate, poor classifiers, missing — periods where you don’t have this continuous data that you really want for prediction. In fact, the biggest predictor of someone getting ill with a lot of wearable studies is missing data, because they were too sick to wear the sensor.

Hanne: Oh, so the very absence of the data is a big indicator.

Dr. Turakhia: Yes. Exactly.

Hanne: That’s so interesting. I don’t feel well enough to put on my what-have-you, and that means something’s not right.

Dr. Turakhia: Possibly, or you’re on vacation. And so that’s the problem. That’s other challenge of AI — is context. And so, what are some of the more simple problems where you have clean data structures, you have less noise, you have very clear training for these algorithms. And I think that’s where we’ve seen AI really pick up, in imaging-like studies. It’s a closed-loop diagnosis. You know, there is a nodule on an x-ray that is cancer-based on a biopsy, proven later in the training dataset, or there isn’t. In the case of an EKG, we already have expert systems that can give us a provisional diagnosis on an EKG. They’re not really learning. And so, that’s a great problem, because most arrhythmias don’t need context. You can look at it and make the diagnosis.

Hanne: We don’t need them to learn, so that’s why it’s good to use right away, to apply this technology immediately.

Dr. Turakhia: You don’t need everything. You don’t need to mine the EMR to get all this other stuff. You can look at the image and say, “Is it probably — does it have a diagnosis or does it not?” And so, imaging of the retina, imaging of skin lesions, x-rays, MRIs, echocardiograms, EKGs — that’s where we’re really seeing AI pick up.

The importance of big data

Brandon: I sort of divide the problems into inputs and outputs. We talked a little bit about some of the inputs that have become newly available, like EMR and imaging data. I think it’s also interesting to think about what the outputs of an AI algorithm would be. And these examples are self-contained, well-defined outputs that fit into the existing medical system. But I think it’s also interesting to imagine what could happen if you were to reinvent the entire medical system with this assumption that we have a lot of data — intelligence is artificial, and therefore cheap — so we can do continuous monitoring. So, one of the things I think about is, “What are the gaps of people who do not have access to EKGs?” I’ve actually never had an EKG done aside from the ones I do myself. So, and most people in the U.S. — you get your first EKG when you turn 65 as part of your Medicare checkup, and they won’t reimburse for anything after that.

Hanne: Oh, wow, I didn’t realize it’s so late.

Brandon: My dad’s an excavator, so he digs foundations for houses, and he hasn’t seen a doctor in 20 years. And if he leaves a job site, the entire job site would shut down. So, it’s hard for some people, I think, to go into the doctor’s office between the hours of 9:00 a.m. to 5:00 p.m. If you look at that in aggregate, about half of people in the U.S. have a primary care physician at all, which seems astonishingly low, but that’s the fact. There’s a gap — about a third of people with diabetes don’t realize they have it, about a fifth of people with hypertension, for AFib, it’s 30% or 40%. For sleep apnea, it’s like 80%.

Vijay: I think it’s one thing just finding out but not being able to do anything about it, but the actionable aspect, I think, really is a huge game-changer. It means that you can have both better outcomes for patients, and, in principle, lower costs for payers.

Hanne: Right. And these are areas where there are clear ways of addressing these specific conditions.

Dr. Turakhia: I will take a little bit of a different view here, which is that I don’t know if AI — artificial intelligence — is needed for earlier and better detection and treatment. To me, that may be a data collection issue.

Hanne: How is that different from what we’re saying about finding it early? How can that not be good?

Dr. Turakhia: Because that may have to do with getting sensors out of hospitals and getting them to patients. And that’s not inherently an AI problem. It could be a last mile AI problem — so that if you want to scale the ability to get this stuff. So, let’s say we get to a point where our bathroom tiles have built-in EKG sensors and scales, and the data is just collected while we brush our teeth. It’s the sensing technology that may detect things discreetly, like an arrhythmia. You may not necessarily need intelligence, but who’s going to look at the data? And so that’s a scaling issue.

Vijay: The AI could look at the data. And the other thing is, if you’re using this as screening, you want to make the accuracy as high as possible to avoid false positives. And AI would have a very natural role there too.

Hanne: But it’s interesting that you’re saying it’s not necessarily about the analysis, it’s about where the data comes from and when.

Dr. Turakhia: I think there are two different problems. There may be a point that it truly outperforms the cognitive abilities of physicians. And we have seen that with imaging so far, and some of the most promising aspects of the imaging studies and the EKG studies are that the confusion matrices — the way humans misclassify things — is recapitulated by the convolutional neural networks.

Hanne: Can you actually break that down for a second? So, what are those confusion matrices?

Dr. Turakhia: So, a confusion matrix is a way to graph the errors and which directions they go. For rhythms on an EKG, a rhythm that’s truly atrial fibrillation could get classified as normal sinus rhythm or atrial tachycardia, or supraventricular tachycardia. The names are not important. What’s important is that the algorithms are making the same type of mistakes that humans are doing. It’s not that it’s making a mistake that’s necessarily more lethal and just nonsensical, so to speak. It recapitulates humans.

To me, that’s the core thesis of AI in medicine. Because if you can show that you’re recapitulating human error, you’re not going to make it perfect. But that tells you that, in check and with control, you can allow this to scale safely since it’s liable to do what humans do. And so, now you’re automating tasks that, you know — I’m a cardiologist, I’m an electrophysiologist, but I don’t enjoy reading 400 ECGs when it’s my week to read them.

Hanne: So, you’re saying it doesn’t have to be better — it just has to be making the same kinds of mistakes to feel that you can trust the decision-making.

Dr. Turakhia: Right. You dip your toe in the water by having it be assistive, and then at some point, we as a society will decide if it can go fully auto. Fully autonomous without a doctor in the loop. That’s a societal issue. That’s not a technical hurdle at this point.

Hanne: Right.

Vijay: Well, you can imagine, just as — let’s say, self-driving cars, you have different levels of autonomy. It’s not nothing versus everything.

Dr. Turakhia: It’s not.

Vijay: You can imagine level one, level two, level four, level five in self-driving cars. I think that would be the most natural way because we wouldn’t want to go from nothing to everything.

Dr. Turakhia: Exactly. And just like a self-driving car, we as a society have to define who’s taking that risk on. You can’t really sue a convolutional neural network, but you might be able to make a claim against the physician, the group, the hospital that implements it. You know, how does that shake out?

Hanne: To figure out, literally, how to insure against these kinds of errors.

Brandon: I think the way you think about some of these error cases depends on whether the AI is substituting for part of what a doctor does today, or if the AI is doing something that’s truly novel. I think, in the novel cases, you might not actually care whether it makes mistakes that would be similar to humans.

Hanne: That’s an interesting point, because it’s doing something that we couldn’t achieve. What kinds of novel cases like that can you imagine?

Brandon: Wearables are an interesting case, because they’ll generate about 2 trillion data points this year. So, there’s no cardiologist or team of cardiologists who could even possibly look at those. That’s a case where you can actually invert maybe the way the medical system works today. Rather than being reactive to symptoms, you can actually be proactive, and the AI can be essentially something that tells you to go to the doctor rather than something that the doctor uses when you’re already there.

Vijay: Let’s take radiology as an example, where you could have one level where it’s as good as a common doctor, another level where it’s as good as the consensus of doctors.

Hanne: Right.

Vijay: Another level is that it’s not just using the labels [that a] radiologist would say on the image. It’s using a higher-level gold standard. It’s predicting what the biopsy would say. And so, now you’re doing something that…

Hanne: Which would be into your kind of novel, plus…

Vijay: Yeah, something that no human being could do. It can do that because in principle, it could fuse the data from the image, and maybe blood work, and other things that are easier to get and much less risk inducing than removing tissue in a biopsy.

Hanne: So, pulling those multiple streams of information into one and sort of synthesizing them is another area that…

Vijay: Yeah. It’s very difficult for a human being to do, and very natural for a computer.

The unknowns of AI in healthcare

Dr. Turakhia: It is very natural, but I think we need a couple of things to get there. We need really dense, high-quality data to train. And the more data you put in a model — I mean, so, machine learning, by definition, is statistical overfitting, and sometimes…

Vijay: Well, actually, I think that’s wrong. Machine learning done poorly — I mean, it’s like saying driving is driving a car off a cliff. Poor driving is poor driving, but machine learning tries to avoid statistical overfitting.

Dr. Turakhia: It does. My point is that one of the unknowns with any model, it doesn’t matter if it’s machine learning or regression or a risk score, is calibration. And as you start including fuzzy and noisy data elements in there — first of all, often the validation data sets don’t perform as well as the training dataset, no matter what math you use.

Hanne: Okay. And why is that?

Vijay: Well, that’s a sign of overfitting, and usually it’s because there wasn’t sufficient regularization during the training process.

Dr. Turakhia: So overfitting is a concept in statistics to effectively indicate that your model has been so highly tuned and specified to the data you see in front of it, that it may not apply to other cases.

Hanne: It can’t generalize.

Dr. Turakhia: If you had to use a model to identify a bunch of kids in a classroom and pick the kid who’s the fastest, an overfitted model might say it’s the red-headed kid wearing Nikes. Because in that class, that was the case.

Hanne: That was the one child.

Dr. Turakhia: But that has no plausible biological or other plausibility.

Hanne: You can’t extrapolate, it doesn’t generalize, yeah.

Dr. Turakhia: You can’t use that. If you take that to a place where the prevalence of Nike shoes or redheads is low, you might miss the fastest person, right?

Hanne: Not helpful, yeah.

Dr. Turakhia: These are some of the issues. The underlying shifts in population, the natural language processing that’s embedded in AI, the lexicon that people use. How doctors and clinicians write what it is that they’re seeing with their patient is different, from not even specialty to specialty, but hospital to hospital, sort of mini subcultures.

Brandon: It’s going to be different at Stanford than it was at UCSF, which is going to be different at Cleveland Clinic. I think that’s actually a nice thing about wearable data, is that Fitbits are the same all over the world. This label problem though is interesting because, in our context, each label represents a human life at risk. It’s a person who came into the hospital with an arrhythmia, and so you’re not going to get a million labels the way you might for a computer vision application. It would be unconscionable to ask for a million labels in this case. So, I think one of the interesting challenges is training deep learning-based models, which tend to be very data-hungry, with limited label data.

Dr. Turakhia: The kitchen-sink approach of taking every single data element — even if you’re looking at an image, can lead to these problems of overfitting. And what Brandon and Vijay are both alluding to is, you limit the labels to really high-quality labeling, and see if you can go there. And so don’t complicate your models unnecessarily.

Vijay: And don’t build models that are overly complicated for the amount of data that you have. Because if you have the case where you’re doing so much better on the training set than the test set, that’s proof that you’re overfitting, and you’re doing the ML wrong.

Brandon: Modern ML practitioners have a whole set of techniques to deal with overfitting. So, I think that problem is very solvable with well-trained practitioners. One thing you’ve alluded to, which is the interpretability aspect. So, let’s say you train on a population that’s very high in diabetes, but then you’re testing on a different population, which has a higher or lower prevalence. That is kind of interesting — so, identifying shifts in the underlying data and how you get…

Hanne: What would that mean?

Brandon: So, let’s say we train on people in San Francisco, and everyone runs to work and eats quinoa all day. But then we go to a different part of the country where maybe obesity is higher, or you could be somewhere in the stroke belt where the rate of stroke is higher. It may be that the statistics you trained on don’t match the statistics that you’re now testing on. It’s fundamentally a data quality problem. If you collect data from all over the world, you can address this. But it’s something you have to be careful with.

Hanne: But it will take a while for that to happen as we start gathering the data in different ways. How does that actually even happen? How are these streams of data funneled in and examined and fed into a useful system?

Brandon: So, used to be, the way you’d run a clinical trial, you would have one hospital. You’d recruit patients from that hospital, that’d be it. If you got a couple of hundred patients, that might actually be quite difficult to attain. I think with ResearchKit, HealthKit, Google Fit, all of these things, you can now get 40,000 or 50,000 people into a study from all over the world, which is great, except the challenge that the first five ResearchKit apps had is that they got 40,000 people, and then they lost 90% of them in the first 90 days.

Hanne: So, everybody just drops out?

Brandon: Everyone just drops out, because the initial versions of the apps weren’t engaging. So, this adds an interesting new dimension. As a medical researcher, you might not think about building an engaging, well-designed app, but actually, you have to bring mobile design in as now a discipline that you’re good at.

Hanne: So, there has to be some incentive to continue to engage.

Brandon: Yeah, exactly. You need to measure cohorts the same way Instagram, or Facebook, or Snapchat does. So, I think the teams that are starting to succeed here tend to be very interdisciplinary. They bring in the clinical aspect, because you need that to choose the right problem to solve, but also design the study well so that you have convincing evidence. You need the AI aspect, but you also often need mobile design, if it’s a mobile study. You may need domain expertise in other areas if your data is coming from somewhere else.

Hanne: And then it all has to be gamified and fun to do.

Brandon: Yeah. Well, I mean, gamification is sort of extrinsic motivation, but you can also give people intrinsic motivation — giving them insights into their own health, for instance. It’s a pretty good way to hook people.

Urging the current system to adopt AI

Hanne: What’s the system’s incentives? I mean, of course, doctors want it if it makes them more accurate or to scale better. Patients want it if you can predict whether or not you’re going to have a problem. How do we incentivize the system as a whole?

Dr. Turakhia: I believe fundamentally it is going to come down to cost and scale, and what willingness does a healthcare entity, whoever that may be — whether it’s employer-based programs, insurer-based programs, accountable care organizations. Are they going to be willing to take on risk to see the rewards of cost and scale? And so, the early adopters will be ones who’ve taken on a little more risk.

Vijay: Yeah. I think, you know, it is — the challenge is where the hope is, and in terms of value, and in terms of better outcomes. But one has to prove it out and hospitals will want to see.

Dr. Turakhia: The regulatory risk thing is being largely addressed by this new office of digital health and the FDA, and they really seem much more forward-thinking about it. But there are going to be challenges that we have to solve, and I’ll give you one just to get the group’s input here. Should you be versioning AI, or do you just let it learn on the fly? And so, normally, when we have firmware, hardware, software updates in regulated and FDA-approved products, they’re static. They don’t learn on the fly. If you keep them static, you’re losing the benefit of learning as you go. On the other hand, bad data could heavily bias the system and cause harm, right? So, if you start learning from bad inputs that come into the system for whatever reason, you could intentionally or unintentionally cause harm. And so, how do we deal with versioning in deep learning?

Vijay: I mean, to just freeze the parameterization — so versioning, from a computer science point of view, is trivial. There’s the deeper statistical question, which you could version every day, every week, every month.

Hanne: Right. It’s when and how often.

Vijay: And just freeze the parameters. What you want to do is, to the point we were talking about earlier — you want to bring in new validation sets. Things that it has never seen before, because you don’t want to just test each version on the same validation set, because now you’re intrinsically overfitting into it. What you always want to be doing [is] holding out bits of data [so that] you can test each version separately, because I want to make sure that they have very strict confidence that this is doing no harm, and this is helpful.

Hanne: Right. It’s like, we’re introducing this whole new data set of a different kind of thing, and that’s when you make new considerations.

Vijay: Yeah. Data’s coming in all the time, and so you just version on what came in today, and that’s it. It’s pretty straightforward. And as you’re training it…

Brandon: This is the way speech recognition works on Android phones. Obviously, data is coming in continuously, and every time someone says, “Okay, Google,” or, “Hey, Siri,” it’s coming into either Google or Apple. But you train a model in batch and then you test it very carefully and then you deploy it. The versions are indeed tagged by the date of the training data.

Hanne: It’s already embedded in the system. Who are the decision-makers that are green lighting when, like, “Okay, we’re going to try this new algorithm. We’re going to start applying this to these radiology images.” What are the decision points?

Dr. Turakhia: So, with EKGs, the early companies used expert systems to just ease the pain points of me having to write out and code out every single diagnosis.

Hanne: The super low-hanging fruit.

Dr. Turakhia: Yeah. Can you improve the accuracy of physicians with this? Can you increase their volume and bandwidth? Can you actually use it to see which physicians are maybe going off course? What if you start having a couple of physicians whose error rates are going up?

Right now with quality, the QI process isn’t really based on random sampling. There’s actually no standardized metrics for QI in any of this. When people read EKGs and sign them off, they just sign them. There’s nothing telling anyone that this guy has a high error rate. And so, that is a great use case of this, where you’re not making diagnoses, but you’re helping anchor and show that, well, if you believe this algorithm is good and broadly generalizable across data, you’re restating the calibration problem now.

It’s not that the algorithm has gotten necessarily worse, because, in fact, in seven of the eight doctors, it’s right on par with them. But in this other doctor, it could be if that doctor — if that doctor is not agreeing with the algorithm which is agreeing with the other seven, that doctor is actually not agreeing with the other seven. So now you have an opportunity to train and relearn. Those are the use cases to go off of.

Vijay: You can train and relearn the person?

Dr. Turakhia: The person. Address their reading errors, coding errors, see what’s going on. And that qualitative look, I think, is very, very valuable.

Hanne: So, what are the ways we’re actually going to start seeing it in the clinical setting? You know, the tools that we might see our doctor actually use with us or not.

Dr. Turakhia: I think it’s going to be these adjacencies around treatment with management. There are a lot of things that happened in the hospital that seem really primitive and arcane, and no one wants to do them. I’ll give you a simple one, which is OR scheduling.

Hanne: So, is it actually the way it looks like it is in “Grey’s Anatomy?” Is it just a whiteboard and an eraser?

Dr. Turakhia: It is a whiteboard and somebody on the phone. The OR front desk.

Hanne: That’s unbelievable.

Dr. Turakhia: There’s a backend of scheduling that happens for the outpatients, but you have add-ons, you have emergencies, you have finite…

Hanne: I mean, it seems like even an Excel sheet would be better than a whiteboard.

Dr. Turakhia: The way OR scheduling works now is primitive, and it also involves gaming. It involves convincing staff X, Y, and Z to stay to get the case done, or do it tomorrow.

Hanne: So, there’s so much behind the scenes, like, human negotiation?

Dr. Turakhia: When I do catheter ablations, we have many different moving parts. Equipment, the support personnel of the equipment manufacturer, anesthesia, fellows, nurses, whatever. Everyone has little pieces of that scheduling. It all comes together, but it comes together in the art of human negotiation and very simple things like, “This is your block time, and if you want to go outside your block time, you need to write a Hallmark card to person X.” So, very simple problem where there’s huge returns inefficiency if you could have AI do that. With the AI inputs over time, you could be like, well — you can really know which physicians are quick and speedy, which ones tend to go over their allotted times, which patient cases might be high risk, which ones may need more backup, which should be done during daytime hours.

Brandon: You could add their Fitbit data and then you could tell who’s drowsy at any given moment, for a little elaboration there.

Hanne: Oh, that’s fascinating. Yeah. Whether or not they want to do it.

Brandon: How stressed are they feeling.

Dr. Turakhia: And so, people stay at times that they’re really needed. That kind of elasticity can come with automation where we fail right now. This is a great place where you are not making diagnoses. There’s nothing you’re being committed to from a, kind of, basic regulatory framework. You’re just optimizing scheduling.

Hanne: So, who actually — so, say that that technology is available. How do you actually get it in — where’s the confluence of the regulation and the actual rollout, and how does it actually make its way into a hospital and to a waiting room?

Brandon: There’s an alternative model I’ve seen, which is startups acting as full-stack healthcare providers. So, Omada Health or Virta Health would be examples of this, where if you have pre-diabetes or diabetes, respectively, the physician can actually refer the patient to one of these services. They have on-staff physicians. They’re registered as providers with national provider IDs. They bill insurance just like a doctor would, and they’re essentially acting as a provider who addresses the whole condition end to end.

I think that case actually simplifies decision-making, because you don’t necessarily have to convince both Stanford and United Healthcare to adopt this thing. You can actually convince just a self-insured employer that they want to include one of these startups as part of their health plan. And so, I think that simplifies the decision-making process and ensures that the physicians and the AI folks are under the same roof. I think that’s a model that we’re going to see probably get the quickest adoption, at least in the <inaudible> world.

Vijay: There are many models, and which is the best model will depend on how you’re helping in the indication, and on the accuracy, and what you’re competing against, and so on.

Brandon: This is a case where, probably, we’ll see the healthcare industry maybe reconstitute itself by vertical, with AI-based diagnostics or therapeutics. Because, if you think — right now, providers are geographically structured. But with AI, every data point makes the system more accurate. Presumably, in an AI-based world, providers will be more oriented around a particular vertical. So, you might have the best data network in radiology, the best data network in pathology, the best data network in…

Hanne: That’s interesting. Yeah. Thank you so much for joining us on the “a16z Podcast.”

Vijay: Great. Thank you.

Dr. Turakhia: Thank you.

Brandon: Thanks for having us.

The views expressed here are those of the individual AH Capital Management, L.L.C. (“a16z”) personnel quoted and are not the views of a16z or its affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation.

This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments for which the issuer has not provided permission for a16z to disclose publicly as well as unannounced investments in publicly traded digital assets) is available at https://a16z.com/investments/.

Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Past performance is not indicative of future results. The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures for additional important information.

Posted November 3, 2017

Putting AI in Medicine, in Practice