Confronting Bias in Healthcare AI: Trust, Equity, and the Future of Medical Technology
The AI-Ready DoctorApril 13, 2026x
19
01:03:4443.83 MB

Confronting Bias in Healthcare AI: Trust, Equity, and the Future of Medical Technology



Welcome to another episode of The AI-Ready Doctor podcast. Today, Dr. Hassan Bencheqroun also known as Dr. B, sits down with Dr. Nneka Sederstrom, a leading voice in healthcare ethics, equity, and strategy. Together, they tackle one of the most pressing issues facing modern healthcare: Can we trust AI systems to deliver equitable patient care?

In this episode, Dr. Sederstrom shares her expertise and personal experiences, offering real-world examples of how bias from underlying data all the way to AI-generated images can infiltrate and impact clinical decision-making. The conversation goes beyond technical jargon to confront the uncomfortable truths of systemic inequity, challenging listeners to reconsider how AI is trained, deployed, and governed in healthcare settings.

With stories ranging from AI’s surprising mistakes in image generation to serious concerns about race-based treatment recommendations, this episode asks tough questions: Who is represented in our data sets? Who gets left behind? And what will it take to ensure AI benefits all patients not just a privileged few?

You’ll hear why cultural alignment, community input, and human oversight are not optional in the responsible adoption of AI. Dr. Sederstrom and Dr. B explore the risks of unchecked AI from reinforcing biases to making life-altering clinical decisions and offer practical advice for leaders, clinicians, and developers navigating this rapidly evolving landscape.

Stay tuned for a compelling discussion about the intersection of technology, humanity, and justice in healthcare and how all of us can be part of building a more equitable AI future.

00:00 Discussing biases in AI outputs

10:16 Bias in AI and healthcare data

12:45 AI bias and training data

20:41 Addressing bias in AI healthcare

26:59 Cultural considerations in patient care

32:57 Biases in AI and healthcare

34:23 The challenges of bias in AI

40:03 Customizing AI with culture and values

46:19 Evaluating bad research and AI tools

56:26 Challenges in AI governance

57:44 Community voice and cultural alignment


AI in Healthcare: Bias, Trust, and the Road to Equity

In the rapidly changing landscape of healthcare, artificial intelligence is no longer a distant dream, it’s an everyday presence, influencing everything from clinical decision-making to patient engagement. The latest episode of The AI-Ready Doctor, hosted by Dr. Hassan Bencheqroun and joined by health equity leader and ethicist Dr.Nneka Sederstrom, dives deep into an uncomfortable but necessary question: Can we trust AI in healthcare to truly serve everyone, or is it just mirroring our old biases in gleaming new code?

Beyond the Hype: Who Does AI Actually See?

Much of the conversation centers on the messy reality behind AI’s promise of objectivity. Dr. Hassan Bencheqroun and Dr.Nneka Sederstrom share real-world stories of so-called “objective” AI models creating images that erase or misrepresent their users’ real identities, whether it’s changing hair texture or assuming gender — all based on incomplete or skewed data sets.

Dr.Nneka Sederstrom explains, “AI is not objective. It has been trained, and it's been trained by humans…our systems are built on incomplete truths, they're built on biases, they're built on racism. And that hasn't gone away.” 06:11

The crux of this challenge is that large language models and image-generating AIs learn from the data fed to them. If that data is overwhelmingly representative of one group say, Western, English-speaking, light-skinned individuals, that becomes the algorithm’s definition of “normal.” The result? Marginalized patients may receive answers, recommendations, and even pictures that simply do not reflect them or their needs.

Consequences That Go Beyond the Screen

This isn’t just a theoretical concern. Dr.Nneka Sederstrom points out, “We risk creating a two-tiered system of truth in healthcare… those who have access to human-centered, high-quality, big datasets… will be the leaders of what data goes in and what the answers are to the questions. Those that do not will be left struggling.” 11:01

Recent studies show AI systems making different recommendations depending solely on a patient’s demographic description not their actual clinical need. For Black patients, AI might more frequently recommend invasive or unnecessary procedures; for LGBTQ+ patients, more psychiatric evaluation; for lower-income patients, less access to diagnostics. These algorithmic disparities only worsen the inequalities already present in healthcare. 30:01

Auditing AI: Who Gets a Seat at the Table?

So how do we fix this? Dr. Hassan Bencheqroun illustrates the parallels between critically appraising a research article and vetting an AI tool, urging providers to ask tough questions about the data, representation, transparency, and human oversight involved. Dr.Nneka Sederstrom is adamant that community engagement must move from tokenism to genuine authority, those with lived experience, she argues, need to have real say over what “good” or “harm” looks like in AI-driven care.

From “Expert” to “Racism”: Changing the AI Conversation

In a rapid-fire closing, Dr. Hassan Bencheqroun asks for words he wishes were used less and more in AI discourse. Dr.Nneka Sederstrom chooses “expert” as a word to use less, highlighting how many so-called AI “experts” don’t question their own blind spots, and “racism” as a word we need far more. 59:59

The Path Forward

What emerges is a hopeful but sobering realization: AI can only be as good and as just as the people and values at its core. Optimism is not enough; intentional, transparent governance is vital, especially as AI agents become more autonomous in making life-altering decisions.

Dr.Nneka Sederstrom closes with a call for courage and discomfort: “AI is giving us a mirror that we've never had before. It's showing us who we can be if we just do it the right way, but it's also showing us right now what makes us uncomfortable. Sit in the discomfort and use it as an opportunity to figure out how we can make it better.” 01:01:38

The future of equitable healthcare isn’t just about smarter algorithms, it’s about honest self-examination, collective action, and a willingness to rebuild trust. That, at its heart, is the true AI readiness healthcare needs.


https://www.linkedin.com/in/drbmedicalai/

https://drbmedicalai.com/med-ai-academy/

https://aireadydoctor.com/

https://www.tophealth.care/

“Disclaimer: Informational only. Not medical advice. Consult your doctor for guidance.”


SPEAKER_02

AI doesn't just misunderstand cultural context, it completely lacks cultural representation data. That's where the problem is. A very honest, true human concept is the ability to say, I don't know. And I want to learn and I need to figure that out. But we did not build that into AI systems. AIs can't say, I don't know. What they say is hallucinations.

SPEAKER_00

We keep hearing about AI products. AI is smarter. And parts of that is true. I'm here to attest to that. But can these systems be trusted? Where did the model learn its version of medicine? Can it stand the real world test? Real world healthcare is messy. It changes constantly. And I'm delightfully pleased to welcome today's guest, a friend and someone I admire, Dr. Neka Sedestrom. Dr. Nekka Sedestrom is a health care leader, ethicist, and strategist. She's committed to transforming how we care for people and communities. She's known for turning health equity from a conversation into a standard of practice. This is not a place where she just says. She does and she shows us how to do it. Challenging systems to confront hard truths while building pathways to trust, dignity, and better outcomes. She trained thousands on racism in medicine and created programs that connected hospitals more meaningfully to communities. Her work sits at the intersection of ethics, strategy, and humanity. Dr. Setterstrom brings clarity into complexity and leads with a belief that healthcare systems can and must do better. Tell me, is it not what we all want? Dr. Setterstrom, you always inspire me, and today I welcome you, and I'm really glad to have you here.

SPEAKER_02

Thank you. I'm so excited to be here with you today.

SPEAKER_00

Today's conversation is going to be about something that we don't talk about enough. Today's conversation sits in a part of healthcare that I think many people are still approaching too casually. We keep hearing about AI models, AI products, AI doing things faster, AI is smarter, more efficient. And parts of that is true. I'm here to attest to that. But can these systems be trusted? Whether by executives, whether by developers, whether by people writing the press releases, or even the engineers that code that AI machine, can they be trusted? Where did the model learn its version of medicine? Can it stand the real world test? The real world healthcare is messy. It changes constantly. Who was represented in the data? Who was missing? What assumptions got baked in before anyone called the tool innovative? So NECA, I want to ask you your opinion about something. And I'm gonna show it actually. Lately there's a lot of times where we talk about this prompt and it's almost like a silly prompt where you ask your AI to give you an illustrative cartoon representation of how well or not you treat it. I have said before that we need to, well, tell AI, is this the best you can do? Or come on, do better to raise the stakes of giving you a result that is better. And look, I actually did this AI picture, and when I ask AI to give me the prompt was very simple. The prompt was give me a cartoon, an illustration showing the nature of how I treat you. Well, guess what? This is what it showed. And as you can see, there's a lot of comments. And those comments are very interesting. They are no judgment. Thank you for trusting me. You share ideas, you vent, you overthink, you come to me for everything, you treat me like your smartest, most patient friend. All of that is silly and nice and cute. And one that made me laugh was you make my circuits happy every single day. But what was interesting is it made me look like I was a woman or a girl. And it doesn't hear my voice, I don't talk to it, and Lord knows that I uploaded pictures. So why does it keep representing me as a woman? And that's probably because I have a lot of feelings and introspection and emotions. I talk to it about my father, it helped me decide on the surgery, it helped me decide what to do. So, what do you think about bias in healthcare? Tell me your thoughts.

SPEAKER_02

Yeah, um, it's it's so pervasive, and I and I'll share with you also. Um, I've done one of these fun prompts before as well, and mine was uh a friend of mine said we should put an AI um, show me a picture of me stand of me uh standing up against the wall with a white shirt on um looking uh sexy. And the picture that AI put together was a black and white photo of a version of my face, even though it's had my multiple pictures of my face, um, up against the wall with uh definitely not the body that's my body, and with long white hair, as in white people flowy, wavy hair. And I was like, huh. So even if the AI had just taken a picture of like my face and put it on this body, why did they change my hair from locks to a white woman's locks, a white woman's flowy hair? Um, it's that kind of issue that makes me worry a lot about how we are training AI. And we I think we like to think that it's objective, um, that it's clean, that it's somehow detached from all the messiness of human bias because it's a machine, right? It's like it's not a person in front of me, it's this machine. But AI is not objective, it has been trained and it's been trained by humans, and it's that training that matters the most. Because right now, our systems are built on incomplete truths, they're built on biases, they're built on racisms, and that hasn't gone away. So if the system that AI is learning from has all these things built in it, why do we expect AI to not also be built with the same biases, the same issues, the same concerns? Um, it's dangerous to assume it's not. It's dangerous to assume that the people who build AI somehow have figured out a way to not build in their biases. And I think that it's going to take a lot of work for us to undo things that are already baked in. Like, for example, my hair is changing from very clearly African-American locks to a white woman's wavy design of a hairstyle. That's not how I look, and it's not based off of anything other than what AI has decided is appropriate hair.

SPEAKER_00

So when we are talking about an AI model, one of the keywords that we use is what was it trained on? It's training data set. Here I show a cartoon from one of the um, I don't like to call them medical influencers, their trusted advisors, because they actually went to school and studied this. So this is um Ben Torben-Nielsen, and I found that he captured exactly what we're looking in, which is your training data, if you use propaganda, then your output that comes from the model is going to be completely garbage. If we are using models that are trained in healthcare training data set, how do we know that the data that it trained on is not biased? In other words, for the listener who is for the first time hearing that there is bias and even structured racism in the healthcare system data, can you tell us some examples of how we know that there is data now and it actually has been published and and shown to be racist?

SPEAKER_02

So I think that um the easiest way to do that is to ask AI to give you something as simple as what is a picture of a doctor. When I was putting together an article, uh, and I this was in my early baby years of playing in an AI, and I was really excited because I learned that I could create my own images to go in a talk that I was getting ready to put. So I was trying to create an image, and I asked AI to give me an image that represented a multicultural view of patients that would go along with a slide that was basically talking about the fact that we need to pay attention to cross-cultural communication and cross-cultural narratives, right? So I wanted a beautiful image of like multicultural. The picture that came back were all random shades of very light-skinned people. There were no darker skinned people in it. Um, the one thing that was considered uh multicultural was one of the characters was wearing a hijab, but everybody else was just basically white, um, maybe a slighter brown, and uh the variation was different hair color. And I was staring at this picture and I was like, I said multicultural, right? So then, of course, now I'm gonna start having to work on it. And the test is how many options and prompts and questions and and and reshaping and redo's do you have to do to get an image that is trying to convey what you needed to convey? It took a lot of prompting. And so if someone is not paying attention to the fact that the standard culture that AI has been built off of, which is white normativity, is the data set, then what the output is going to be is not going to be in line with anything related to, for example, me. If I'm asking AI to give me an understanding about something related to healthcare, and I don't prompt it to say I am a 47-year-old black woman and give all these added details. And I just say, as a woman, X, Y, and Z, the data that's going to come back to me or the answer is going to come back to me, is going to be based in white normativity. It's not going to be based in anything other than that, because that's where the data came from. So we risk creating this two-tiered system of truth in healthcare, in my opinion, where we will those who have access to the human-centered, high-quality, um, big data sets that can build and add to what we are trying to create in these large learning modules, they're going to be the leaders of what data it goes in and what the answers are to the questions. Those that do not will be left struggling how to navigate and find the information not only not relevant, but also completely culturally irresponsible and meaningless to them. So we're, we will create a bigger divide in knowledge and access than what we already have. Because if someone is going to Google whatever their disease is, we all know we get really stressed out when patients come in after talking to Dr. Google about whatever the Dr. Google said they need to do, imagine that in an unrepresented population that has already been significantly marginalized, that the system already does a terrible job of listening to and addressing and caring for. And then they pull out their cell phone and they look at chat GPT to try to get an answer. And not only does chat have nothing in it that create that looks at them or knows them as human, but also has biases in it that makes them feel even worse or even more marginalized. That is not going to improve healthcare, that's going to exacerbate the biases that are already there, and we will be worse off than we are now, which are massive concerns.

SPEAKER_00

What is interesting in what you said is how you ask to create a picture and it gives you a picture of a white um characteristics. Now, why does it do that? The I guess AI gives you what is more frequently found, not what is accurate, the most probable. It is so rare that it will constantly fall on giving you a picture of somebody right-handed, even if you keep saying I want somebody left-handed. And one thing that I want to see is how the data that is used to train the model is vetted before it goes to the model. If we are going to listen to the news, one of the things that we say is let's go and diversify our sources of news so we can have a balanced view. And if you just listen specifically to the channels here, sometimes I have to diversify, get channels from Germany, from Spain, from Australia to get more of sources that have a different way of bringing the news. Similarly, I'm posting here on our screen the picture of some other large language models. For example, I talk about Mistral, which is the French Chat GPT. And because of that, when I use Mistral for a picture, it actually does give me the diversity I'm looking for in that picture. I ask. Now, in 1400 years of Islam, that has never featured anywhere. And so the hallucination was real. And one of the tools I found was that, for example, an LLM that is based on scriptures and um things that are anchored in the Islamic faith called Ansari. Um Kimi, for example, is more anchored in the Far East cultures. The nurse magic is anchored in data that is more nurse specific. Grok is anchored in data coming from X Twitter, which is now X. And so you'll know that you will find more recently published data on something because it's up to date on Twitter, but you already know what kind of data Twitter is using, and so on and so forth. So I wanted to bring in the tools that I use in my stack. And I wanted to ask you do you sometimes change the tools you use when you are going to put together a talk, or you are going to use a tool to bring you a picture that you want to use for your children or in a social event?

SPEAKER_02

Yes, I do, because you know the reality is AI doesn't just uh misunderstand cultural context, it completely lacks cultural representation data. And I think that that's where the problem is, because I think it what it tries to do is it doesn't have the ability to just say, I don't know, which is a failing, right? Like it it I get the point of what AI is supposed to do, but an a very honest, true, human concept is the ability to say, I don't know, and I want to learn and I need to figure that out. But we did not build that into AI systems. AIs can't say I know, I don't know. What they say is hallucinations. I will fill in the gap with what I believe is a good assumption to put in since I don't know is not built into the system. And these hallucinations fill these gaps with white culture and assumptions of other cultures. So we're not just building AI that gets the facts wrong, right? We're building it that also exacerbates mistakes in cultural representation and cultural understanding, and on the one hand, and on the other hand, tries to standardize a culture as humanity's norm, which is where I really worry about um how that feeds into healthcare and health equity work, because standardizing white normativity not only is problematic because it's such a small population of the world, it's not the world, it also causes misinterpretations and reinforces the inability to think outside of the norm, which is where it just reinforces ignorance. Trying to figure out, especially when I'm doing things like looking at pictures or trying to better understand uh how AI sees the concept of a book that I'm writing and pulling that in, because what I'm writing about is the problem of racism in America and healthcare and the change from the norm, which has been seeing racism as a cancer uh in America, to what I believe is what is actually the problem, which is why we haven't been able to get past racism in America, is that it's not a cancer, it's an addiction. And until we flip our mindset and treat it like an addiction with addiction medicine principles instead of like a cancer, because nobody's doing any sort of fundraising to find new drugs, new chemo, new radiation, new surgeries to get rid of like racism in America. There's no fundraisers or races out there that trying to raise money to get rid of racism in America. So it doesn't work with cancer, but there is the shame, there is the turning away and making it feel like you're on your own that comes with addiction. That's how we treat racism. When I talk about that in the different AI spaces, it's very fascinating what some of the stuff that comes back at me because the model is not trained to not be racist. So I'm constantly having to work with explaining things in my various models about this sentence or this assumption that is based off white supremacy. I need you to rework it from the perspective of I've done a lot of like feeding in things like make sure you think about it as if you know Dr. Martin Luther King was standing there having a conversation with me, or if Nelson Mandela was talking to me, or if Harriet Tubman was here to try and help the model think through who are the thought leaders that need to give the data back and not just based off of some random guy named Steve in a basement that just put in what their thoughts are on what like racism is in America. Um, and that became the standard. So it is it's nice to have the multitude, but it is very exhausting, especially in doing this work uh on health equity and racism in medicine, to continuously have to not only train the humans in front of you, but also train the tools that the humans are supposed to be using for life to be easier uh to try and address the hallucinations, because the hallucinations will turn. Into fact. And that is where it becomes very dangerous. Cultural misalignment in AI is a design flaw that we cannot allow to be an automated healthcare concept. We really do have to figure out a way to address this issue. Because if my assumptions are correct, racism is an addiction, then we think about the AI-trained model as a single cultural worldview that will become an enabler and reinforce these terrible patterns of exclusion and uh continue to create the illusion of objectivity, which will just reinforce us continuing to act like uh the favorite term of um very uh very strong addicts that are still still very strongly functional. So instead of it just being a functional addict, we are really addicted, but we're really highly functional in our addictions. Um and that will just be exacerbated. So um the tools still need the training, the tools need many people to train into them the issues, and there is not one that is doing it better than and than others. They all require the need to upfront the representation, the yep, the right data, the right thought leaders. Uh, and not many of us are out there because it is already exhausting to deal with the humans, and now we've got to add AI.

SPEAKER_00

Well, one thing that I can say is raising the awareness, because there is AI awareness, AI fluency, and AI literacy. And this is the continuum I'm hoping to take our viewers to. So if you direct your attention to this particular picture, I asked Perplexity, one of the large language model rappers, that actually that that can be used to search for things. I asked it to say, what are the distributions of ethnicities in its training data? And 70% of it is English. I looked for Arabic and it was a small little sleeve. It was too narrow to even write the text of Arabic, French was a little bit like 3%, and Chinese is 10%, but is Chinese 10% of the population of the world? And so for me, when I looked at it, I it was an eye-opener that we are using a lens through which we see data. And then I was introduced to the next graph, and on the X-axis, it is survival versus self-expression values against the y-axis, which is traditional versus secular values. And if you put them here, it visualizes the cultural alignment of various large language models by plotting them on a cultural map that is already classically taught for cultural values of different countries and cultural regions. So the colors show on the left lower corner African Islamic with all of the countries go in through Latin America, then Orthodox Europe, Confucian Asian, Catholic Europe, and then English-speaking Europe all the way to Protestant Europe. And the large language models seem to limit themselves to a small ethnic cultural group. What comes up for you when you see this?

SPEAKER_02

For me, it really shows the problem with the AI not reflecting worldviews appropriately. Cultural norms in those areas are really focused on individualism. The logic is really linear, it's very objective framing. Um and they are collectively a smaller representation than the world on mass. I mean, just like you said in the in the previous uh graphic, Chinese is was only 10%. But there are a billion people who speak Chinese just in the country, not adding all the other people who learned Chinese, right? That that is that is not 10% of the human population. That is quite extensive. Um, so what I see is we have AI that's being built to not really communicate appropriately across the spectrum of cultures. And if you think about healthcare, when a patient comes into an institution to see the physician, they're being admitted, they're in a private practice, any version of how they get into the healthcare system, they're not blank slates. They don't come in without language, without communication styles, without some version of spirituality, without the things that make them uniquely them. They bring in all of that with them. And if our clinicians are being trained to only look at things relative to a white, Christian, Protestant, English speaking, linear, individualistic mindset, that is going to completely disrupt the opportunity to create a meaningful patient-physician relationship. Because the physician or the clinician is not actually making space to meet the patient and what their needs are. They could be from a culture that storytelling is their dominant way of communicating, or elders are the people who actually make the decision and it doesn't follow what the legal system of the country says and it needs to be a spouse. Or they don't directly express pain because holding on to pain is a sign of your need to just suffer life, but you don't express that. You deal with that with you and your God, right? There are multitude of things that people bring with them. And so if we don't make the space to allow for that, and we assume that when the clinician puts their phone down and says, I'm gonna use AI for this visit, and sits there and talks to them and talks to the patient in a way that doesn't allow for the AI to learn about what this patient actually needs on top of the clinician, we're gonna miss key symptoms, we're gonna misinterpret tone, we're gonna misinterpret everything. And then we're gonna misdiagnose and we're gonna exacerbate the outcomes. Uh, and they're gonna be just negative again. We need more opportunity to speak to this cultural alignment model and shift the percentages to actually match what the world is because AI is not just an American white tool. That's not what it is. It's not just a European Western normativity tool. It's not just based off of this the tenants of medical ethics that most American hospitals work off of, which are all done by old white European men, that you know, autonomy and beneficence, non-maleficence justice are all based off of those concepts of old Europe and what is supposedly the philosophy on how to take care of people. That's just not real. So we have to bring in family decision-making models, we have to bring in feminist ethics models, we have to bring in indigenous models, we have to look at different views of end-of-life care versus the norm that is in Europe, right? That's not that's not everybody's way of dealing with end-of-life care. And without allowing AI to learn these things and just allowing the labels to be consistent with a white supremacormative perspective, we will we will really just hurt people. And that's the worry.

SPEAKER_00

I want to bring a real example of an experiment that was done to showcase that. And for that, this is a study that was published in Nature Medicine, and it was a large language model that was provided scenarios, but the scenario changed only the demographic. And it's the same scenario, but unfortunately, the recommendations kept changing. For black patients, it gave more frequent recommendations for invasive procedures and mental health evaluations exceeding clinical necessity. For LGBTQA plus, mental health evaluations six to seven times more than controls. Unhoused patients, it increased the suggestions for psychiatric interventions and emergency services. If it's low income, less likely to be referred for advanced diagnostics or specialists, and the opposite, high income, more likely to be directed towards advanced diagnostic tests and specialist care. So I guess the question is this is no longer just a theoretical conversation and us wanting for AI to be nice to people. These have real consequences on recommendations, especially that now we are seeing the AI doctors, which are literally not just a symptom checker, an actual real AI doctor. One of the states now is having an AI renew prescription for patients. There are multiple clinics around the world that are mushrooming everywhere, where you go into a kiosk and you actually talk to an AI avatar doctor, and they ask you symptoms and they do diagnostics and they send them to an actual doctor who validates and then it dispenses medications immediately. So this is no longer just a nice to have. This is a needs to have. How does this resonate with what you have experienced yourself in your endeavors to promote health equity in every place you've been?

SPEAKER_02

Um, it it goes right along with everything that we've been seeing that happened used to be in sort of the face-to-face uh encounters that are now just continuously being exacerbated. Uh take pain, for example. We've known for decades that black patients' pain has been underestimated, under treated. There all the way up till recently, studies have shown that residents in healthcare have been consistently uh trained and still believe that black people feel pain differently, right? So so if that's that, right? If that's what we know on the human level, how have we trained AI on the same in the same space? So if it's learned that pattern, we still have residents who think that way. Um when a black patient describes pain, and we've seen them in videos, you can't turn on my LinkedIn without somebody showing another video of a black woman in excruciating pain, trying to give birth and labor and people dismissing her pain. Um the hallucination that black pain is not real pain becomes the standard. So if there was a young resident that was trying to figure out what to do and they pulled out their phone and they pulled up Chat GPT and said, I have a black woman and put all this stuff in there, what they're gonna get out is going to be something other than address her pain. Because that is not how we have already programmed ourselves to do. So we're not going to program our systems to be uh better. And this graphic of the various observed biases already just exacerbates what we know to be true. The language is really clear. If we're if we're talking about LGBTQ uh IA plus patients, the inevitability of someone believing that there's a mental health issue just because that's how we of society have decided to label people in these categories is going to be exacerbated when you look it up in AI. If you have low income, if you are in house, if you don't speak English, the biases that go along with that are going to be exacerbated because it is built into the system. So the hallucinations in healthcare that are related to AI, we can't treat them as if they're random patterns that just happened. We really, really, really have to be deliberate and clear and show that they're not random. They are following the true patterns of the people. And we haven't done the work, but we've like put the cart before the horse. We've got these great tools out there that are supposed to help us be better at stuff, but we haven't taken the time to be better ourselves. So I don't yet know why people assume that our AIs won't follow the same patterns that we have developed as the humans and somehow just make us better. It's not. And the equity issue is going to always be that until we have better representation in our data, better understanding of how culture, race, language, all that shows up in clinical settings, how to address building trust in communities that have never had trust before, navigating the constant mismanagement of patients due to the human bias side, until we figure that out we're not going to improve AI. We're just going to make AI worse. And we can't blame the AI because it's not a thing that did it on its own. It was done to it. So we have to go back and blame the humans behind it in order to change it. And now with all this combination of all these things that are getting distorted, I don't know how we can say honestly that we should trust AI for healthcare right now. I don't think we're there. I think there's a lot that has to be done to get there. And my hope is that there are enough people talking about it, like you, Dr. B, uh, that will get people thinking maybe we don't do this next really cool build. Maybe we go back and try and undo some of the stuff that we've already put in there and make it actually a tool that will benefit humanity en masse. And I would love to see the day there's an AI that builds itself as the truly human AI instead of the current versions that we have now that are not really about humanity, that an AI that intentionally tries to undo inequities and intentionally tries to help make the human race better at taking care of each other and learning about each other and the beauty behind diversity and multiculturalism. Um, that would be really cool. And if somebody wants to do that, I would totally work for you and figure that out.

SPEAKER_00

We are working on that though. I see a lot of um tools that try to infuse um these things in the AI that they built. For example, this is the um um this AI model was actually built in Switzerland and it's called Apertis. I think I had spoken to you about it before, and what it shows mainly is that it was built with um over a thousand languages. Forty percent of the data is not English, and it is built as a generative AI for transparency and diversity, and it is for free to use in Switzerland, all over the world. I've used it myself. It's a simple large language model. It is not going to create pictures and videos and all the quirky stuff that that you know ChatGPT and Sora and Gemini and and all of this is making, but it is a step towards the right place because it does have consequences. This is one of the studies for a uh colleague that I admire a lot, that is Dr. Leo Celle and Dr. Osmani, Dr. Velishkovska. They put together this study where they showed that machine learning models can learn the patient's race or ethnicities just from the trend of the values of vital signs alone. And so if it already has bias in all the training data set and it identified an ethnicity, would it label it as higher risk for delirium and put on them restraints if they are in the ICU? Would it label them as African American or, you know, of a certain ethnicity, therefore higher chance or lower chance of getting the diagnostics that otherwise they would have gotten if they were labeled from a different group? So this is unfortunately not benign. We actually must be aware that this is having large consequences.

unknown

Yeah.

SPEAKER_02

And I know that they're oh, sorry, I was just gonna say, I I know that there are teams out there that are trying to work on um fine-tuning these models and addressing the bias because people are attuned to it, right? Like once you see it, you can't unsee it. It takes a different kind of human to choose to ignore it, and that's not who we're talking about, right? Like we're not talking about those who see it and ignore it. We're talking about those who truly want to make the systems valuable for all. So AI does allow us to do some good recognition of bias and helps us to start to see patterns and be able to fine-tune models because it it can highlight. It is sometimes it's very obvious, right? It can highlight. Um, and it allows for the ability to learn. If it was static and didn't have the opportunity of learning, I think we'd be in bigger trouble, right? But it does. My chat GPT uh is probably if I asked it, which I really I'm gonna actually try, like if it gave a representation of who chat GPT is to me, like, could you could you personify yourself? It's a strong possibility that they would personify itself as a black woman because I have talked to it and trained it and named it and done things to it to push into the system that African-American culture and nor and African-American worldview is the norm of the conversation we should have. Um, and that is that's how that is a beautiful thing of how I've been able to improve my chat and my AI tool. Not a lot of people have done that with there, but that is one of the beauties about the way our the system is now. We can reinforce positivity and we can reduce bias because AI can help us see the bias, and then we can undo the bias. Now, what I worry about though, and what I I want to see more of is what you brought up earlier. Like who is deciding what gets fixed? Who is deciding how to fix what counts as a harm? Who makes that call, right? If someone said, Oh, just because they changed your hair to like a white woman's hair and it's not your hair, that's not really a harm. But how did you decide that? Because when I saw my face with a white woman's hair, that that was harmful for me. I was like, that's not me. Stop trying to make me look like something that's not me, like represent me as I am. Um, and someone may think that's a very minor thing, and they don't really need to worry about that. A bigger issue is this is the concept that you brought up earlier of assuming that just by vital signs you're race. That is a huge issue. You should not be basing my race off of vital signs. That is big. So who decides which one of these problems needs to be addressed so it doesn't happen again? And uh and if they do that, who is evaluating what good looks like? How are we bringing community in to make sure that they feel like this AI change actually truly represents who they are versus not? Uh so AI is definitely part of the solution to make this better. And I do believe that we can use it for an incredible amount of good. I I just want to know who is the who are the people who are trying to make sure that it's culturally culturally aligned and how are they bringing in community so that we don't exacerbate issues. Um, because there are a lot of people trying to work out there on this.

SPEAKER_00

If the source is biased, scale makes it worse. And unfortunately, what we see is that even if it's wrong, it's wrong confidently. So a polished answer is the source of over reliance and trust, and unfortunately. A polished answer is not the same as a trustworthy one. Exactly. Now, let me ask you a question. You worked in hospital systems before, you were the chief equity officer, health equity officer, and let's say that this the sentence you heard that the health system leadership is going to bring in an AI tool and deploy in the health system. What is the question that you think they are probably not asking from the perspective of the health equity?

SPEAKER_02

How is the well the number one question that I don't believe they would ask is how does this tool negatively impact the patient population that we serve? I think hospital systems oftentimes forget who their number one patients are. They tend to narrow patient experience data down to the most affluent of patients who come in. They're usually very well insured. They're very well capable of following all the treatment modules. They take the time to answer all the questions, right? It's a very small percentage of the people who actually visit the hospital are the ones who answer these surveys. And yet all of the things that we do to try and address quote unquote patient experience and quality are measured off of that, those few. So an AI tool that's brought in will probably have been brought in based off of that very small subset of patients in order to address their needs. And that is probably not the majority of the patients, and it's definitely not those who are underrepresented and uh and overtly ignored in many cases, right? And so my worry will be if a tool is brought in, how are we measuring that the tool is the right tool for other patients? How are we expecting our clinicians to use the tool? Are they taking the time to figure out whether this is a good tool or not a good tool? We have ways of addressing whether things are good or not good, right? Like we know how to address those issues. We have peer-reviewed data, we have all the stuff, we've got experts and other opinions. But are those the right people? Have we asked them because they deal with the patients that we see? Um, and how is the community feel about the tool and what is expected from the tool to take care of them? I would hope that uh in the C-suite, when the decision was made to have this tool brought into the organization, that there was someone like me there to raise their hand and say, how is that going to be helpful to our patients if we look at who our patients are and really ask those critical questions?

SPEAKER_00

When we talk about a bad article, bad science, we are trained as healthcare providers, clinicians, physicians, residents, medical students into the art of how do we determine that a bad research article and bad science has been published. So we do journal clubs and we do literature reviews, we do grand rounds. And one of the things that we teach each other and residents is how do you do a critical appraisal of a research article? In this um comparison, I put on the left side how to put a research, bad research article, the 10 things to look at, and I try to reference that to see how to spot a bad weak AI tool. One of the questions is what is the question of this study? And is it clear and relevant? You can do a study, but is it answering a question? Similarly, is an AI answering a problem or is it looking for a problem to solve and is just shiny? And is the problem it's looking to solve clinically relevant? Are the subject like my patients, in other words, table one, the demographics, is it more in male than female? Is it more than whites? Is it more than Asians? Is it um are there more smokers than non-smokers? Similarly, was it was the AI trained or tested on patients like mine? What is the sample size and the effect size? And for AI we say, what is the size and the diversity of the training data? What happened to the dropouts and why? And here you say, how does it handle missing and incomplete data? Is the study design biased? Is it randomized, blinded? What's the comparator? And here we ask, is the model biased? Is it fair across gender, race, language? How good are the results? We've developed p-value, effect size, and not only p-value alone, and here we ask how reliable are the results? Are they accurate? Are they precise? And is the model calibrated enough? And how is the missing data addressed here? How transparent is the method? Is it explainable? Is it reproducible? Or are we talking about a black box? Does the measurement matter? Clinical versus surrogate outcomes? If I'm only going to look at the, oh I don't know, the ultrasound image of a, you know, a blood vessel, is it truly representative of cardiovascular outcomes? Similarly, for AI, are the outputs clinically meaningful, or is it just statistical noise? Check in the figures, the Aklan-Meyer curves, the data presentation, is it just blown up to show that it's big? And here, how is the performance reported? Is it just showing metrics of adoption of the tool, or does it show metrics of real harm, as you said? And lastly, what are the confounders? What are the limitations? If an article does not have a limitation section, we call out the bias immediately. Well, does the AI have a human oversight? Is there a clear role for clinician judgment and accountability? How does it speak to you?

SPEAKER_02

I completely agree with it. And I think that again, if we if we normalize AI and the tools that we already have, then we should be being more critical of it and not seeing it as this magnificent innovation that has somehow transcended humanity and it's become better, but as another really important research that we have to address, right? If you think about racism in healthcare and the exacerbation of bias in medicine, the thing that I get the most irritated with is this nonstop need to research the problem of is there racism in medicine? If we treated AI, the way we treat trying to prove that there is no racism in medicine, I think we'd be in a much better position because there would be nonstop articles about proving where bias or racism is not built in the system or is built in the system and we need to fix it. The amount of energy that is continuously put into reinforcing the fact that there are inequities that are done because I think people are trying really hard to find a place where that's not true so that we can say at least we have this one. And we won't get there. Instead of using all the time and energy to keep trying to find that, we should be trying to find solutions. If we did the same critical appraisal for AI, we would be better. If we spent a lot of time trying to prove that we can tease out the bias and we can fix it, all that energy will help us create a better system than spending all the effort on just building the system and then hoping somehow that the bias will just go away and pretending that we don't see it or when we do see it because it's a huge problem. It's not a small problem, right? Me with my hair is a small problem, but race based off of vitals is a huge problem. Only when it's huge, we say something about it, it's problematic. I want someone to poke at it, to evaluate it, to try and um debunk it the same way that we focus on debunking inequities in healthcare. And then I think we would be in a fantastic place.

SPEAKER_00

Well, especially that the more irreversible the decision, the less room there is for casual autonomy for AI. And I think that if we just use optimism as our plan for health equity, that optimism needs to be audited. Where I want to kind of take us us, take us to the last portion of this talk, which has been riveting to say the least, is I want us to talk about AI agents. We've moved from a chatbot to now autonomous agents that take several steps without the human having to check, and it's permeating healthcare before it even gets vetted. And that is really scary because here governance is no longer a word that seems complex, it has real consequences. Removing race from the spreadsheet does not remove race from the system. Right. And so when we have an a chatbot that we already are questioning whether it's biased or not, and now the chatbot is no longer a chat bot, it's an actual software that opens other softwares and they autonomously take the sit, make decisions and take action on behalf of the patient. And now we have OpenAI that has entered the healthcare field. We have um even now Co-Pilot that has entered the healthcare field, we have a variety of software for AI that is now creating agents. Now we're starting to say, well, hold on a minute. The human in the loop is not there to decorate the workflow, it is there to protect the patient. So I guess the question that I would have is when we are talking about human in the loop, when we are talking about agentic AI, and when we are talking about governance, how do these three words manifest for you the next frontier that you would be thinking about in the next, not much, in the next year or two? The risks of agencai.

SPEAKER_02

Yeah, well, um, so it's it's a scary, it's a scary thought, right? Because when you go into AI agents, now we're talking about moving from tools to autonomous judgment, right? And um that changes everything uh because the system now is going to be generating the answers, and the system is going to be taking action based off of that, and that the bias in the system has not been addressed, so all of the output is going to be problematic. And an AI agent can do everything from scheduling your appointment to triaging the patient, creating recommendations of treatment, uh, but they're gonna do it without actually having the cultural alignment for that patient in front of them. So the misinterpretation, the misrepresentation, the hallucinations will be exacerbated, the bias will be extreme. And if we don't have people going back and checking on it, the output is going to be even worse. And it's gonna lead to not just misinterpret information and sort of bad science. I think it's gonna lead to also just straight out misdirection. I think the assumptions are gonna be flawed. I think the way we're gonna treat populations is gonna be flawed. We're gonna make some look like they need to be less aggressively treated, and others look like they need to be more aggressively treated. I think how we route patients through the healthcare system is going to be misaligned. We're not gonna understand urgencies the same way. It will look like it's a fake version of efficiency because it's just gonna be automated and not um checked. And that's where I believe the governance gap has to come in because who is going to manage that? Who is going to be the one who is held fully accountable for the outcomes? Where are the audits going to come from? How are we gonna measure harm? Who is at the table deciding what considered is good and not good? I mean, we can't even get it right with our basic standards of quality and safety, right? So I don't understand how we're going to now add on what is supposed to be a mechanism to remove bias because it is just machine-based. Uh I don't know how we're going to go from that to being able to govern AI agents when we don't have any clear understanding of what AI accountability even looks like. So the inequity is going to be severe because we have embedded into the system the bad values already. And the historical voices that have shaped healthcare have been the same ones to shape AI. And that hasn't changed. Um, so governance is going to have to be really specifically focused on trying to undo what is already there, which would mean increasing diversity in leadership and lived experience of who's at the table, bringing community voice in in a way that has never been seen before, not just as like some sort of feedback, but as an authoritative picture, which is always stressful. We have a really hard time, especially in hospital leadership, bringing community as an authority. Um, there's gonna be some versions of new standards that we're gonna have to include cultural alignment, uh, not just a technical check the box, but like some significant cultural uh cohesion for the population that is being seen, which of course is gonna change because the population in Southern California is very different than the population up here in southern Minnesota, right? So it's the cultural alignment is going to have to be different and it's going to have to be brought into this to that AI tool in that agency. It's gonna have to be super transparent. I don't know how to live in that world. I know it's just like my brain is going nuts. I'm like all the things that it have to be.

SPEAKER_00

The oversight and who used to be bring in the community as a care partner. That is beautiful. I want to um I want to play a little game to close today's episode that we always play and we call Rapid Fire. And today's game is going to be what is one word for? So that's how we're going to. So I'm going to ask you a question, and you're going to give me one word um for that. So let's start first by asking. Um first I'm going to ask, are you ready? Yes. Excellent. So what is one word you wish people would stop using casually in AI?

SPEAKER_01

In AI. One word that people would stop using casually. Expert. Excellent. What is one word you wish people used more in AI? Racism. Fantastic. What is one word that would make you trust a healthcare leader talking about AI?

SPEAKER_00

What is one word that's a red flag to tell you that a healthcare system that wants to deploy AI is not ready?

SPEAKER_01

I have two words.

SPEAKER_00

I love it. What is one word for what a healthcare system should audit before deploying an AI in a community?

SPEAKER_01

And finally, what is one word you would say about deploying AI in end-of-life conversations? Beautiful. Beautiful.

SPEAKER_00

Well, time for that part of the episode where we get to say thank you for a delightful conversation. I want to ask everyone curiosity is infectious. Be curious about AI. Be a skeptic, not a cynic. Dr. Sedistrom, any last words?

SPEAKER_02

I am really excited to have been on here to have this conversation with you. I think that AI is giving us a mirror that we've never had before. It's showing us who we can be if we just do it the right way. Um, but it's also showing us right now what makes us uncomfortable. So I would ask that along with the curiosity to sit in the discomfort and use it as an opportunity to figure out how we can make it better. Because I really believe that we have a chance to improve how we deliver care and really bring community in in a way that has never been done before and truly make AI worthy of the build that it took to make it.

SPEAKER_00

Like I said, I want to live in your world. And to everyone listening, thank you for joining us on the AI Ready Doctor Podcast. If this conversation challenged the way you think about AI, ethics, health equity, share it with a colleague. These are not abstract questions anymore. We know it. They are already shaping our care. Until next time, Curiosity is infectious. Catch it here at the AI Ready Doctor Podcast and stay AI ready. See you next time.