iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
📖

re:Invent 2025: How Heidi Health Leverages GenAI for Medical Documentation Automation and Global Scaling Challenges

に公開

Introduction

By transcribing various overseas lectures into Japanese articles, we aim to make hidden valuable information more accessible. This project addresses this concept, and the presentation we will be discussing this time is:

For re:Invent 2025 transcription articles, information is summarized in this Spreadsheet. Please check it as well.

📖 re:Invent 2025: AWS re:Invent 2025-How Heidi Health is leveraging GenAI to transform the global healthcare industry

In this video, Ocha, an engineer at Heidi, shares his experience deploying an AI scribe to over 370,000 clinicians worldwide. Heidi transcribes doctor's consultations and automatically generates clinical notes, reducing administrative tasks. Key learnings include the indispensability of doctors for validating non-deterministic AI outputs, systematizing the evaluation process that started with JupyterHub as "clinicians in the loop," and facing four challenges in global expansion: data sovereignty, model availability, regional differences in healthcare, and regulatory compliance. These were addressed through standardization with Infrastructure as Code and a hybrid approach using Amazon Bedrock and EKS. The keys to success are highlighted as pivoting from a broad platform to a single workflow, incorporating experts as guardians of quality into product development, and building a flexible architecture.

https://www.youtube.com/watch?v=fboYXjGSWJI

  • This article is automatically generated while maintaining the content of the existing lecture as much as possible. Please note that typos or incorrect information may be included.

Main Part

Thumbnail 0

Thumbnail 20

Thumbnail 30

Heidi's Birth and Growth: Revolutionizing Healthcare Administration with AI Scribes and Building Trust with "Clinicians in the Loop"

Hello, Ocha. It's awesome to be here at re:Invent. I've learned so much these past three days, from sessions and mixers and other startups about how they're building AI. So it's my turn to contribute my learnings of scaling generative AI in a real healthcare setting. So for today's agenda, first, I want to tell you a bit about Heidi, our journey to becoming one of the world's largest AI scribes, and how we're navigating the healthcare system with AI. And then by the end of this lightning talk, I hope you all have some valuable takeaways you can take back to the products you're building.

Thumbnail 50

Thumbnail 90

So before I start, can I get a show of hands? Who here has gone to the doctors and felt rushed or felt like the doctor was staring at the computer and not giving you the attention you deserve? Yeah, all of you. It's a sad reality, but actually in that moment, we feel frustrated, but it's not actually the doctor's intention because they have to deal with the administrative tasks of writing notes. Heidi is here to solve that problem. At Heidi, we want clinicians to enjoy their jobs without the burden of administrative tasks. We give time back to doctors to provide better patient experiences. And this is just one step in our mission in doubling the world's healthcare capacity by providing an AI care partner.

Thumbnail 100

Thumbnail 110

Thumbnail 120

Thumbnail 130

So let me show you Heidi in action. Imagine you're a doctor, and you're currently in a consultation session with a patient. Once you start transcribing, Heidi will generate clinical notes without requiring any corrections or additional actions. And that's not all Heidi can do. Based on templates created by the doctor, the doctor can then use those notes as a reference. To create a patient-facing explanation letter. And the doctor can also ask AI to do clinical research, and Heidi can even give suggestions on tasks the doctor needs to do as a follow-up to the consultation. This is exciting because doctors can now focus on the patient, and Heidi can do all the administrative tasks.

Thumbnail 150

Thumbnail 160

Thumbnail 170

Thumbnail 190

Thumbnail 210

So let me tell you a little bit about our journey. Our founder, Tom Kelly, a practicing doctor, started building a chatbot tool called OSCER to help medical students master clinical exams using early transformer models. Seeing some success, we then dove into full health, creating a care platform that supported doctors and improved patient care. This worked well for a while until generative AI came along. That's when we could leverage the power of non-deterministic outputs. We focused on one workflow, clinical note generation, and ended up with Heidi, an ambient AI scribe that reduced the burden of documentation. We grew from a small company in Australia to now being recognized as a global player. Looking at the numbers, Heidi is the most used AI scribe in the world. With over 370,000 clinicians using Heidi, 10 million consultations a month, the number one adopted AI scribe in Canada, and backed by notable investors. And this is just the beginning. And this is me. I've been with Heidi for almost four years. As one of Heidi's founding engineers, I developed the software infrastructure, observability, security compliance, and all those fun things by myself. And now, as we're growing at incredible speed, I'm focusing on platform engineering, providing the tools and experiences engineers need to develop safely and efficiently.

Thumbnail 240

Thumbnail 260

Thumbnail 270

So let's dive into our first lesson. And that's the challenge of AI in healthcare, which is building trust in AI. When we started building Heidi, we immediately thought about how notes could be personalized and customized to write just like the clinician. Our custom templates were a huge success, and doctors loved them, but as engineers, we kept on thinking how we could be more efficient. We thought about latency, we thought about context window for large language models. But as more clinicians started using Heidi, we encountered more unique cases related to their specialty. For example, how do we make the tone and details of the note summary accurate so that each doctor feels confident and relies on it? This is where we realized that truth matters. Healthcare requires clinical accuracy, and we were trying to validate non-deterministic outputs at scale. You cannot write unit tests for clinical empathy or diagnostic nuances. We needed doctors.

Thumbnail 300

So what did this mean at the very beginning? Since we only had a couple of doctors at the time, we started by giving each doctor a Jupyter notebook. A Jupyter notebook is a tool that data scientists use to write code and do experiments.

And doctors can start doing experiments to connect with LLMs, inputting prompts and transcripts, changing temperatures, and all that stuff. But the problem is that at the very end, doctors have to aggregate their results to summarize all their tests. So to deal with that problem, we then started providing a JupyterHub hosted on EC2 as a collaboration tool so that doctors don't have to integrate everything at the end.

Thumbnail 330

Thumbnail 350

Obviously, this doesn't scale. Not every clinician or doctor is going to be a coder. It works fine for small-scale tests with a few doctors. So how do we scale it? Think about it this way: what does a typical clinician's workflow in Heidi look like for transcription and note generation? First, we needed data points that we could use to conduct evaluations, but we can't use user data, so how do we do that in a test environment?

Thumbnail 390

One way doctors do this is by conducting mock consultations or case studies with Heidi users, but most importantly, the latest technique we're using is to generate synthetic data using LLMs to generate consultations in both audio and text formats. With enough data, clinicians can then start doing more evaluations, such as word error rates, template adherence checks – basically ensuring that the template created by the clinician is medically safe – hallucination rate checks, and so on. And this process is what we call "clinicians in the loop."

Thumbnail 410

Thumbnail 430

As Heidi began to hire more clinicians, the evaluation process needed to be scalable. At this stage, engineers began to introduce more tools that could assist clinicians. For example, internal tools to evaluate flagged sessions reviewed in a test environment, building connections between sessions and LLM contexts to gain a better understanding, and LLM as a judge tools to run at scale. All these processes can be used as a feedback loop to improve models, prompts, and medical safety. This has shaped our product and engineering decisions, hiring, and go-to-market strategy.

Thumbnail 440

Thumbnail 450

Challenges of Global Expansion: Overcoming Data Sovereignty, Model Availability, Healthcare Diversity, and Regulatory Compliance with Standardization and a Hybrid Approach

Now, speaking of scaling, scaling clinical experts is one challenge. Next is how we scale Heidi outside of Australia. As we started rolling Heidi out globally, we quickly realized that there is no single standard for healthcare. We faced four distinct layers of complexity that needed to be solved simultaneously.

Thumbnail 470

First is data sovereignty. This is not just a storage issue, but it's about strict data locality and network architecture. For example, in Australia, we must strictly use AP Southeast 2 or the latest Melbourne region, AP Southeast 4, whereas in the US, we might utilize US East 1 or US West 2. This is not just about where the data is stored, but also how it moves. We need to keep our workloads private using a well-architected VPC network and precisely control how systems communicate with each other within those specific boundaries.

Thumbnail 510

And second is model availability. If you're only building for the US, it's relatively easy because models are available everywhere here in the US. You can choose almost any provider you want, but the moment you try to expand into a new region, that luxury disappears. We suddenly have to consider other options because the models we wanted were simply not available or compliant in those local zones.

Thumbnail 530

And third is the reality of healthcare itself. A GP consultation in Australia looks very different from a primary care consultation in New York. This is not just an accent issue, but it's about training, consultation flow, and medical terminology. Heidi must adapt to these nuances to accurately document consultations.

Thumbnail 560

And lastly, we're building on shifting sands. Gen AI is a new frontier that's actively influencing the regulatory landscape. Navigating different regions means managing different compliance requirements simultaneously. This is not just a legal headache, but it directly influences our product roadmap and daily engineering decisions.

Thumbnail 590

Faced with these four major hurdles, we had to design a solution that could address all of them. So, from a technical perspective, how did we actually address these challenges? The answer lies in standardization. All of our AWS infrastructure needs to be standardized across all regions. We leverage Infrastructure as Code to ensure consistency in our deployments. This enables a flexible architecture that allows us to easily deploy to new regions without reinventing the wheel. Essentially, we treat new regions as plug-and-play templates. You'll notice an EKS cluster in the diagram. It's a bit small, so I'll make it bigger over there. This is central to our strategy for model availability.

Thumbnail 630

When we talk about immediate availability, when entering new regions, we use LLM providers like Amazon Bedrock, which are already available and compliant in that specified region. This solves the immediate cold start problem. However, in the long term, having the infrastructure to support self-hosted models is crucial. This is where EKS comes into play because AWS EKS supports most global regions. With our infrastructure templates ready, we can provide our own inference models anywhere. This hybrid approach – Bedrock for speed and EKS for control – solves global model availability.

Thumbnail 670

But as I mentioned, healthcare isn't just code, right? It's people. Even after the technical pipes are laid, we still face huge non-technical hurdles. It's about building trust. Trust begins by speaking the language. And I'm not just talking about French or Spanish; I'm talking about medicine. We employ clinical ambassadors in every region where we operate. These are doctors who believe in Heidi's mission and provide concrete local support. They're not just consultants. They provide specific support to ensure Heidi speaks the local medical dialect. They validate that Heidi not only translates words but also understands regional practice patterns, ensuring outputs that feel natural to a New York GP or a Sydney specialist.

Thumbnail 720

Finally, we tackle complex regulatory requirements through a rigorous compliance network. We've established a dedicated in-house legal and compliance team that manages the changing landscape of international laws. We also collaborate with external partners, especially focusing on medical safety. This ensures that while we move fast with our infrastructure, we never compromise on safety. This plug-and-play technical architecture, combined with a human-in-the-loop trust strategy, allows us to scale globally while staying local.

Thumbnail 770

So, if there's one thing I want you all to take away today, it's that technology isn't the only product. What made Heidi successful wasn't just the release of foundational models, what made us successful was the pivot. We shifted from a broad care platform that tried to do everything, to focusing on a single workflow that brings immediate, tangible value to doctors. Don't try to boil the ocean. Just perfectly solve one painful problem.

And second, in a generative AI world, humans are more important than ever. Doctors and clinicians are the core part of our product. We learned to treat our experts not just as testers, but as our biggest asset. They are the guardians of quality. And finally, to all the developers out there, build a flexible architecture from day one. This isn't just about code quality. It's about business survival. It's this flexibility that allows you to respond to changing regulatory environments and expand into new regions with entirely different requirements. Your architecture should be an enabler of expansion, not a bottleneck.

Thumbnail 840

Thumbnail 860

And that's it. I hope you learned one or two things from our journey. And if you're interested in building the future of healthcare or have any questions about Heidi, I'd love to chat. You can learn more about Heidi via the QR link on the left, and you can find me via the QR link on the right. Thank you very much.


  • This article was automatically generated using Amazon Bedrock, maintaining the original video's information as much as possible.

Discussion