Thoughts on Large Language Models in pre-clinical medical education
I gave a talk on AI/ LLM (aka ChatGPT and its cousins) in medical education at the UMEDS section of the Association of Pathology Chairs meeting July 17, 2023.
https://www.apcprods.org/apc2023
Some highlights.
With respect to LLM, any literature from before 2022 is irrelevant. Forget the IBM Watson debacle.
Generative LLM should be thought of in a different context than the various black box algorithms that predict patient outcome or classify images. I find those problematic as they distract from understanding the underlying biology. But LLM is different technology for different uses.
With respect to education, there are two major categories of use cases. One is that the instructor can use LLM to generate an indefinitely large number of clinical scenarios and various questions for formative and summative learning exercises / tests. The system can generate harder or easier questions as needed. The material can be based on pasted in reference works or based on the LLM’s existing knowledge base. Generated questions can be multiple choice, matching, essay questions and probably other formats I have not explored. ChatGPT can probably grade the essays as well although this needs to be explored further. It can summarize chapters, generate learning objectives, and do content creation generally. In these cases, errors / hallucinations are not an issue as the instructor can edit / correct / enhance the material as needed.
The more controversial case would be for students to use it for the same thing e.g. generating flash cards or looking things up. The concern is that the known error rate of LLM would be problematic. First of all students have for years moved on from conventional textbooks and lectures. Sketchy, Pathoma, First AID and their like are the dominant texts. The incoming class of first year medical students have all used some LLM (probably ChatGPT) already. So educators need to get on the wave or they will be under it.
Secondly, students I have talked to are acutely aware that current generation LLM make errors. When I have talked to students about prior era study aid errors, the response is usually that they don’t care, they just want to get into the game and they will worry about details later. I think this is a healthy recognition that medicine is lifelong learning coupled with skepticism about any particular data source.
It is worth noting that the convenience factor of LLMs is such that it encourages exploration of scenarios that would be harder to do with other resources. For example, a student could ask for a differential diagnosis for a given patient’s presentation. The LLM will then make suggestions. The student can then ask questions about those entities including their definitions, and how to distinguish those entities (history, physical exam, imaging, labs). As they get additional answers, they can keep asking more and more questions and go down any rabbit hole they choose. At some point, they can switch to more conventional sources but by then, they will have had a framework on which to comprehend the other material.
One concern is that this approach will enhance student isolation and keep them from learning from each other. One thought is to include LLM during in class exercises. In a team-based learning event, students traditionally have one round by themselves, and one round with their team. A third round can be added with a LLM answer. The students can then comment on the LLM’s answer related to their answer to see if they agree or disagree with it. This promotes a critical eye to LLM’s answers but also allows them to see if that answer can add to theirs.
What is missing from this discussion is of course data. It will be the work of academic educators to explore this use and determine best uses and highlight traps. This is the way.
Comments
Post a Comment