The integration of artificial intelligence (AI), particularly generative tools like chatbots, into clinical reasoning processes offers new opportunities for enhancing diagnostic decision-making and learning. Yet evidence on the perceived utility of such tools is limited. This session will present the findings of a randomized control study that evaluates whether diagnostic performance was improved by the use of a large language model (LLM) in Indonesia, Kenya and the Netherlands.