CINECA IRIS Institutional Research Information System

Objective: To evaluate the quality and readability of large language models (LLMs) when responding to Frequently Asked Questions (FAQs) about oral lichen planus (OLP). Methods: We evaluated the responses of three LLMs (ChatGPT-4o, Gemini 2.0 Flash Experimental, and Copilot) to 13 patient-centered FAQs about OLP. Questions were identified using query tools, and answers were assessed by 14 oral medicine experts using the Quality Assessment of Medical Artificial Intelligence (QAMAI) tool. Readability was analyzed with the Flesch Reading Ease (FRE) and Flesch–Kincaid Grade Level (FKG) tools. Results: All LLMs provided generally accurate and relevant responses, with median QAMAI scores indicating “good” to “very good” quality. ChatGPT achieved slightly higher completeness, particularly for questions on OLP definition and treatment. The reference provision was inconsistent across all chatbots. Readability analysis revealed that most responses required college-level literacy, with ChatGPT producing the most complex texts, Gemini occasionally achieving more accessible outputs, and Copilot situated in an intermediate position. Conclusions: LLMs may have potential as adjunctive tools for patient education in OLP, although they remain limited by incomplete information, inconsistent references, and suboptimal readability. Future research should incorporate longitudinal LLMs evaluations and training to develop models delivering accurate, accessible information, tailored to users' literacy levels.

Quality and Readability of Large Language Models' Responses to Oral Lichen Planus Patients' FAQs

Alessandro Polizzi;Gaetano Isola;Vito Carlo Alberto Caponio;Alan Roger Santos‐Silva;José González‐Serrano;Rui Albuquerque;Vlaho Brailo;Arwa Mohammad Farag;María Pía López Jornet;Jairo Robledo Sierra;Thomas Peter Sollecito;Hongxia Dan;Márcio Diniz Freitas;Caroline Bissonnette;Paswach Wiriyakijja;Gonzalo Hernández;Rosa María López‐Pintor

2026-01-01

Abstract

Objective: To evaluate the quality and readability of large language models (LLMs) when responding to Frequently Asked Questions (FAQs) about oral lichen planus (OLP). Methods: We evaluated the responses of three LLMs (ChatGPT-4o, Gemini 2.0 Flash Experimental, and Copilot) to 13 patient-centered FAQs about OLP. Questions were identified using query tools, and answers were assessed by 14 oral medicine experts using the Quality Assessment of Medical Artificial Intelligence (QAMAI) tool. Readability was analyzed with the Flesch Reading Ease (FRE) and Flesch–Kincaid Grade Level (FKG) tools. Results: All LLMs provided generally accurate and relevant responses, with median QAMAI scores indicating “good” to “very good” quality. ChatGPT achieved slightly higher completeness, particularly for questions on OLP definition and treatment. The reference provision was inconsistent across all chatbots. Readability analysis revealed that most responses required college-level literacy, with ChatGPT producing the most complex texts, Gemini occasionally achieving more accessible outputs, and Copilot situated in an intermediate position. Conclusions: LLMs may have potential as adjunctive tools for patient education in OLP, although they remain limited by incomplete information, inconsistent references, and suboptimal readability. Future research should incorporate longitudinal LLMs evaluations and training to develop models delivering accurate, accessible information, tailored to users' literacy levels.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Parole chiave
	
				accuracy
large language models
oral lichen planus
oral potentially malignant disorders
patient education
readability
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14085/60741

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

ND

social impact