top of page
Evaluating and Advancing Large Language Models for Water Knowledge Tasks in Engineering and Research
Reference Type:
Journal Article
Xu, Boyan, Zihao Li, Yuxin Yang, Guanlan Wu, Chengzhi Wang, Xiongpeng Tang, Yu Li, et al. 2025. “Evaluating and Advancing Large Language Models for Water Knowledge Tasks in Engineering and Research.” Environmental Science & Technology Letters 12 (3): 289–96. https://doi.org/10.1021/acs.estlett.5c00038
Although large language models (LLMs) have demonstrated significant value in numerous fields, there remains limited research on evaluating their performance or enhancing their capabilities within water science and technology. This study initially assessed the performance of eight foundational models (i.e., GPT-4, GPT-3.5, Gemini, GLM-4, ERNIE, QWEN, Llama3-8B, and Llama3-70B) on a wide range of water knowledge tasks in engineering and research by developing an evaluation suite called WaterER (i.e., 1043 tasks). GPT-4 was demonstrated to excel in diverse water knowledge tasks in engineering and research. Llama3-70B was best for Chinese engineering queries, while Chinese-oriented models outperformed GPT-3.5 in English engineering tasks. Gemini demonstrated specialized academic capabilities in wastewater treatment, environmental restoration, drinking water treatment, sanitation, anaerobic digestion, and contaminants. To further advance LLMs, we employed prompt engineering (i.e., five-shot learning) and fine-tuned open-sourced Llama3-8B into a specialized model, namely, WaterGPT. WaterGPT exhibited enhanced reasoning capabilities, outperforming Llama3-8B by over 135.4% on English engineering tasks and 18.8% on research tasks. Additionally, fine-tuning proved to be more reliable and effective than prompt engineering. Collectively, this study established various LLMs’ baseline performance in water sectors while highlighting the robust evaluation frameworks and augmentation techniques to ensure the effective and reliable use of LLMs.
Download Reference:
Search for the Publication In:
Formatted Reference:
bottom of page