Skip to main navigation Skip to search Skip to main content

Multi-Modal LLMs in Agriculture: A Comprehensive Review

  • Ranjan Sapkota
  • , Rizwan Qureshi
  • , Muhammad Usman Hadi
  • , Syed Zohaib Hassan
  • , Ferhat Sadak
  • , Maged Shoman
  • , Muhammad Sajjad
  • , Fayaz Ali Dharejo
  • , Achyut Paudel
  • , Jiajia Li
  • , Zhichao Meng
  • , John Shutske
  • , Manoj Karkee

Research output: Contribution to journalArticlepeer-review

6   Link opens in a new tab Citations (SciVal)

Abstract

Given the rapid emergence and applications of Multi-Modal Large Language Models (MM-LLMs) across various scientific fields, insights regarding their applicability in agriculture are still only partially explored. This paper conducts an in-depth review of MM-LLMs in agriculture, focusing on understanding how MM-LLMs can be developed and implemented to optimize agricultural processes, increase efficiency, and reduce costs. Recent studies have explored the capabilities of MM-LLMs in agricultural information processing and decision-making. Despite these advancements, significant gaps persist, particularly in addressing domain-specific challenges such as variable data quality and availability, integration with existing agricultural systems, and the creation of robust training datasets that accurately represent complex agricultural environments. Moreover, a comprehensive understanding of the capabilities, challenges, and limitations of MM-LLMs in agricultural information processing and application is still missing. Exploring these areas is crucial to providing the community with a broader perspective and a clearer understanding of MM-LLMs’ applications, establishing a benchmark for the current state and emerging trends in this field. To bridge this gap, this survey reviews the progress of MM-LLMs and their utilization in agriculture, with an additional focus on 11 key research questions (RQs), where 4 RQs are general and 7 RQs are agriculture focused. By addressing these RQs, this review outlines the current opportunities and challenges, limitations, and future roadmap for MM-LLMs in agriculture. The findings indicate that multi-modal MM-LLMs not only simplify complex agricultural challenges but also significantly enhance decision-making and improve the efficiency of agricultural image processing. These advancements position MM-LLMs as an essential tool for the future of farming. For continued research and understanding, an organized and regularly updated list of papers on MM-LLMs is available at https://github.com/JiajiaLi04/Multi-Modal-LLMs-in-Agriculture Note to Practitioners—Motivated by the need to optimize agricultural practices, this paper investigates the use of Large Language Models (MM-LLMs) to improve efficiency and decision-making in agriculture. We delve into critical RQs to reveal the capabilities and challenges of MM-LLMs, and their potential applications in the agricultural sector. Looking ahead, our findings suggest a promising future for the integration of MM-LLMs in agriculture, potentially revolutionizing how we manage and operate farms.
Original languageEnglish
Pages (from-to)22510-22540
Number of pages31
JournalIEEE Transactions on Automation Science and Engineering
Volume22
DOIs
Publication statusPublished - 19 Sept 2025

Funding

FundersFunder number
National Science Foundation
UCF AI Research To Novel Agricultural EngineeRing Applications
US-DANIFA
PARTNER
National Institute of Food and AgricultureAWD003473, AWD004595, 1029004
University of Central Florida2024-67022-41788
U.S. Department of Agriculture1031712

    Fingerprint

    Dive into the research topics of 'Multi-Modal LLMs in Agriculture: A Comprehensive Review'. Together they form a unique fingerprint.

    Cite this