FiLLM -- A Filipino-optimized Large Language Model based on Southeast Asia Large Language Model (SEALLM)

#1 FiLLM -- A Filipino-optimized Large Language Model based on Southeast Asia Large Language Model (SEALLM) [PDF] [Copy] [Kimi¹] [REL]

Authors: Carlos Jude G. Maminta, Isaiah Job Enriquez, Deandre Nigel Nunez, Michael B. Dela Fuente

This study presents FiLLM, a Filipino-optimized large language model, designed to enhance natural language processing (NLP) capabilities in the Filipino language. Built upon the SeaLLM-7B 2.5 model, FiLLM leverages Low-Rank Adaptation (LoRA) fine-tuning to optimize memory efficiency while maintaining task-specific performance. The model was trained and evaluated on diverse Filipino datasets to address key NLP tasks, including Named Entity Recognition (NER), Part-of-Speech (POS) tagging, Dependency Parsing, and Text Summarization. Performance comparisons with the CalamanCy model were conducted using F1 Score, Precision, Recall, Compression Rate, and Keyword Overlap metrics. Results indicate that Calamancy outperforms FILLM in several aspects, demonstrating its effectiveness in processing Filipino text with improved linguistic comprehension and adaptability. This research contributes to the advancement of Filipino NLP applications by providing an optimized, efficient, and scalable language model tailored for local linguistic needs.

Subjects: Computation and Language , Artificial Intelligence

Publish: 2025-05-25 06:36:26 UTC

2505.18995

#1 FiLLM -- A Filipino-optimized Large Language Model based on Southeast Asia Large Language Model (SEALLM) [PDF] [Copy] [Kimi1] [REL]

#1 FiLLM -- A Filipino-optimized Large Language Model based on Southeast Asia Large Language Model (SEALLM) [PDF] [Copy] [Kimi¹] [REL]