2025.emnlp-industry.78@ACL

Total: 1

#1 GSID: Generative Semantic Indexing for E-Commerce Product Understanding [PDF] [Copy] [Kimi] [REL]

Authors: Haiyang Yang, Qinye Xie, Qingheng Zhang, Chen Li Yu, Huike Zou, Chengbao Lian, Shuguang Han, Fei Huang, Jufeng Chen, Bo Zheng

Structured representation of product information is a major bottleneck for the efficiency of e-commerce platforms, especially in second-hand ecommerce platforms. Currently, most product information are organized based on manually curated product categories and attributes, which often fail to adequately cover long-tail products and do not align well with buyer preference. To address these problems, we propose Generative Semantic InDexings (GSID), a data-driven approach to generate product structured representations. GSID consists of two key components: (1) Pre-training on unstructured product metadata to learn in-domain semantic embeddings, and (2) Generating more effective semantic codes tailored for downstream product-centric applications. Extensive experiments are conducted to validate the effectiveness of GSID, and it has been successfully deployed on the real-world e-commerce platform, achieving promising results on product understanding and other downstream tasks.

Subject: EMNLP.2025 - Industry Track