Item type:Conference Paper,

Self-Refinement Strategies for LLM-based Product Attribute Value Extraction

Loading...
Thumbnail Image

Fulltext URI

Document type

Text/Conference Paper

Additional Information

Date

relationships.isAuthorOf

Brinkmann, Alexander
Bizer, Christian

Journal Title

Journal ISSN

Volume Title

Publisher

Gesellschaft für Informatik, Bonn

Abstract

Structured product data, represented as attribute-value pairs, is crucial for e-commerce platforms to enable features such as faceted product search and attribute-based product comparison. However, vendors often supply unstructured product descriptions, necessitating attribute value extraction to ensure data consistency and usability. Large language models (LLMs), including OpenAI's GPT-4o, have demonstrated their potential for product attribute value extraction in few-shot scenarios. Recent research has shown that self-refinement techniques can improve the performance of LLMs on tasks such as code generation and text-to-SQL translation. For other tasks, applying these techniques has only led to increased costs due to the processing of additional tokens, without achieving an improved performance. This paper investigates applying two self-refinement techniques — error-based prompt rewriting and self-correction — to the product attribute value extraction task. The self-refinement techniques are evaluated across zero-shot, few-shot in-context learning, and fine-tuning scenarios. Experimental results reveal that both self-refinement techniques have a marginal impact on the performance of GPT-4o across the different scenarios while significantly increasing processing costs. For attribute value extraction scenarios involving training data, fine-tuning yields the highest performance while the ramp-up costs of fine-tuning are balanced out as the amount of product descriptions grows.

Description

Brinkmann, Alexander; Bizer, Christian (2025): Self-Refinement Strategies for LLM-based Product Attribute Value Extraction. Datenbanksysteme für Business, Technologie und Web - Workshopband (BTW 2025). DOI: 10.18420/BTW2025-132. Gesellschaft für Informatik, Bonn. PISSN: 2944-7682. pp. 291-304. Workshop Data Engineering for Data Science (DE4DS). Bamberg. 3.-7. März 2025

Keywords

Citation

URI

Endorsement

Review

Supplemented By

Referenced By