Schütz, Michael and Kludt, Niklas (2024) AI-based Review Analysis. Other thesis, OST Ostschweizer Fachhochschule.
Full text not available from this repository.Abstract
The rapid expansion of e-commerce in Switzerland, especially during the COVID-19 pandemic, has led to a significant increase in product reviews.
This term paper addresses the challenge of efficiently summarizing key aspects of a product from its reviews into keywords to help customers make decisions.
The goal was to develop an AI solution capable of extracting and condensing product review insights into concise positive and negative keywords, following an approach similar to the AI-driven keyword generation used by Digitec Galaxus AG.
The final solution uses a single large language model (LLM) to provide both flexibility and robust performance while requiring no training.
OpenAI's gpt-4o mini model was chosen for its cost-effectiveness and large context size.
The workflow, implemented in Python using Jupyter notebooks, systematically extracts review aspects and aggregates them into three positive and three negative keywords per product.
The process follows a four-stage pipeline:
(0) Review Data Composition, where raw review data is organized for analysis;
(1) Aspect Extraction, where positive and negative aspects are isolated for easier reprocessing;
(2) Chunk Summarization, where similar aspects are consolidated over multiple reviews to reduce complexity and fit context boundaries of the LLM; and (3) Aspect Unification, where aspects are aggregated across all the chunks to produce final keywords.
Additional stages to extend functionality, such as generating summaries or incorporating user-defined criteria, are described in this term paper.
The verification showed the ability to effectively summarize large amounts of product reviews into positive and negative keywords, capturing many key aspects mentioned in the reviews.
However, some problems were observed during the verification phase.
The keywords generated were often overly broad or generic, making it difficult to identify specific product features.
In addition, the system sometimes combined multiple features into a single keyword, resulting in some lost features.
Despite these challenges, the approach demonstrated promising results and strong potential as a proof of concept.
However, further development is needed to improve the precision of keyword generation for productive use.
Item Type: | Thesis (Other) |
---|---|
Subjects: | Area of Application > Consumer oriented Area of Application > Multimedia Technologies > Programming Languages > Python Technologies > Databases |
Divisions: | Bachelor of Science FHO in Informatik > Student Research Project |
Depositing User: | OST Deposit User |
Contributors: | Contribution Name Email Thesis advisor Politze, Daniel UNSPECIFIED |
Date Deposited: | 18 Feb 2025 12:29 |
Last Modified: | 18 Feb 2025 12:29 |
URI: | https://eprints.ost.ch/id/eprint/1256 |