Semantic Clustering Toolbox

Derungs, Lukas (2025) Semantic Clustering Toolbox. Other thesis, OST Ostschweizer Fachhochschule.

[thumbnail of FS 2025-BA-EP-Derungs-Entwicklung einer Toolbox zur Evaluation von LLM-basiertem s.pdf] Text
FS 2025-BA-EP-Derungs-Entwicklung einer Toolbox zur Evaluation von LLM-basiertem s.pdf - Supplemental Material

Download (1MB)

Abstract

This thesis presents the design, development, and evaluation of a semantic clustering toolbox intended to support non-technical users in analyzing open-ended survey responses.
The motivation stems from a desire to expedite the labor- and time-intensive process of survey data analysis in research and evaluation contexts.

The toolbox allows users to upload survey data, perform semantic clustering, analyze sentiment, and export results through a simplified interface.
Developed using a Design Science Research methodology, it integrates embedding models for semantic representation, the K-means algorithm for clustering, dimensionality reduction for visualization, and language models for sentiment analysis.
A notable feature of the system is the inclusion of cluster stability visualizations, which help users interpret the consistency of clustering outcomes across multiple runs.

The artifact was evaluated through internal clustering metrics, user feedback and requirement validation using real-world survey data provided by the IFSAR research institute.
Results indicate that the toolbox effectively identifies dominant themes and supports exploratory analysis, while remaining accessible to non-technical users.

Despite its utility, the toolbox has limitations, including sensitivity to input quality and the inherent subjectivity of interpreting clusters without ground truth labels.
Nonetheless, the artifact fulfills its primary goal and offers a practical foundation for future enhancements and research.

Overall, this work contributes a practical and extensible tool for the semantic clustering of textual data.

Item Type: Thesis (Other)
Subjects: Topics > User Interface Design
Technologies > Programming Languages > Python
Technologies > Frameworks and Libraries
Divisions: Bachelor of Science FHO in Informatik > Bachelor Thesis
Depositing User: OST Deposit User
Date Deposited: 29 Sep 2025 10:51
Last Modified: 29 Sep 2025 10:51
URI: https://eprints.ost.ch/id/eprint/1317

Actions (login required)

View Item
View Item