Document Management using Large Language Model

Wymann, Momoko and Willi, Andrew (2024) Document Management using Large Language Model. Other thesis, OST Ostschweizer Fachhochschule.

[thumbnail of FS 2024-BA-EP-Willi-Wymann-Document Management using Large Language Model.pdf] Text
FS 2024-BA-EP-Willi-Wymann-Document Management using Large Language Model.pdf - Supplemental Material

Download (8MB)

Abstract

Initial Situation:
The management and retrieval of documents can be challenging due to the unstructured nature
of their content. Traditional search methods, which rely on document names, are often inefficient
and ineffective. With the rise of Transformers-based Large Language Models (LLMs), which sig-
nificantly enhance our everyday tasks, there is an opportunity to improve the search and man-
agement of these documents. The integration of LLMs can transform unstructured data into
structured metadata, making documents more accessible and organized.

Objective:
Within the scope of this project, we aim to build a Single Page Application of this type. We created this prototype application, which is able to store, read, and process PDF documents with unstructured data and generate metadata using an LLM. This metadata improves document search and
management, as well as streamlines repetitive tasks such as sending an email reminder to a
customer. The prototype is designed to be easily expandable to facilitate the continuous development and implementation of new features.

Conclusion:
In this project, we developed a prototype application utilizing React and TypeScript for the frontend while using NestJS with TypeScript for the backend. We integrated external services such as Hugging Face and Zapier. The application enables users to upload PDF documents, extract
a suitable title, summary, and tags using a Large Language Model (LLM), and search for docu-
ments based on this metadata. Additionally, it includes a feature for sending email reminders
to customers for specific files. Designed with scalability in mind, the application allows for the
easy addition of new features in the future, such as integrating various file types like pictures,which could utilize the LLM. This proof of concept demonstrates the potential of using LLMs to enhance document management and retrieval.

Item Type: Thesis (Other)
Subjects: Technologies > Databases > MySQL
Technologies > Frameworks and Libraries > React
Technologies > Programming Languages > TypeScript
Metatags > IFS (Institute for Software)
Divisions: Bachelor of Science FHO in Informatik > Bachelor Thesis
Depositing User: OST Deposit User
Contributors:
Contribution
Name
Email
Thesis advisor
Koch, Frank
UNSPECIFIED
Expert
Purandare, Mitra
UNSPECIFIED
Other
Güntensperger, Michael
UNSPECIFIED
Date Deposited: 04 Oct 2024 05:49
Last Modified: 04 Oct 2024 05:49
URI: https://eprints.ost.ch/id/eprint/1226

Actions (login required)

View Item
View Item