AI-assisted Digitalization of Landscaping plans

Löffler, Kevin (2024) AI-assisted Digitalization of Landscaping plans. Other thesis, OST Ostschweizer Fachhochschule.

Text
FS 2024-BA-EP-Löffler-AI-assisted Digitalization of Landscaping plans.pdf - Supplemental Material
Download (5MB)

Abstract

This thesis proposes a full-stack solution for the digitisation process of the Swiss Archive for Landscape Architecture (ASLA).
A desktop app is developed using Tauri, enabling the management of the archive’s data as well as the correction of AI predictions.
The AI pipeline is containerised using Docker and is accessible via a web API. Each plan is formatted and preprocessed, before three deep learning models are applied: a pretrained layout model (LayoutLMv3) to detect all text occurrences with k-means clustering to group text boxes into logical blocks, and a transformer-based OCR model (TrOCR) to extract text. Relevant entities are then identified using a custom-trained German BERT model. The output undergoes post-processing for formatting and normalisation, with project-specific keywords like the architect’s name filtered out. The predicted metadata is sent back to the client app where metadata files track all changes to the image, ensuring non-destructive editing.
Every weekend, the machine learning models are retrained on all the manually changed predictions. The thesis focuses more on implementing a robust pipeline and continuous retraining than on improving the models because continuous retraining is expected to enhance the AI pipeline’s performance over time.
The app significantly speeds up the digitisation process for the archive and is a substantial improvement over the old Excel-based workflow. The AI pipeline’s prediction accuracy varies by model. Marker detection is 100% reliable, and the OCR model reaches 98% ac- curacy after retraining on only 77 images. The NER model, currently at 46% accuracy, is about 10% better than the model from the SA project. If the accuracy continues to increase with additional training data, an F1-score of over 80% can be foreseen with 600 images. Thanks to the new app, these images can be collected in less than a month of archival work.

Item Type:	Thesis (Other)
Subjects:	Topics > Software > Software Modeling Technologies > Programming Languages > Java Script Technologies > Virtualization > Docker Metatags > IFS (Institute for Software)
Divisions:	Bachelor of Science FHO in Informatik > Bachelor Thesis
Depositing User:	OST Deposit User
Contributors:	Contribution Name Email Thesis advisor Purandare, Mitra UNSPECIFIED Expert Röösli, Marc UNSPECIFIED
Date Deposited:	04 Oct 2024 05:49
Last Modified:	04 Oct 2024 05:49
URI:	https://eprints.ost.ch/id/eprint/1231

Actions (login required)

: View Item