Hazeraj, Aziz and Uthayakumar, Thashvar (2025) Controlled Image Generation for Reflecting Eating Habits in Virtual Avatars. Other thesis, OST Ostschweizer Fachhochschule.
FS 2025-BA-EP-Hazeraj-Uthayakumar-Controlled Image Generation for Reflecting Eating Habits in.pdf - Supplemental Material
Download (27MB)
Abstract
This thesis explores the use of AI-based image generation to visualize potential physical changes resulting from adherence to personalized meal plans provided by the Smart Eating Platform. The central goal is to generate realistic image transformations that reflect anticipated changes in body composition, thereby enhancing user engagement and motivation.
To identify suitable image generation techniques, we conducted manual testing of several models and selected two open-source methods, ControlNet and Null-text Inversion, for deeper evaluation. Both were systematically assessed by generating several hundred images using different combinations of input images, prompt templates, and parameters. This evaluation led to the identification of an optimal parameter set for each model.
During the course of the project, OpenAI released the GPT-4o image generation model. Although introduced too late for inclusion in the full evaluation pipeline, it was informally tested and showed superior performance in both realism and fidelity to the original image. As a result, GPT-4o was integrated into the final system alongside the two open-source pipelines.
The full solution consists of a modular system architecture with a Python-based FastAPI backend and a React.js frontend. The backend pipeline handles parameter validation, converts meal plans into descriptive textual prompts (termed "reflection in appearance"), and generates corresponding images using the selected image model. The system supports both OpenAI's GPT-4o and Qwen3 as language backends, and allows users to select between the three image generation pipelines based on their preferences or technical constraints.
A user study with nine participants was conducted to evaluate image quality and prompt adherence. GPT-4o emerged as the most reliable and well-rated model overall. ControlNet outperformed Null-text Inversion in average ratings, although with greater variability. These findings validate the decision to include multiple generation backends, providing both high-fidelity results and privacy-conscious alternatives for local execution.
In summary, this thesis presents a flexible, user-configurable pipeline for meal plan–driven image transformation, contributing a novel motivational tool to the Smart Eating Platform.
| Item Type: | Thesis (Other) |
|---|---|
| Subjects: | Area of Application > Consumer oriented Area of Application > Healthcare, Medical Sector Technologies > Programming Languages > Python Metatags > IFS (Institute for Software) |
| Divisions: | Bachelor of Science FHO in Informatik > Bachelor Thesis |
| Depositing User: | OST Deposit User |
| Date Deposited: | 29 Sep 2025 10:52 |
| Last Modified: | 29 Sep 2025 10:52 |
| URI: | https://eprints.ost.ch/id/eprint/1318 |
