Arendar, Martin and Mahler, Matteo (2025) Evaluating machine learning models in a SIEM environment. Other thesis, OST Ostschweizer Fachhochschule.
HS 2025 2026-SA-EP-Mahler-Arendar-System zur Erkennung von Netzwerk- oder Benutzeranomalien vi.pdf - Supplemental Material
Download (3MB)
Abstract
This project presents the design and implementation of a testbed for detecting user behavior anomalies in web applications using machine learning within the Elastic Stack.
Traditional Security Information and Event Management (SIEM) systems rely largely on rule based detection, which offers limited flexibility and struggles with complex behavioral patterns.
Modern machine learning approaches address these limitations but introduce new challenges, especially regarding data availability, model evaluation, and integration into operational monitoring systems.
The goal of this work is therefore not to build a single high-performing model, but to create an extensible and reproducible environment in which different machine learning models can be trained, deployed, and compared under identical conditions.
Because no suitable dataset for user centric anomaly detection existed, we developed a synthetic data generation pipeline that simulates user interactions in a controlled web application.
A custom student management platform was implemented to produce ECS compliant logs representing normal and anomalous behavior.
A traffic generator produces user flows and defined anomaly patterns over configurable time ranges, allowing the creation of reproducible datasets.
These logs are ingested through Filebeat and Logstash into Elasticsearch, enriched through ingest pipelines (e.g., GeoIP), and transformed into session-level features using Elastic Transforms.
This pivot from action-level events to session-level aggregates provides meaningful input for supervised machine learning models.
Two machine learning approaches are evaluated: Elastics built in Data Frame Analytics (DFA) classifier and a custom Random Forest model trained in Python using scikit-learn, imported into Elastic via eland.
Both models operate on the same labelled session dataset, enabling direct comparison.
A dedicated evaluation dashboard, the foundation of the testbed, visualizes confusion matrices, classification metrics, feature importance, and anomaly distributions.
This dashboard allows users to compare models on standardized performance indicators and supports repeated experimentation with newly trained models.
The results demonstrate that the system functions reliably as a testbed.
Data generation, ingestion, transforms, and dashboards work as intended, and both models can be evaluated consistently.
While Elastic's DFA provides strong baseline performance with minimal configuration, the custom model highlights the challenges of feature engineering, data realism, and training strategy.
Several findings indicate opportunities for improvement, particularly regarding model tuning, dataset variability, and automation of dashboard creation.
The project concludes that a machine-learning-based anomaly detection workflow in Elastic is feasible and that the developed testbed offers a solid foundation for future research, such as improved model engineering, integration of real-world data, or real-time anomaly tracking.
| Item Type: | Thesis (Other) |
|---|---|
| Subjects: | Area of Application > Security Technologies > Databases > PostgreSQL Technologies > Virtualization > Docker Technologies > Programming Languages > TypeScript Metatags > INS (Institute for Networked Solutions) |
| Divisions: | Bachelor of Science FHO in Informatik > Student Research Project |
| Depositing User: | OST Deposit User |
| Date Deposited: | 26 Feb 2026 09:04 |
| Last Modified: | 26 Feb 2026 09:04 |
| URI: | https://eprints.ost.ch/id/eprint/1362 |
