PhD-Thesis document "Novelty detection for multivariate data streams with probabilistic models" online
Christian Gruhl's dissertation entitled Novelty detection for multivariate data streams with probabilistic models, which was successfully submitted in 2022, is now available online. The content of the document:
The autonomous detection of unexpected changes in data is called novelty detection. Multivariate data streams consisting of measurements from multiple sensors often form the basis to detect such changes. Specific examples of such changes are, for instance, cardiac arrhythmias, power failures, storms or network attacks. Accordingly, changes can affect both a system itself and the environment in which it is embedded. This doctoral thesis investigates methods for online novelty detection in multivariate data streams and presents the CANDIES methodology. A unique feature of this method is the explicit separation of the input space of a probabilistic model into different regions – High-Density Regions (HDR) and Low-Density Regions (LDR) – with detection techniques specifically designed for each. While other detectors can usually only detect novelties or anomalies in LDR, the CANDIES method can also identify novelties in HDR. It also offers possibilities to handle concept drift and noise in data streams. Another distinctive feature of CANDIES is the notion of novelties as an agglomeration of anomalies that have a certain relation to each other (spatially or temporally). Additionally, the focus of this work is also on the experimental evaluation of novelty detection algorithms in general. For this purpose, a data generator that can synthesise data streams and novelties is presented, and a new evaluation measure, the FDS, is specifically designed to evaluate novelty detection methods. All methods, algorithms and tools developed and used in this thesis are also publicly and freely available online.