Department of Computer Science

MSc Defences Winter 2024

See the list of MSc defences at DIKU this winter. The list will be updated continuously.

Information about the thesis, supervisor, location of the defence, etc. can be found on the respective events below.

Computer Science

5 January: Matilde Mouritsen Broløs

Name of student(s)	Matilde Mouritsen Broløs
Study Programme	Computer Science
Title	A DSL for eBPF in ML style programming language
Abstract	eBPF is a subsystem allowing users to monitor activity in the Linux kernel. With eBPF it is possible to write programs that are loaded into the kernel and attached to an event, such that each time the event is performed, the program is executed. This gives a lot of possibilities, including packet filtering and tracing. However, eBPF programs are written in bytecode, which is not desirable for all users, and therefore it is attractive to have a higher-level language for writing the programs, which can then be compiled to eBPF bytecode. As of now there are multiple solutions to this problem, two of which are libbpf and bpftrace. The first implements a very C-like syntax, where the second implements an even higher-level syntax that abstracts away a lot of the business logic. The goal of this project is to design a new domain specific language for writing eBPF programs, which is based on an analysis of the two existing tools libbpf and bpftrace. This analysis will help form a list of criteria or design goals for the new language. Furthermore this new language is inspired by functional programming languages such as ML and F#. The result is a new language with an abstraction level somewhere between libbpf and bpftrace, which seeks to aid the programmer with type safety among other functional programming language traits. The goal is to have a language that is user-friendly and at the same time does not hide away the logic of the program, thus enhancing the understanding of the eBPF subsystem. This project consists of an analysis of the two existing solutions, a design vision for a new DSL, and a proof-of-concept for the new DSL. The proof-of-concept is a very limited subset of the entire design, but from the results of the project it is concluded that the design seems like a good foundation for a functional approach to writing eBPF applications.
Supervisor(s)	Ken Friis Larsen
External examiner(s)	Michael R. Hansen
Date and time	05.01.2024 13:30 - 14:30
Room	Mødelokalet PLTC gangen, HCO-01-0-029

5 January: Emil Nærvig Isenbecker

Name of student(s)	Emil Nærvig Isenbecker
Study Programme	Computer Science
Title	Procedural Generation of City Maps Assisted by Heightmaps
Abstract	Procedural content generation remains a consistently popular way of creating video game worlds, and titles such as Rogue (1980) and Minecraft (2011) continue to inspire the next generation of video games. Recently, Unreal Engine 5 (2022), a leading commercial video game engine and real-time 3D creation tool, has implemented a comprehensive framework dedicated to assisting with procedural generation of content. In this thesis, an original method for procedural generation of city maps is designed and implemented. First, a survey of existing methods for procedural generation of city maps is conducted, and the different methods are discussed and compared. Following the survey, an algorithm derived from the work of Parish and Müller (2001) [1] is chosen as the core for development of the original method. The implementation of this core algorithm, which functions by growing a branching network of street segments, is expanded upon through integrating Strahler numbers as a control mechanism for branching. Finally, heightmaps are integrated into the process, acting as a believable background which guides the direction of the growing street network. The result is a flexible procedural generation system, which is able to generate believable street networks that fit the given terrain. The developed method has the potential to function in environments with minimal user input, which is often required from procedural generation methods when applied to video game development.
Supervisor(s)	Torben Ægidius Mogensen
External examiner(s)	Claus Witfelt
Date and time	05.01.2024 9:30-10:30
Room	Lokale S029 (i PLTC-gangen)

5 January: Karen Qvist Larsen & Nanna Munk Berg

Name of student(s)	Karen Qvist Larsen and Nanna Munk Berg
Study Programme	Computer Science
Title	Metric First-Order Temporal Logic: Aggregation and the non-recursive and past-recursive let operators
Abstract	A monitor receives events generated by a system, and checks whether the event sequence satisfies a given specification. Monitoring distributed systems is challenging due to the individual concurrent events that may arrive out-of-order at the monitor. This challenge is addressed by the unverfied monitoring algorithm TimelyMon that supports a fragment of the specification language Metric First-Order Temporal Logic (MFOTL) in an online setting. It is implemented in the Timely Dataflow framework written in the programming language Rust. We add to the expressiveness of TimelyMon by extending the specification language to include the aggregation, non-recursive let and past-recursive let operators. Furthermore, the temporal operator previous is optimized for the interval zero to infinity. We replicate a benchmark test originally used to evaluate the performance of the recursive let operator in VeriMon, and demonstrate that the TimelyMon recursive let operator performance is better than that of VeriMon in some cases but worse in others. Furthermore, we show that the performance of the recursive let operator of TimelyMon when used on this particular benchmark data log does not consistently scale well as more workers are added.
Supervisor(s)	Dmitriy Traytel (primary), Rafael Castro G. Silva (co-supervisor)
External examiner(s)	Alceste Scalas
Date and time	05.01.2024 14:00 - 15:00
Room	Online

12 January: Daniel Gert Rosenørn Ørsted

Name of student(s)	Daniel Gert Rosenørn Ørsted
Study Programme	Computer Science
Title	3D Cell Instance Segmentation and Tracking in Time-Lapse Microscopy
Abstract	In this master thesis project, I will focus on developing and testing methodologies for 3D cell instance segmentation and tracking in time-lapse microscopy images. I will explore the challenges in accurately identifying cells and tracking their cell fates over time, particularly in the context of Type 1 diabetes research. I will present a study of Convolutional Neural Networks, particularly the U-Net architectures, which I use, for instance segmentation in complex cellular environments. I will implement various tracking algorithms to track the cells over time using the segmented cells, then I evaluate the performance of these methods using metrics like the Jaccard Similarity Coefficient and Higher Order Tracking Accuracy.
Supervisor(s)	Jon Sporring, Silja Heilmann (Co-Supervisor)
External examiner(s)	Rasmus Reinhold Paulsen
Date and time	12.01.2024 14:30 - 16:00
Room	tbd

18 January: Christian Kuke Marslev and Jonas Slot Grønborg

Name of student(s)	Christian Kuke Marslev and Jonas Slot Grønborg
Study Programme	Computer Science
Title	Efficient Sequentialization of Parallelism
Abstract	Writing parallel programs can be difficult, but can be offloaded by compilers through code transformations. Since parallelism is a finite resource on actual hardware, once the parallelism available is saturated, choosing to sequentialize some parts of computation can potentially result in speedups. In this thesis, it is shown how efficient sequentialization can be implemented as an automatic code transformation technique targeting intra-block parallelism on GPUs. Transformation rules and implementations are shown for the four parallel operators, map, reduce, scan, and scatter in the Futhark compiler. The optimization showcases how affected programs can achieve a speedup of 2 by introducing additional sequentialization to otherwise parallel programs.
Supervisor(s)	Cosmin Eugen Oancea (primary), Troels Henriksen (co-supervisor)
External examiner(s)	Mircea Filip Lungu
Date and time	18.01.2024 15:20 - 16:20
Room	PLTC meeting room
Your email	uddannelse@diku.dk

18 January: Yi Jing

Name of student(s)	Yi Jing
Study Programme	Computer Science
Title	Automated Audio Analysis Tool to Support Research on Group Work
Abstract	When researching how small groups collaborate in education, a large amount of audio and video data is generally collected. Researchers code the interactions and the conversations to understand how groups collaborate. One aspect of the investigation is the conversational patterns of who talks when, how long, turn-taking, and conversation pauses. The coding is time-consuming; therefore, tools are needed to help researchers. To extract richer insights from the audio data, the research question was raised: what kinds of conversational patterns can be automatically identified for supporting research on educational group work? The research took place in an informal group learning environment, design science was used as the methodology to support the research, and the core focus was on constructing an automated tool to help the researchers. After doing the speaker diarization and visualization, the conversation patterns were found among the students from different levels of engagement. The speaking ratio for the whole process and the speaking ratio for each person can reveal the students’ engagement in group work. This can help future educational researchers identify the critical moments in group interactions by quickly analyzing the audio files.
Supervisor(s)	Daniel Spikol
External examiner(s)	Andrea Corradini
Date and time	18.01.2024 8:45-10:30
Room	Online

18 January: Jonas Masiulionis and Xinzhi Huo

Name of student(s)	Jonas Masiulionis and Xinzhi Huo
Study Programme	Computer Science
Title	Harmony in Player Experience: Explore the Impact of Game Element Combinations on Player Satisfaction in 2D Platformer Games
Abstract	Movement is one of the basic mechanics in video games, it plays a crucial role in contributing to a joyful player experience. However, it does not work alone. Game movements as long as background music, levels, the visual art styles of the player avatar, and in-game items contribute a harmony of satisfaction to players. Despite their prevalence in games, we have a limited scientific understanding of how to create this harmony to provide a satisfying player experience. In this work, we investigate whether we can affect players’ satisfaction with 2D platformer games by providing several specific combinations of game movement, background music, level, and visual style. We built several 2D platformer games that provide different player experiences by offering different game element combinations. In a between-design user study(N=9), we explored the impact of different movements, music styles, visual styles, and levels on players’ overall satisfaction with games. We found that to make a successful game, carefully selecting the audience is as important as the game design. Different players can have opposite preferences in video games. Thus, developing a game with a specific audience in mind is one of the most important tasks for game developers. Based on the findings, we discuss implications for game design and future directions.
Supervisor(s)	Valkyrie Arline Savage
External examiner(s)	Louise Petersen Matjeka
Date and time	18.01.2024 9:00 - 10:30
Room	Sigurdsgade 0-11

19 January: Henrik Windum Petersen

Name of student(s)	Henrik Windum Petersen
Study Programme	Computer Science
Title	Dataset Condensation for Object Detection
Abstract	Recently, various dataset distillation techniques have emerged to resolve the issue of training a deep neural network in an unreasonable amount of time. These methods suggest the potential to train models on synthetic datasets, essentially guiding the network parameters to a similar state as if trained with the actual, full dataset. Currently, dataset distillation techniques have shown promise in multi-label image classification. In our study, however, we aim to extend the use cases of dataset distillation to include semantic segmentation as the underlying objective. We have modified and implemented two methods: matching training trajectories and distribution matching. In comparison, the synthetic datasets derived by either method did not come close to achieving similar generalizing capability to the actual dataset. This study explores the challenges and potential future prospects for dataset distillation in semantic segmentation applications.
Supervisor(s)	Raghavendra Selvan
External examiner(s)	Lee Herluf Lund Lassen
Date and time	19.01.2024 14:00 - 15:00
Room	SCI-DIKU-UP1-1-1-N116A
Your email	uddannelse@diku.dk

24 January: Kinga Agnieszka Bieniarz

Name of student(s)	Kinga Agnieszka Bieniarz
Study Programme	Computer Science
Title	Enhancing eBPF Verifier for Safe Kernel Code Execution: Specification, Implementation, and Analysis
Abstract	In the ever-evolving landscape of operating systems and kernel development, ensuring the security, reliability, and performance of the Linux kernel is of paramount importance. The Linux kernel, being at the core of numerous computing systems, serves as the foundation for a wide array of applications, ranging from cloud services to embedded devices. To meet the diverse needs of modern computing, the Linux kernel has embraced extensibility through a technology known as eBPF (Extended Berkeley Packet Filter). This technology empowers developers to augment the kernel’s functionality without compromising its safety and stability. To guarantee safety and security, eBPF comes with a verifier that checks that eBPF code will not threaten the confidentiality and integrity of kernel data. The current eBPF verifier implementation, while important, may have limitations, proven by many vulnerabilities that were discovered it the verifier throughout the years. This thesis project aims at providing a formal specification of the eBPF verifier as well as an implementation of the verifier, using the theory of abstract interpretation.
Supervisor(s)	Thomas Philip Jensen
External examiner(s)	Mads Rosendahl
Date and time	24.01.2024 13.30-14.30
Room	Auditorium 8 på HCØ

25 January: Oscar Martti Nelin and Bjarke Lohmann Pedersen

Name of student(s)	Oscar Martti Nelin and Bjarke Lohmann Pedersen
Study Programme	Computer Science
Title	General Array Locality Optimization by Permutation (GALOP)
Abstract	Programming in a high-level, data-parallel language such as Futhark alleviates programmers from hardware-specific details. This, in turn, requires such languages to have a general notion of optimal array layouts for their data-parallel operations. Both CPU and GPU architectures may benefit from good locality of reference, and when the user has no control over the program intrinsics, it is the compiler’s responsibility not to produce programs with poor locality of reference. Research on this topic is very limited; therefore, we present a novel, hardware-agnostic, best-effort algorithm designed to perform statical analysis and optimize the locality of reference in programs by General Array Locality Optimization by Permutation (GALOP). That is, by strategic permutation on the order of array dimensions. We show that GALOP outperforms the existing solution on the GPU in most cases but achieves this mainly by being more conservative and making fewer bad decisions.
Supervisor(s)	Troels Henriksen
External examiner(s)	Magnus Madsen
Date and time	25.01.2024 time:tbd
Room	PLTC-mødelokalet

26 January: Christian Arboe Franck

Name of student(s)	Christian Arboe Franck
Study Programme	Computer Science
Title	Propagating Negative Information in Streaming First-Order Monitoring
Abstract	In recent years, the surge in data storage, access, and processing within data centers has elevated concerns regarding their correct behavior and security. This thesis builds upon an existing monitoring system that can process real-time activity logs in the form of data streams to check violations of security policies. The thesis outlines how the fact of knowing data will not occur may improve performance, this concept is described as negative information. By formulating this analysis as a streaming dataflow computation in the Timely Dataflow framework, the system ensures high parallelizability, low latency, and high throughput. Beyond showcasing the effectiveness of these algorithms in real-time monitoring, the thesis explores the utilization of negative information to enhance system efficiency and reduce latency in reporting violations. Keywords: Monitoring · Negative Information · Stream Processing
Supervisor(s)	Dmitriy Traytel
External examiner(s)	Alceste Scalas
Date and time	26.01.2024 14:00 - 15:00
Room	Sigurdsgade 41, room 2.03

TBD: Daniel Ellebæk Nielsen

Name of student(s)	Daniel Ellebæk Nielsen
Study Programme	Computer Science
Title	Conversion of BPMN graphs to DCR graphs
Abstract	Process modeling is often used in companies or organizations to model internal processes. Two popular approaches to modeling are imperative and declarative methods. Imperative notation is rigid and defines exact sequences of valid executions, while declarative is flexible and allows all executions that do not violate the defined constraints. Business Process Model and Notation (BPMN) and Dynamic Condition Response (DCR) graphs are respectively imperative and declarative modeling notations. In this project, I set out to build a tool that converts a BPMN graph to an equivalent DCR graph. Due to time constraints, I will not provide a mapping for all of BPMN, but only a subset of it containing the most used elements. I will also not provide proof using formal semantics to prove the validity of the mapping, because of time constraints and the lack of semantics for BPMN. Instead, I will show equivalence through thorough testing of the tool. During the project, I developed this tool. It can parse a BPMN file, convert it, and output a corresponding DCR graph. It can output the DCR graph as either XML or as an API call to DCR Solutions. It supports testing through traces, which can be used to define the expected behavior. I have used this testing setup to define a testing suite, which tests a wide range of conversion cases quickly. The mapping between BPMN and DCR that I found was not exactly equivalent, since start events are turned into DCR activities, because of the BPMN exclusive gateway. Otherwise, I fulfilled the requirements I set out to.
Supervisor(s)	Thomas Troels Hildebrandt
External examiner(s)	Søren Debois
Date and time	tbd
Room	tbd

Health Informatics

16 January: Katrine Koizumi Gundtoft

Name of student(s)	Katrine Koizumi Gundtoft
Study Programme	Sundhed og Informatik
Title	Data needs in selfmanagement of diabetes with Hedia Diabetes Assistant
Abstract	Background: Diabetes is a complex disease, and the treatment relies heavily on the patient’s abilities to monitor blood glucose and administer insulin daily. With the society’s shift towards a person-centred health system, that actively include the patients in treatment plans, digital services have become an important source of patient generated data in the management of diabetes. Purpose: The purpose of this project is to explore the data needs of patients in insulin treatment and healthcare personnel in supporting the patients in self-management, how Hedia Diabetes Assistant fulfils the identified data needs and how the functionality of the bolus calculator app can be further developed to support the communication between patient and healthcare personnel. Research design: A qualitative study by semi structured interviews of a doctor, two diabetes nurses and a patient is conducted, and the data is transcribed and coded using NVivo. Research literature on self-management and challenges of life with diabetes will accompany the theoretical framework User Experience and its principles to address the research question. Results: Six patient data needs and five healthcare personnel data needs are identified in the analysis with themes such as seeking safety, normalisation, and motivation, targeting treatment, and patient empowerment. Hedia Diabetes Assistant is evaluated on usability, usefulness, meaningfulness, and emotional impact to find that Hedia Diabetes Assistant can be further developed. Three proposals for this development are an expanded function for logging possibilities, different activity profiles for individualised and flexible use, and report formatting to better facilitate the cooperation of patients and healthcare personnel. Conclusion: Self-management of diabetes is difficult, and there are many challenges in the daily lives of patients with diabetes. Therefore, technology that can be adjusted to each patient’s needs and individual situation is essential. Bolus calculators should act as a tool that effectively supports documentation and exchange of data to healthcare personnel, and enhances the experiences of self-management, glucose control, and healthcare personnel workflow.
Supervisor(s)	Erling Carl Havn
External examiner(s)	Jens Pedersen
Date and time	16.01.2024 16:00-17:00
Room	UP1-2-0-04

IT and Cognition

25 January: Chenxiao Ma

Name of student(s)	Chenxiao Ma
Study Programme	IT and Cognition
Title	On the Efficiency of No-regret Exploration: Episodic RL in MDPs and MDPRMs
Abstract	Reinforcement learning (RL) is a powerful framework for learning optimal behaviors in unknown dynamical systems from experience and has been successfully applied to many applications in science and engineering. Most state-of-the-art RL methods build on the Markov assumption, implying that the underlying transition and reward functions only depend on the current state and action of the agent. However, in the real world, this assumption may be too simplistic: Whether we can open the door to the office depends not only on where we are, but also on whether we take the key with us when we leave home. Thus, in the natural conception of the state yielding Markov transition functions, it is more intuitive to consider the non-Markovian reward function. In this thesis, we study the class of non-Markovian reward functions that can be fully described by some deterministic finite-state transducer with the deterministic transitions of high-level events, such as whether the key is brought. In the recent literature, such automata are often called Reward Machines (RM). They allow one to formulate an equivalent (cross-product) Markov Decision Process (MDP), where one can apply state-of-the-art RL algorithms. In the context of MDP, we focus on two model-based RL algorithms in the episodic setting: RLSVI powered by the posterior sampling approach and UCBVI implementing the optimism in the face of uncertainty principle. RLSVI is shown to attain a near-optimal regret bound in episodic MDPs, whereas UCBVI enjoys a minimax regret bound. Nonetheless, they have the fame of yielding poor empirical performance even in toy MDPs. We investigate their empirical performance in terms of empirical regret and that of their variants derived using refined confidence sets based on the Laplace and peeling methods. Furthermore, although the two algorithms and their variants can be blindly adapted to the environment modeled by a MDP with a Reward Machine (MDPRM), they perform poorly because the associated cross-product MDP has a large state-space, making strategic exploration empirically inefficient. Leveraging the intrinsic structure in the MDPRM, we propose reward-machine-specific variants of the two algorithms that explicitly exploit the known knowledge of the underlying high-level tension. We further explore the empirical performance of these variants using experiments on some standard domains and verify the performance gain over structure-oblivious counterparts.
Supervisor(s)	Sadegh Talebi
External examiner(s)	Melih Kandemir
Date and time	25.01.2024 13:00-14:00
Room	Online

30 January: Kyara Catharina Helena Rozman

Name of student(s)	Kyara Catharina Helena Rozman
Study Programme	IT and Cognition
Title	Development of an analysis pipeline for the automatic behavioral assessment and classification of rat models of neurodevelopmental conditions
Abstract	In the search to unravel the inner workings of the brain, rodent models are a widely used tool to help us study the complex relationship between genes, brain activity and behavior, and how these may be affected in neurobehavioral disorders such as Autism Spectrum Disorders or Alzheimer’s disease. Quantification of behavior in rodent models heavily relies on humanly predefined and intuitive measurements, which could limit detection of robust and subtle behavior of rodents. We are in need of a more robust way of describing behavior. The goal of this study is to develop an analysis pipeline to automatically assess behavior and genetic classification of rat models of neurodevelopmental disorders, without the use of humanly predefined behavior. I leverage a dataset from Simons Initiative for the developing brain, containing 759 videos spread over 7 genetic models of neurodevelopmental conditions and use DeepLabCut for pose estimation. Subsequently, I process the pose data, extract and robust features (distance, angle and area) and extract statistical metrics on top of the features. I then apply 2 pipelines (Random Forest and Principal Component Analysis combined with Random Forest) with 1 cross-validation and permutation test to classify between transgenic and wild type rats.. The extended models (median, variance, standard deviation features) outperform the baseline models (mean features) for both pipelines (accuracy 0.66 and 0.63), also significantly after permuted null distribution. Therefore, I can conclude that the models can classify between binary classes of transgenic and wild type rats
Supervisor(s)	Melanie Ganz-Benjaminsen (supervisor), Martin Nørgaard (co-supervisor), Antonios Asiminas (company supervisor)
External examiner(s)	Veronika Vladimirovna Cheplygina
Date and time	30.01.2024 14:00-15:00
Room	DIKU, UP1 rum 2-0-04