MSc Defences Winter 2024

See the list of MSc defences at DIKU this winter. The list will be updated continuously.

Information about the thesis, supervisor, location of the defence, etc. can be found on the respective events below.

Computer Science

 

 

Name of student(s)  

Matilde Mouritsen Broløs

Study Programme  

Computer Science

Title  

A DSL for eBPF in ML style programming language

Abstract  

eBPF is a subsystem allowing users to monitor activity in the Linux kernel. With eBPF it is possible to write programs that are loaded into the kernel and attached to an event, such that each time the event is performed, the program is executed. This gives a lot of possibilities, including packet filtering and tracing.

However, eBPF programs are written in bytecode, which is not desirable for all users, and therefore it is attractive to have a higher-level language for writing the programs, which can then be compiled to eBPF bytecode.

As of now there are multiple solutions to this problem, two of which are libbpf and bpftrace. The first implements a very C-like syntax, where the second implements an even higher-level syntax that abstracts away a lot of the business logic.

The goal of this project is to design a new domain specific language for writing eBPF programs, which is based on an analysis of the two existing tools libbpf and bpftrace. This analysis will help form a list of criteria or design goals for the new language. Furthermore this new language
is inspired by functional programming languages such as ML and F#.

The result is a new language with an abstraction level somewhere between libbpf and bpftrace, which seeks to aid the programmer with type safety among other functional programming language traits. The goal is to have a language that is user-friendly and at the same time does not hide away the logic of the program, thus enhancing the understanding of the eBPF
subsystem.

This project consists of an analysis of the two existing solutions, a design vision for a new DSL, and a proof-of-concept for the new DSL. The proof-of-concept is a very limited subset of the entire design, but from the results of the project it is concluded that the design seems like a good foundation for a functional approach to writing eBPF applications.

Supervisor(s)  

Ken Friis Larsen

External examiner(s)  

Michael R. Hansen

Date and time  

05.01.2024 13:30 - 14:30

Room  

Mødelokalet PLTC gangen, HCO-01-0-029

 

Name of student(s)  

Emil Nærvig Isenbecker

Study Programme  

Computer Science

Title  

Procedural Generation of City Maps Assisted by Heightmaps

Abstract  

Procedural content generation remains a consistently popular way of creating video game worlds, and titles such as Rogue (1980) and Minecraft (2011) continue to inspire the next generation of video games. Recently, Unreal Engine 5 (2022), a leading commercial video game engine and real-time 3D creation tool, has implemented a comprehensive framework dedicated to assisting with procedural generation of content. In this thesis, an original method for procedural generation of city maps is designed and implemented. First, a survey of existing methods for procedural generation of city maps is conducted, and the different methods are discussed and compared. Following the survey, an algorithm derived from the work of Parish and Müller (2001) [1] is chosen as the core for development of the original method. The implementation of this core algorithm, which functions by growing a branching network of street segments, is expanded upon through integrating Strahler numbers as a control mechanism for branching. Finally, heightmaps are integrated into the process, acting as a believable background which guides the direction of the growing street network. The result is a flexible procedural generation system, which is able to generate believable street networks that fit the given terrain. The developed method has the potential to function in environments
with minimal user input, which is often required from procedural generation methods when applied to video game development.

Supervisor(s)  

Torben Ægidius Mogensen

External examiner(s)  

Claus Witfelt

Date and time  

05.01.2024 9:30-10:30

Room  

Lokale S029 (i PLTC-gangen)

 

Name of student(s)  

Karen Qvist Larsen and Nanna Munk Berg

Study Programme  

Computer Science

Title  

Metric First-Order Temporal Logic: Aggregation and the non-recursive and past-recursive let operators

Abstract  

A monitor receives events generated by a system, and checks whether the event sequence satisfies a given specification. Monitoring distributed systems is challenging due to the individual concurrent events that may arrive out-of-order at the monitor.
This challenge is addressed by the unverfied monitoring algorithm TimelyMon that supports a fragment of the specification language Metric First-Order Temporal Logic (MFOTL) in an online setting. It is implemented in the Timely Dataflow framework written in the programming language Rust. We add to the expressiveness of TimelyMon by extending
the specification language to include the aggregation, non-recursive let and past-recursive let operators. Furthermore, the temporal operator previous is optimized for the interval zero to infinity. We replicate a benchmark test originally used to evaluate the performance of the recursive let operator in VeriMon, and demonstrate that the TimelyMon recursive
let operator performance is better than that of VeriMon in some cases but worse in others.
Furthermore, we show that the performance of the recursive let operator of TimelyMon when used on this particular benchmark data log does not consistently scale well as more workers are added.

Supervisor(s)  

Dmitriy Traytel (primary), Rafael Castro G. Silva (co-supervisor)

External examiner(s)  

Alceste Scalas

Date and time  

05.01.2024 14:00 - 15:00

Room  

Online

 

 

Name of student(s)  

Daniel Gert Rosenørn Ørsted

Study Programme  

Computer Science

Title  

3D Cell Instance Segmentation and Tracking in Time-Lapse Microscopy

Abstract  

In this master thesis project, I will focus on developing and testing methodologies for 3D cell instance segmentation and tracking in time-lapse microscopy images. I will explore the challenges in accurately identifying cells and tracking their cell fates over time, particularly in the context of Type 1 diabetes research. I will present a study of Convolutional Neural Networks, particularly the U-Net architectures, which I use, for instance segmentation in complex cellular environments. I will implement various tracking algorithms to track the cells over time using the segmented cells, then I evaluate the performance of these methods using metrics like the Jaccard Similarity Coefficient and Higher Order Tracking Accuracy.

Supervisor(s)  

Jon Sporring, Silja Heilmann (Co-Supervisor)

External examiner(s)  

Rasmus Reinhold Paulsen

Date and time  

12.01.2024 14:30 - 16:00

Room  

tbd

 

Name of student(s)  

Christian Kuke Marslev and Jonas Slot Grønborg

Study Programme  

Computer Science

Title  

Efficient Sequentialization of Parallelism

Abstract  

Writing parallel programs can be difficult, but can be offloaded by compilers through code transformations. Since parallelism is a finite resource on actual hardware, once the parallelism available is saturated, choosing to sequentialize some parts of computation can potentially
result in speedups. In this thesis, it is shown how efficient sequentialization can be implemented as an automatic code transformation technique targeting intra-block parallelism on GPUs. Transformation rules and implementations are shown for the four parallel operators,
map, reduce, scan, and scatter in the Futhark compiler. The optimization showcases how affected programs can achieve a speedup of 2 by introducing additional sequentialization to otherwise parallel programs.

Supervisor(s)  

Cosmin Eugen Oancea (primary), Troels Henriksen (co-supervisor)

External examiner(s)  

Mircea Filip Lungu

Date and time  

18.01.2024 15:20 - 16:20

Room  

PLTC meeting room

Your email  

uddannelse@diku.dk

 

Name of student(s)  

Yi Jing

Study Programme  

Computer Science

Title  

Automated Audio Analysis Tool to Support Research on Group Work

Abstract  

When researching how small groups collaborate in education, a large amount of audio and video data is generally collected. Researchers code the interactions and the conversations to understand how groups collaborate. One aspect of the investigation is the conversational patterns of who talks when, how long, turn-taking, and conversation pauses. The coding is time-consuming; therefore, tools are needed to help researchers. To extract richer insights from the audio data, the research question was raised: what kinds of conversational patterns can be automatically identified for supporting research on educational group work? The research took place in an informal group learning environment, design science was used as the methodology to support the research, and the core focus was on constructing an automated tool to help the researchers. After doing the speaker diarization and visualization, the conversation patterns were found among the students from different levels of engagement. The speaking ratio for the whole process and the speaking ratio for each person can reveal the students’ engagement in group work. This can help future educational researchers identify the critical moments in group interactions by quickly analyzing the audio files.

Supervisor(s)  

Daniel Spikol

External examiner(s)  

Andrea Corradini

Date and time  

18.01.2024 8:45-10:30

Room  

Online

 

Name of student(s)  

Jonas Masiulionis and Xinzhi Huo

Study Programme  

Computer Science

Title  

Harmony in Player Experience: Explore the Impact of Game Element Combinations on Player Satisfaction in 2D Platformer Games

Abstract  

Movement is one of the basic mechanics in video games, it plays a crucial role in contributing to a joyful player experience. However, it does not work alone. Game movements as long as background music, levels, the visual art styles of the player avatar, and in-game items contribute a harmony of satisfaction to players. Despite their prevalence in games, we have a
limited scientific understanding of how to create this harmony to provide a satisfying player experience. In this work, we investigate whether we can affect players’ satisfaction with 2D platformer games by providing several specific combinations of game movement, background music, level, and visual style. We built several 2D platformer games that provide different
player experiences by offering different game element combinations. In a between-design user study(N=9), we explored the impact of different movements, music styles, visual styles, and levels on players’ overall satisfaction with games. We found that to make a successful game,
carefully selecting the audience is as important as the game design. Different players can have opposite preferences in video games. Thus, developing a game with a specific audience in mind is one of the most important tasks for game developers. Based on the findings, we discuss implications for game design and future directions.

Supervisor(s)  

Valkyrie Arline Savage

External examiner(s)  

Louise Petersen Matjeka

Date and time  

18.01.2024 9:00 - 10:30

Room  

Sigurdsgade 0-11

 

 

 

Name of student(s)  

Henrik Windum Petersen

Study Programme  

Computer Science

Title  

Dataset Condensation for Object Detection

Abstract  

Recently, various dataset distillation techniques have emerged
to resolve the issue of training a deep neural network in an unreasonable amount of time. These methods suggest the potential
to train models on synthetic datasets, essentially guiding the
network parameters to a similar state as if trained with the
actual, full dataset. Currently, dataset distillation techniques
have shown promise in multi-label image classification. In our
study, however, we aim to extend the use cases of dataset distillation to include semantic segmentation as the underlying
objective. We have modified and implemented two methods:
matching training trajectories and distribution matching. In
comparison, the synthetic datasets derived by either method
did not come close to achieving similar generalizing capability
to the actual dataset. This study explores the challenges and
potential future prospects for dataset distillation in semantic
segmentation applications.

Supervisor(s)  

Raghavendra Selvan

External examiner(s)  

Lee Herluf Lund Lassen

Date and time  

19.01.2024 14:00 - 15:00

Room  

SCI-DIKU-UP1-1-1-N116A

Your email  

uddannelse@diku.dk

 

 

Name of student(s)  

Kinga Agnieszka Bieniarz

Study Programme  

Computer Science

Title  

Enhancing eBPF Verifier for Safe Kernel Code Execution: Specification, Implementation, and Analysis

Abstract  

In the ever-evolving landscape of operating systems and kernel development, ensuring the security, reliability, and performance of the Linux kernel is of paramount importance. The Linux kernel, being at the core of numerous computing systems, serves as the foundation for a wide array of applications, ranging from cloud services to embedded devices. To meet the diverse needs of modern computing, the Linux kernel has embraced extensibility through a technology known as eBPF (Extended Berkeley Packet Filter). This technology empowers developers to augment the kernel’s functionality without compromising its safety and stability. To guarantee safety and security, eBPF comes with a verifier that checks that eBPF code will not threaten the confidentiality and integrity of kernel data. The current eBPF verifier implementation, while important, may have limitations, proven by many vulnerabilities that were discovered it the verifier throughout the years. This thesis project aims at providing a formal specification of the eBPF verifier as well as an implementation of the verifier, using the theory of abstract interpretation.

Supervisor(s)  

Thomas Philip Jensen

External examiner(s)  

Mads Rosendahl

Date and time  

24.01.2024 13.30-14.30

Room  

Auditorium 8 på HCØ

 

 

Name of student(s)  

Oscar Martti Nelin and Bjarke Lohmann Pedersen

Study Programme  

Computer Science

Title  

General Array Locality Optimization by Permutation (GALOP)

Abstract  

Programming in a high-level, data-parallel language such as Futhark alleviates programmers from hardware-specific details. This, in turn, requires such languages to have a general notion of optimal array layouts for their data-parallel operations. Both CPU and GPU architectures may benefit from good locality of reference, and when the user has
no control over the program intrinsics, it is the compiler’s responsibility not to produce programs with poor locality of reference. Research on this topic is very limited; therefore, we present a novel, hardware-agnostic, best-effort algorithm designed to perform statical analysis and optimize the locality of reference in programs by General Array Locality
Optimization by Permutation (GALOP). That is, by strategic permutation on the order of array dimensions. We show that GALOP outperforms the existing solution on the GPU in most cases but achieves this mainly by being more conservative and making fewer bad decisions.

Supervisor(s)  

Troels Henriksen

External examiner(s)  

Magnus Madsen

Date and time  

25.01.2024 time:tbd

Room  

PLTC-mødelokalet

 

Name of student(s)  

Christian Arboe Franck

Study Programme  

Computer Science

Title  

Propagating Negative Information in Streaming First-Order Monitoring

Abstract  

In recent years, the surge in data storage, access, and processing within
data centers has elevated concerns regarding their correct behavior
and security. This thesis builds upon an existing monitoring system
that can process real-time activity logs in the form of data streams to
check violations of security policies. The thesis outlines how the fact of
knowing data will not occur may improve performance, this concept
is described as negative information. By formulating this analysis as a
streaming dataflow computation in the Timely Dataflow framework, the
system ensures high parallelizability, low latency, and high throughput.
Beyond showcasing the effectiveness of these algorithms in real-time
monitoring, the thesis explores the utilization of negative information
to enhance system efficiency and reduce latency in reporting violations.
Keywords: Monitoring · Negative Information · Stream Processing

Supervisor(s)  

Dmitriy Traytel

External examiner(s)  

Alceste Scalas

Date and time  

26.01.2024 14:00 - 15:00

Room  

Sigurdsgade 41, room 2.03

 

Name of student(s)  

Daniel Ellebæk Nielsen

Study Programme  

Computer Science

Title  

Conversion of BPMN graphs to DCR graphs

Abstract  

Process modeling is often used in companies or organizations to model internal processes. Two popular approaches to modeling are imperative and declarative methods. Imperative notation is rigid and defines exact sequences of valid executions, while declarative is flexible and allows all executions that do not violate the defined constraints. Business Process Model and Notation (BPMN) and Dynamic Condition Response (DCR) graphs are respectively imperative and declarative modeling notations.

In this project, I set out to build a tool that converts a BPMN graph to an equivalent DCR graph. Due to time constraints, I will not provide a mapping for all of BPMN, but only a subset of it containing the most used elements. I will also not provide proof using formal semantics to prove the validity of the mapping, because of time constraints and the lack of semantics for BPMN. Instead, I will show equivalence through thorough testing of the tool.

During the project, I developed this tool. It can parse a BPMN file, convert it, and output a corresponding DCR graph. It can output the DCR graph as either XML or as an API call to DCR Solutions. It supports testing through traces, which can be used to define the expected behavior. I have used this testing setup to define a testing suite, which tests a wide range of conversion cases quickly. The mapping between BPMN and DCR that I found was not exactly equivalent, since start events are turned into DCR activities, because of the BPMN exclusive gateway. Otherwise, I fulfilled the requirements I set out to.

Supervisor(s)  

Thomas Troels Hildebrandt

External examiner(s)  

Søren Debois

Date and time  

tbd

Room  

tbd

 

 

 

 

 

 

 

 

 

 

 

 

Health Informatics

 

Name of student(s)  

Katrine Koizumi Gundtoft

Study Programme  

Sundhed og Informatik

Title  

Data needs in selfmanagement of diabetes with Hedia Diabetes Assistant

Abstract  

Background: Diabetes is a complex disease, and the treatment relies heavily on the patient’s abilities to monitor blood glucose and administer insulin daily. With the society’s shift towards a person-centred health system, that actively include the patients in treatment plans, digital
services have become an important source of patient generated data in the management of diabetes.
Purpose: The purpose of this project is to explore the data needs of patients in insulin treatment and healthcare personnel in supporting the patients in self-management, how Hedia Diabetes Assistant fulfils the identified data needs and how the functionality of the bolus calculator app
can be further developed to support the communication between patient and healthcare personnel.
Research design: A qualitative study by semi structured interviews of a doctor, two diabetes nurses and a patient is conducted, and the data is transcribed and coded using NVivo. Research literature on self-management and challenges of life with diabetes will accompany the theoretical framework User Experience and its principles to address the research question.
Results: Six patient data needs and five healthcare personnel data needs are identified in the analysis with themes such as seeking safety, normalisation, and motivation, targeting treatment, and patient empowerment. Hedia Diabetes Assistant is evaluated on usability, usefulness, meaningfulness, and emotional impact to find that Hedia Diabetes Assistant can be further developed. Three proposals for this development are an expanded function for logging possibilities, different activity profiles for individualised and flexible use, and report formatting
to better facilitate the cooperation of patients and healthcare personnel.
Conclusion: Self-management of diabetes is difficult, and there are many challenges in the daily lives of patients with diabetes. Therefore, technology that can be adjusted to each patient’s needs and individual situation is essential. Bolus calculators should act as a tool that effectively
supports documentation and exchange of data to healthcare personnel, and enhances the experiences of self-management, glucose control, and healthcare personnel workflow.

Supervisor(s)  

Erling Carl Havn

External examiner(s)  

Jens Pedersen

Date and time  

16.01.2024 16:00-17:00

Room  

UP1-2-0-04

 

 

 

 

 

 

IT and Cognition

 

Name of student(s)  

Chenxiao Ma

Study Programme  

IT and Cognition

Title  

On the Efficiency of No-regret Exploration: Episodic RL in MDPs and MDPRMs

Abstract  

Reinforcement learning (RL) is a powerful framework for learning optimal
behaviors in unknown dynamical systems from experience and has been
successfully applied to many applications in science and engineering. Most
state-of-the-art RL methods build on the Markov assumption, implying that
the underlying transition and reward functions only depend on the current
state and action of the agent. However, in the real world, this assumption
may be too simplistic: Whether we can open the door to the office depends
not only on where we are, but also on whether we take the key with us when we leave home. Thus, in the natural conception of the state yielding Markov transition functions, it is more intuitive to consider the non-Markovian reward function.

In this thesis, we study the class of non-Markovian reward functions that
can be fully described by some deterministic finite-state transducer with the deterministic transitions of high-level events, such as whether the key is
brought. In the recent literature, such automata are often called Reward
Machines (RM). They allow one to formulate an equivalent (cross-product)
Markov Decision Process (MDP), where one can apply state-of-the-art RL
algorithms. In the context of MDP, we focus on two model-based RL algorithms in the episodic setting: RLSVI powered by the posterior sampling approach and UCBVI implementing the optimism in the face of uncertainty principle. RLSVI is shown to attain a near-optimal regret bound in episodic MDPs, whereas UCBVI enjoys a minimax regret bound. Nonetheless, they have the fame of yielding poor empirical performance even in toy MDPs. We investigate their empirical performance in terms of empirical regret and that of their variants derived using refined confidence sets based on the Laplace and peeling methods.

Furthermore, although the two algorithms and their variants can be blindly
adapted to the environment modeled by a MDP with a Reward Machine (MDPRM), they perform poorly because the associated cross-product MDP has a large state-space, making strategic exploration empirically inefficient. Leveraging the intrinsic structure in the MDPRM, we propose reward-machine-specific variants of the two algorithms that explicitly exploit the known knowledge of the underlying high-level tension. We further explore the empirical performance of these variants using experiments on some standard domains and verify the performance gain over structure-oblivious counterparts.

Supervisor(s)  

Sadegh Talebi

External examiner(s)  

Melih Kandemir

Date and time  

25.01.2024 13:00-14:00

Room  

Online

 

 

 

 

 

 

 

 

Name of student(s)  

Kyara Catharina Helena Rozman

Study Programme  

IT and Cognition

Title  

Development of an analysis pipeline for the automatic behavioral assessment and classification of rat models of neurodevelopmental conditions

Abstract  

In the search to unravel the inner workings of the brain, rodent models are a widely used tool to help us study the complex relationship between genes, brain activity and behavior, and how these may be affected in neurobehavioral disorders such as Autism Spectrum Disorders
or Alzheimer’s disease. Quantification of behavior in rodent models heavily relies on humanly predefined and intuitive measurements, which could limit detection of robust and subtle behavior of rodents. We are in need of a more robust way of describing behavior.

The goal of this study is to develop an analysis pipeline to automatically assess behavior and genetic classification of rat models of neurodevelopmental disorders, without the use of humanly predefined behavior. I leverage a dataset from Simons Initiative for the developing
brain, containing 759 videos spread over 7 genetic models of neurodevelopmental conditions and use DeepLabCut for pose estimation. Subsequently, I process the pose data, extract and robust features (distance, angle and area) and extract statistical metrics on top of the
features. I then apply 2 pipelines (Random Forest and Principal Component Analysis combined with Random Forest) with 1 cross-validation and permutation test to classify between transgenic and wild type rats..

The extended models (median, variance, standard deviation features) outperform the baseline models (mean features) for both pipelines (accuracy 0.66 and 0.63), also significantly after permuted null distribution. Therefore, I can conclude that the models can classify between binary classes of transgenic and wild type rats

Supervisor(s)  

Melanie Ganz-Benjaminsen (supervisor), Martin Nørgaard (co-supervisor), Antonios Asiminas (company supervisor)

External examiner(s)  

Veronika Vladimirovna Cheplygina

Date and time  

30.01.2024 14:00-15:00

Room  

DIKU, UP1 rum 2-0-04