DataNinja sAIOnARA Conference

Keynot Talks

Kuhl Ulrike — 2024-10-28

Tuesday, 25.06.2024

Prof. Dr. Anand Subramoney

Scalable Architectures for Neuromorphic Machine Learning

I will discuss how to design architectures for neuromorphic machine learning from first principles. These architectures take inspiration from biology without being constrained by biological details. Two major themes will be sparsity and asynchrony, and their significant role in scalable neuromorphic systems. I will present recent work from my group on using various forms of sparsity and distributed learning to improve the scalability and efficiency of neuromorphic deep learning models.

Prof. Dr. Kerstin Bunte

Scientific Machine Learning for Partially Observed Dynamical Systems

Nowadays, most successful machine learning (ML) techniques for the analysis of complex interdisciplinary data use significant amounts of measurements as input to a statistical system. The domain expert knowledge is often only used in data preprocessing. The subsequently trained technique appears as a “black box”, which is difficult to interpret and rarely allows insight into the underlying natural process. Especially in critical domains such as medicine and engineering, the analysis of dynamic data in the form of sequences and time series is often difficult. Due to natural or cost limitations and ethical considerations data is often irregularly and sparsely sampled and the underlying dynamic system is complex. Therefore, domain experts currently enter a time-consuming and laborious cycle of mechanistic model construction and simulation, often without direct use of the experimental data or the task at hand. We now combine the predictive power of ML and the explanatory power of mechanistic models.
Therefore we perform learning in the space of dynamic models that represent the complex underlying natural processes, with potentially very few and limited measurements. We use principles of dimensionality reduction, such as subspace learning, to determine relevant areas in the parameter space of the underlying model as a first step to achieve task-driven model reduction. We furthermore incorporate identifiability analysis for informed posterior construction to improve learning with ill-posed systems caused by data limitations. Findings indicate the possibility of an alternative handling of epistemic uncertainties for scientific machine learning techniques applicable for all linear and classes of non-linear mechanistic models based on Lie symmetries.

Joint work of:
Bunte, Kerstin; Tino, Peter; Oostwal, Elisa; Norden, Janis; Chappell, Michael; Smith, Dave

Prof. Dr. Holger Hoos

How and Why AI Will Shape the Future of Science and Engineering

Recent progress in artificial intelligence has elevated what used to be a highly specialised research area to a topic of public discourse and debate. In this presentation I will discuss why beyond the hype, there are good reasons to be excited, but also concerned about AI. Specifically, I will explain how and why AI will have transformative impact on all sciences and engineering disciplines. Based on my own research on the robustness of neural networks, I will discuss some of the fundamental strengths, weaknesses and limitations of current AI systems. Finally, I will share some thoughts on the most serious risks of deploying these systems quickly and broadly, as well as on what needs to be done in order to manage these risks and to realise the benefits AI can bring.

Prof. Dr. Christian Igel

Deep Learning for Large-Scale Tree Carbon Stock Estimation From Satellite Imagery

Trees play an important role for carbon sequestration, biodiversity, as well as timber and food production. We need a better characterization of woody resources at global scale to understand how they are affected by climate change and human management. Recent advances in satellite remote sensing and machine learning (ML) based computer vision makes this possible. This talk discusses large-scale mapping of individual trees using deep learning applied to high-resolution satellite imagery. The biomass of each tree, and thereby its carbon content, is estimated from the crown size using allometric equations. The parameters of these equations are learned from data. The functional relation is assumed to be non-decreasing. Such monotonicity constraints are powerful regularizers in ML in general. They can support fairness in computer-aided decision making and increase plausibility in data-driven scientific models. This talk introduces a conceptually simple and efficient neural network architecture for monotonic modelling that compares favorable to state-of-the-art alternatives. After this technical excursion, we present an application of our tree monitoring in Rwanda, where it helps quantifying progress of restoration projects and developing a pathway to reach the country’s goal of net zero emissions by 2050.

Wednesday, 26.06.24

Prof. Dr. Lucie Flek

Perspective Taking in Large Language Models

Perspective-taking, the process of conceptualizing the point of view of another person, remains a challenge for LLMs. Understanding the mental state of others – emotions, beliefs, intentions – is central for the ability to empathize in social interactions. It is also the key to choose the best action to take next.
Enhancing perspective-taking capabilities of LLMs can unlock their potential to react better and safer to hints of distress, to engage in a more receptive argumentation, or to target an explanation to an audience. In this talk, I will present our recent perspective-taking experiments, and discuss further opportunities for bringing the human-centered perspectivist paradigm into the LLMs.

Prof. Dr. Henning Wachsmuth

LLM-based Argument Quality Improvement

Natural language processing (NLP) has recently seen a revolutionary breakthrough, due to the impressive capabilities of large language models (LLM). This also affects NLP research on computational argumentation: the computational analysis and synthesis of natural language arguments. While one of the core tasks studied in computational argumentation is the assessment of an argument’s quality, in this talk I look one beyond, namely at how to improve argument quality. Starting from basics of argumentation, I present insights from selected research of my group involving LLMs for improving argument quality. As part of this, I also look at the recent breakthroughs of LLMs and the paradigm shift that comes with them for computational argumentation in particular and for NLP in general.

Thursday, 27.06.2024

Prof. Dr. Sebastian Trimpe

Trustworthy AI for Physical Machines: Integrating Machine Learning and Control

AI promises significant advancements in engineering, enhancing both design and operation processes. Given that engineering focuses on physical machines like vehicles or robots, ensuring trustworthy solutions is crucial. This talk will explore how combining classical control methods with modern machine learning can create reliable algorithms for real-world applications. Specifically, we will discuss some of our recent research on (i) Bayesian optimization for controller learning, (ii) deep reinforcement learning, and (iii) approximate model-predictive control via imitation learning. The effectiveness of the developed algorithms will be demonstrated through experimental results on robotic hardware.

Prof. Dr. Malte Schilling

Biological Biases for Learning Robust Robot Behavior: Does Deep Reinforcement Learning Run into the Alignment Problem?

Shaping Trustworthy AI: An Introduction to This Issue

Ulrike Kuhl — 2024-07-18

Is it Possible to Characterize Group Fairness in Rankings in Terms of Individual Fairness and Diversity?

Chiara Balestra — 2024-10-11

Rankings are ever-present in everyday life. Examples are the results of personalized recommendations and web search queries. Rankings can result from an algorithm, importance scores and human-based rankings of items. Till we are not concerned with societal applications, the “fairness“ of the ranking is often irrelevant; however, problems appear when switching from depersonalized items to individuals. Then, suddenly, fairness becomes an issue. We investigate the relationships among group fairness, individual fairness, diversity, and Shapley values. Far from being a comprehensive survey of fairness-related papers or proposing a new method, we want to raise awareness of the chaos we are trying to navigate and propose some new research direction we are trying to follow.

Comparing Shapley Value Approximation Methods for Unsupervised Feature Importance

Patrick Kolpaczki — 2024-10-11

Assigning importance scores to features is a common approach to gain insights about a prediction model’s behavior or even the data itself. Beyond explainability, such scores can also be of utility to conduct feature selection and make unlabeled high-dimensional data manageable. One way to derive scores is by adopting a game-theoretical view in which features are understood as agents that can form groups and cooperate for which they obtain a reward. Splitting the reward among the features appropriately yields the desired scores. The Shapley value is the most popular reward sharing solution. However, its exponential complexity renders it inapplicable for high-dimensional data unless an efficient approximation is available. We empirically compare selected approximation algorithms for quantifying feature importance on unlabeled data.

Question Answering from Healthcare Fora

David M. Schmidt — 2024-10-11

Assessing the quality of life of cancer patients is an important aspect of patient-focused drug development and real-world evidence generation. Specialized quality of life questionnaires exist for this purpose, and different types of cancer, such as breast cancer or lung cancer, can be assessed. However, conducting these surveys is a time-consuming process for both patients and clinical staff. At the same time, many patients discuss their experiences with and symptoms of their specific diseases in online healthcare fora. These forum posts may contain information that could be used to answer quality of life questions. Our objective is to determine whether forum posts can be used to answer quality of life questionnaires and, if so, whether this process can be automated successfully.

Linguistic-Based Reflection on Trust Calibration in Conversations with LLM-Based Chatbots

Milena Belosevic — 2024-10-11

This paper presents a linguistic approach to trust in human conversations with LLM-based chatbots. Using the concept of trust calibration as a starting point, we aim to address the question of how to increase user AI literacy and prevent misuse of as well as overtrust in the information provided by LLM-based chatbots in educational contexts. We propose a linguistic-based model of trust calibration that supports users in adopting a critical perspective on trust calibration and controlling their trust level. The method combines previous studies on trust in human interaction, specifically linguistic trust cues displayed by human trustors to indicate their level of trustworthiness in naturally occurring contexts with studies on proactive human-computer interaction and the social influence of conversational agent's embodiment in educational contexts.

COMETH - An Active Learning Approach Enhanced with Large Language Models

Franziska Zelba — 2024-10-11

We present a system for supervision of technical processes, called COMETH, which involves an active learning approach. The system is able to identify anomalies with very little training data, through an efficient feedback process. COMETH has been successfully applied in the context of heating ventilation and air conditioning systems and in industrial machinery. Here, we describe the idea of combining the time series analysis COMETH with large language models to integrate further context information and thus provide the user with specific recommendations.

Finding Commonalities in Dynamical Systems with Gaussian Processes

Andreas Besginow — 2024-10-11

Gaussian processes can be utilized in the area of equation discovery to identify differential equations describing the physical processes present in time series data.
Furthermore, automatically constructed models can be split into components that facilitate comparisons between time series on a structural level. We consider the potential combination of these two methods and describe how they could be used to detect shared physical properties in multiple recordings of dynamical systems as time series. This approach provides insights into the underlying dynamics of the observed systems, facilitating a deeper understanding of complex processes.

Dueling Bandits with Delayed Feedback

Jasmin Brandt — 2024-10-11

Dueling Bandits is a well-studied extension of the Multi-Armed Bandits problem, in which the learner must select two arms in each time step and receives a binary feedback as an outcome of the chosen duel. However, all of the existing best arm identification algorithms for the Dueling Bandits setting assume that the feedback can be observed immediately after selecting the two arms. If this is not the case, the algorithms simply do nothing and wait until the feedback of the recent duel can be observed, which is a waste of runtime. We propose an algorithm that can already start a new duel even if the previous one is not finished and thus is much more time efficient. Our arm selection strategy balances the expected information gain of the chosen duel and the expected delay until we observe the feedback. By theoretically grounded confidence bounds we can ensure that the arms we discard are not the best arms with high probability.

Leveraging Desirable and Undesirable Event Logs in Process Mining Tasks

Ali Norouzifar — 2024-10-11

Traditional process mining techniques utilize one event log as input to offer organizational insights. In many applications, information regarding undesirable process aspects may exist. However, the literature lacks a comprehensive overview of their integration into process mining tasks. In our paper, we explore leveraging data from both desirable and undesirable event logs to augment existing process mining tasks and develop innovative applications. Our aim is to systematically outline the potential for enhancements in this realm.

Feeling Socially Excluded When Working With Robots

Clarissa Sabrina Arlinghaus — 2024-10-11

Work is not just about money, but also about satisfying social needs. We examine processes of social inclusion and exclusion among human employees and robot employees. For our current study, we chose the restaurant industry as a contemporary use case where humans and robots work together as waiters. We assume that social exclusion from either human or robot colleagues will threaten people’s needs (i.e., belonging, control, meaningful existence, self-esteem) but will be interpreted differently depending on the excluding agent (i.e., human colleague or robot colleague). Assuming different attribution processes challenges the “Computers Are Social Actors” theory and could lead the rethinking human-robot interactions or even humans interacting with technology in general.

Trade-offs Between Privacy and Performance in Encrypted Dataset using Machine Learning Models

Sanaullah — 2024-10-11

In recent years, with the increasing importance of dataset privacy in machine learning (ML) applications, there has been an increased demand for secure and privacy-preserving solutions. Consequently, encryption techniques have become known as a critical tool for protecting data privacy in an era of massive data use, exchange, and analysis. Encryption protects data against illegal access and disclosure by changing it into unreadable ciphertext that can only be decrypted by authorized parties. In the field of ML, where sensitive data is often utilized, in such a process the use of encryption techniques has significant potential for providing privacy-preserving model training and inference. Therefore, this article analyzes, investigates, and compares three widely used encryption techniques. Each encryption method offers unique advantages and trade-offs. Thus, we evaluate the performance of Convolutional Neural Network (CNN) models trained on encrypted datasets using these encryption techniques to provide detailed information on the effectiveness, practical concerns, and applicability of various methods for real-world applications by completely analyzing them within the context of computer vision. We test the performance of CNN models trained on encrypted data with several encryption approaches using neural models based-architecture. Parameters such as training time, memory usage, and classification accuracy are analyzed and compared between encryption methods. We also look into the effect of encryption on model interpretability and robustness against adversarial attacks. Furthermore, to support our study we demonstrate our approach by using practical implementation—to showcase the performance and efficiency of each encryption strategy in protecting data privacy while keeping model accuracy and testing in a real-time recognition application using an edge device such as NVIDIA Jetson. Through this comparative analysis, researchers and developers can achieve a more in-depth understanding of the importance and issues involved with the integration of encryption techniques into ML especially in computer vision application workflows.

Advancements in Neural Network Generations

Sanaullah — 2024-10-11

Innovations in Neural Network Generation demonstrate the continual evolution, optimization, and development of artificial neural networks (ANNs) over periods. These improvements include a combination of methodologies, approaches, and technical breakthroughs aimed at increasing the efficiency and abilities of neural network models. Researchers and engineers have repeatedly attempted to push the boundaries of neural network performance, scalability, and applicability across multiple fields. These improvements usually involve changes to network designs, training algorithms, optimization methodologies, and hardware acceleration methods. Moreover, the neural network generations are closely related to key achievements in the machine learning (ML) research domain, such as the development of deep learning (DL) designs like convolutional neural network (CNN) or spiking neural network (SNN) and using both neural generations to introduce natural language processing and advances in computer vision applications. Thus, in the field of neural network study, researchers have categorized ANN models into generations based on their computational design and capabilities. Therefore, this research study explores the continual evolution and optimization of ANNs, highlighting advancements in methodologies and technical innovation. We discuss the different generations of ANN, based on computational design and capabilities, emphasizing their role in shaping achievements in ML research. The study underscores the significance of these generational milestones in enhancing the adaptability and efficacy of neural network models for computational tasks, such as image classification.

Nonlinear Prediction in a Smart Shoe Insole

Markus Vieth — 2024-10-11

In our previous work, we have investigated different methods to compute the ideal placement of pressure sensors in a smart shoe insole. There, we used a linear model to predict the weight put on the foot/leg. In this work, we investigate how using a quadratic model instead changes the sensor placement and improves prediction performance.

Prediction of Intermuscular Co-contraction Based on the sEMG of Only One Muscle With the Same Biomechanical Direction of Action

Nils Grimmelsmann — 2024-10-11

Research aims to enhance physical abilities using exoskeletons and limb movement prediction. SEMG signals are used for intuitive control, but their measurement is limited to shallowly under-the-skin muscles, making deep muscle signals less frequently used.
Here we extended a previously proposed method to train a virtual sensor for the difficult to access muscles (deep muscles e.g. brachialis).
The method is extended from signals from the same muscle to intermuscular signals and the results confirm simple biomechanical assumptions. The trained virtual sensors are ready for further investigations by being used in a biomechanical model.

Bioinspired Decentralized Hexapod Control with a Graph Neural Network

Luca Hermes — 2024-10-11

Legged locomotion enables animals to navigate challenging terrains. However, it demands intricate coordination between the legs, with varying levels of information exchange depending on the task. For instance, in more demanding scenarios such as an insect climbing on a twig, greater coordination between the legs is necessary to achieve adaptive behavior. To address this challenge for legged robots, we present a concept and preliminary results of a decentralized biologically inspired controller for a hexapod robot: Based on insights of coordination influences between legs in stick insects, our approach models inter-leg information flow as message passing through a Graph Neural Network.

Improving Trust in AI Through Sustainable and Trustworthy Reporting

Raphael Fischer — 2024-10-11

This extended abstract outlines STREP, our (S)ustainable and (T)rustworthy (REP)orting framework. It communicates performance indicators of systems that build on artificial intelligence and thus makes them more trustworthy.

Beyond Trial and Error in Reinforcement Learning

Moritz Lange — 2024-10-11

In this work, we address the trial-and-error nature of modern reinforcement learning (RL) methods by investigating approaches inspired by human cognition. By enhancing state representations and advancing causal reasoning and planning, we aim to improve RL performance, robustness, and explainability. Through diverse examples, we showcase the potential of these approaches to improve RL agents.

Closing the Loop with Concept Regularization

Andres Felipe Posada-Moreno — 2024-10-11

Convolutional Neural Networks (CNNs) are widely adopted in industrial settings, but are prone to biases and lack transparency. Explainable Artificial Intelligence (XAI), particularly through concept extraction (CE), allows for global explanations and bias detection, yet fails to offer corrective measures for identified biases. To bridge this gap, we introduce Concept Regularization (CoRe), which uses CE capabilities alongside human feedback to embed a regularization term during retraining. CoRe allows for the adjustments in model sensitivities based on identified biases, aligning model prediction process with expert human assessments. Our evaluations on a modified metal casting dataset demonstrate CoRe's efficacy in bias mitigation, highlighting its potential to refine models in practical applications.

Provable Guarantees for Deep Learning-Based Anomaly Detection through Logical Constraints

Tim Katzke — 2024-10-11

Incorporating constraints expressed as logical formulas and based on foundational prior knowledge into deep learning models can provide formal guarantees for the fulfillment of critical model properties, improve model performance, and ensure that relevant structures can be inferred from less data. We propose to thoroughly explore such logical constraints over input-output relations in the context of deep learning-based anomaly detection, specifically by extending the capabilities of the MultiplexNet framework.

Study on the Influence of Texture Variation on the Validation Performance of a Synthetically Trained Object Detector

Alexander Moriz — 2024-10-11

In recent years, the utilization of synthetic data for the training of Deep Learning (DL) approaches has emerged as a valid alternative to the costly process of real data acquisition. Yet, the influence of the sim-to-real gap on the model performance still poses an obstacle to the broader usage of synthetic data. To investigate the major contributing factors, this study focuses on the influence of texture variation as a first step. Examining different strategies for generating synthetic validation sets for the training process of an object detector, the results of this study indicate that the sole influence of textures is insufficient to cause the observable performance gap alone.

Interpretable Machine Learning via Linear Temporal Logic

Simon Lutz — 2024-10-11

In recent years, deep neural networks have shown excellent performance, outperforming even human experts in various tasks. However, their inherent complexity and black-box nature often make it hard, if not impossible, to understand the decisions made by these models, hindering their practical application in high-stakes scenarios.

We propose a framework for learning LTL formulas as inherently interpretable machine learning models. These models can be trained both in a supervised and unsupervised setting. Furthermore, they can easily be extended to handle noisy data and to incorporate expert knowledge.

Distributive Justice of Resource Allocation Through Artificial Intelligence

Paul Hellwig — 2024-10-11

Artificial intelligence will take over leadership functions such as rewarding employee performance. It will therefore make decisions about employee outcomes and most likely allocate different resources to employees. Resource Theory of Social Exchange distinguishes six resource classes. The theory postulates that the value of some resources depend on the identity of the provider of the resource and on the relationship with the provider. This raises the question of whether certain resources, such as the resource affiliation, have a value when they are allocated by artificial intelligence. This contribution calls for studies that investigate the value of different resources allocated by artificial intelligence in leadership functions.

Concept Extraction for Time Series With ECLAD

Antonia Holzapfel — 2024-10-11

Concept Extraction (CE) methods are being increasingly used in the image domain for explaining deep learning models, which are not inherently interpretable. However, there have not been transfer studies yet for their usage in the time series domain. The purpose of this work is to explore the use of CE methods in time series. We propose to modify the ECLAD algorithm for this domain by changing the latent space representation used to extract concepts. This method is then tested on an InceptionTime model trained on the Gunpoint dataset. Preliminary results show that we can successfully extract concepts from time series models on datasets with local features and provide conceptual explanations that effectively explain how the model works.

Trustworthy Virtual Measurements in Battery Manufacturing

Lukas Krebs — 2024-10-11

The growing demand for electric cars necessitates an increase in battery production efficiency and cost-effectiveness. Through a reduction of the joint testing efforts an increase of productivity can be accomplished. To achieve the reduction, remain on a high level of quality standards and increase the informational content about current production the use of virtual measurements is examined. Ensuring the trustworthiness of virtual measurements is crucial for informed decision making, necessitating validation. This paper explores the requirements and challenges in battery manufacturing for implementing trustworthy virtual measurements. Two central requirements are identified to enable virtual measurements. Firstly, a traceability system based on the production meta-model is needed to track process parameters and quality characteristics. Secondly, a framework is proposed to facilitate reliable virtual measurements. The primary challenge for virtual measurement in battery manufacturing systems from the complexity of the process chain and products. It is crucial to assess how virtual measurements perform across various processes and to evaluate their transferability to different process parameters and products.