The variety and complexity of data-intensive applications and systems have been increasing drastically the past decade. Tasks from a SQL-based big data analytics request can be very different from tasks from deep learning training. Nevertheless, these data-intensive applications increasingly run on shared powerful hardware resources in data centers and high-performance computing (HPC) centers or resource-constrained edge/Internet-of-Things(IoT) devices. These hardware resources are also diverse ranging from general-purpose CPUs and GPUs to programmable FPGAs and specialized hardware like TPUs. There is a pressing need for a more resource-aware infrastructure that orchestrates the different data-intensive tasks over the heterogeneous processing units effectively. In order to achieve this, our approach is to first investigate the resource consumption characteristics of different data-intensive workloads, and then to establish and implement guidelines for hardware resource management for data-intensive systems. These days, our team more specifically focuses on resource-aware and resource-constrained machine learning.
This research has been supported by Independent Research Fund Denmark and Novo Nordisk Foundation.