Publication details

Using Kubernetes in Academic Environment : Problems and Approaches

Authors

SPIŠAKOVÁ Viktória KLUSÁČEK Dalibor HEJTMÁNEK Lukáš

Year of publication 2023
Type Article in Proceedings
Conference Job Scheduling Strategies for Parallel Processing
MU Faculty or unit

Institute of Computer Science

Citation
Web
Doi http://dx.doi.org/10.1007/978-3-031-22698-4_12
Keywords cloud;HPC;scheduling;Kubernetes;resource management
Description In this work, we discuss our experience when utilizing the Kubernetes orchestrator (K8s) to efficiently allocate resources in a heterogeneous and dynamic academic environment. In the commercial world, the "pay per use" model is a strong regulating factor for efficient resource usage. In the academic environment, resources are usually provided "for free" to the end-users, thus they often lack a clear motivation to plan their use efficiently. In this paper, we show three major sources of inefficiencies. One is the users' requirement to have interactive computing environments, where the users need resources for their application as soon as possible. Users do not appreciate waiting for interactive environments, but constantly keeping some resources available for interactive tasks is inefficient. The second phenomenon is observable in both interactive and batch workloads; users tend to overestimate necessary limits for their computations, thus wasting resources. Finally, Kubernetes does not support fair-sharing functionality (dynamic user priorities) which hampers the efforts when developing a fair scheme for Pod/job scheduling and/or eviction. We discuss various approaches to deal with these problems such as scavenger jobs, placeholder jobs, Kubernetes-specific resource allocation policies, separate clusters, priority classes, and novel hybrid cloud approach. We also show that all these proposals open interesting scheduling-related questions that are hard to answer with existing Kubernetes tools and policies. Last but not least, we provide a real workload trace from our installation to the scheduling community which captures these phenomena.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info