Journée du GT SCALE


Journée du GT SCALE
Vendredi, 3 Décembre, 2021


ENS-Lyon, site Monod (46 allée d’Italie)
Salles des conseils (2ème étage, centre)




10h00 Accueil café

10h30 Redouane Elghazi
Shelf schedules for independent moldable tasks to minimize the energy consumption
Scheduling independent tasks on a parallel platform is a widely-studied problem, in particular
when the goal is to minimize the total execution time, or makespan (P||Cmax problem
in Graham’s notations). Also, many applications do not consist of sequential tasks, but rather
parallel moldable tasks that can decide their degree of parallelism at execution (i.e., on how
many processors they are executed). Furthermore, since the energy consumption of data centers
is a growing concern, both from an environmental and economical point of view, minimizing the
energy consumption of a schedule is a main challenge to be addressed. One can then decide, for
each task, on how many processors it is executed, and at which speed the processors are operated,
with the goal to minimize the total energy consumption. We further focus on co-schedules,
where tasks are partitioned into shelves, and we prove that the problem of minimizing the energy
consumption remains NP-complete when static energy is consumed during the whole duration
of the application. We are however able to provide an optimal algorithm for the schedule within
one shelf, i.e., for a set of tasks that start at the same time. Several approximation results are
derived, and simulations are performed to show the performance of the proposed algorithms.

11h00 Lucas Perotin
Resilient Scheduling of Moldable Parallel Jobs to Cope with Silent Errors
This presentation features resilient scheduling of moldable parallel jobs on high-performance
computing (HPC) platforms. Moldable jobs allow for choosing a processor allocation before
execution, and their execution time obeys various speedup models. The objective is to minimize
the overall completion time or the makespan, when jobs can fail due to silent errors and hence
may need to be re-executed after each failure until successful completion. Our work introduce two
resilient scheduling algorithms that generalizes the classical scheduling framework for failure-free
jobs and cope with silent errors.

11h30 Vincent Fagnon
Scheduling avec oracle

Après Midi

14h00 Chifaa Dahik
Optimisation discrète robuste en présence d’incertitude ellipsoïdale
On s’intéresse à la version robuste des problèmes linéaires à variables binaires avec un ensemble
d’incertitude ellipsoïdal corrélé. Puisque ce problème est NP-difficile, une approche heuristique
intitulée DFW et basée sur l’algorithme de Frank-Wolfe est proposée. Dans cette approche, nous examinons la puissance d’exploration des itérations internes binaires de la méthode.
Pour les problèmes de petites tailles, la méthode est capable de fournir la solution optimale
fournie par CPLEX, après quelques centaines d’itérations. De plus, contrairement à la méthode
exacte, notre approche s’applique à des problèmes de grandes tailles également. Les résultats
numériques ont été appliqués au plus court chemin robuste. Un autre objectif est de proposer
une relaxation semi-définie positive (SDP) pour le plus court chemin robuste qui fournit une
borne inférieure pour valider des approches telles que l’algorithme DFW. Le problème relaxé
est le résultant d’une bidualisation du problème. Puis le problème relaxé est résolu en utilisant
une version creuse d’une méthode de décomposition dans un espace produit. Cette méthode
de validation est adaptée aux problèmes de grande taille. Finalement, une autre adaptation de
l’algorithme de Frank-Wolfe a été réalisé pour le problème du k-médiane, accompagnée d’un
algorithme d’arrondissement qui satisfait les contraintes

14h30 Anthony Dugois
Bounding the Flow Time in Online Scheduling with Structured Processing Sets
Replication in distributed key-value stores makes scheduling more challenging, as it introduces
processing set restrictions, which limits the number of machines that can process a given
task. We focus on the online minimization of the maximum response time in such systems, that
is, we aim at bounding the latency of each task. When processing sets have no structure, Anand
et al. (Algorithmica, 2017) derive a strong lower bound on the competitiveness of the problem :
no online scheduling algorithm can have a competitive ratio smaller than Omega(m), where m is the
number of machines. In practice, data replication schemes are regular, and structured processing
sets may make the problem easier to solve. We derive new lower bounds for various common
structures, including inclusive, nested or interval structures. In particular, we consider fixed sized
intervals of machines, which mimic the standard replication strategy of key-value stores. We
prove that EFT scheduling is (3 -2/k )-competitive when optimizing max-flow on disjoint intervals
of size k. However, we show that the competitive ratio of EFT is at least m - k + 1 when
these intervals overlap, even when unit tasks are considered. We compare these two replication
strategies in simulations and assess their efficiency when popularity biases are introduced, i.e.,
when some machines are accessed more frequently than others because they hold popular data.

15h00 Andrei Da Silva
Scheduling of Functions on Serverless Platforms

15h30 Manal Benaissa
Standalone Data-Center Sizing Combating the Over-provisioning of Both the IT
and Electrical Parts

During the two last decades, sustainability of IT infrastructures like data centers became a
major concern for computer giants like Google or Amazon. Data centers powered with renewable
energy have been proposed. But because of the intermittency of these power alternatives, these
platforms remains connected to the classical power grid. IT structure and electrical constraints
were often questioned separately, leading to a non-efficient global system. In this paper, an energy
self-sufficient green data center is modeled and designed, by proposing an electrically autonomous infrastructure including wind turbines, solar panels and its own short and long-term storage
devices mainly based on batteries and hydrogen system. The sizing of such an infrastructure one
of the main issue of the DATAZERO2 project. Based of previous sizing suggestions, the paper
shows how to reduce and combat the overprovioning by questioning its impact on the QoS, when
decreasing the number of computing or storage elements (servers and batteries).

16h00 Break / Discussion scientifique puis clôture de la journée