NSDI '22 - Efficient Scheduling Policies for Microsecond-Scale Tasks

Показать описание

NSDI '22 - Efficient Scheduling Policies for Microsecond-Scale Tasks

Sarah McClure and Amy Ousterhout, UC Berkeley; Scott Shenker, UC Berkeley, ICSI; Sylvia Ratnasamy, UC Berkeley

Datacenter operators today strive to support microsecond-latency applications while also using their limited CPU resources as efficiently as possible. To achieve this, several recent systems allow multiple applications to run on the same server, granting each a dedicated set of cores and reallocating cores across applications over time as load varies. Unfortunately, many of these systems do a poor job of navigating the tradeoff between latency and efficiency, sacrificing one or both, especially when handling tasks as short as 1μs.

While the implementations of these systems (threading libraries, network stacks, etc.) have been heavily optimized, the policy choices that they make have received less scrutiny. Most systems implement a single choice of policy for allocating cores across applications and for load-balancing tasks across cores within an application. In this paper, we use simulations to compare these different policy options and explore which yield the best combination of latency and efficiency. We conclude that work stealing performs best among load-balancing policies, multiple policies can perform well for core allocations, and, surprisingly, static core allocations often outperform reallocation with small tasks. We implement the best-performing policy choices by building on Caladan, an existing core-allocating system, and demonstrate that they can yield efficiency improvements of up to 13-22% without degrading (median or tail) latency.

Рекомендации по теме

Комментарии

Interesting and good job on presenting.

allanwind

NSDI '22 - Efficient Scheduling Policies for Microsecond-Scale Tasks

NSDI '22 - Efficient Scheduling Policies for Microsecond-Scale Tasks

NSDI '22 - SLearn - A Case for Task Sampling based Learning for Cluster Job Scheduling

NSDI '22 - Dynamic Scheduling of Approximate Telemetry Queries

NSDI '19 - Eiffel: Efficient and Flexible Software Packet Scheduling

NSDI '22 - RDC: Energy-Efficient Data Center Network Congestion Relief with Topological

NSDI '22 - SketchLib: Enabling Efficient Sketch-based Monitoring on Programmable Switches

Paper #108. Efficient Scheduling Policies for Microsecond-Scale Tasks

NSDI '22 - MLaaS in the Wild: Workload Analysis and Scheduling in Large-Scale Heterogeneous GPU...

NSDI '22 - Packet Order Matters! Improving Application Performance by Deliberately Delaying Pac...

NSDI '22 - Tiara: A Scalable and Efficient Hardware Acceleration Architecture for Stateful Laye...

NSDI '22 - Backdraft: a Lossless Virtual Switch that Prevents the Slow Receiver Problem

NSDI '22 - Runtime Programmable Switches

NSDI '22 - Aquila: A unified, low-latency fabric for datacenter networks

NSDI '21 - Caerus: NIMBLE Task Scheduling for Serverless Analytics

NSDI '23 - Better Together: Jointly Optimizing ML Collective Scheduling and Execution Planning....

NSDI '22 - SwarmMap: Scaling Up Real-time Collaborative Visual SLAM at the Edge

NSDI '22 - Performance Interfaces for Network Functions

NSDI '20 - Programmable Calendar Queues for High-speed Packet Scheduling

NSDI '21 - Elastic Resource Sharing for Distributed Deep Learning

NSDI '22 - NetHint: White-Box Networking for Multi-Tenant Data Centers

OSDI '22 - Looking Beyond GPUs for DNN Scheduling on Multi-Tenant Clusters

OSDI '20 - RackSched: A Microsecond-Scale Scheduler for Rack-Scale Computers

NSDI '23 - RingLeader: Efficiently Offloading Intra-Server Orchestration to NICs

NSDI '22 - Configanator: A Data-driven Approach to Improving CDN Performance.