Skip to content

runtime: make the page allocator scale #35112

Closed
@mknyszek

Description

@mknyszek

Over the course of the last year, we've seen several cases where making relatively minor changes to the allocator slow path, which allocates pages from the heap, caused serious performance issues (#32828, #31678). The problem stemmed largely from contention on the heap lock: a central m-based spinlock (so we can't schedule another g while it's waiting!) which guards nearly all operations on the page heap. Since Go 1.11, (and likely earlier) we've observed barging behavior on this lock in applications which allocate larger objects (~1K) frequently, which indicates a collapsing lock. Furthermore, these applications tend to stay the same or worsen in performance as the number of available cores on the machine increases. This represents a fundamental scalability bottleneck in the runtime.

Currently the page allocator is based on a treap (and formerly had a small linked-list based cache for smaller free spans), but I propose we rethink it and rewrite to something that:

  1. Is more cache-friendly, branch-predictor friendly, etc. on the whole.
  2. Has a lockless fast path.

The former just makes the page allocator faster and less likely to fall into bad paths in the microarchitecture, while the latter directly reduces contention on the lock.

While increased performance across the board is what we want, what we're most interested in solving here is the scalability bottleneck: when we increase the number of cores available to a Go application, we want to see a performance improvement.

Edit: Design doc

I've already built both an out-of-tree and in-tree prototype of a solution which is based on a bitmap representing the whole heap and a radix tree over that bitmap. The key idea with bitmaps here being that we may "land grab" several pages at once and cache them in a P. Two RPC benchmarks, a Google internal one, and one based on Tile38, show up to a 30% increase in throughput and up to a 40% drop in tail latencies (90th, 99th percentile) on a 48-core machine, with the benefit increasing with the number of cores available to the benchmark.

Activity

added this to the Go1.14 milestone on Oct 23, 2019
self-assigned this
on Oct 23, 2019
gopherbot

gopherbot commented on Oct 23, 2019

@gopherbot
Contributor

Change https://golang.org/cl/200439 mentions this issue: runtime: place lower limit on trigger ratio

gopherbot

gopherbot commented on Oct 23, 2019

@gopherbot
Contributor

Change https://golang.org/cl/195698 mentions this issue: runtime: count scavenged bits for new allocation for new page allocator

gopherbot

gopherbot commented on Oct 23, 2019

@gopherbot
Contributor

Change https://golang.org/cl/190621 mentions this issue: runtime: add packed bitmap summaries

gopherbot

gopherbot commented on Oct 23, 2019

@gopherbot
Contributor

Change https://golang.org/cl/196640 mentions this issue: runtime: make more page sweeper operations atomic

gopherbot

gopherbot commented on Oct 23, 2019

@gopherbot
Contributor

Change https://golang.org/cl/195697 mentions this issue: runtime: add scavenging code for new page allocator

gopherbot

gopherbot commented on Oct 23, 2019

@gopherbot
Contributor

Change https://golang.org/cl/196643 mentions this issue: runtime: add page cache and tests

gopherbot

gopherbot commented on Oct 23, 2019

@gopherbot
Contributor

Change https://golang.org/cl/190622 mentions this issue: runtime: add new page allocator core

gopherbot

gopherbot commented on Oct 23, 2019

@gopherbot
Contributor

Change https://golang.org/cl/190619 mentions this issue: runtime: add new page allocator constants and description

gopherbot

gopherbot commented on Oct 23, 2019

@gopherbot
Contributor

Change https://golang.org/cl/196642 mentions this issue: runtime: add per-p mspan cache

gopherbot

gopherbot commented on Oct 23, 2019

@gopherbot
Contributor

Change https://golang.org/cl/201763 mentions this issue: runtime: make the scavenger self-paced

gopherbot

gopherbot commented on Oct 23, 2019

@gopherbot
Contributor

Change https://golang.org/cl/190620 mentions this issue: runtime: add mallocbits and tests

gopherbot

gopherbot commented on Oct 23, 2019

@gopherbot
Contributor

Change https://golang.org/cl/196639 mentions this issue: runtime: remove unnecessary large parameter to mheap_.alloc

77 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @mknyszek@ianlancetaylor@ALTree@gopherbot@un000

        Issue actions

          runtime: make the page allocator scale · Issue #35112 · golang/go