Live Agent working · engine-01 Placer/router engine positioning
Back to SOTA Summary
Google Research / TILOS / academic AI4EDA

Learning-Based Macro Placement

Learning-based placement is relevant to us only after we have a robust conventional flow. Google Circuit Training frames placement as a learned sequential decision problem. TILOS and follow-on work emphasize reproducibility, stronger baselines, and full-flow evaluation.

Class

Reinforcement-learning and learning-assisted floorplanning

Core Stance

Promising but contested: learned priors can accelerate macro placement across repeated design families, but claims need reproducible end-to-end benchmarking.

Page Sections
Architecture

How It Works

Circuit Training treats chip floorplanning as reinforcement learning: an agent places macros sequentially and receives reward from downstream quality proxies.

The model uses graph representations of the netlist so experience from prior designs can transfer to new placements.

The open-source release includes a pre-trained checkpoint and positions learning as improving speed, reliability, and quality as more related examples are seen.

Independent assessment efforts focus on reconstructing missing pieces, publishing benchmark cases, and evaluating post-route outcomes rather than only proxy scores.

The main architectural lesson is dataset leverage: if many boards share circuit patterns, footprints, constraints, or connectors, a learned policy can propose better initial placements or hyperparameters.

Comparison

Compared With Our Flow

We have too little data and too much router instability to make RL the core placer today.

We do have repeated board families and generated designs, which could become a training/evaluation corpus.

Our deterministic analytical placer is easier to debug. Learning should augment it with initializations, cost weights, and macro ordering, not replace it yet.

Our site/benchmark pipeline could become the reproducibility harness that learning-based methods require.

Gaps

Gaps It Exposes

No dataset of routed/failed board attempts with normalized features and outcomes.

No learned warm-starts for component placement, net weighting, route ordering, or ECO blocker selection.

No strict train/test separation or benchmark reporting, which would make learned improvements suspect.

No synthetic board generator tuned to our actual design domain.

Actions

What We Should Steal

Do not start with RL. First log every placement/routing run as structured examples: constraints, placements, route metrics, DRC, and images.

Use supervised learning first: predict congested regions, failed nets, route ordering, and useful net-weight boosts.

Use learned priors only as suggestions into deterministic optimization, with reproducible fallbacks.

Adopt TILOS-style full-flow benchmarking: compare post-route DRC, unconnected, area, wirelength, vias, and runtime, not just proxy objective.

Sources
Other SOTA Pages