Add Human-SL KGS-rank ladder (gtp_human<rank>.cfg, 9d→20k) + tunehuman subcommand by ChinChangYang · Pull Request #1209 · lightvector/KataGo

ChinChangYang · 2026-06-26T03:55:40Z

Summary

Adds a tunehuman subcommand and a set of GTP configs (gtp_human<rank>.cfg) that make
KataGo + the Human-SL net play at a chosen amateur KGS rank, from 9d down to 20k,
where each consecutive rank is exactly 1 KGS rank (1 stone) apart.

`tunehuman` subcommand

Calibrates humanSLChosenMovePiklLambda (the Human-SL strength dial) to a target winrate by
playing in-process candidate-vs-baseline games at fixed visits:

-komi / -cand-color flags realize the KGS 1-rank handicap (weaker rank as Black,
komi 0.5, no color alternation).
Inherits the baseline config's ruleset, so tuning is scored exactly like deployed play.
Resumable per-round checkpointing (-resume-file).
Unit tests in cpp/tests/testhumansltuner.cpp (run via katago runtests).

The ladder

29 configs, gtp_human{9d…1d,1k…20k}.cfg. The Human-SL net is conditioned on KGS rank, so a
1-rank gap is well-defined: per KGS, 1 rank = 1 stone, realized as an even game where the
stronger side (White) gets no komi compensation (komi 0.5). Each rung is tuned so the weaker
rank (Black, komi 0.5) is an even game (50%) vs the rung above, with the 95% CI ⊂
[40%, 60%], under Japanese rules.

λ rises from 0.045 (9d) to 1.223 (20k); the deep-kyu rungs need near-pure-human play
because the net's rank profiles compress at the weakest end.
docs/HumanSL_Rank_Ladder.md documents the method, ruleset rationale, reproduction commands,
the full results table (λ / win rate / 95% CI / games per rung), and findings.
ladder_step.sh + tune_decide.py + tune_{lambda,maxvisits}.sh are the automated
sequential root-finder harness that produced the ladder.

Results

Experimental calibration results — every rung's 95% Wilson CI of win rate is ⊂ [40%, 60%].
The 9d row is even-game parity vs the modern rank_9d profile; every other row is the
komi-0.5 handicap win rate (weaker rank as Black) vs the rung above it.

Rank	λ (`humanSLChosenMovePiklLambda`)	95% CI of win rate
9d (anchor)	0.04500	[39.4%, 58.7%]
8d	0.08680	[40.5%, 53.6%]
7d	0.12670	[40.6%, 56.7%]
6d	0.19830	[44.1%, 59.8%]
5d	0.28064	[43.6%, 58.8%]
4d	0.37300	[43.9%, 56.1%]
3d	0.45556	[43.1%, 59.7%]
2d	0.51330	[41.9%, 58.1%]
1d	0.50930	[42.5%, 55.7%]
1k	0.48988	[42.5%, 58.9%]
2k	0.46755	[40.8%, 55.7%]
3k	0.49173	[41.5%, 58.5%]
4k	0.47130	[40.5%, 55.8%]
5k	0.50720	[43.6%, 58.9%]
6k	0.48925	[42.0%, 59.6%]
7k	0.53370	[41.8%, 60.0%]
8k	0.50640	[40.2%, 58.1%]
9k	0.53880	[41.3%, 54.9%]
10k	0.59036	[42.0%, 58.0%]
11k	0.56458	[40.5%, 55.8%]
12k	0.54297	[42.2%, 59.3%]
13k	0.58977	[42.1%, 59.4%]
14k	0.61625	[41.3%, 58.7%]
15k	0.61839	[40.2%, 58.1%]
16k	0.67050	[42.1%, 57.9%]
17k	0.74130	[40.9%, 55.7%]
18k	0.78210	[40.4%, 52.2%]
19k	0.89820	[41.0%, 59.0%]
20k	1.22270	[40.6%, 59.4%]

Notes

Draft — opening for discussion/review. Configs use humanSLProfile = preaz_<rank>
(pre-AlphaZero KGS-rank profiles) at 400 visits. The subcommand is backend-agnostic.

🤖 Generated with Claude Code

https://claude.ai/code/session_01L2nqY5X9rSVpH65nWHCPaF

…n subcommand Adds a `tunehuman` subcommand and a complete set of GTP configs that make KataGo (with the Human-SL net) play at a chosen amateur rank from 9d down to 20k, where each consecutive rank is exactly 1 KGS rank (1 stone) apart. tunehuman subcommand (cpp/command/tunehuman.cpp, cpp/program/humansltuner.{cpp,h}): - Plays in-process candidate-vs-baseline games and calibrates humanSLChosenMovePiklLambda (the strength dial) to a target winrate at fixed visits, reading the raw winrate (robust to the steep, ceiling-biased λ curve). - -komi / -cand-color flags for the KGS 1-rank handicap (weaker rank as Black, komi 0.5, no color alternation); inherits the baseline config's ruleset so tuning is scored exactly like deployed play. - Resumable per-round checkpointing (-resume-file) to survive long runs. - Unit tests in cpp/tests/testhumansltuner.cpp (run via `katago runtests`). The ladder (cpp/configs/gtp_human{9d..1d,1k..20k}.cfg): - 29 configs. The 9d anchor is even-game parity vs the modern rank_9d profile; every weaker rank is tuned so it (Black, komi 0.5) is an even game (50%) vs the rung above it, with the 95% Wilson CI inside [40%, 60%]. Japanese rules. - λ rises 0.045 (9d) → 1.223 (20k); the deep-kyu rungs need near-pure-human play because the Human-SL net's rank profiles compress at the weakest end. docs/HumanSL_Rank_Ladder.md documents the method, ruleset rationale, reproduction commands, the full results table (λ / win rate / 95% CI / games per rung), and findings. ladder_step.sh + tune_decide.py + tune_{lambda,maxvisits}.sh are the automated sequential root-finder harness that produced the ladder. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01L2nqY5X9rSVpH65nWHCPaF

…<year>) Consolidate rank_*/preaz_* into unified <rank> keys driven by the empirically-tuned KGS-rank ladder from lightvector/KataGo PR lightvector#1209 (preaz_<rank> + tuned humanSLChosenMovePiklLambda), and rename proyear_<year> to "Pro <year>" derived from the 9d config with lambda 0.06. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01WoQvc49FJf5btyZsHLawb3

TDD plan: rewrite HumanSLModel around clean menu keys (lightvector#1209 preaz_<rank> ladder + Pro <year>), legacy input normalization, picker wiring, and a 3-platform verification pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01WoQvc49FJf5btyZsHLawb3

Consolidate rank_*/preaz_* into single <rank> keys driven by PR lightvector#1209's tuned KGS-rank ladder (preaz_<rank> + per-rank humanSLChosenMovePiklLambda), rename proyear_<year> to 'Pro <year>' (9d-derived, lambda 0.06), and add legacy input normalization. AI profile behavior preserved. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01WoQvc49FJf5btyZsHLawb3

Human-profile engine moves search a fixed 400 visits (lightvector#1209 calibration point) instead of the time budget, so a rank plays at its calibrated strength; "Time per move" applies only to the AI profile (human profiles get an Engine/Human toggle). Continuous analysis stays unbounded. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01WoQvc49FJf5btyZsHLawb3

Add GtpCommandBuilder.searchBudgetCommands: AI profile stays time-bounded with unbounded visits; human rank/pro profiles use a fixed 400 visits (lightvector#1209 calibration point) + a safety time cap, ignoring Time per move. genMoveAnalyzeCommands now takes the effective profile. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01WoQvc49FJf5btyZsHLawb3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Human-SL KGS-rank ladder (gtp_human<rank>.cfg, 9d→20k) + tunehuman subcommand#1209

Add Human-SL KGS-rank ladder (gtp_human<rank>.cfg, 9d→20k) + tunehuman subcommand#1209
ChinChangYang wants to merge 1 commit into
lightvector:masterfrom
ChinChangYang:gtp-human-rank-configs

ChinChangYang commented Jun 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ChinChangYang commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

tunehuman subcommand

The ladder

Results

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ChinChangYang commented Jun 26, 2026 •

edited

Loading

`tunehuman` subcommand