Skip to content

Pull Request: Enhanced Training and Analysis Features for KataGo #1072

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 52 commits into
base: master
Choose a base branch
from

Conversation

anonym-g
Copy link

@anonym-g anonym-g commented Jun 10, 2025

This pull request introduces several new features to improve the training and analysis capabilities of the KataGo project. The additions include scripts for noise injection, model merging, Elo estimation, and training statistics visualization, each designed to streamline workflows and enhance model performance.

New Features

  1. Noise Injection Scripts

    • A Python script (python/noise.py) and Bash wrapper (python/selfplay/noise.sh) to apply noise after training.
    • These scripts leverage statistical data logged in stdout.txt by the modified train.py.
    • Purpose: Facilitate fine-tuning of pre-trained models with controlled noise to improve performance efficiently.
  2. Model Merging Scripts

    • A Python script (python/merge.py) and Bash wrapper (python/selfplay/merge.sh) for merging multiple checkpoint files into a single .bin.gz model.
    • Purpose: Enable consolidation of trained models, potentially aligning with techniques used in the experimental network released on April 28, 2025.
  3. Elo Estimation Script

    • A Python script (python/elo_estimate.py) for streamlined estimation of model Elo ratings.
    • Purpose: Provide a user-friendly tool to evaluate model strength relative to official releases.
  4. Training Statistics Visualization

    • Tools to visualize key statistical data collected during the training process, which are used for noise generation.
    • Note: Currently located in my training directory for convenience, but paths may need adjustment to accommodate other users' setups.

Statistical Noise Injection (SNI) Results

The noise injection scripts enable efficient fine-tuning of pre-trained models. Key findings from testing include:

  • Using approximately 12,000 training rows and 1,000 noise iterations (via noise.sh), fine-tuned models achieved performance comparable to official updates trained on 150,000–200,000 games (6.9M–9.25M rows).
  • Testing across three randomly selected pairs of consecutive official releases showed that fine-tuned models achieved an average ~51% win rate against newer models, indicating significant performance gains.
  • A paper documenting these results is in progress, and I welcome community testing to validate the approach.

Example from Pair 3:

  • Older model: kata1-b28c512nbt-s8003120896-d4541551568 (2024-11-23, Elo: 13935.7 ± 16.3, 3113 games)
  • Newer model: kata1-b28c512nbt-s8032072448-d4548958859 (2024-11-28, Elo: 13950.1 ± 16.4, 3372 games)
  • Fine-tuned model: kata1-b28c512nbt-s8003240928-d120263 & noisy-1.0-1000iters-s8003240928-d120263
  • Results:
    • Baseline (no noise): Win rate 49.47% (139/281), Elo boost -3.71 ± 20.73
    • With noise (1,000 iterations): Win rate 51.37% (150/292), Elo boost +9.52 ± 20.34

Model Merging

The merging scripts consolidate multiple checkpoint files into a unified .bin.gz model, simplifying deployment and analysis. This approach may resemble the methodology behind the experimental network released on April 28, 2025.

Elo Estimation

The elo_estimate.py script provides a convenient method to estimate two models' Elo rating difference.

Notes and Next Steps

  • The visualization tools are currently tailored to my directory structure and may require path adjustments for broader compatibility.
  • I invite feedback on the scripts’ usability and performance, particularly for the noise injection approach, which shows promising results.
  • Community validation of the fine-tuning results would be valuable, and I’m happy to collaborate on further testing or integration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant