Skip to content

v0.9.2

Compare
Choose a tag to compare
@futurisold futurisold released this 27 Apr 16:05
· 106 commits to main since this release

Release Notes: v0.9.2


✨ New Features

  • Unified Drawing Interface

    • Added a new high-level drawing interface with two main options:
      • gpt_image: Unified wrapper for OpenAI image APIs (supports dall-e-2, dall-e-3, gpt-image-*). Exposes OpenAI’s full Images API, including advanced parameters (quality, style, moderation, background, output_compression, variations, edits—see updated docs).
      • flux: Simplified interface for Black Forest Labs’ Flux models via api.us1.bfl.ai.
    • Both interfaces now return a list of local PNG file paths for easy downstream consumption.
    • Documented all parameters and new interface usage for both engines.
  • New Engines

    • Added symai.backend.engines.drawing.engine_gpt_image for OpenAI's latest Images API.
    • Deprecated/removed legacy engine_dall_e.py in favor of unified engine_gpt_image.py.
  • Extended Interfaces

    • New public classes: symai.extended.interfaces.gpt_image and updated flux interface for consistency and enhanced discoverability.
    • Added comprehensive tests for drawing engines covering all models and modes (create, variation, edit).

🛠️ Improvements & Fixes

  • Flux Engine

    • Now downloads result images as temporary local PNG files. Handles non-None payload.
    • Uses correct API endpoint (api.us1.bfl.ai).
    • Cleans up error handling, makes API parameters robust against None values.
  • OpenAI Model Support

    • Added support for cutting-edge OpenAI models:
      • Chat/Vision: gpt-4.1, gpt-4.1-mini, gpt-4.1-nano
      • Reasoning: o4-mini, o3
    • Updated max context/response tokens for new models (gpt-4.1* supports up to ~1M context, 32k response tokens).
    • Tiktoken fallback: If initialization fails or support is missing for a new OpenAI model, falls back to "o200k_base" encoding, shows a warning.
  • OpenAI Mixin Enhancements

    • Refined token calculations and model support for new OpenAI and BFL models.
    • Ensured consistent handling of context/response tokens as new models are released.

📚 Documentation

  • Overhauled docs/source/ENGINES/drawing_engine.md:
    • Clearly describes new unified drawing API, how to use models, available parameters, and best practices.
    • Includes ready-to-use code examples for both OpenAI and Flux pathways.

🧪 Testing

  • Comprehensive pytest suite for drawing engines now included (tests/engines/drawing/test_drawing_engine.py).
  • Tests gpt_image create, variation, edit; tests Flux for all supported models.
  • Verifies correct output (generated images exist and are valid).

⚠️ Breaking/Behavioral Changes

  • Legacy DALL·E Engine removed (engine_dall_e.py). Use gpt_image for all OpenAI image generation.
  • All engine calls now return image file paths (as list), not just URLs.
  • Some parameter names and behaviors have changed (see updated docs).

If you use programmatic image generation, especially OpenAI’s DALL·E or gpt-image models, please update your code and refer to the new documentation. The new design offers greater flexibility, future-proofing for new models and APIs, and consistent developer ergonomics.


Full Changelog: v0.9.1...v0.9.2