Open
Description
Feature request
We could leverage the grammar-based sampling work introduced in ggml-org/llama.cpp#1773 to provide reliable JSON output via the python bindings.
Motivation
This would make the python bindings much more useful as a tool that can be used in a predictable manner for all kinds of tasks.
Your contribution
I intend to submit a PR implementing this feature.