alloy-llm-benchmark-creation

Scripts for building an Alloy benchmark workflow:

collect candidate Alloy models,
filter for compatibility,
generate English descriptions,
generate model instances.

1) Setup

Python environment

python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install openai ollama

API key setup

This repository expects your OpenAI key in:

./secret/key

Create the file and paste only the raw key value into it (no quotes/newlines beyond the final newline).

2) Java requirements (for filtering and instance generation)

Two Java versions are used:

Java 17 for alloy-diff.jar
Java 8 for CompoSAT.jar

These scripts require explicit environment variables:

JAVA_HOME_17
JAVA_HOME_8

Run these commands in your shell before running filter/instance scripts:

export JAVA_HOME_17="/path/to/jdk-17"
export JAVA_HOME_8="/path/to/jdk-8"

Verify both are set and correct:

echo "$JAVA_HOME_17"
"$JAVA_HOME_17/bin/java" -version

echo "$JAVA_HOME_8"
"$JAVA_HOME_8/bin/java" -version

Expected major versions:

JAVA_HOME_17/bin/java reports 17
JAVA_HOME_8/bin/java reports 1.8 (or 8)

3) Where to put models

Use this location for your input Alloy models:

./validModels/models/

You can keep nested folders; scripts recurse through subdirectories.

4) Run the compatibility filter

Filter models before generation:

./validModels/checkValidity.sh ./validModels/models

What this does:

checks Alloy-diff compatibility (Java 17)
checks CompoSAT compatibility (Java 8)
copies passing models into:

./validModels/validModels/

5) Generate English descriptions for all compatible models

Create a folder called benchmark in the root directory. Copy all models you wish to use into benchmark/models.

Run the following command to generate descriptions:

python ./scripts/master.py openAI ./benchmark/models ./benchmark/descriptions

This traverses all .als files recursively and writes one .md description per model.

6) Generate instances for all compatible models

./scripts/generate-instances.sh ./benchmark/models ./benchmark/instances 120 4

Arguments:

input .als file or directory
output directory
time limit in seconds per model (default 120)
max parallel jobs in directory mode (default 4)

7) Generate exact-scope general instances (new)

This step uses scripts/InstanceGenerator.java and runs exact scopes for top-level signatures.

Default command (uses x=10, y=5):

./scripts/generate-general-instances.sh

Explicit command (same defaults shown):

./scripts/generate-general-instances.sh ./benchmark/models ./benchmark/generalInstances 10 5

Arguments:

models directory (default ./benchmark/models)
output directory (default ./benchmark/generalInstances)
x: max instances kept per scope (default 10)
y: max scope, runs 1..y (default 5)
timeout in seconds per model/scope run (optional, default 45)

Notes:

Traverses all .als files recursively under the models directory.
For each model and each scope from 1 to y, keeps up to x xml instances.
If fewer than x satisfiable instances exist at a scope, it keeps however many exist.
Output layout is: benchmark/generalInstances/<model-relative-path>/scope_<n>/.
Random selection uses a larger candidate pool per scope, then samples down to x.
Candidate pool size defaults to 3*x; override with CANDIDATE_MULTIPLIER.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
benchmark		benchmark
benchmark2		benchmark2
nix		nix
prompts		prompts
rm		rm
scripts		scripts
temp		temp
testRun1		testRun1
validModels		validModels
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

alloy-llm-benchmark-creation

1) Setup

Python environment

API key setup

2) Java requirements (for filtering and instance generation)

3) Where to put models

4) Run the compatibility filter

5) Generate English descriptions for all compatible models

6) Generate instances for all compatible models

7) Generate exact-scope general instances (new)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

alloy-llm-benchmark-creation

1) Setup

Python environment

API key setup

2) Java requirements (for filtering and instance generation)

3) Where to put models

4) Run the compatibility filter

5) Generate English descriptions for all compatible models

6) Generate instances for all compatible models

7) Generate exact-scope general instances (new)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages