AI Model Attestation
How do you ensure the provenance of the models you run on your laptop or in production? Who built it? Who’s edited it?
Chalk can help.
Signing a GGUF file
Make sure you have a working Installation Guide of Chalk.
Then grab llama.cpp from your favorite package manager.
And grab a model. We’ll use qwen2.5-0.5b-instruct-q2_k.gguf since it’s small.
curl -LO https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct-GGUF/resolve/main/qwen2.5-0.5b-instruct-q2_k.gguf
Now sign it with Chalk.
chalk insert ./qwen2.5-0.5b-instruct-q2_k.gguf
The model is a binary file. Let’s tweak it slightly. Replacing the model’s name “instruct” with “instruce”.
printf '\x65' | dd of=qwen2.5-0.5b-instruct-q2_k.gguf bs=1 seek=158 count=1 conv=notrunc
00000000 47 47 55 46 03 00 00 00 23 01 00 00 00 00 00 00 GGUF....#.......
00000010 1B 00 00 00 00 00 00 00 14 00 00 00 00 00 00 00 ................
00000020 67 65 6E 65 72 61 6C 2E 61 72 63 68 69 74 65 63 general.architec
00000030 74 75 72 65 08 00 00 00 05 00 00 00 00 00 00 00 ture............
00000040 71 77 65 6E 32 0C 00 00 00 00 00 00 00 67 65 6E qwen2........gen
00000050 65 72 61 6C 2E 74 79 70 65 08 00 00 00 05 00 00 eral.type.......
00000060 00 00 00 00 00 6D 6F 64 65 6C 0C 00 00 00 00 00 .....model......
00000070 00 00 67 65 6E 65 72 61 6C 2E 6E 61 6D 65 08 00 ..general.name..
00000080 00 00 15 00 00 00 00 00 00 00 71 77 65 6E 32 2E ..........qwen2.
00000090 35 2D 30 2E 35 62 2D 69 6E 73 74 72 75 63 65 0F 5-0.5b-instruce.
No one is doing any integrity checks against the model itself so even after this edit we can run llama-cli (from above llama.cpp project) against this edited model:
$ llama-cli -m ./qwen2.5-0.5b-instruct-q2_k.gguf -p "What's 1+1?" -st --no-display-prompt 2>/dev/null
... omitted ...
> What's 1+1?
1 + 1 equals 2.
But if we ever run chalk extract against the model file we’d see:
$ ./chalk extract ./qwen2.5-0.5b-instruct-q2_k.gguf > /dev/null
... truncated ...
error: /Users/phil/crashappsec/chalk/qwen2.5-0.5b-instruct-q2_k.gguf: extracted CHALK_ID doesn't match computed CHALK_ID
error: CNHP2S-K461-J3GE-1J61J3 vs: 75JPAC-SP64-W38S-9P64V6
We’d see Chalk catching the edit.
Now this little tweak we made wasn’t harmful but who knows who’s messing with your model files.
And maybe llama.cpp will catch up and impose integrity checks itself. That would be good! But Chalk also includes build provenance information and Chalk supports (or will support) every major artifact format, whether it’s a binary or a Docker image or a GGUF model or so on. It’s universal. If there’s an artifact format you’d like us to support that we do not yet, just let us know!