GitHub - block/cachew: Cachew (pronounced cashew) is a super-fast application-level pass-through cache
Cachew (pronounced "cashew") is a tiered, protocol-aware, caching HTTP proxy for software engineering infrastructure. It understands higher-level protocols (Git, Docker, Go modules, etc.) and makes smarter caching decisions than a naive HTTP proxy.
Strategies
Git
Caches Git repositories with two complementary techniques:
- Snapshots — periodic
.tar.zstarchives that restore 4–5x faster thangit clone. - Pack caching — passthrough caching of packs from
git-upload-packfor incremental pulls.
Redirect Git traffic through cachew:
[url "https://cachew.example.com/git/github.com/"] insteadOf = https://github.com/
Restore a repository from a snapshot (with automatic delta bundle to reach HEAD):
cachew git restore https://github.com/org/repo ./repo
git { snapshot-interval = "1h" repack-interval = "1h" }
GitHub Releases
Caches public and private GitHub release assets. Private orgs use a token or GitHub App for authentication.
URL pattern: /github-releases/{owner}/{repo}/{tag}/{asset}
github-releases { token = "${GITHUB_TOKEN}" private-orgs = ["myorg"] }
Go Modules
Go module proxy (GOPROXY-compatible). Private modules are fetched via git clone.
URL pattern: /gomod/...
export GOPROXY=http://cachew.example.com/gomod,directgomod { proxy = "https://proxy.golang.org" private-paths = ["github.com/myorg/*"] }
Hermit
Caches Hermit package downloads. GitHub release URLs are automatically routed through the github-releases strategy.
URL pattern: /hermit/{host}/{path...}
Artifactory
Caches artifacts from JFrog Artifactory with host-based or path-based routing.
artifactory "example.jfrog.io" { target = "https://example.jfrog.io" }
Host
Generic reverse-proxy caching for arbitrary HTTP hosts, with optional custom headers.
host "https://ghcr.io" { headers = { "Authorization": "Bearer QQ==" } } host "https://w3.org" {}
HTTP Proxy
Caching proxy for clients that use absolute-form HTTP requests (e.g. Android sdkmanager --proxy_host).
Cache Backends
Multiple backends can be configured simultaneously — they are automatically combined into a tiered cache. Reads check each tier in order and backfill lower tiers on a hit. Writes go to all tiers in parallel.
Memory
In-memory LRU cache.
memory { limit-mb = 1024 # default max-ttl = "1h" # default }
Disk
On-disk LRU cache with TTL-based eviction.
disk { limit-mb = 250000 max-ttl = "8h" }
S3
S3-compatible object storage (AWS S3, MinIO, etc.).
s3 { bucket = "my-cache-bucket" endpoint = "s3.amazonaws.com" region = "us-east-1" }
Authorization (OPA)
Cachew uses Open Policy Agent for request authorization. The default policy allows all requests from localhost and restricts remote access to non-admin paths (/api/*, /admin/*).
Policies must be in package cachew.authz and define an allow rule. If allow is true the request proceeds; otherwise it is rejected with 403.
opa { policy = <<EOF package cachew.authz default allow := false allow if input.headers["authorization"] EOF }
Or reference an external file with optional data:
opa { policy-file = "./policy.rego" data-file = "./opa-data.json" }
Input fields: input.method, input.path (string array), input.headers, input.remote_addr (includes port — use startswith to match by IP).
Testing policies
The test field holds a Rego test module that is run against the policy when cachewd starts. Any rule prefixed with test_ is executed; if a test fails, cachewd exits.
opa { policy = <<EOF package cachew.authz default allow := false allow if input.method == "POST" EOF test = <<EOF package cachew.authz_test import data.cachew.authz test_post_allowed if authz.allow with input as {"method": "POST"} test_get_denied if not authz.allow with input as {"method": "GET"} EOF }
GitHub App Authentication
For private Git repositories and GitHub release assets, configure a GitHub App:
github-app { app-id = "12345" private-key-path = "./github-app.pem" installations = { "myorg": "67890" } }
Installations can also be discovered dynamically via the GitHub API.
CLI
Server (cachewd)
cachewd --config cachew.hcl
cachewd --schema # print config schemaClient (cachew)
# Object operations cachew get <namespace> <key> [-o file] cachew put <namespace> <key> [file] [--ttl 1h] cachew stat <namespace> <key> cachew delete <namespace> <key> cachew namespaces # Directory snapshots cachew save <namespace> <directory> [paths...] (--key <key> | -H <glob>) [--ttl 1h] [--exclude pattern] cachew restore <namespace> <directory> (--key <key> | -H <glob>) # exit 0 hit, 2 miss, 1 error # Git cachew git restore <repo-url> <directory> [--no-bundle]
Global flags: --url (CACHEW_URL), --authorization (CACHEW_AUTHORIZATION), --platform (prefix keys with os-arch), --daily/--hourly (prefix keys with date).
Observability
log { level = "info" # debug, info, warn, error } metrics { service-name = "cachew" }
Admin endpoints: /_liveness, /_readiness, PUT /admin/log/level, /admin/pprof/.
Full Configuration Example
state = "./state" bind = "0.0.0.0:8080" url = "http://cachew.example.com:8080/" log { level = "info" } opa { policy = <<EOF package cachew.authz default allow := false allow if startswith(input.remote_addr, "127.0.0.1:") EOF } metrics {} github-app { app-id = "12345" private-key-path = "./github-app.pem" } git-clone {} git { snapshot-interval = "1h" repack-interval = "1h" } github-releases { token = "${GITHUB_TOKEN}" private-orgs = ["myorg"] } gomod { proxy = "https://proxy.golang.org" private-paths = ["github.com/myorg/*"] } hermit {} host "https://ghcr.io" { headers = { "Authorization": "Bearer ${GHCR_TOKEN}" } } disk { limit-mb = 250000 max-ttl = "8h" } proxy {}