Reads like Parquet.
cozip makes a ZIP behave like Parquet. Random access on every entry, queryable manifest, predicate pushdown, byte-range reads from S3 or GCS.
Still a ZIP. Now queryable. Filter the manifest first, then fetch only the bytes you need.
-- query the manifest before reading any file SELECT name, offset, size FROM read_cozip('s3://bucket/dataset.zip') WHERE name LIKE 'tile_%' LIMIT 3;
Why cozip?
cozip adds a Parquet-powered table of contents to ordinary ZIP archives. Query it in place, then fetch each file with a single byte-range request.
cozip makes a ZIP behave like Parquet. Random access on every entry, queryable manifest, predicate pushdown, byte-range reads from S3 or GCS.
cozip conforms to APPNOTE 6.3.10. Every existing ZIP tool opens it without changes. unzip, libarchive, your browser, your collaborators.
Writers run on vendored libzip behind a stable C ABI. Readers run on a DuckDB extension for Python, R, Julia, and SQL — plus a standalone JavaScript package for the browser and Node.
How it works
A tiny binary index at byte zero points to a Parquet manifest. Query it in place, then fetch only the files you need.
Install
Same archive on disk, five ways to read or write it. Each page is a focused quickstart.