Skip to content

Support indexing WACZ files #710

@machawk1

Description

@machawk1

Via @ikreymer, Web Archive Collection Zipped (WACZ) Format, https://github.com/webrecorder/wacz-format (MIT, potentially reusable)

Example of MDN WACZ at https://twitter.com/webrecorder_io/status/1293730279824089088

https://dh-preserve.sfo2.cdn.digitaloceanspaces.com/webarchives/mdn.wacz (1.6GB)

Finalizing Issue #604 (resolving #631) would be conducive here depending on the WACZ's contents. Also, hosting some larger WARCs remotely like this, because they are beyond the size restrictions on GitHub, could serve as the means for testing scalability.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions