What is a .WARC file?
WARC is the standard format web archives use to store crawled pages. Used by the Wayback Machine.
- Did you know
- WARC is how the Internet Archive’s Wayback Machine preserves the web.
- WARC is the successor to the older ARC format, adding richer metadata about how each page was captured.
- It was standardised as ISO 28500 and is produced by crawlers such as the Internet Archive’s Heritrix.
- What Analyser reads
- Open documents, ebooks and publishing files beyond Office: comic books (CBZ/CBT with ComicInfo + first-page preview; CBR/CB7 identified), Microsoft XPS, FictionBook FB3, iBooks, Scrivener, Visio VSDX, R Markdown/Quarto, RTFD, WARC/MAFF web archives, TeX DVI, legacy Hangul HWP, and WordPerfect/QuarkXPress/PageMaker identification.
- Depth of analysis
- .WARC is an identification-grade format: Analyser recognises it from its bytes and decodes the header metadata it carries, rather than opening it in a full viewer. Formats that do get a full viewer are marked "Full" on the formats page.
- Open a .WARC file
- Drag a .WARC file onto the Analyser home page (or tap to pick one). It is identified entirely in your browser - nothing is uploaded, there is no account, and it works offline once installed.