tar-xz - v6.1.1
    Preparing search index...

    tar-xz - v6.1.1

    tar-xz

    Universal tar.xz library — stream-first, Node and Browser, same API.

    npm License: LGPL-3.0

    Create and extract .tar.xz archives with streaming support for Node.js and WebAssembly-powered support for browsers — using the same create/extract/list function names in both environments.

    • Unified APIcreate, extract, list work identically in Node.js and browsers
    • Stream-shaped API — all functions return AsyncIterable<…>; stream-shaped inputs accepted. Node extract()/list() now stream chunks as XZ decompresses them — memory stays O(largest single entry). v6.0.0 introduced the stream-first API contract; v6.1.0 delivers the planned optimization that fulfills it.
    • Flexible inputextract() and list() accept AsyncIterable, Uint8Array, ArrayBuffer, Web ReadableStream, or Node ReadableStream
    • Flexible sourcecreate() accepts fs paths (Node), Buffer/Uint8Array, or AsyncIterable<Uint8Array> per file
    • File helperstar-xz/file subpath for disk I/O (Node only)
    • Full TAR support — POSIX ustar with PAX extensions for long filenames and metadata
    • TypeScript — full type definitions included
    • Zero runtime deps — only requires node-liblzma (native in Node, WASM in browser)
    npm install tar-xz
    pnpm add tar-xz
    yarn add tar-xz

    Node.js and browser use the same import:

    import { create, extract, list } from 'tar-xz';
    

    Bundlers (Vite, Webpack, esbuild) resolve to the WASM implementation in browser builds automatically via the browser condition in package.json.

    create() returns an AsyncIterable<Uint8Array>. Pipe it wherever you need:

    import { create } from 'tar-xz';

    const enc = new TextEncoder();
    const archiveStream = create({
    files: [
    { name: 'hello.txt', source: enc.encode('Hello, world!') },
    { name: 'data.json', source: enc.encode(JSON.stringify({ ok: true })) },
    ],
    preset: 6, // XZ compression level 0–9 (default: 6)
    filter: (file) => !file.name.endsWith('.tmp'), // optional
    });

    // Collect to a Uint8Array (browser / in-memory use)
    const chunks: Uint8Array[] = [];
    for await (const chunk of archiveStream) {
    chunks.push(chunk);
    }
    const archive = new Uint8Array(chunks.reduce((n, c) => n + c.length, 0));
    let offset = 0;
    for (const chunk of chunks) { archive.set(chunk, offset); offset += chunk.length; }

    // Or pipe to a WritableStream (browser)
    const writer = writable.getWriter();
    for await (const chunk of create({ files: [...] })) {
    await writer.write(chunk);
    }
    await writer.close();

    extract() yields TarEntryWithData objects. Consume entry.data or use the convenience helpers entry.text() and entry.bytes():

    import { extract } from 'tar-xz';

    for await (const entry of extract(archiveStream)) {
    if (entry.type === '0') { // regular file — TarEntryType.FILE
    const content = await entry.text();
    console.log(entry.name, content);
    }
    }

    // Or collect raw bytes
    for await (const entry of extract(archive)) {
    if (entry.type === '0') {
    const bytes = await entry.bytes();
    console.log(entry.name, bytes.byteLength, 'bytes');
    }
    }

    Important: entry.data is a lazy AsyncIterable — consume or skip each entry before the for await loop advances to the next one. Calling entry.bytes() or iterating entry.data fully satisfies this requirement.

    list() yields TarEntry metadata without reading file content:

    import { list } from 'tar-xz';

    for await (const entry of list(archiveStream)) {
    console.log(entry.name, entry.size, entry.mtime);
    }

    All of the following are valid as the first argument:

    extract(asyncIterable)            // AsyncIterable<Uint8Array>
    extract(syncIterable) // Iterable<Uint8Array> (e.g. [chunk1, chunk2])
    extract(uint8array) // Uint8Array (in-memory archive)
    extract(arrayBuffer) // ArrayBuffer
    extract(webReadableStream) // ReadableStream<Uint8Array> (Web Streams)
    extract(nodeReadableStream) // NodeJS.ReadableStream (Node only)
    create({
    files: [
    { name: 'from-disk.txt', source: '/absolute/or/relative/path' }, // Node only
    { name: 'from-buffer.bin', source: Buffer.from([0x01, 0x02]) },
    { name: 'from-uint8.bin', source: new Uint8Array([0x03, 0x04]) },
    { name: 'from-stream.bin', source: asyncIterableOfChunks },
    ],
    });

    string sources (fs paths) throw a helpful error in browser environments — there is no filesystem access in browsers.

    tar-xz/file wraps the stream API with disk I/O convenience functions:

    import { createFile, extractFile, listFile } from 'tar-xz/file';

    // Write archive to disk
    await createFile('archive.tar.xz', {
    files: [
    { name: 'a.txt', source: '/path/to/a.txt' },
    { name: 'b.txt', source: Buffer.from('hello') },
    ],
    });

    // Extract archive from disk to a directory
    await extractFile('archive.tar.xz', {
    cwd: './output', // target directory (default: process.cwd())
    strip: 1, // strip N leading path components (default: 0)
    filter: (entry) => entry.name.endsWith('.ts'), // optional
    });

    // List archive on disk (returns TarEntry[])
    const entries = await listFile('archive.tar.xz');
    for (const entry of entries) {
    console.log(entry.name, entry.size);
    }

    Do not import tar-xz/file in browser bundles — it imports node:fs and will fail at runtime. Use create/extract/list directly in browser code.

    extractFile enforces layered path-safety checks before writing any bytes to disk: traversal detection (.. and absolute paths), leaf-symlink rejection (O_NOFOLLOW on POSIX), ancestor-symlink TOCTOU guard, hardlink validation, NUL/empty name rejection, and setuid/setgid/sticky-bit stripping.

    The TOCTOU mitigation differs by platform:

    POSIX (Linux, macOS): FILE entries are written through a file descriptor opened with O_NOFOLLOW. The fd is held open for the entire streaming write, so the window between the safety check and the last write is effectively zero.

    Windows: O_NOFOLLOW is not available. Extraction falls back to by-path stream operations (createWriteStream). With streaming delivery (v6.1.0), the window between the initial safety check and the last written byte scales with the entry's size — a co-tenant process that can modify cwd could race a symlink swap during this window.

    Windows recommendation: always extract to a directory owned exclusively by the calling process. Do not extract user-supplied archives into shared, world-writable, or TEMP-like directories on Windows.

    This gap is now closed: the Windows path uses open(target, 'wx') (atomic exclusive create) with an unlink+retry pattern for legitimate overwrites. If a symlink is injected between the unlink and the retry-open, extraction fails with a security error. All writes and metadata operations are fd-based. See SECURITY.md for the full reparse-tag coverage table and user mitigations.

    Compute a checksum over the compressed bytes as they are produced:

    import { create } from 'tar-xz';
    import { createHash } from 'node:crypto';

    const hasher = createHash('sha256');
    const chunks: Uint8Array[] = [];

    for await (const chunk of create({ files: [...] })) {
    hasher.update(chunk);
    chunks.push(chunk);
    }

    const digest = hasher.digest('hex');
    console.log('SHA-256:', digest);

    Stream the archive directly to an HTTP endpoint without buffering:

    import { create } from 'tar-xz';
    import { Readable } from 'node:stream';
    import { pipeline } from 'node:stream/promises';

    const archiveStream = Readable.from(create({ files: [...] }));
    // Use with node:http or any streaming HTTP client
    import { extract } from 'tar-xz';

    const response = await fetch('https://example.com/archive.tar.xz');
    // ReadableStream<Uint8Array> is accepted directly
    for await (const entry of extract(response.body!)) {
    if (entry.type === '0') {
    const text = await entry.text();
    console.log(entry.name, text);
    }
    }
    import { extract } from 'tar-xz';

    const response = await fetch('large.tar.xz');
    for await (const entry of extract(response.body!)) {
    if (entry.type === '0') {
    const bytes = await entry.bytes();
    // write to IndexedDB, OPFS, etc.
    await saveToStorage(entry.name, bytes);
    }
    }
    Function Signature Returns
    create (options: CreateOptions) => AsyncIterable<Uint8Array> Compressed archive chunks
    extract (input: TarInput, options?: ExtractOptions) => AsyncIterable<TarEntryWithData> Entries with data
    list (input: TarInput) => AsyncIterable<TarEntry> Metadata only
    Function Signature Returns
    createFile (path: string, options: CreateOptions) => Promise<void> Writes archive to path
    extractFile (archivePath: string, options?: ExtractOptions & { cwd?: string }) => Promise<void> Extracts to cwd
    listFile (archivePath: string) => Promise<TarEntry[]> Collected entry array
    Field Type Default Description
    files TarSourceFile[] required Files to include
    preset number 6 XZ compression level 0–9
    filter (file: TarSourceFile) => boolean Return false to exclude
    Field Type Description
    name string Path inside the archive
    source string | Uint8Array | ArrayBuffer | AsyncIterable<Uint8Array> File content or fs path (Node only)
    mode number? File permissions (default: 0o644)
    mtime Date? Modification time (default: now)
    Field Type Default Description
    strip number 0 Strip N leading path components
    filter (entry: TarEntry) => boolean Return false to skip entry

    Metadata yielded by list() and attached to each TarEntryWithData:

    Field Type Description
    name string Relative file path
    type TarEntryTypeValue Entry type ('0'=file, '5'=dir, '2'=symlink, …)
    size number File size in bytes
    mode number File permissions
    uid / gid number Owner user/group IDs
    uname / gname string Owner user/group names
    mtime number Modification time (seconds since epoch)
    linkname string Symlink / hardlink target
    Member Description
    data AsyncIterable<Uint8Array> — lazy content stream (consume once, in order)
    text(encoding?) Collect all chunks and decode to string (default UTF-8)
    bytes() Collect all chunks into a single Uint8Array
    Preset WASM Memory Speed Ratio Recommendation
    1 ~10 MB Fastest Lowest Batch of many small files
    3 ~20 MB Fast Good Memory-constrained environments
    6 ~100 MB Medium Very good Default (Node + Browser)
    9 ~700 MB Slowest Best Archive longevity
    • No fs path sourcesource: '/path/to/file' throws; use Uint8Array or AsyncIterable instead
    • 256 MB WASM memory cap — single-file content exceeding this limit will fail; batch large files carefully
    • No synchronous APIs — all browser operations are async
    • Preset 1–6 recommended — presets 7–9 approach or exceed the memory cap

    Archives produced by tar-xz are fully compatible with standard tooling:

    # Extract with system tar
    tar -xJf archive.tar.xz

    # List contents
    tar -tJf archive.tar.xz

    v6 unifies the Node and browser APIs under a single set of function names.

    v5 v6
    createTarXz(options) create(options) (browser entry point)
    extractTarXz(archive) extract(archive) (browser entry point)
    listTarXz(archive) list(archive) (browser entry point)
    extractToMemory(path) extract(createReadStream(path)) + entry.bytes()
    BrowserCreateOptions CreateOptions
    BrowserExtractOptions ExtractOptions
    ExtractedFile TarEntryWithData

    create() now returns AsyncIterable<Uint8Array> instead of Promise<Uint8Array>. Collect all chunks if you need the full buffer:

    // v5
    const archive = await createTarXz({ files: [...] });

    // v6
    const chunks: Uint8Array[] = [];
    for await (const chunk of create({ files: [...] })) chunks.push(chunk);
    const archive = Buffer.concat(chunks); // Node
    // or new Blob(chunks) for a Blob in browser

    extract() now returns AsyncIterable<TarEntryWithData> instead of Promise<Array<...>>. Iterate with for await:

    // v5
    const files = await extractTarXz(archive);
    for (const f of files) { /* f.name, f.data */ }

    // v6
    for await (const entry of extract(archive)) {
    if (entry.type === '0') {
    const data = await entry.bytes(); // f.data equivalent
    }
    }
    // v5 (mixed in main export)
    import { create, extract } from 'tar-xz';
    await create({ file: 'archive.tar.xz', cwd: '.', files: ['a.txt'] });

    // v6 (dedicated subpath)
    import { createFile, extractFile } from 'tar-xz/file';
    await createFile('archive.tar.xz', { files: [{ name: 'a.txt', source: 'a.txt' }] });

    node-tar (230M+ downloads/month) does not support .tar.xz. tar-xz fills that gap by combining:

    • node-liblzma for XZ compression (native addon in Node, WASM in browser)
    • A minimal POSIX ustar TAR implementation (no external dependencies)

    LGPL-3.0 — same as node-liblzma.