Google's JSIR: An MLIR-Based Intermediate Representation for JavaScript Analysis

When a compiler intermediate representation (IR) makes the news, you know it matters. Google has published JSIR, a next-generation JavaScript analysis tool built on MLIR, and it's already used internally for tasks that reveal just how ambitious the project is: decompiling Hermes bytecode back to JavaScript, and powering AI-assisted deobfuscation pipelines that combine JSIR with Gemini.

Why This Matters for Tooling

An intermediate representation is the data structure that a compiler or analysis tool uses to represent code between parsing and code generation. If an AST tells you what the code looks like structurally, an IR tells you what it does. The quality of your IR determines what kind of analysis and transformation you can perform.

JavaScript tooling has long suffered from fragmented IR approaches. Babel plugins work on ASTs. ESLint rules work on ASTs. Bundlers often work on their own internal representations with limited interoperability. A common, well-designed IR could let these tools share analysis work, and that's exactly what Google is proposing with JSIR.

High-Level and Low-Level Simultaneously

The core technical challenge JSIR solves is a familiar one in compiler design: you typically have to choose between a high-level IR (preserves AST structure, can be lifted back to source) and a low-level IR (enables deep dataflow analysis like taint tracking and constant propagation). Most systems pick one.

JSIR uses MLIR regions to accurately model JavaScript's control flow structures, things like closures, try-catch-finally, async functions, and generator frames, in a way that supports both directions simultaneously. You can transform code and lift it back to source, or run taint analysis across the same representation.

This unlocks use cases that were previously impractical:

Decompilation: JSIR is used at Google to decompile Hermes bytecode all the way back to JavaScript. Hermes compiles React Native apps to a compact bytecode for faster startup; JSIR's source-liftability is what makes this decompilation possible when other tools would hit a dead end.

Deobfuscation: Google published research (CASCADE) on combining Gemini LLM with JSIR for JavaScript deobfuscation. The AI operates on JSIR's structured representation rather than raw obfuscated source, producing transformations that JSIR applies back to reconstruct clean code.

The MLIR Foundation

JSIR isn't a standalone project, it's built on MLIR, the LLVM project's flexible IR framework. This is significant for ecosystem compatibility: MLIR already has a broad set of existing dialects, transformations, and tooling. By expressing JavaScript analysis in MLIR terms, JSIR can plug into that ecosystem rather than reinventing infrastructure.

Getting Started

JSIR is available on GitHub at github.com/google/jsir. The project recommends using Docker for local experimentation:

docker build -t jsir:latest .
docker run --rm -v $(pwd):/workspace jsir:latest jsir_gen --input_file=/workspace/yourfile.js

Building from source requires clang, Bazel, and significant build time, the project notes that LLVM fetch and build takes a while. The Docker path is the practical entry point for most developers.

What This Means for the Ecosystem

Most developers won't interact with JSIR directly in the near term, it's a foundation for tooling developers to build on. But the long-term implications are significant. A shared, well-designed IR could enable:

Linters with deeper semantic understanding (not just pattern matching on AST nodes)
Bundlers with better dead code elimination using dataflow analysis
Refactoring tools that can safely transform code across complex control flow
Cross-framework analysis that works consistently regardless of which framework or build tool is used

Google has open sourced it, which means the community can build on this foundation. Whether it gains traction depends on whether tooling maintainers see enough benefit to integrate JSIR-based analysis into their pipelines, but the technical foundation is solid.

Frequently Asked Questions

More coverage with overlapping topics and tags.

securityJun 15, 2026

Playwright v1.61.0 Lands WebAuthn Passkeys, a Real WebStorage API, and Trace-Style Video Modes for the Test Runner

Playwright v1.61.0 (June 15, 2026) ships a virtual authenticator for WebAuthn/passkey ceremonies, a first-class page.localStorage / page.sessionStorage API, network security details on API responses, and brings test runner video recording to parity with trace recording. Browser channels: Chromium 149, Firefox 151, WebKit 26.5.

securityJun 14, 2026

esbuild 0.28.1: First Release in Two Months Ships a High-Severity Deno RCE, a Windows Path Traversal, and a `using` Disposal Bug

esbuild v0.28.1 (June 11, 2026) is the first release since April. It fixes a CVSS 8.1 remote code execution in the Deno API via NPM_CONFIG_REGISTRY, a Windows-only dev-server path traversal, and a minifier bug that silently broke `using` and `await using` resource disposal.

securityJul 5, 2026

Claude Code Issue #74066: Users Report Cross-Workspace Context Bleed on Sonnet 5, Anthropic Has Not Yet Responded

An open bug filed against Claude Code on 2026-07-04 by an [Enterprise ZDR](https://docs.anthropic.com/en/docs/build-with-claude/zero-data-retention) user describes a working session on Sonnet 5 that suddenly starts referencing an unrelated Minecraft temple build, then doubles down on the wrong task in its recap. The reporter (GitHub: [@milesrichardson-edb](https://github.com/milesrichardson-edb), issue [anthropics/claude-code#74066](https://github.com/anthropics/claude-code/issues/74066)) is on Enterprise Zero Data Retention, the tier Anthropic specifically advertises as session-isolated. Triage on the reporter's local session JSONL at `~/.claude/projects/<encoded-cwd>/<session-id>.jsonl` finds the leaked text is not in the transcript, ruling out a local context bleed by file overlap. Four other users in the comments (with work histories going back to last year) describe near-identical behavior across Claude Code, Claude Mobile, and Claude deep research. The most plausible architectural fit is shared KV-cache state in inference ([per @yv3nne in the comments](https://github.com/anthropics/claude-code/issues/74066#issuecomment-4880448776)), but no Anthropic engineer has commented on the issue in the 22 hours since it was filed, and the issue reached the top of [Hacker News](https://news.ycombinator.com/item?id=42481789) on 2026-07-04. The tone in the thread is split: half suspecting a real platform cache reuse, half suspecting a [sonnet-5-specific hallucination triggered by a Pygments lexer](https://github.com/anthropics/claude-code/issues/74066#issuecomment-4880334711). Both readings are credible.