Formatter development (Emit)
How the Beskid pretty-printer works: Emit, EmitCtx, modules, and extending the formatter.
Formatter development (Emit)
The formatter lives in the beskid_analysis crate under src/format/. It is an opinionated pretty-printer: it walks the concrete syntax AST after a successful parse and writes canonical text. It does not round-trip arbitrary whitespace or comments (except structured /// leading documentation carried on the AST).
For normative formatter and front-end contracts, see Parser and AST contracts and Command surface.
Mental model
Section titled “Mental model”- Parse →
Spanned<Program>(or nested nodes). - Emit → each node implements
Emit::emitand writes tokens and layout into afmt::Writetarget (usually aString). EmitCtxcarries indent depth and spacing policy so nested structures share one layout discipline.
The public entry point for a whole file is:
beskid_analysis::format::format_program(&Spanned<Program>) -> Result<String, EmitError>Internally that constructs an Emitter, a fresh EmitCtx, and calls emit on program.node (the inner Program).
The Emit trait
Section titled “The Emit trait”Defined in format/emit.rs:
pub trait Emit { fn emit<W: Write>(&self, w: &mut W, cx: &mut EmitCtx) -> Result<(), EmitError>;}Contract
w— append-only sink; implementations compose smalleremitcalls on child nodes instead of building intermediate strings (avoids quadratic concat patterns).cx— mutable context; indent is pushed/popped around braced regions; policy helpers insert blank lines where the style guide requires.- Return value —
Ok(())on success;EmitErrorwrapsfmt::Errorfrom the writer (typically only on OOM-style writer failures).
Implement Emit for AST types (and sometimes for Spanned<T> wrappers) so the tree formats itself structurally: Program loops items and delegates to Node; Node matches on item kind; expressions delegate to subexpressions.
EmitCtx: indentation and layout helpers
Section titled “EmitCtx: indentation and layout helpers”EmitCtx (same file) tracks:
indent: usize— logical nesting level;write_indentwrites four spaces per level.policy_blank_line_between_members— toggles extra newline policy between type/enum/contract members (tests may disable for tighter snapshots).
Common helpers used everywhere:
| Method | Role |
|---|---|
nl / ln | Newline; ln also indents the new line |
space / token | Single space or a keyword/punctuation literal |
open_brace / close_brace | Allman-style bracing: { on its own line after increasing indent; } outdented to the enclosing block |
between_top_level_declarations | Blank line policy between file-level items |
between_members | Blank line policy inside aggregate bodies |
between_block_items | Statement-block spacing (e.g. control flow followed by let) |
Policy bodies live in format/policy.rs so spacing rules stay centralized.
Module layout (src/format/)
Section titled “Module layout (src/format/)”| Path | Responsibility |
|---|---|
emit.rs | Emit, EmitCtx, EmitError, Emitter, format_program, Block / Spanned<Block> emission |
policy.rs | Blank-line decisions |
expressions_emit.rs | Expression and related expression shapes |
statements_emit.rs | Statements; parenthesized if / while conditions |
types_emit.rs | Types, paths, parameters, fields |
items/ | Top-level and member items split by file (root_emit, declarations_emit, functions_emit, attributes_emit, helpers) |
format/mod.rs re-exports the public surface: Emit, EmitCtx, EmitError, Emitter, format_program.
Adding a new syntax node to the formatter
Section titled “Adding a new syntax node to the formatter”- Parse / AST — ensure the node exists on the concrete AST used by analysis.
Emitimpl — addfn emitin the most natural module (expressions_emit.rsvsstatements_emit.rsvsitems/…).- Delegate — prefer
child.emit(w, cx)?over duplicating indent logic. - Policy — if the node introduces new vertical spacing needs, extend
policy.rsand thread throughEmitCtxrather than hard-coding double newlines at call sites. - Tests — add
*.input.bd/*.expected.bdunderbeskid_tests/fixtures/format/(any subdirectory; the harness walks recursively), then runpython3 scripts/bless_format_fixtures.pyfrom thecompiler/tree aftercargo build -p beskid_cli.
Idempotence and grouped expressions
Section titled “Idempotence and grouped expressions”The formatter aims for idempotence: format(parse(format(parse(x)))) should stabilize. Concrete example: if and while headers emit a parenthesized condition for a C#-like look. If the condition is already a grouped expression (( … ) in the AST), the emitter must not add another pair of parentheses or the tree would grow on each pass.
That logic lives in emit_parenthesized_condition in statements_emit.rs: grouped conditions delegate to condition.emit only; other shapes wrap with literal ( ).
Use the same pattern whenever syntax allows redundant grouping that your emitter might re-introduce.
CLI and LSP
Section titled “CLI and LSP”- CLI:
beskid formatreads a file or directory, parses, runsformat_program, then writes stdout,--output,--write, or validates with--check. - LSP: the formatting handler calls the same
format_programon the parsed buffer; range formatting currently replaces the full document (see LSP architecture).
Both paths require a successful parse; there is no best-effort partial format on parse errors.
Regression testing (compiler repo)
Section titled “Regression testing (compiler repo)”- Coverage policy: Test harnesses and fixtures.
- CI:
nox -s format_regressionandbeskid format --checkon the fixture tree; LSP unit testsinclude_str!thedocs_and_controlfixture to assert the handler matchesformat_program. - Corelib (optional):
BESKID_FORMAT_CORPUS=1withnox -s format_corpus_corelibrunsscripts/format_corpus_check.py.