Skip to content
Beskid The Beskid Book

Beskid

Jump to a Beskid service

Beskid

Jump to a Beskid service

Formatter development (Emit)

How the Beskid pretty-printer works: Emit, EmitCtx, modules, and extending the formatter.

Formatter development (Emit)

The formatter lives in the beskid_analysis crate under src/format/. It is an opinionated pretty-printer: it walks the concrete syntax AST after a successful parse and writes canonical text. It does not round-trip arbitrary whitespace or comments (except structured /// leading documentation carried on the AST).

For normative formatter and front-end contracts, see Parser and AST contracts and Command surface.

  1. ParseSpanned<Program> (or nested nodes).
  2. Emit → each node implements Emit::emit and writes tokens and layout into a fmt::Write target (usually a String).
  3. EmitCtx carries indent depth and spacing policy so nested structures share one layout discipline.

The public entry point for a whole file is:

beskid_analysis::format::format_program(&Spanned<Program>) -> Result<String, EmitError>

Internally that constructs an Emitter, a fresh EmitCtx, and calls emit on program.node (the inner Program).

Defined in format/emit.rs:

pub trait Emit {
fn emit<W: Write>(&self, w: &mut W, cx: &mut EmitCtx) -> Result<(), EmitError>;
}

Contract

  • w — append-only sink; implementations compose smaller emit calls on child nodes instead of building intermediate strings (avoids quadratic concat patterns).
  • cx — mutable context; indent is pushed/popped around braced regions; policy helpers insert blank lines where the style guide requires.
  • Return valueOk(()) on success; EmitError wraps fmt::Error from the writer (typically only on OOM-style writer failures).

Implement Emit for AST types (and sometimes for Spanned<T> wrappers) so the tree formats itself structurally: Program loops items and delegates to Node; Node matches on item kind; expressions delegate to subexpressions.

EmitCtx (same file) tracks:

  • indent: usize — logical nesting level; write_indent writes four spaces per level.
  • policy_blank_line_between_members — toggles extra newline policy between type/enum/contract members (tests may disable for tighter snapshots).

Common helpers used everywhere:

MethodRole
nl / lnNewline; ln also indents the new line
space / tokenSingle space or a keyword/punctuation literal
open_brace / close_braceAllman-style bracing: { on its own line after increasing indent; } outdented to the enclosing block
between_top_level_declarationsBlank line policy between file-level items
between_membersBlank line policy inside aggregate bodies
between_block_itemsStatement-block spacing (e.g. control flow followed by let)

Policy bodies live in format/policy.rs so spacing rules stay centralized.

PathResponsibility
emit.rsEmit, EmitCtx, EmitError, Emitter, format_program, Block / Spanned<Block> emission
policy.rsBlank-line decisions
expressions_emit.rsExpression and related expression shapes
statements_emit.rsStatements; parenthesized if / while conditions
types_emit.rsTypes, paths, parameters, fields
items/Top-level and member items split by file (root_emit, declarations_emit, functions_emit, attributes_emit, helpers)

format/mod.rs re-exports the public surface: Emit, EmitCtx, EmitError, Emitter, format_program.

  1. Parse / AST — ensure the node exists on the concrete AST used by analysis.
  2. Emit impl — add fn emit in the most natural module (expressions_emit.rs vs statements_emit.rs vs items/…).
  3. Delegate — prefer child.emit(w, cx)? over duplicating indent logic.
  4. Policy — if the node introduces new vertical spacing needs, extend policy.rs and thread through EmitCtx rather than hard-coding double newlines at call sites.
  5. Tests — add *.input.bd / *.expected.bd under beskid_tests/fixtures/format/ (any subdirectory; the harness walks recursively), then run python3 scripts/bless_format_fixtures.py from the compiler/ tree after cargo build -p beskid_cli.

The formatter aims for idempotence: format(parse(format(parse(x)))) should stabilize. Concrete example: if and while headers emit a parenthesized condition for a C#-like look. If the condition is already a grouped expression (( … ) in the AST), the emitter must not add another pair of parentheses or the tree would grow on each pass.

That logic lives in emit_parenthesized_condition in statements_emit.rs: grouped conditions delegate to condition.emit only; other shapes wrap with literal ( ).

Use the same pattern whenever syntax allows redundant grouping that your emitter might re-introduce.

  • CLI: beskid format reads a file or directory, parses, runs format_program, then writes stdout, --output, --write, or validates with --check.
  • LSP: the formatting handler calls the same format_program on the parsed buffer; range formatting currently replaces the full document (see LSP architecture).

Both paths require a successful parse; there is no best-effort partial format on parse errors.

  • Coverage policy: Test harnesses and fixtures.
  • CI: nox -s format_regression and beskid format --check on the fixture tree; LSP unit tests include_str! the docs_and_control fixture to assert the handler matches format_program.
  • Corelib (optional): BESKID_FORMAT_CORPUS=1 with nox -s format_corpus_corelib runs scripts/format_corpus_check.py.