Normal Form (CNF)

Version: 1.0.0

This document specifies the Calor Normal Form (CNF), an intermediate representation that makes evaluation semantics explicit.


The Problem: Backend-Dependent Semantics

When compiling directly to a backend language (like C#), semantic decisions can become implicit:

Plain Text
result = f(a(), b()) + c()

Direct compilation to C# might produce:

C#
var result = f(a(), b()) + c();

This delegates evaluation order to C#. If C# changes its evaluation order (or if a different backend has different rules), the code's behavior changes silently.

CNF prevents this by making every semantic decision explicit in the IR.


Purpose

CNF is an intermediate representation (IR) between the Calor AST and backend code generation. Its purpose is to:

  1. Make evaluation order explicit - Temporaries enforce left-to-right evaluation
  2. Introduce explicit temporaries - All intermediate values have names
  3. Linearize control flow - Branch/label/goto instead of structured control
  4. Remove implicit conversions - All conversions are explicit nodes

By lowering to CNF before emitting backend code, we guarantee that Calor semantics are enforced regardless of backend.


Pipeline Position

Plain Text
Source -> Parser -> AST -> TypeChecker -> Binder -> CNF Lowering -> CNF -> Backend -> Output
                                                    ^^^^^^^^^^^^^^^^^^^^^^^^
                                                    Semantics enforced here

CNF Node Types

Expressions (Atomic)

CNF expressions are always atomic—they reference either literals, variables, or the results of previous operations.

NodeDescription
CnfLiteralConstant value (42, 3.14, true, "hello")
CnfVariableRefReference to a named variable
CnfBinaryOpBinary operation on two atomic operands
CnfUnaryOpUnary operation on one atomic operand
CnfCallFunction call with atomic arguments
CnfConversionExplicit type conversion

Statements

NodeDescription
CnfAssignAssign value to variable
CnfSequenceOrdered list of statements
CnfBranchConditional jump to label
CnfLabelTarget for jumps
CnfGotoUnconditional jump
CnfReturnReturn from function
CnfThrowThrow exception
CnfTryTry/catch/finally block

Lowering Examples

Binary Operations

Source:

Plain Text
§R (+ (* a b) c)

CNF:

Plain Text
t1 = CnfBinaryOp(Multiply, a, b)
t2 = CnfBinaryOp(Add, t1, c)
return t2

The evaluation order is now explicit: a * b is computed first, stored in t1, then t1 + c is computed.

Short-Circuit AND

Source:

Plain Text
§R (&& A B)

CNF:

Plain Text
t_result = false
branch A -> then_block, end_block
then_block:
  t_result = B
  goto end_block
end_block:
return t_result

Short-circuit semantics are now explicit control flow. If A is false, B is never evaluated.

Short-Circuit OR

Source:

Plain Text
§R (|| A B)

CNF:

Plain Text
t_result = true
branch A -> end_block, else_block
else_block:
  t_result = B
  goto end_block
end_block:
return t_result

If A is true, B is never evaluated.

Function Call

Source:

Plain Text
§R (f (a) (b) (c))

CNF:

Plain Text
t1 = call a()
t2 = call b()
t3 = call c()
t4 = call f(t1, t2, t3)
return t4

Arguments are evaluated left-to-right, stored in temporaries, then the function is called.


Design Principles

Explicit Over Implicit

Every semantic operation is a visible node. There are no hidden evaluation orders or implicit conversions.

Flat Structure

No deeply nested expressions. Every intermediate value has a name.

Backend Agnostic

The same CNF serves all backends. Whether emitting C#, IL, or LLVM IR, the semantics are already decided.

Verifiable

CNF can be validated against the semantics specification. A CnfValidator checks:

  • Variables are defined before use
  • Types are consistent
  • Control flow is well-formed

CNF Validation

The CnfValidator checks CNF for correctness:

C#
var validator = new CnfValidator();
validator.ValidateFunction(func);

if (!validator.IsValid)
{
    foreach (var error in validator.Errors)
    {
        Console.WriteLine(error);
    }
}

Validation Rules

  1. Definition before use: Variables must be assigned before referenced
  2. Type consistency: Operations must have compatible operand types
  3. Label targets: All branch targets must exist
  4. Return paths: All code paths must return or throw

References