Error Detection
Category: Error Detection Result: Calor wins (1.45x) What it measures: Bug identification and contract violation detection
Overview
The Error Detection metric measures how effectively an AI agent can identify bugs and contract violations using explicit semantics.
Why It Matters
When reviewing code, agents need to detect:
- Precondition violations at call sites
- Postcondition violations in implementations
- Missing null checks
- Bounds errors
- Invariant violations
Explicit contracts make these violations visible without deep analysis.
How It's Measured
Calor Detection Factors
| Factor | Points | Rationale |
|---|---|---|
Has §Q (requires) | 0.25 | Preconditions catch caller errors |
Has §S (ensures) | 0.20 | Postconditions catch implementation errors |
Has §IV (invariant) | 0.15 | Invariants catch state errors |
Has §E{...} (effects) | 0.10 | Effect tracking catches side-effect bugs |
| Has typed inputs | 0.05 | Type errors caught at compile time |
| Has typed output | 0.05 | Return type errors caught |
Base score: 0.30
C# Detection Factors
| Factor | Points | Rationale |
|---|---|---|
Has Debug.Assert | 0.15 | Runtime assertion checking |
Has Contract.Requires | 0.20 | Code Contracts precondition |
Has Contract.Ensures | 0.15 | Code Contracts postcondition |
Has null checks (??, ?.) | 0.05 | Null safety |
Has ArgumentNullException | 0.10 | Explicit null validation |
Has throw new | 0.05 | Exception-based validation |
Has readonly | 0.05 | Immutability enforcement |
Has const | 0.05 | Compile-time constants |
Base score: 0.30, cap at 0.90
Example Comparison
Calor with Contracts
§F{f001:Divide:pub}
§I{i32:a}
§I{i32:b}
§O{i32}
§Q (!= b 0) // Explicit: b must not be zero
§Q (>= a 0) // Explicit: a must be non-negative
§S (>= result 0) // Explicit: result is non-negative
§R (/ a b)
§/F{f001}Detection capability:
- Can flag any call where
bmight be 0 - Can flag any call where
amight be negative - Can verify implementation satisfies postcondition
C# Equivalent
public static int Divide(int a, int b)
{
if (b == 0)
throw new ArgumentException("b cannot be zero");
if (a < 0)
throw new ArgumentException("a must be non-negative");
var result = a / b;
Debug.Assert(result >= 0);
return result;
}Detection capability:
- Must recognize exception pattern as precondition
- Must recognize Debug.Assert as postcondition
- Less structured, more inference required
Bug Categories
Null Reference
Calor detection:
§Q (!= input null) // Explicit null check requirementC# detection:
if (input == null) throw new ArgumentNullException();
// or
input ?? throw new ArgumentNullException();Bounds Check
Calor detection:
§Q (>= index 0)
§Q (< index (len array))C# detection:
if (index < 0 || index >= array.Length)
throw new ArgumentOutOfRangeException();Contract Violation
Calor: Direct contract syntax enables automated verification.
C#: Must parse exception throws and assertions.
Real-World Example
Finding a Bug
Given this buggy call:
// Calling Divide with potentially zero divisor
§B{result} §C{Divide} §A x §A (- y y) §/C // (y - y) = 0!Calor agent detection:
- See
§C{Divide} - Look up
Dividecontracts:§Q (!= b 0) - Evaluate second argument:
(- y y)= 0 - Report: "Precondition violation:
(!= b 0)fails whenb = (- y y) = 0"
C# agent detection:
- See
Divide(x, y - y) - Find Divide implementation
- Parse exception throw pattern
- Infer precondition
- Evaluate argument
- Report potential issue
Scoring Example
Calor Code
§F{f001:SafeDivide:pub}
§I{i32:a}
§I{i32:b}
§O{i32!str}
§Q (!= b 0)
§S (|| (== result.IsOk true) (!= result.Error ""))
§IF{if1} (== b 0) → §R §ERR "Division by zero"
§EL → §R §OK (/ a b)
§/I{if1}
§/F{f001}Score:
- Base: 0.30
§Q: +0.25§S: +0.20§I{: +0.05§O{: +0.05
Total: 0.85
C# Equivalent
public static Result<int, string> SafeDivide(int a, int b)
{
if (b == 0)
return Result.Err("Division by zero");
return Result.Ok(a / b);
}Score:
- Base: 0.30
- Has type annotations: +0.10
- Has return: +0.05
Total: 0.45
Ratio: 0.85 / 0.45 = 1.89x (Calor significantly better for this example)
Interpretation
The 1.45x advantage indicates that Calor's first-class contracts provide substantially better bug detection signals.
The advantage is highest when:
- Functions have explicit preconditions and postconditions
- Multiple contracts are present
- Contracts use complex conditions
Next
- Edit Precision - How IDs enable precise changes