Typeclass Dictionary Internals

This guide documents how a definstance is lowered to C and, in particular, the rule for resolving a method field's C type -- including the case that historically miscompiled: a method whose return type is a closure.

Dictionary lowering

Each definstance C [T] emits, at file scope (see src/compiler/emit_stmt.c, EX_INSTANCE_DEF):

  1. A dictionary struct dict_<Class>_<TypeArgs> with one function-pointer field per method. The field's C signature mirrors the method impl's emitted C signature: return type, then one parameter slot per method parameter (poly-fn params become tur_poly_fn_t; by-pointer structs become const T *).
  2. A global singleton dict_<Class>_<TypeArgs>_singleton whose fields point at the per-instance method impls (__inst_<Class>_<method>_<TypeArgs>, or the binding name computed in elab_definstance).

Dispatch (EX_DICT with a method name, in src/compiler/emit_expr.c) loads the function-pointer field and the indirect-call path casts it to ret_t (*)(...) before invoking it. The dict-field type, the impl-signature type, and the call-site cast type must all agree -- they are three views of the same function pointer.

Method-field C-type resolution

The field's return type is chosen by the same priority emit_fns.c uses for the impl signature (emit_stmt.c):

  1. emit_carrier_return_override -- the body returns a by-value struct that resolves a dispatch tyvar (e.g. HasSchema decode -> User): use the struct.
  2. binding->type.as.fn.result_full_type -- the method's fn type carries a full return Type. This is the load-bearing path for closure returns (below).
  3. result_kind via emit_type_from_kind -- primitives and bare references.
  4. body->type fallback.

The closure-handle convention

A first-class closure value is carried as an int64_t at runtime: a handle to the heap fat-closure box { thunk, env... } (or, for a captureless lambda, a bare function pointer). This is the same handle TUR_APPLY0/1/2/... consume by reading the thunk from slot 0. type_c_name lowers closure carriers accordingly:

Why a closure-returning method needs result_full_type

When a method declares a function return type, e.g.

(defclass HasArr [a]
  (arr-of [self n : int] : (fn [int] int)))

the instance-method elaborator (elab_definstance, src/compiler/elab_typeclasses.c) builds the method's fn type with type_fn(param_kinds, n, return_type.kind). That stores only the return kind (TY_FN) and drops the full return Type. Without the full type, codegen falls back to emit_type_from_kind(TY_FN) -- a zeroed shell whose result kind is TY_UNKNOWN -- and the dict field / impl signature lower to an unknown-void carrier. The emitted impl then returns a closure handle from a void-returning function: a silent miscompile.

The fix mirrors the regular defn path ("Issue 1b" in elab_fns.c): when the declared return is a TY_FN, attach it as result_full_type so the carrier resolves to the int64_t closure handle. Three sites must agree, and all three now carry the handle:

Site File Carrier
dict struct field emit_stmt.c (EX_INSTANCE_DEF) int64_t (*m)(...)
method impl signature emit_fns.c static int64_t __inst_...
let-binding of the call result emit_expr.c (EX_LET) int64_t v = (int64_t)(intptr_t)(...)

For a curried return (a closure returning a closure) the result_full_type is a TY_FN whose result kind is again TY_FN; type_c_name's curried-closure rule (above) carries it as int64_t rather than recursing to unknown-void.

Calling a closure returned from a method

The returned handle obeys the usual captureless-vs-fat distinction:

The tests/fixtures/instance-closure-return-* fixtures cover both, plus int and struct captures, curried returns, and two methods composed at the call site.

definstance is idempotent (reload-safe)

A definstance C [T] form is idempotent: re-running it -- whether at the REPL, via (reload), or by re-loading a module -- replaces the singleton entry in the instance table rather than appending a duplicate. The most recent elaboration wins. This means a typeclass-heavy session can call (reload) freely without instance already defined errors and without ambiguous resolution caused by stale duplicates.

TUR-W0039: method name shadowing a free defn

A typeclass method name shares a global namespace with ordinary defn bindings. When a defn and a defclass method use the same name in the same scope, the compiler emits TUR-W0039 at the shadowing site. Both bindings coexist (the method resolves through the dictionary; the free defn is still callable by its qualified name), but the warning flags the ambiguity for the reader. Rename one of them when the shadowing is unintentional. See arrows-guide.md for the worked example that originally surfaced the diagnostic.

See also -- the arrow / Category hierarchy

The arrow surface in stdlib/arrow.tur and stdlib/kleisli.tur is a production-grade example of the dictionary-lowering rules above: Category (the superclass providing ident/comp), Arrow, ArrowChoice, ArrowZero, and a second Category instance (Kleisli) all share method names and exercise both the closure-handle convention and the method-vs-defn namespace rules. See arrows-guide.md.

See also