This guide documents how a definstance is lowered to C and, in particular, the
rule for resolving a method field's C type -- including the case that
historically miscompiled: a method whose return type is a closure.
Each definstance C [T] emits, at file scope (see
src/compiler/emit_stmt.c, EX_INSTANCE_DEF):
dict_<Class>_<TypeArgs> with one function-pointer
field per method. The field's C signature mirrors the method impl's emitted
C signature: return type, then one parameter slot per method parameter
(poly-fn params become tur_poly_fn_t; by-pointer structs become
const T *).dict_<Class>_<TypeArgs>_singleton whose fields point
at the per-instance method impls (__inst_<Class>_<method>_<TypeArgs>, or
the binding name computed in elab_definstance).Dispatch (EX_DICT with a method name, in src/compiler/emit_expr.c) loads the
function-pointer field and the indirect-call path casts it to
ret_t (*)(...) before invoking it. The dict-field type, the impl-signature
type, and the call-site cast type must all agree -- they are three views of
the same function pointer.
The field's return type is chosen by the same priority emit_fns.c uses for the
impl signature (emit_stmt.c):
emit_carrier_return_override -- the body returns a by-value struct that
resolves a dispatch tyvar (e.g. HasSchema decode -> User): use the struct.binding->type.as.fn.result_full_type -- the method's fn type carries a full
return Type. This is the load-bearing path for closure returns (below).result_kind via emit_type_from_kind -- primitives and bare references.body->type fallback.A first-class closure value is carried as an int64_t at runtime: a handle
to the heap fat-closure box { thunk, env... } (or, for a captureless lambda, a
bare function pointer). This is the same handle TUR_APPLY0/1/2/... consume by
reading the thunk from slot 0. type_c_name lowers closure carriers
accordingly:
TY_FN (a stored first-class closure) -> void *.TY_FN reference -> its result type's C name; but a closure whose
result is itself a function (a curried return, (fn [int] (fn [int]
int))) or whose result is unknown -> int64_t (the handle). Recursing
into the result kind there would bottom out at a zeroed TY_FN shell and emit
an unknown-void carrier, silently dropping the handle.result_full_typeWhen a method declares a function return type, e.g.
(defclass HasArr [a]
(arr-of [self n : int] : (fn [int] int)))
the instance-method elaborator (elab_definstance,
src/compiler/elab_typeclasses.c) builds the method's fn type with
type_fn(param_kinds, n, return_type.kind). That stores only the return kind
(TY_FN) and drops the full return Type. Without the full type, codegen
falls back to emit_type_from_kind(TY_FN) -- a zeroed shell whose result kind is
TY_UNKNOWN -- and the dict field / impl signature lower to an unknown-void
carrier. The emitted impl then returns a closure handle from a void-returning
function: a silent miscompile.
The fix mirrors the regular defn path ("Issue 1b" in elab_fns.c): when the
declared return is a TY_FN, attach it as result_full_type so the carrier
resolves to the int64_t closure handle. Three sites must agree, and all three
now carry the handle:
| Site | File | Carrier |
|---|---|---|
| dict struct field | emit_stmt.c (EX_INSTANCE_DEF) |
int64_t (*m)(...) |
| method impl signature | emit_fns.c |
static int64_t __inst_... |
| let-binding of the call result | emit_expr.c (EX_LET) |
int64_t v = (int64_t)(intptr_t)(...) |
For a curried return (a closure returning a closure) the result_full_type
is a TY_FN whose result kind is again TY_FN; type_c_name's curried-closure
rule (above) carries it as int64_t rather than recursing to unknown-void.
The returned handle obeys the usual captureless-vs-fat distinction:
TUR_APPLY1(f, x) (the standard protocol).((int64_t(*)(int64_t))(intptr_t)f)(x).The tests/fixtures/instance-closure-return-* fixtures cover both, plus int and
struct captures, curried returns, and two methods composed at the call site.
definstance is idempotent (reload-safe)A definstance C [T] form is idempotent: re-running it -- whether at the
REPL, via (reload), or by re-loading a module -- replaces the singleton entry
in the instance table rather than appending a duplicate. The most recent
elaboration wins. This means a typeclass-heavy session can call (reload)
freely without instance already defined errors and without ambiguous
resolution caused by stale duplicates.
defnA typeclass method name shares a global namespace with ordinary defn
bindings. When a defn and a defclass method use the same name in the same
scope, the compiler emits TUR-W0039 at the shadowing site. Both bindings
coexist (the method resolves through the dictionary; the free defn is still
callable by its qualified name), but the warning flags the ambiguity for the
reader. Rename one of them when the shadowing is unintentional. See
arrows-guide.md for the worked example that originally
surfaced the diagnostic.
Category hierarchyThe arrow surface in stdlib/arrow.tur and stdlib/kleisli.tur is a
production-grade example of the dictionary-lowering rules above: Category
(the superclass providing ident/comp), Arrow, ArrowChoice,
ArrowZero, and a second Category instance (Kleisli) all share method
names and exercise both the closure-handle convention and the method-vs-defn
namespace rules. See arrows-guide.md.
docs/archive/history/nested-closure-transitive-capture.md -- an orthogonal
capture-set defect (a grandchild closure's free var not threaded through the
middle closure), independent of the carrier type.docs/archive/history/intra-instance-method-dispatch-unsupported.md -- calling a
sibling method via (.other self ...) inside an instance body.