Higher-Kinded Types in Turmeric

Turmeric supports higher-kinded types (HKT), allowing you to write generic code over type constructors like Option, Vec, or any user-defined container.

Overview

A type constructor takes one or more types and produces a new type. For example, Option is a type constructor: given int, it produces Option<int>.

In Turmeric's kind system: - * -- a plain type (e.g. int, bool) - * -> * -- a unary type constructor (e.g. Option, Vec) - * -> * -> * -- a binary type constructor (e.g. Either, Pair)

Typeclasses can be parameterised over type constructors using the ^f or ^^f syntax.

Defining HKT Typeclasses

Unary type constructor (^f -- kind * -> *)

(defclass Functor [^f]
  (fmap [container fn] :int))

(defclass Monad [^m]
  (bind [ma fn] :int))

(defclass Foldable [^t]
  (fold-left  [container acc fn] :int)
  (fold-right [container acc fn] :int))

The ^f parameter means "f has kind * -> *". Using a primitive type (like int) in the type argument position is a type error:

;; ERROR: int has kind *, not * -> *
(definstance Functor [int]
  (fmap [c f] c))

Binary type constructor (^^f -- kind * -> * -> *)

(defclass Bifunctor [^^f]
  (bimap [container fn-left fn-right] :int))

The ^^f parameter means "f has kind * -> * -> *".

Kind aliases with defkind

defkind lets you give a name to a kind for documentation purposes. It is currently informational only (parsed and ignored):

(defkind Unary  (* -> *))
(defkind Binary (* -> * -> *))

Defining Instances

Provide a concrete type constructor when implementing an HKT typeclass:

;; Functor for Option
(definstance Functor [option]
  (fmap [container fn] (__fmap_option container fn)))

;; Monad for Option
(definstance Monad [option]
  (bind [ma fn] (__bind_option ma fn)))

;; Bifunctor for Pair
(definstance Bifunctor [pair]
  (bimap [container fn-left fn-right] (__bimap_pair container fn-left fn-right)))

Using HKT Typeclasses

Method dispatch (.method)

Call typeclass methods using the dot-dispatch syntax:

;; fmap over an Option
(let [opt (__opt_some 5)]
  (.fmap opt (fn [x] (* x 2))))    ;; dispatches Functor.fmap

Note: Method dispatch on HKT containers uses the first-found instance as a fallback when the container type is stored as int64_t (opaque handle). This works reliably when only one instance of the typeclass is in scope.

Direct function calls

For reliability with multiple instances, call the implementation function directly:

(__fmap_option opt (fn [x] (* x 2)))
(__bind_option opt (fn [x] (__opt_some (* x 2))))

Container Values at Runtime

HKT container values (like Option<int>) are stored as opaque int64_t handles. Use inline C blocks to allocate and dereference them:

(defn __opt_some [x] :int
  ```
  struct { bool is_some; int64_t value; } *r = malloc(sizeof(*r));
  r->is_some = true;
  r->value = x;
  return (int64_t)(intptr_t)r;
  ```)

(defn __opt_none [] :int
  ```c return 0; ```)

Implement fmap and bind using inline C to call the closure function pointer:

(defn __fmap_option [container fn] :int
  ```
  struct { bool is_some; int64_t value; } *c =
      (struct { bool is_some; int64_t value; } *)(intptr_t)container;
  if (!c || !c->is_some) return 0;
  struct { bool is_some; int64_t value; } *r = malloc(sizeof(*r));
  r->is_some = true;
  r->value = ((int64_t(*)(int64_t))(intptr_t)fn)(c->value);
  return (int64_t)(intptr_t)r;
  ```)

(defn __bind_option [ma fn] :int
  ```c
  struct { bool is_some; int64_t value; } *opt =
      (struct { bool is_some; int64_t value; } *)(intptr_t)ma;
  if (!opt || !opt->is_some) return 0;
  return ((int64_t(*)(int64_t))(intptr_t)fn)(opt->value);
  ```)

Standard HKT Typeclasses

The standard library (stdlib/typeclass.tur) provides:

Typeclass Kind Methods
Functor * -> * fmap [container fn]
Applicative * -> * pure [x], ap [fn-container container]
Monad * -> * bind [ma fn]
Foldable * -> * fold-left, fold-right
Traversable * -> * traverse [container fn]
Bifunctor * -> * -> * bimap [container fn-left fn-right]

Functor Laws Example

(defclass Functor [^f]
  (fmap [container fn] :int))

(defn id [x] :int x)
(defn times2 [x] :int (* x 2))
(defn inc [x] :int (+ x 1))

(definstance Functor [option]
  (fmap [container fn] (__fmap_option container fn)))

(defn main [] :int
  (do
    ;; Identity law: fmap id x = x
    (let [opt    (__opt_some 42)
          result (__fmap_option opt id)]
      (println (__opt_unwrap result)))  ;; 42

    ;; Composition law: fmap (f . g) x = fmap f (fmap g x)
    (let [opt (__opt_some 5)
          lhs (__fmap_option opt (fn [x] (times2 (inc x))))
          rhs (__fmap_option (__fmap_option opt inc) times2)]
      (println (= (__opt_unwrap lhs) (__opt_unwrap rhs))))  ;; true
    0))

Monad Chaining

Use bind directly to chain monadic operations:

;; Sequence two Option computations
(let [step1  (__bind_option (__opt_some 3) (fn [x] (__opt_some (* x 2))))  ;; step1 = some 6
      result (__bind_option step1 (fn [y] (__opt_some (+ y 1))))]           ;; result = some 7
  (println (__opt_unwrap result)))  ;; 7

do-m Notation

The do-m macro provides monadic do-notation. It desugars to nested .bind calls:

;; (do-m x ma1 y ma2 body) desugars to:
;; (.bind ma1 (fn [x] (.bind ma2 (fn [y] body))))

Simple usage (single binding, no variable capture in body):

(definstance Monad [option]
  (bind [ma fn] (__bind_option ma fn)))

;; Single expression: returned as-is
(let [r (do-m (__opt_some 42))]
  (println (__opt_unwrap r)))  ;; 42

;; One binding: desugars to .bind call
(let [r (do-m x (__opt_some 5) (__opt_some (* x 3)))]
  (println (__opt_unwrap r)))  ;; 15

;; None propagates automatically
(let [r (do-m x (__opt_none) (__opt_some (* x 3)))]
  (println (__opt_some? r)))  ;; false

Closures with fmap

Both non-capturing and capturing closures work with .fmap and .bind typeclass dispatch.

;; Non-capturing closure
(let [opt    (__opt_some 10)
      result (.fmap opt (fn [x] (* x x)))]
  (println (__opt_unwrap result)))  ;; 100

;; Capturing closure -- delta is captured from the enclosing scope
(let [delta  5
      opt    (__opt_some 10)
      result (.fmap opt (fn [x] (+ x delta)))]
  (println (__opt_unwrap result)))  ;; 15

;; Multi-capture
(let [a      2
      b      3
      opt    (__opt_some 10)
      result (.fmap opt (fn [x] (+ (* x a) b)))]
  (println (__opt_unwrap result)))  ;; 23

;; Capturing closure through .bind
(let [scale 3
      r     (.bind (__opt_some 4) (fn [x] (__opt_some (* x scale))))]
  (println (__opt_unwrap r)))  ;; 12

Named functions (non-closures) also work unchanged:

(defn double [x] :int (* x 2))

(let [opt    (__opt_some 7)
      result (.fmap opt double)]
  (println (__opt_unwrap result)))  ;; 14

do-m with captured variables works too:

(let [factor 3
      r      (do-m x (__opt_some 5) (__opt_some (* x factor)))]
  (println (__opt_unwrap r)))  ;; 15

Binary Type Constructors (Bifunctor)

(defclass Bifunctor [^^f]
  (bimap [container fn-left fn-right] :int))

(definstance Bifunctor [pair]
  (bimap [container fn-left fn-right]
    (__bimap_pair container fn-left fn-right)))

Performance

Dispatch model: dictionary passing

By default Turmeric implements typeclass method calls via dictionary passing. For each typeclass instance, the compiler generates a struct ("dictionary") that holds one function pointer per method:

/* generated for (definstance Functor [option] ...) */
/* fn params annotated with :fn receive tur_poly_fn_t so fat closures work */
typedef struct { int64_t (*fmap)(int64_t container, tur_poly_fn_t fn); } dict_Functor_option;
static dict_Functor_option __dict_Functor_option = { .__fmap_option };

At a call site such as .fmap opt f, the compiler locates the relevant dictionary at compile time and emits a direct function-pointer call through it. The overhead is one pointer dereference, comparable to a C virtual dispatch.

Controlling the C back-end optimisation level

tur build and tur run shell out to a C compiler. Two environment variables let you control that step:

Variable Default Effect
CC cc C compiler executable
TUR_CC_FLAGS -O2 -std=c99 -Wall Flags passed to every cc invocation

Examples:

# Ship a fast binary (aggressive optimisation, link-time optimisation)
CC=clang TUR_CC_FLAGS="-O3 -flto -std=c99" tur build app.tur -o app

# Quick iteration build (fast compile, no optimisation)
TUR_CC_FLAGS="-O0 -std=c99" tur build app.tur -o app

# Debug symbols + sanitizer
CC=clang TUR_CC_FLAGS="-O1 -g -fsanitize=address -std=c99" tur build app.tur -o app

Both variables are inherited by tur run and by the test runner (TUR_CC_FLAGS is explicitly documented in tests/run.sh).

Benchmarking HKT dispatch overhead

The benchmark suite in tests/benchmarks/ measures typeclass dispatch overhead. The primary benchmark is hkt-dict-pass.tur, which exercises Functor.fmap over option in a tight loop:

./tests/run-bench.sh                  # run all benchmarks
./tests/run-bench.sh hkt-dict-pass    # run only the HKT dispatch benchmark
BENCHMINIT=10000 ./tests/run-bench.sh # increase minimum iterations

Results are written to tests/benchmarks/output/.

Planned: -O monomorphization flag

Status: planned, not yet implemented.

In a future release, passing -O to tur build will trigger monomorphization: the compiler will specialise each typeclass method call for the concrete type known at the call site, eliminating the dictionary indirection entirely:

# planned syntax -- not yet available
tur build -O app.tur -o app

Under -O, a call like .fmap opt f where opt is a known option container is lowered directly to __fmap_option(opt, f) with no dictionary lookup. Dictionary passing remains the default and is safe for all cases; -O is intended as a manual hot-path annotation for tight loops where the profiler shows typeclass dispatch is a bottleneck.

Recursive Types

Turmeric supports self-referential and mutually-recursive defdata/defstruct definitions. Because all user-defined type values are stored as opaque int64_t pointers, recursive field types require no special C-level treatment -- the elaborator simply recognises the type name and records it as TY_INT.

Self-referential ADTs

(defdata IntList
  (Cons :int :IntList)
  (Nil))

(defn sum [lst :int] :int
  (match lst
    (Cons h t) (+ h (sum t))
    (Nil)      0))

Mutually-recursive ADTs

(defdata Expr
  (Lit :int)
  (Add :Expr :Expr))

(defn eval-expr [e :int] :int
  (match e
    (Lit n)   n
    (Add l r) (+ (eval-expr l) (eval-expr r))))

Fix -- fixed-point of a functor

stdlib/fix.tur provides the Fix type and the catamorphism/anamorphism pair:

(load "stdlib/fix.tur")

(defdata NatF [^f]
  (ZeroF)
  (SuccF :int))

(defn to-nat [n :int] :int
  (if (= n 0)
    (roll (ZeroF))
    (roll (SuccF (to-nat (- n 1))))))

See stdlib/fix.tur for the full API.

Free monad

stdlib/free.tur implements the Free monad, enabling pure DSL interpreters without committing to a concrete effect type:

(load "stdlib/free.tur")

;; Define a small effect algebra
(defdata CalcOp
  (CalcAdd :int :int)
  (CalcMul :int :int))

;; Lift operations into Free
(defn calc-add [a :int b :int] :int (free-lift (CalcAdd a b)))

;; Interpret with free-run
(defn run-op [op :int] :int
  (match op
    (CalcAdd a b) (+ a b)
    (CalcMul a b) (* a b)))

(defn main [] :int
  (let [prog (calc-add 3 4)]
    (println (free-run (fn [op] (let [_ prog] (run-op op))) prog))
    0))

See stdlib/free.tur for free-pure, free-lift, free-bind, free-fmap, and free-run.

Known Limitations

  1. Method dispatch on containers: Since HKT container values are stored as int64_t, type-based dispatch may fall back to the first matching instance. When exactly one typeclass instance is in scope for the given method, the fallback is accepted silently and the program behaves correctly. When two or more instances are in scope and the receiver type has been erased to int64_t, the compiler emits TUR_E0020_AMBIGUOUS_DISPATCH (Phase D0). Use the @TypeName witness syntax (Phase D1) to resolve the ambiguity at zero runtime cost:

turmeric (.fmap @option opt f) ; disambiguate: use the option Functor instance

sweet-exp .fmap(@option opt f) ; disambiguate: use the option Functor instance

See docs/archive/hkt-opaque-dispatch-plan.md for background.

  1. defkind: Currently parsed and ignored. Future versions may use it for documentation generation and kind inference.

See also