Definite Clause Grammars (DCGs)¶
DCGs are a notation for defining grammars and other list-processing tasks. Clausal uses >> syntax for grammar rules, which are rewritten to ordinary <- clauses with two hidden difference-list arguments at compile time.
The implementation lives in clausal/templating/term_rewriting.py (source-level rewriting) and clausal/logic/builtins.py (phrase/2,3).
The -table directive is often used with DCGs to memoize recursive grammar rules. See Directives.
Syntax¶
Grammar rules use >> instead of <-:
This rewrites to a clause with two hidden arguments (the input list and the remainder list):
Terminals¶
Terminals are list literals — they consume tokens from the input:
The empty list [] matches without consuming any input:
Non-terminals¶
Non-terminals are predicate references — they delegate to other grammar rules:
This chains three grammar rules: noun_phrase consumes some tokens, then verb_phrase, then noun_phrase again.
Extra Arguments¶
DCG rules can have extra arguments beyond the hidden state pair:
Inline Goals¶
Curly braces {...} embed arbitrary Clausal goals inside a grammar rule. They do not consume input:
The goals D >= 0 and D <= 9 are CLP(FD) constraints checked without consuming tokens.
Conjunction and Disjunction¶
Multiple items in a rule are joined with , (conjunction):
Alternatives use or:
noun_phrase >> (["the", "dog"] or ["the", "cat"] or ["a", "bird"])
verb_phrase >> (["chases"] or ["sees"] or ["likes"])
Negation¶
not tests that a terminal does NOT match:
This matches any single token that is not "a".
Pushback (Semicontext)¶
A rule can peek at the next token without consuming it using pushback notation:
The left side (look_ahead(T), [T]) means: match look_ahead(T) and push back [T]. The right side ([T]) consumes T. Net effect: T is unified with the next token but remains in the input.
Recursive Rules¶
DCG rules can be recursive:
This matches strings of as and bs in any order.
phrase/2 and phrase/3¶
The phrase builtin invokes a grammar rule on an input list.
phrase/2¶
phrase(RuleName, InputList) — parse InputList with the named rule, succeeding if the entire list is consumed:
Query: valid_sentence(["the", "dog", "chases", "the", "cat"]) succeeds.
phrase/3¶
phrase(RuleName, S0, S) — parse with explicit remainder. S is the unconsumed suffix:
phrase/3 is also used for state threading (see below).
phrase with extra arguments¶
For rules with extra arguments, pass them as part of the rule:
State Threading¶
DCGs are a general state-passing mechanism — not just for parsing lists of tokens. The hidden difference-list pair can thread any state.
Core Pattern¶
Two helper non-terminals provide state access:
# state/1: read current state (passthrough)
(state(S), [S]) >> ([S])
# state/2: read old state, replace with new
(state2(S0, S), [S]) >> ([S0])
Counter Example¶
Thread an integer counter through phrase/3:
Usage:
The initial state [0] is passed as the input list; the final state [N] is the remainder.
Tree Leaf Counting¶
Thread a counter to count leaves in a binary tree:
count_leaves("leaf") >> (state(N0), {N := N0 + 1}, state2(_, N))
count_leaves([L, R]) >> (count_leaves(L), count_leaves(R))
num_leaves(T, N) <- phrase(count_leaves(T), [0], [N])
Accumulator¶
Thread a list accumulator to collect items:
push(X) >> (state(ACC0), {ACC is [X, *ACC0]}, state2(_, ACC))
push_all([]) >> ([])
push_all([X, *XS]) >> (push(X), push_all(XS))
collect_items(XS, R) <- phrase(push_all(XS), [[]], [R])
Coexisting with Regular Predicates¶
DCG rules and regular <- clauses can coexist in the same module:
sentence >> (noun_phrase, verb_phrase, noun_phrase)
# Regular predicate that uses the DCG rule
valid_sentence(S) <- phrase(sentence, S)
Test coverage
Tests are in tests/test_dcg.py (47 tests).
- Terminals: single, multiple, empty
- Non-terminals: chaining, extra args
- Inline goals: CLP(FD) constraints, arithmetic
- Conjunction/disjunction: multiple alternatives
- Negation:
not [terminal] - Pushback: peek without consuming
- Recursive rules:
abgrammar - phrase/2,3: full parse, partial parse, remainder
- State threading: counter, tree counting, accumulator
- Fixture integration:
dcg_grammar.clausalwith mixed rules and regular predicates