Interpreter pattern in PHP

Interpreter pattern in PHP

Some recurring problems can be expressed as sentences in simple language which is usually called Domain Specific Language (DSL). Interpreter pattern allows you:

  1. To express the grammar rules of a DSL as classes,
  2. To represent sentences in DSL as a tree of objects. Each object is an instance of some class from p.1. This tree is usually called AST (Abstract Syntax Tree).
  3. To define methods to interpret sentences in DSL that is the methods to evaluate AST.

I will develop this succint introduction in a moment.

BNF of DSL

A grammar of a DSL can be defined using [Backus–Naur form] (BNF). It is a notation technique for the exact description of languages syntax.

BNF is a finite set of rules (sometimes called “productions”). Each rule is written as:

<symbol> ::= __expression__

<symbol> is an artificial name (variable) for a language construct. Tt is defined by __expression__. In other words, <symbol> is derived from __expression__; that is why BNF rules are called derivation rules. This means that <symbol> can be replaced by __expression__.

__expression__ can consist of:

  • individual characters (a, ,, /, etc.),
  • individual words (begin, or, end, etc.),
  • any string (to be or not to be),
  • alternation (choice) of characters, words, strings : (come | leave | stay). The | character separates the alternatives,
  • other symbols or even recursively include the symbol that is being currently defined.

There are two kind of symbols: terminal and non-terminal ones.

A symbol is called terminal if its __expression__ does not include another symbol. Terminal symbols are evaluated immediately:

<terminal symbol> ::= good dog
# corresponds exactly to the text "good dog" in a sentece

or

<terminal symbol with alternation> ::= one | two | three
# corresponds to one of these values: "one" or "two" or "three"

Non-terminal symbols are defined through other symbols that in turn can be both terminal ones and non-terminal ones:

   <description> ::= a <features> <animal>
      <features> ::= <single feature> | <single feature> <features>
<single feature> ::= red | kind | young
        <animal> ::= dog | cat

<description> corresponds to many velues: “a red dog”, “a young red dog”, “a red young dog”, “a kind kind king dog”, etc.

  • <description> is a non-terminal symbol defined through another symbol <features>.
  • <features> symbol is defined through two symbols : <single feature>
    and itself. It’s a recursive definition.
  • <single feature> is a terminal symbol with three alternative vales.
  • <animal> is a simple terminal symbol with two alternative vales.

The pattern

Interpreter pattern suggests crating a class for each derivation rule of BNF. For the example above it would be Description, Features, SingleFeature and Animal.

It’s clear that the Description class aggregates one instance of Features and one instance of Animal. Features class in turn aggregates an instance of SingleFeature and optionally an instance of Features recursively. Description and Features are non-terminals. The classes SingleFeature and Animal do not aggregate other objects, that is why they are terminals.

While the classes represent grammar rules of a DSL, the objects of these classes represent sentences of the DSL. More precisely a three of objects represent a sentence. This kind of tree is usually called [Abstract Syntax Tree] (AST). E.g. the sentence “a big kind dog” is represented by this AST:

A big kind dog AST

AST clearly depicts the difference between terminal and non-terminal symbols: terminal symbols are always the leafs of AST while non-terminal symbols always have children.

The process of constructing AST from the sentence in DSL is called parsing. It worth nothing that parsing is out of the scope of the Interpreter pattern. In fact the pattern defines a method to evaluate an AST that was constructed beforehand. It is literally some public method of the classes representing grammar rules and available in every object of AST.

Implementation: BNF of DSL

Imagine you want to provide the users of your library/API/application with a possibility to write sentences in some basic language. These sentences are assertions or predicates that are evaluated and produce a result. Let us say that these sentences act as security expressions to permit or reject client’s requests to some resources. Every request contains one of these actions: “view”, “change”, “delete”. Every resource is in one of these states: “processing”, “treated”, “rejected”.

For example, this expression results to “true” or “false” depending on the current request’s action:

doAction('view')

This depends on the current state of a resource:

inState('treated') OR inState('rejected')

A change request to a resource in the processing state:

doAction('change') AND inState('processing')

A delete request to a resource in the processing or in the treated state:

doAction('delete') AND (inState('processing') OR inState('treated'))

The BNF of this DSL is:

 <expression> ::= <doAction> | <inState> | <alternation> | <sequence> | (<expression>)
<alternation> ::= <expression> OR <expression>
   <sequence> ::= <expression> AND <expression>
   <doAction> ::= doAction(<permission>)
    <inState> ::= inState(<state>)
     <action> ::= view | change | delete
      <state> ::= processing | treated | rejected

Implementation: the pattern

The corresponding classes are Expression, Alternation, Sequence, DoAction, InState, Action, State.

Expression

It is a common interface for all grammar rules. Also it is a common interface for all nodes in AST.

/**
 * Defines a common interface for all grammar rules.
 * Also all nodes in an AST (abstract syntax tree) implement this interface.
 * An AST represents a particular sentence in a language defined by grammar
 * rules.
 *
 * It is an `AbstractExpression` in terms of the Interpreter pattern.
 */
interface Expression
{
    /**
     * Evaluates an expression:
     * - subclasses of `TerminalExpression` type return a value immediately,
     * - subclasses of `NonTerminalExpression` type call `interpret` methods of
     *   theirs subexpressions.
     *
     * @return mixed
     */
    public function interpret();
}

Alternation

It is the expression that evaluates boolean OR operator.

/**
 * The expression that evaluates boolean `OR` operator in a sentence.
 *
 * It is a `NonTerminalExpression` in terms of the Interpreter pattern. It
 * always calls the `interpret()` method of its subexpressions.
 */
class Alternation implements Expression
{
    /**
     * The first alternative expression.
     * 
     * @var Expression
     */
    private $alt1;

    /**
     * The second alternative expression.
     * 
     * @var Expression
     */
    private $alt2;

    /**
     * @param Expression $alt1
     * @param Expression $alt2
     */
    public function __construct(Expression $alt1, Expression $alt2)
    {
        $this->alt1 = $alt1;
        $this->alt2 = $alt2;
    }

    /**
     * Joins the subexpressions with boolean `OR` operator.
     */
    public function interpret()
    {
        return $this->alt1->interpret() || $this->alt2->interpret();
    }
}

alternation expression

Sequence

Is is the expression that evaluates boolean AND operator.

/**
 * The expression that evaluates boolean `AND` operator in a sentence.
 *
 * It is a `NonTerminalExpression` in terms of the Interpreter pattern. It
 * always calls the `interpret()` method of its subexpressions.
 */
class Sequence implements Expression
{
    /**
     * The first required operand.
     * 
     * @var Expression
     */
    private $seq1;

    /**
     * The second required operand.
     * 
     * @var Expression
     */
    private $seq2;

    /**
     * @param Expression $seq1
     * @param Expression $seq2
     */
    public function __construct(Expression $seq1, Expression $seq2)
    {
        $this->seq1 = $seq1;
        $this->seq2 = $seq2;
    }

    /**
     * Joins the subexpressions with boolean `AND` operator.
     */
    public function interpret()
    {
        return $this->seq1->interpret() && $this->seq2->interpret();
    }
}

Sequence expression

DoAction

Evaluates the doAction function. It retrieves the current request action from the $context and compares it to its argument in a sentence.

/**
 * Expression that evaluates the `doAction` function.
 *
 * It is a `NonTerminalExpression` in terms of the Interpreter pattern. It
 * always calls the `interpret()` method of its subexpressions.
 */
class DoAction implements Expression
{
    /**
     * The subexpression corresponding to the argument of `doAction` function.
     * For example if a sentence contains `doAction('delete')` then the argument
     * is the instance of @see Action expression representing a 'delete' action.
     *
     * @var Action
     */
    private $argument;

    /**
     * @var array
     */
    private $context;

    /**
     * @param Action $argument
     * @param array $context
     */
    public function __construct(Action $argument, array $context)
    {
        $this->argument = $argument;
        $this->context = $context;
    }

    /**
     * Compares the argument given in a sentence and the current action from
     * the external context.
     */
    public function interpret()
    {
        return $this->argument->interpret() === $this->context['current_action'];
    }
}

Action

An action (view, change, delete) in a sentence.

/**
 * Represents an action which corresponds to either of these values: `view`,
 * `change`, `delete`.
 *
 * `TerminalExpression` in terms of the Interpreter pattern.
 */
class Action implements Expression
{
    /**
     * `view`, `change`, `delete`.
     *
     * @var string
     */
    private $action;

    /**
     * @param string $action
     */
    public function __construct($action)
    {
        if (!in_array($action, ['view', 'change', 'delete'])) {
            throw new \InvalidArgumentException('Unknown action.');
        }

        $this->action = $action;
    }

    /**
     * @inheritdoc
     */
    public function interpret()
    {
        return $this->action;
    }
}

DoAction and Action expressions

InState

Evaluates the inState function. It retrieves the current request action from the $context and compares it to its argument in a sentence.

/**
 * Expression that evaluates the `inState` function.
 *
 * It is a `NonTerminalExpression` in terms of the Interpreter pattern. It
 * always calls the `interpret()` method of its subexpressions.
 */
class InState implements Expression
{
    /**
     * The subexpression corresponding to the argument of `inState` function.
     * For example if a sentence contains `inState('view')` then the argument
     * is the instance of @see State expression representing a 'view' state.
     *
     * @var State
     */
    private $argument;

    /**
     * External context in which the expression is evaluated.
     *
     * @var array
     */
    private $context;

    /**
     * @param State $argument
     * @param array $context
     */
    public function __construct(State $argument, array $context)
    {
        $this->argument = $argument;
        $this->context = $context;
    }

    /**
     * Compares the argument given in a sentence and the current state from
     * the external context.
     */
    public function interpret()
    {
        return $this->argument->interpret() === $this->context['current_state'];
    }
}

State

A state (processing, treated, rejected) in a sentence.

/**
 * Represents a state which corresponds to either of these values:
 * `processing`, `treated`, `rejected`.
 *
 * `TerminalExpression` in terms of the Interpreter pattern.
 */
class State implements Expression
{
    /**
     * `processing`, `treated`, `rejected`.
     *
     * @var string
     */
    private $state;

    /**
     * @param string $state
     */
    public function __construct($state)
    {
        if (!in_array($state, ['processing', 'treated', 'rejected'])) {
            throw new \InvalidArgumentException('Unknown state.');
        }

        $this->state = $state;
    }

    /**
     * Evaluates to its state.
     */
    public function interpret()
    {
        return $this->state;
    }
}

InState and State expressions

For example, AST for the sentence doAction('change') AND (inState('processing') OR inState('treated')) is constructed as follows:

$expr = new Sequence(
    new DoAction(new Action('change'), $context),
    new Alternation(
        new InState(new State('processing'), $context),
        new InState(new State('treated'), $context)
    )
);

AST looks like:

doAction('change') AND (inState('processing') OR 
  inState('treated')) AST

DoAction and InState classes require a $context that holds current request’s action and current resource state:

$context = [
    'current_action' => 'delete',
    'current_state' => 'processing',
];

Other dependencies (parameters, services, couple of variables) could replace $context if appropriate.

AST is evaluated recursively. Terminal nodes are evaluated immediately while non-terminal nodes are evaluated using the values of theirs children. But sometimes a node needs to know the evaluated value of its sibling node or subtree. In that case Interpreter pattern suggests storing already evaluated values in the evaluation context in the form of array or hash or object. In this way a node can obtain a evaluated value of its sibling node.

Check out the full code as well as its tests.