Dice Roller

“Ruby Quiz #61 – Dice Roller was posted by Matthew D Moss. The solution shown here is more in-depth than the Bracket Packing example, so please read through that one if you haven’t already.

Specifying the Grammar

The synopsis of the quiz includes a grammar specification in pseudo-BNF form that has a pretty straightforward transcription in Dhaka. The only complications are in the operator precedences we specify in the beginning. Parentheses bind highest, then the dice operator d, then * and / and, lastly, + and -. If you’re familiar with Yacc, then Dhaka’s notation should come as no surprise. To specify precedences we list groups of tokens while declaring the associativity of each group. The groups are listed in ascending order of precedence. For example, d, which has the highest binding strength, is listed after * and /. The rest is easy:

class DiceGrammar < Dhaka::Grammar

  precedences do
    left %w| + - |
    left %w| * / |
    left %w| d |
  end
  
  for_symbol(Dhaka::START_SYMBOL_NAME) do
    start                     %w| Expr |
  end
  
  for_symbol('Expr') do
    multiplication            %w| Expr * Expr |
    division                  %w| Expr / Expr |
    subtraction               %w| Expr - Expr |
    addition                  %w| Expr + Expr |
    integer                   %w| n |
    parenthetized_expression  %w| ( Expr ) |
    dice_roll_with_prefix     %w| Expr d Expr |
    dice_roll_without_prefix  %w| d Expr |
  end
  
end

Generating the Parser

Next, generate and ‘compile’ the parser to Ruby:

require 'dhaka'
require 'dice_grammar'
parser = Dhaka::Parser.new(DiceGrammar)
File.open('dice_parser.rb', 'w') {|file| file << parser.compile_to_ruby_source_as(:DiceParser)}
File.open('dice_parser.dot', 'w') {|file| file << parser.to_dot}

We’ve also generated a dot file. If you open this with GraphViz, you’re in for quite a sight. It should become clear why we don’t write these things by hand.

Generating the Lexer

The lexical structure for this language is slightly more complicated than in the bracket packing example. We have to recognize multi-character tokens and deal with whitespace. It’s incredibly easy, however, to write a lexer specification for this using regular expression patterns:

class DiceLexerSpecification < Dhaka::LexerSpecification

  operators = { 
    '(' => '\(',
    ')' => '\)',
    '+' => '',
    '-' => '-',
    '/' => '\/',
    '*' => '\*',
    'd' => 'd'
  }

  operators.each do |operator, regex|
    for_pattern(regex) do
      create_token(operator)
    end
  end
  
  for_pattern('\d+') do
    create_token('n')
  end
  
  for_pattern('\s+') do
    # ignore whitespace
  end

end

We’ve defined a hash that maps the operators to their regular expression patterns and for each of these pairs, we tell the lexer to create a token with the grammar symbol corresponding to the operator. For strings consisting of one or more digits, we tell the lexer to create a token with symbol 'n'. Lastly, we tell the lexer to ignore whitespace. Here’s how we generate and compile a lexer from this specification:

require 'dhaka'
require 'dice_lexer_specification'
lexer = Dhaka::Lexer.new(DiceLexerSpecification)
File.open('dice_lexer.rb', 'w') {|file| file << lexer.compile_to_ruby_source_as(:DiceLexer)}

It’s important to note that the regular expressions used in the lexer specification are not Ruby regular expressions. The regular expression language that Dhaka uses is a subset of Ruby’s regexes implemented in Ruby. Here’s the current grammar for this language in pseudo-BNF.

Visualizing Parse Trees

It’s hard to test dice, since they produce random output. For the purposes of this tutorial, however, we’re more interested in whether the parse tree is generated correctly. Parse trees can be exported to dot format as well:

File.open('parse_tree.dot', 'w') do |file| 
  file << DiceParser.parse(DiceLexer.lex("(5d5 -4)d(16 /d4)+ 3")).to_dot
end

Open this file with GraphViz and you get:

Note that this is not an abstract syntax tree – by default, the parser generates parse-trees and creating an evaluator is an easy way to perform syntax-directed translation of this parse-tree. It is possible, however, to specify actions in the grammar (as in Yacc) that directly create an abstract syntax tree from the input.

Defining an Evaluator

The evaluator is pretty straightforward. For the purposes of demonstration, however, we’ll implement the ‘d’ operator to behave like a ’*’. It’s easy to change it to do the right thing.

require 'dice_grammar'

class DiceEvaluator < Dhaka::Evaluator
  
  self.grammar = DiceGrammar
  
  define_evaluation_rules do
    
    for_subtraction do 
      evaluate(child_nodes[0]) - evaluate(child_nodes[2])
    end

    for_addition do
      evaluate(child_nodes[0]) + evaluate(child_nodes[2])
    end

    for_division do
      evaluate(child_nodes[0]).to_f/evaluate(child_nodes[2])
    end

    for_multiplication do
      evaluate(child_nodes[0]) * evaluate(child_nodes[2])
    end

    for_integer do
      child_nodes[0].token.value.to_i
    end

    for_parenthetized_expression do
      evaluate(child_nodes[1])
    end
    
    for_dice_roll_with_prefix do
      evaluate(child_nodes[0]) * evaluate(child_nodes[2])
    end

    for_dice_roll_without_prefix do
      evaluate(child_nodes[1])
    end

  end
  
end

Testing it

require 'test/unit'
require 'dhaka/runtime'
require 'dice_grammar'
require 'dice_parser'
require 'dice_evaluator'
require 'dice_lexer_specification'
require 'dice_lexer'

class TestDice < Test::Unit::TestCase
  def test_parses_dice_expressions
    assert_equal -7, evaluate("5-2d6")
    assert_equal 4.5, evaluate("5-10 / 2d5 /2")
    assert_equal 11, evaluate("5- (1 -2 )d6")
    assert_equal 32, evaluate("(2d3)d6-4")
    assert_equal 0, evaluate("(2*3) - d6")
  end
  
  def evaluate(input)
    DiceEvaluator.new.evaluate(DiceParser.parse(DiceLexer.lex(input)))
  end
end