
Dice Roller
“Ruby Quiz #61 – Dice Roller was posted by Matthew D Moss. The solution shown here is more in-depth than the Bracket Packing example, so please read through that one if you haven’t already.
Specifying the Grammar
The synopsis of the quiz includes a grammar specification in pseudo-BNF form that has a pretty straightforward transcription in Dhaka. The only complications are in the operator precedences we specify in the beginning. Parentheses bind highest, then the dice operator d, then * and / and, lastly, + and -. If you’re familiar with Yacc, then Dhaka’s notation should come as no surprise. To specify precedences we list groups of tokens while declaring the associativity of each group. The groups are listed in ascending order of precedence. For example, d, which has the highest binding strength, is listed after * and /. The rest is easy:
class DiceGrammar < Dhaka::Grammar precedences do left %w| + - | left %w| * / | left %w| d | end for_symbol(Dhaka::START_SYMBOL_NAME) do start %w| Expr | end for_symbol('Expr') do multiplication %w| Expr * Expr | division %w| Expr / Expr | subtraction %w| Expr - Expr | addition %w| Expr + Expr | integer %w| n | parenthetized_expression %w| ( Expr ) | dice_roll_with_prefix %w| Expr d Expr | dice_roll_without_prefix %w| d Expr | end end
Generating the Parser
Next, generate and ‘compile’ the parser to Ruby:
require 'dhaka' require 'dice_grammar' parser = Dhaka::Parser.new(DiceGrammar) File.open('dice_parser.rb', 'w') {|file| file << parser.compile_to_ruby_source_as(:DiceParser)} File.open('dice_parser.dot', 'w') {|file| file << parser.to_dot}
We’ve also generated a dot file. If you open this with GraphViz, you’re in for quite a sight. It should become clear why we don’t write these things by hand.
Generating the Lexer
The lexical structure for this language is slightly more complicated than in the bracket packing example. We have to recognize multi-character tokens and deal with whitespace. It’s incredibly easy, however, to write a lexer specification for this using regular expression patterns:
class DiceLexerSpecification < Dhaka::LexerSpecification operators = { '(' => '\(', ')' => '\)', '+' => '', '-' => '-', '/' => '\/', '*' => '\*', 'd' => 'd' } operators.each do |operator, regex| for_pattern(regex) do create_token(operator) end end for_pattern('\d+') do create_token('n') end for_pattern('\s+') do # ignore whitespace end end
We’ve defined a hash that maps the operators to their regular expression patterns and for each of these pairs, we tell the lexer to create a token with the grammar symbol corresponding to the operator. For strings consisting of one or more digits, we tell the lexer to create a token with symbol 'n'. Lastly, we tell the lexer to ignore whitespace. Here’s how we generate and compile a lexer from this specification:
require 'dhaka' require 'dice_lexer_specification' lexer = Dhaka::Lexer.new(DiceLexerSpecification) File.open('dice_lexer.rb', 'w') {|file| file << lexer.compile_to_ruby_source_as(:DiceLexer)}
It’s important to note that the regular expressions used in the lexer specification are not Ruby regular expressions. The regular expression language that Dhaka uses is a subset of Ruby’s regexes implemented in Ruby. Here’s the current grammar for this language in pseudo-BNF.
Visualizing Parse Trees
It’s hard to test dice, since they produce random output. For the purposes of this tutorial, however, we’re more interested in whether the parse tree is generated correctly. Parse trees can be exported to dot format as well:
File.open('parse_tree.dot', 'w') do |file| file << DiceParser.parse(DiceLexer.lex("(5d5 -4)d(16 /d4)+ 3")).to_dot end
Open this file with GraphViz and you get:
Note that this is not an abstract syntax tree – by default, the parser generates parse-trees and creating an evaluator is an easy way to perform syntax-directed translation of this parse-tree. It is possible, however, to specify actions in the grammar (as in Yacc) that directly create an abstract syntax tree from the input.
Defining an Evaluator
The evaluator is pretty straightforward. For the purposes of demonstration, however, we’ll implement the ‘d’ operator to behave like a ’*’. It’s easy to change it to do the right thing.
require 'dice_grammar' class DiceEvaluator < Dhaka::Evaluator self.grammar = DiceGrammar define_evaluation_rules do for_subtraction do evaluate(child_nodes[0]) - evaluate(child_nodes[2]) end for_addition do evaluate(child_nodes[0]) + evaluate(child_nodes[2]) end for_division do evaluate(child_nodes[0]).to_f/evaluate(child_nodes[2]) end for_multiplication do evaluate(child_nodes[0]) * evaluate(child_nodes[2]) end for_integer do child_nodes[0].token.value.to_i end for_parenthetized_expression do evaluate(child_nodes[1]) end for_dice_roll_with_prefix do evaluate(child_nodes[0]) * evaluate(child_nodes[2]) end for_dice_roll_without_prefix do evaluate(child_nodes[1]) end end end
Testing it
require 'test/unit' require 'dhaka/runtime' require 'dice_grammar' require 'dice_parser' require 'dice_evaluator' require 'dice_lexer_specification' require 'dice_lexer' class TestDice < Test::Unit::TestCase def test_parses_dice_expressions assert_equal -7, evaluate("5-2d6") assert_equal 4.5, evaluate("5-10 / 2d5 /2") assert_equal 11, evaluate("5- (1 -2 )d6") assert_equal 32, evaluate("(2d3)d6-4") assert_equal 0, evaluate("(2*3) - d6") end def evaluate(input) DiceEvaluator.new.evaluate(DiceParser.parse(DiceLexer.lex(input))) end end
