Description
A very dumb PEG parser library.
Son to aabro, grandson to neg, grand-grandson to parslet. There is also a javascript version jaabro.
raabro alternatives and similar gems
Based on the "Parsers" category.
Alternatively, view raabro alternatives based on common mentions on social networks and blogs.
-
Shale
Shale is a Ruby object mapper and serializer for JSON, YAML and XML. It allows you to parse JSON, YAML and XML data and convert it into Ruby data structures, as well as serialize data structures into JSON, YAML or XML. -
ROXML
ROXML is a module for binding Ruby classes to XML. It supports custom mapping and bidirectional marshalling between Ruby and XML using annotation-style class methods, via Nokogiri or LibXML. -
Nokolexbor
High-performance HTML5 parser for Ruby based on Lexbor, with support for both CSS selectors and XPath.
SaaSHub - Software Alternatives and Reviews
* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.
Do you think we are missing an alternative of raabro or a related project?
Popular Comparisons
README
raabro
A very dumb PEG parser library.
Son to aabro, grandson to neg, grand-grandson to parslet. There is also a javascript version jaabro.
a sample parser/rewriter
You use raabro by providing the parsing rules, then some rewrite rules.
The parsing rules make use of the raabro basic parsers seq
, alt
, str
, rex
, eseq
, ...
The rewrite rules match names passed as first argument to the basic parsers to rewrite the resulting parse trees.
require 'raabro'
module Fun include Raabro
# parse
#
# Last function is the root, "i" stands for "input".
def pstart(i); rex(nil, i, /\(\s*/); end
def pend(i); rex(nil, i, /\)\s*/); end
# parenthese start and end, including trailing white space
def comma(i); rex(nil, i, /,\s*/); end
# a comma, including trailing white space
def num(i); rex(:num, i, /-?[0-9]+\s*/); end
# name is :num, a positive or negative integer
def args(i); eseq(nil, i, :pstart, :exp, :comma, :pend); end
# a set of :exp, beginning with a (, punctuated by commas and ending with )
def funame(i); rex(nil, i, /[a-z][a-z0-9]*/); end
def fun(i); seq(:fun, i, :funame, :args); end
# name is :fun, a function composed of a function name
# followed by arguments
def exp(i); alt(nil, i, :fun, :num); end
# an expression is either (alt) a function or a number
# rewrite
#
# Names above (:num, :fun, ...) get a rewrite_xxx function.
# "t" stands for "tree".
def rewrite_exp(t); rewrite(t.children[0]); end
def rewrite_num(t); t.string.to_i; end
def rewrite_fun(t)
funame, args = t.children
[ funame.string ] +
args.gather.collect { |e| rewrite(e) }
#
# #gather collect all the children in a tree that have
# a name, in this example, names can be :exp, :num, :fun
end
end
p Fun.parse('mul(1, 2)')
# => ["mul", 1, 2]
p Fun.parse('mul(1, add(-2, 3))')
# => ["mul", 1, ["add", -2, 3]]
p Fun.parse('mul (1, 2)')
# => nil (doesn't accept a space after the function name)
This sample is available at: [doc/readme0.rb](doc/readme0.rb).
custom rewrite()
By default, a parser gets a rewrite(t)
that looks at the parse tree node names and calls the corresponding rewrite_{node_name}()
.
It's OK to provide a custom rewrite(t)
function.
module Hello include Raabro
def hello(i); str(:hello, i, 'hello'); end
def rewrite(t)
[ :ok, t.string ]
end
end
basic parsers
One makes a parser by composing basic parsers, for example:
def args(i); eseq(:args, i, :pa, :exp, :com, :pz); end
def funame(i); rex(:funame, i, /[a-z][a-z0-9]*/); end
def fun(i); seq(:fun, i, :funame, :args); end
where the fun
parser is a sequence combining the funame
parser then the args
one. :fun
(the first argument to the basic parser seq
) will be the name of the resulting (local) parse tree.
Below is a list of the basic parsers provided by Raabro.
The first parameter to the basic parser is the name used by rewrite rules.
The second parameter is a Raabro::Input
instance, mostly a wrapped string.
def str(name, input, string)
# matching a string
def rex(name, input, regex_or_string)
# matching a regexp
# no need for ^ or \A, checks the match occurs at current offset
def seq(name, input, *parsers)
# a sequence of parsers
def alt(name, input, *parsers)
# tries the parsers returns as soon as one succeeds
def altg(name, input, *parsers)
# tries all the parsers, returns with the longest match
def rep(name, input, parser, min, max=0)
# repeats the the wrapped parser
def nott(name, input, parser)
# succeeds if the wrapped parser fails, fails if it succeeds
def ren(name, input, parser)
# renames the output of the wrapped parser
def jseq(name, input, eltpa, seppa)
#
# seq(name, input, eltpa, seppa, eltpa, seppa, eltpa, seppa, ...)
#
# a sequence of `eltpa` parsers separated (joined) by `seppa` parsers
def eseq(name, input, startpa, eltpa, seppa, endpa)
#
# seq(name, input, startpa, eltpa, seppa, eltpa, seppa, ..., endpa)
#
# a sequence of `eltpa` parsers separated (joined) by `seppa` parsers
# preceded by a `startpa` parser and followed by a `endpa` parser
the seq
parser and its quantifiers
seq
is special, it understands "quantifiers": '?'
, '+'
or '*'
. They make behave seq
a bit like a classical regex.
The '!'
(bang, not) quantifier is explained at the end of this section.
module CartParser include Raabro
def fruit(i)
rex(:fruit, i, /(tomato|apple|orange)/)
end
def vegetable(i)
rex(:vegetable, i, /(potato|cabbage|carrot)/)
end
def cart(i)
seq(:cart, i, :fruit, '*', :vegetable, '*')
end
# zero or more fruits followed by zero or more vegetables
end
(Yes, this sample parser parses string like "appletomatocabbage", it's not very useful, but I hope you get the point about .seq
)
The '!'
(bang, not) quantifier is a kind of "negative lookahead".
def menu(i)
seq(:menu, i, :mise_en_bouche, :main, :main, '!', :dessert)
end
Lousy example, but here a main cannot follow a main.
trees
An instance of Raabro::Tree
is passed to rewrite()
and rewrite_{name}()
functions.
The most useful methods of this class are:
class Raabro::Tree
# Look for the first child or sub-child with the given name.
# If the given name is nil, looks for the first child with a name (not nil).
#
def sublookup(name=nil)
# Gathers all the children or sub-children with the given name.
# If the given name is nil, gathers all the children with a name (not nil).
# When a child matches, does not pursue gathering from the children of the
# matching child.
#
def subgather(name=nil)
end
I'm using "child or sub-child" instead of "descendant" because once a child or sub-child matches, those methods do not consider the children or sub-children of that matching entity.
Here is a closeup on the rewrite functions of the sample parser at [doc/readme1.rb](doc/readme1.rb) (extracted from an early version of floraison/dense):
require 'raabro'
module PathParser include Raabro
# (...)
def rewrite_name(t); t.string; end
def rewrite_off(t); t.string.to_i; end
def rewrite_index(t); rewrite(t.sublookup); end
def rewrite_path(t); t.subgather(:index).collect { |tt| rewrite(tt) }; end
end
Where rewrite_index(t)
returns the result of the rewrite of the first of its children that has a name and rewrite_path(t)
collects the result of the rewrite of all of its children that have the "index" name.
errors
By default, a parser will return nil when it cannot successfully parse the input.
For example, given the above Fun
parser, parsing some truncated input would yield nil
:
tree = Sample::Fun.parse('f(a, b')
# yields `nil`...
One can reparse with error: true
and receive an error array with the parse error details:
err = Sample::Fun.parse('f(a, b', error: true)
# yields:
# [ line, column, offest, error_message, error_visual ]
[ 1, 4, 3, 'parsing failed .../:exp/:fun/:arg', "f(a, b\n ^---" ]
The last string in the error array looks like when printed out:
f(a, b
^---
error when not all is consumed
Consider the following toy parser:
module ToPlus include Raabro
# parse
def to_plus(input); rep(:tos, input, :to, 1); end
# rewrite
def rewrite(t); [ :ok, t.string ]; end
end
Sample::ToPlus.parse('totota')
# yields nil since all the input was not parsed, "ta" is remaining
Sample::ToPlus.parse('totota', all: false)
# yields
[ :ok, "toto" ]
# and doesn't care about the remaining input "ta"
Sample::ToPlus.parse('totota', error: true)
# yields
[ 1, 5, 4, "parsing failed, not all input was consumed", "totota\n ^---" ]
The last string in the error array looks like when printed out:
totota
^---
LICENSE
MIT, see [LICENSE.txt](LICENSE.txt)
*Note that all licence references and agreements mentioned in the raabro README section above
are relevant to that project's source code only.