Ragel State Machine Compiler

What is Ragel?

Ragel compiles executable finite state machines from regular languages. Ragel targets C, C++, Objective-C, D, Java and Ruby. Ragel state machines can not only recognize byte sequences as regular expression machines do, but can also execute code at arbitrary points in the recognition of a regular language. Code embedding is done using inline operators that do not disrupt the regular language syntax.

The core language consists of standard regular expression operators (such as union, concatenation and Kleene star) and action embedding operators. The user's regular expressions are compiled to a deterministic state machine and the embedded actions are associated with the transitions of the machine. Understanding the formal relationship between regular expressions and deterministic finite automata is key to using Ragel effectively.

Ragel also provides operators that let you control any non-determinism that you create, construct scanners, and build state machines using a statechart model. It is also possible to influence the execution of a state machine from inside an embedded action by jumping or calling to other parts of the machine, or reprocessing input.

Ragel provides a very flexible interface to the host language that attempts to place minimal restrictions on how the generated code is integrated into the application. The generated code has no dependencies.

action dgt      { printf("DGT: %c\n", fc); }
action dec      { printf("DEC: .\n"); }
action exp      { printf("EXP: %c\n", fc); }
action exp_sign { printf("SGN: %c\n", fc); }
action number   { /*NUMBER*/ }

number = (
    [0-9]+ $dgt ( '.' @dec [0-9]+ $dgt )?
    ( [eE] ( [+\-] $exp_sign )? [0-9]+ $exp )?
) %number;

main := ( number '\n' )*;
   =>   
st0:
    if ( ++p == pe )
        goto out0;
    if ( 48 <= (*p) && (*p) <= 57 )
        goto tr0;
    goto st_err;
tr0:
    { printf("DGT: %c\n", (*p)); }
st1:
    if ( ++p == pe )
        goto out1;
    switch ( (*p) ) {
        case 10: goto tr5;
        case 46: goto tr7;
        case 69: goto st4;
        case 101: goto st4;
    }
    if ( 48 <= (*p) && (*p) <= 57 )
        goto tr0;
    goto st_err;
||
\/

click for larger

What kind of task is Ragel good for?

Features

Publications

[1] Adrian D. Thurston. "Parsing Computer Languages with an Automaton Compiled from a Single Regular Expression." In 11th International Conference on Implementation and Application of Automata (CIAA 2006), Lecture Notes in Computer Science, volume 4094, pp. 285-286, Taipei, Taiwan, August 2006. pdf.

Documentation, Editors and Mailing List

Ragel has a user guide available in PDF format as well as a man page. Major version number releases contain language changes. See the ChangeLog and Release Notes for details.

If you use Vim, there is a syntax file ragel.vim for your editing pleasure. If you use TextMate there is a Ragel bundle Ragel.tmbundle.

The Ragel mailing list is available here: ragel-users. Ask for help, post parsing problems, or tell us what you think of Ragel.

Links

  • Using Ragel and XCode. link, link
  • Mongrel is an HTTP library and server for Ruby.
  • Hpricot is an HTML parsing and manipulation library for Ruby.
  • RFuzz is an HTTP destroyer.
  • Zed Shaw on Ragel State Charts. link
  • SuperRedCloth is a snappy implementation of Textile for Ruby.
  • json A JSON parser and generator for Ruby.
  • A JSON parser for Pike. link
  • Lib2geom: a computational geometry library for Inkscape.
  • Utu: internet communication with cryptographically enforced identity, reputation and retribution.
  • appid: single-pass application protocol identification.
  • RaSPF is an SPF library in C.
  • Layout Expression Language (part of Profligacy) is for building Swing GUIs with JRuby.
  • Qfsm is a graphical tool for designing state machines. It includes a Ragel export feature.
  • ABNF Generator is a tool which accepts grammars in ABNF and outputs Ragel definitions.
  • Perian is a QuickTime component that adds native support for many popular video formats.
  • devChix article: A Hello World for Ruby on Ragel. (updated)
  • A Brazilian Portuguese translation of above: Um Hello World para o Ruby em Ragel 6.0
  • Using Ragel for scanning wikitext. link.
  • An article in Japanese on using Ragel. link.
  • A little assembler that uses Ragel for scanning and Lemon for parsing. link.
  • An ESI server derived from Mongrel. link.
  • CroMo: Morphological analysis of Croatian and other languages. link.
  • An include file dependency scanner, mostly for C files. link.
  • EaRing, an assembler using Ragel and Lemon. link.
  • Screenplay typesetting. link.
  • Source code line counting. link.
  • An implementation of W3C Selectors in Java. link.
  • There are Ragel lexers for the Pygments syntax highlighting system. link.
  • Lighttpd sandbox (which will become 2.0) uses Ragel. link.

Examples

Clang: a scanner for a simple C like language. clang.rl

Mailbox: parses unix mailbox files. It breaks files into separate messages, the headers within messages and the bodies of messages. mailbox.rl

AwkEmu: performs the basic parsing that the awk program performs on input. awkemu.rl

Atoi: converts a string to an integer. atoi.rl

Concurrent: performs multiple independent tasks concurrently. concurrent.rl

 

StateChart: the Atoi machine built with the named state and transition list paradigm. statechart.rl

GotoCallRet: demonstrates the use of fgoto, fcall, fret and fhold. gotocallret.rl

Params: parses command line arguments. params.rl

RagelScan: scans ragel input files. rlscan.rl

CppScan: A c++ scanner that uses the longest-match method of scanning cppscan.rl

Download

Development Version: Please use this for creating patches.

http://svn.complang.org/ragel/trunk/

Tar.Gz: The latest release is version is ragel-6.5.tar.gz (sig).

Older: Previous versions are available here.

Debian:  The homepage for the Debian package of Ragel is here. It is by Robert Lemmen.

OpenPKG: Ragel has been included in the OpenPKG project.

FreeBSD: A port for Ragel is available in the FreeBSD ports system.

NetBSD: There is a package for Ragel in the pkgsrc database.

 

Mac OS X: A port is available in the MacPorts repository.

Crux: A port for the Crux Linux distribution is available here.

Gentoo: A Gentoo port is available.

Suse: Packages for Suse can be found here.

Windows: Ragel can be compiled using Cygwin or MinGW. Binaries compiled with visual studio are here (6.4+ r766), provided by Joseph Goettgens.

Redhat/Fedora: A package for Ragel is available in Fedora Extras.

Slackware: A package is available at LinuxPackages.net

 Version:   6.5 
 Date:   May 18, 2009 
 Change Log:   ChangeLog 
 Tar.Gz:   ragel-6.5.tar.gz (sig
 Zip:   ragel-6.5.zip (sig
 User Guide PDF:   ragel-guide-6.5.pdf 

The public key for package signing is here.

License

Ragel is released under the GNU General Public License. A copy of the license is included in the distribution. It is also available from GNU.

Note: parts of Ragel output are copied from Ragel source covered by the GNU GPL. As a special exception to the GPL, you may use the parts of Ragel output copied from Ragel source without restriction. The remainder of Ragel output is derived from the input and inherits the copyright status of the input file. Use of Ragel makes no requirements about the license of generated code.

Credits

Ragel was written by Adrian Thurston. It was originally developed in early 2000 and was first released January 2002. Many people have contributed feedback, ideas and code. Please have a look at the CREDITS file.