Most of the time I try to avoid reinventing the wheel. And most of the time I fail and forced to do so. That’s what happened to me when I decided to write my own textual modeling framework. I’ve used (and currently using) Xtext in several projects and tried EMFText once. Both of them are great projects with great possibilities, but I’ve come across a problem which required a different approach. I needed a more dynamic toolkit, where the language is extensible and easier to maintain.
Xtext is a true leviathan when it comes to its features, it works out-of-box with a few clicks and gives a really feature-rich and customizable editor for your DSL. However, if you want something lightweight, it quickly becomes a dependency hell. Even if you don’t use Xbase you can’t leave out JDT from your project. It’s also a pain in the ass to configure a headless parser. And the top on this, newer versions of Xtext is often not compatible with generated code created by older Xtext and vice-versa. One of the main motivations of EMFText is to address these problems. The generated code of EMFText is truly standalone, it only depends on a common Antlr plugin.
But both technologies come with a bunch of generated code to keep up-to-date and maintain. Of course, the generated code can be removed from the version control, the generation itself can be moved into a build script to run on CI, but there are some generated first-time then extended by hand parts (e.g. scoping in Xtext). So I thought do we really need all this code generation? Couldn’t a grammar model be used as it is to parse a text? I gave it a shot and found out it’s possible.
To eliminate code generation, I had to drop Antlr and any other parser generator toolkits. The parser algorithm shall be independent from the used grammar model, so I decided to use an Earley parser. Obviously the downside of the approach is performance, but that’s the point: trading speed for flexibility.
A simple example
For a quick show of the features, I’ve created a simple example grammar. First, as expected from every textual modeling toolkit, there is a grammar definition for the grammar model itself, which makes it possible to edit the grammar in a convenient syntax highlighting editor:
To make the upper grammar work, a few hand-written parts are necessary:
- A resource factory implementation, to register the file type
- A resource implementation based on AbstractTextualResource, which connects grammar to the resource type and delegates feature resolving to java code.
- Extensions to register the grammar file, the resource factory and the editor
If everything goes as expected, you can try your new language with a convenient syntax highlighting editor:
The parsed model and the abstract syntax tree (for debugging the grammar) is shown in the outline view of the editor. The view gets the icon and label decorations of the elements from the generated EMF adapter factories:
Textualmodeler on GitHub: https://github.com/balazsgrill/textualmodeler/