Enabling documentation comments in LPG parsers

Sometimes I fell, using LPG parsers is like balancing between the fine line of total madness and fine, easy results. The results are (often) fine, as the generator produces a good enough parser, with error recovery mechanisms and similar (I don’t want to compare it with other generators), but sometimes the lack of documentation can be prohibiting.

A problem I wanted to use documentation comments and multiline comments in a Java-like syntax (so /* comment */ describes a comment, while /** comment */ describes a documentation comment). I tried to find either examples or documentation about the solution to this problem, but couldn’t find any (at least until recently).

The solution consists of two parts: the lexer should be capable of differentiating between the two comment types (it should be able to emit the corresponding tokens), and then either the parser or the resolver should be able to read the comments.

Lexing issues

The first problem is hard, as we have to create the rules describing the comment tokens in a constructive way: there is no inversion, and it is better if the two rules are non-conflicting. The basic solution works like the following: a multiline comment starts with two characters, '/' and '*', followed by a not-star character, then followed as a series of character series not consisting of the ending '*''/' tag.

So far it sounds easy, but the grammar language of the LPG generator does not support inverse. Luckily we have a finite alphabet, so the number of two character long series are also finite, which means, it is possible to write the previously mentioned system in a constructive way:

doc ::= '/' '*' '*' CommentBody Stars '/'
mlc ::= '/' '*' NotStar CommentBody Stars '/'
CommentBody$$CommentBody ::= CommentBody Stars NotSlashOrStar |
CommentBody '/' | CommentBody NotSlashOrStar |
Stars NotSlashOrStar | '/' | NotSlashOrStar

NotSlashOrStar ::= letter | digit | whiteChar | AfterASCII |
'+' | '-' | '(' | ')' | '"' | '!' | '@' | '`' | '~' | '.' |
'%' | '&' | '^' | ':' | ';' | "'" | '\' | '|' | '{' | '}' |
'[' | ']' | '?' | ',' | '<' | '>' | '=' | '#' | '$' | '_'

Yes, quite hard to write, and even harder to understand first, but at least it is working.

Parsing the comments

After the lexer understands the comments, the parser needs also to be aware of them. A (seemingly) easy way to handle this is the following: the multiline comment is exported in the lexer as a comment token, while the documentation comment is exported as a simple token. After this the parser could include references to documentation comments and handled similary to all other tokens.

On the other hand, the approach has a serious drawback: the documentation comment is not a comment, so it cannot be placed anywhere, as it is limited by the grammar language. At first this looks not a too big problem, but there could be several problems: first, if you are enhancing an existing language, this could break some texts – possibly created by other users – and the default error message is hard to understand. The second problem is the following: the system disallows having long lines of stars at the beginning of the simple multiline comments, as in the following snippet:

/*********************
*                    *
*  Comment in a box  *
*********************/

Why: the comments are not started with two or more star characters, so this block is interpreted as a block comment.

If these problems cannot be ignored, both the comments and documentation comments shall be marked as comment tokens, that are swallowed by the lexer, the parser does see any comment tokens (as it should be). The drawback of this approach is that the position of the documentation comments cannot be controlled in the parser – so the handling has to ignore the documentation comments at invalid locations.

A final step is needed to read these documentation comments: they have to be read, but they are not present in the AST – at least directly. On the other hand, following the tip from the presentation the EclipseCon 2009 Tutorial: “PIMP Your Eclipse: Building an IDE using IMP” preceding adjuncts of an AST node can be read. This terminology was not clear for me, and the missing Javadoc was neither helpful, so this tutorial was a great help.

To tell the truth, I am interested in the correct meaning of the definition used in LPG for adjuncts, but as much I know, the comment tokens are included in it – in their raw form. This raw form means, every character, including whitespaces, starting and ending characters are included, so they might need another parsing step with a different grammar.

But for a short handling, the following code can be used to read the first documentation comment before an element:

public String parseDocComments(ASTNode node) {
  IToken[] adjuncts = node.getPrecedingAdjuncts();
  for (IToken adjunct : adjuncts) {
    String adjString = adjunct.toString();
    if (adjString.startsWith("/**")) {
      return adjString;
    }
  }
  return null;
}

 

Conclusion

To allow the use of documentation comments can be quite a bit of challange, especially getting the syntax right, without any conflict between rules, but it is certainly possible.

On the other hand, the language of documentation comments has to be defined again, where even the lexer could not be reused from the original grammar, as it uses different terminal rules (e.g. the documentation comment shall not be a comment token in this language). Even worse, having two different grammars makes it harder to provide correct coding help from the IDE (e.g. content assist, source coloring, etc.). These ways need further experimenting with the tools, but at least the solution is working right now.

How to turn a printer power unit to a fancy lab power supply

A dead printer is a great possibility to gain useful components. You can scoop many kind of parts, like DC motors (stepper motors if you’re lucky), wires, gears, etc.. And of course a power supply unit. The power supply can be useful in many ways, for example as a lab power supply. But without a case it is a bit dangerous, as you can accidentally touch something, which would be painful. Warning! Don’t try this at home, unless you’re perfectly know what are you doing, and have practice handling with 220 Volts.

Now, if you dare, run up to attic, and bring down the dead printer, clean the dust and grab a screwdriver. After tearing down the device to the smallest parts possible without breaking anything, you should see a component like this:

DSCF7533

It’s rather simple: it has an input, AC 220 Volts in Europe usually connected with thick wires. The output varies by brand and type but it’s surely DC, low voltage connected with thin wires. Look for the numbers printed on the board, they usually tell the output voltage and maximum power. In this case, it’s 18 Volts and 25 Watts. This is more than enough for experimental purposes.

Before you continue, check it out carefully. The printer is not working because of a cause, and maybe that cause is the power supply itself. The easiest way to check is to connect a multimeter to the output and plug it in. Be very careful with this step, try to not touch the device while it is under power.

If the power supply is not working, try to check out the fuse. If it’s burnt, you should replace it and try again. If it solves the problem, then maybe it was the cause that rendered the printer dead. At this point you can just put the printer together, and live on. And of course make merit of fixing a dead printer. Both cases, continue reading, the lab supply is far from finished yet.

To make it safe and easy to use, it should be inserted in a case. You can buy one in any electronics shop. I’ve used a plastic one as it’s easier to use, but many thinks that a metal box looks better. If you use a metal box, more work is needed as you should take care of grounding the box itself, and insulation of the power supply board. A printer is usually not grounded, so the power cable of the printer won’t be adequate. I’ve reused the AC connector of the printer, it can be clipped easily into a matching hole, which was easy to cut into the back board of the box using a drill and a rasper:

DSCF7537

I’ve used similar method on the front board, which contains two banana connectors for the output, and a switch which enables to turn off the device without unplugging it:

DSCF7536

Soldering is hard in small places, and the hot iron can damage the plastic box badly. So instead of soldering the wires together in the box, I recommend using a terminal block as I did. The block can be fixed to the box using a glue gun. I’ve also used glue gun to mount a screw and two copper wires into the hold the power supply board in place:

DSCF7540

DSCF7543

The hard work is done, all is left is to put the whole thing together. Doing some wiring work:

DSCF7546

And then voilá, it’s finished:

DSCF7548

DSCF7549

Have fun doing your own lab power supply! More pictures here:http://www.flickr.com/photos/gbalage/sets/72157624262231051/

Short story of the day

Never, I mean never write code like this:

boolean isRel1 = false, isRel2 = false;
if((isRel1 = element1 instanceof IRelation)
|| (isRel2 = element2 instanceof IRelation)){
...
}

Especially do not use such code in comparators. If not at first try, than later it will mess up things badly, as you rely on the variable, that will not be set because of the evaluation optimization.

This fact cost me three or four hours today…

A better solution (for those who look for usable code snippets):

boolean isRel1 = element1 instanceof IRelation;
boolean isRel2 = element2 instanceof IRelation;
if(isRel1 || isRel2) {
...
}

Generating LPG 1.0 parsers on OSX using Eclipse

In fall I began maintaining the parser of the VIATRA2 framework. Funny.

Mostly because it uses the LPG parser generator framework, and to make things worse, a very old version (v1.1) of it. Today it is available a new 2.0 version (since 2008), but they are not compatible at all, e.g. they define define packages in the LPG runtime. As the release was near, there was no chance of upgrading the parser, so we were stuck with version 1.0.

The problem with the old version is, that although it is written in C++, even its makefile uses explicitely the Visual C++ compiler, so simply compiling it for OSX is not possible. That means, every time I have to change the grammar file, I have to start a Windows binary. And I like to do it from Eclipse.

My two chances were Wine and VMware (not Parallels, because I don’t have a licence for it 🙂 ). The latter is too hard on resources and is so much harder to integrate with my Eclipse in OSX, so the first choice was Wine. Luckily the Wine developers did quality work, so the LPG generator binary can be run with wine.

The Eclipse integration is not too hard (at least in a basic way, that would work for a while), as there is support for running External tools using the appropriate icon from the toolbar (or from the Run menü).

Such an External tool can be parameterized using various variables of Eclipse, of which two are needed:

  • [cci]$resource_loc[/cci]: the file system path (not workspace-relative path) of the selected resource
  • [cci]$container_loc[/cci]: the the container folder’s (or directory) location, that holds the selected resource (also in the file system)

The tool will be the wine installation, as it will execute the lpg.exe binary, that will receive it as a runtime parameter. This way both the location of the lpg.exe binary and the lpg parameters have to be written to the tools parameters section. It is important to note, that the location of the lpg binary can be given using OSX paths, there is no need to translate them into Wine paths, Wine can handle native OSX paths.

LPG uses a working folder, where it puts the generated parser and AST classes. This will be defined using the [cci]$container_loc[/cci] variable.

LPG needs three types of information: the grammar file (that can be given as a parameter to LPG, we will use the [cci]$resource_loc[/cci] variable), an includes directory (for grammar snippets) and a templates directory (for parser and lexer templates).

The directories can either be found in the working directory (this is needed for local templates), given as parameters or set as environment variables. I choose the third one, as it seemed the most maintainable solution.

For this reason the [cci]LPG_INCLUDE[/cci] and the [cci]LPG_TEMPLATE[/cci] environment variables have to be set on the Environment variables tab respectively.

The described settings (except the environment variables) are shown on the following screenshot:

Running LPG with Wine on the current selection

After these settings are done, by selecting the parser.g file, it becomes possible to run this new tool, that will generate the various parser-related Java classes.

After running the tool, the console output of the lpg generator is shown, where all paths are listed beginning with [cci]Y:\[/cci], although the selected files appear in the folder structure of the Eclipse workspace.

There are some minor shortcomings of this integration: first I cannot use the pop-up menu to execute this tool, as the external tools are not listed. Another annoyance is, that the file has to be selected in Navigator view, the open editor is not enough.

This means, I have to select first the file in the Project Navigator (or Package Explorer, etc.), then run the tool manually from the Run configuration menu. Quite disturbing, but the grammar does not need to be changed too often.

Another problem is, that the error output of the generator is not back-annotated as Eclipse errors (problem markers), only a console output is available. For a brand new grammar this would be not the best solution, but for maintenance it is enough.

The LPG IDE of the IMP (IDE Metatooling Platform) project overcomes this challange by using a newer version of LPG, that is written in cross-platform C (or C++), and uses a builder (that automatically calls the LPG binary if the grammar files are changed), and the builder results are showed as proper error messages.

This means, the future for LPG development in Eclipse is the LPG IDE, but for legacy projects it cannot be used. In these cases my solution can become a good alternative.

Packaging Eclipse in OSX

Recently I experimented a bit with Eclipse packaging. At first it seems not very important, given that the folks at Eclipse work hard to produce executable packages. On the other hand, the Mac OSX packaging is not the best possible one.

The default folder structure of Eclipse applications on Mac OSX is something like follows:
[cc]eclipse
–configuration/
–dropins/
–features/
–p2/
–plugins/
–Eclipse.app/
–artifacts.xml[/cc]

In this structure Eclipse.app is a special folder, that acts as an executable item for OSX.

This structure is easy to produce, very similar to the ones of Windows or Linux, but there are some drawbacks. First, in the /Applications folder the folder icon is a generic folder, instead of an Eclipse icon (okay, this one is easy to resolve, as every folder can have a custom icon). More importantly, all indexer try identifies the executables by name. If there are multiple Eclipse instances installed, then every instance will have the same name displayed. If the path is also displayed, it is possible to distinguish between the instances.
Multiple Eclipse instances shown by the same name

Some time ago (~1 year) I tried simply renaming the Application bundle did not work, as there is seems to be some kind of configuration that won’t work after that. But this was quite a time ago.

Now I found another possible solution: there is an Eclipse repackager script shared in GitHub I could give a try.

The script is a simple bash script, with simple parametering:

[cc_bash]./EclipseOSXRepackager «eclipse source folder» «target.app»[/cc_bash]

A quick testing showed it does not handle dropins, so I hacked and shared a new version (and meanwhile I was able to test Git for the first time – btw. thanks for the fine tutorials, GitHub team 🙂 ).

My updated solution is available also from GitHub: http://github.com/ujhelyiz/yoursway-eclipse-osx-repackager

To tell the truth, even the updated script has some serious issues: I could break the app two ways: the smallest issue was, that P2 could not install or remove anything, or in the worse case the bundle couldn’t even start.

So I have a quick question: does anyone has a working solution for creating proper, working app bundles for OSX from Eclipse? Or simply could help fixing the repackager script?