Source code visualisation for my projects

Today I read the announcement of SourceCloud, a new Zest/Cloudio-based visualization for source code. As I like cool visualizations, I tried it out on the open projects I am participating in.At first I created the cloud for the source code of the Debug Visualisation project. Globally it consists of about half Eclipse debugger/UI and Java keywords – as this is quite a small project (about 4 kLOC), this is somewhat expected with 200 words.

SourceCloud of the Debug Visualisation Project

However, it is interesting, that the strings ‘end_of_line‘ and ‘sp_cleanup‘ found their way in. Luckily, only a few one-character names and numbers are present, meaning, we managed to created longer variable names…

The cloud of the VIATRA2 model transformation framework is more interesting: it is a much larger codebase, developed for several year by around a dozen people. Because of the size I generated the cloud using 300 words, but almost no Java keywords found their way into the graph.

SourceCloud of the VIATRA2 framework

However, the mandatory comment parts did, e.g. nbsp or the names of some creators. A lot of EMF-specific keywords are also present. They are present because we have quite a large EMF metamodel that provides several hundred generated Java files.

At last, but not least I also checked the cloud of the EMF-IncQuery project. This is smaller, then the VIATRA2 project, but larger then the Debug Visualisation. Additionally, it has a shorter history, but with several key committers.

SourceCloud of the EMF-IncQuery project

My personal favorite is this diagram, as it shows a lot of nice things: e.g. four digit numbers in large quantities (magic numbers FTW) or the absolute champion string: ‘HUKfyM7ahfg‘. All of them are present in the Ecore Diagram of the defined metamodel (the meaningless string 152 times as a postfix of an identifier 🙂 ). At least, I learned something about the internal model representation of GMF.

And how do your projects look like?

Incremental validation of UML models using EMF-IncQuery

Last week we created in the university a nice demo application to demonstrate the validation capabilities of our EMF-IncQuery framework. The application was presented on the conference ECMFA 2011 as a tutorial.

EMF-IncQuery provides an incrementally evaluated model query framework: as the underlying EMF model is changed, the query results are re-evaluated, and are instantly returned when needed. Queries are formulated using the pattern language of the VIATRA2 transformation system, and the Java code necessary to execute queries are generated.

In our demo application we extended the Papyrus UML editor with on-the-fly validation of well-formedness constraints. During editing we check for non-local constraints (e.g. a Behaviour must have the same number of parameters as the corresponding operation); display the results in the editor and in the problems view; we try to provide and remove these error markers without saving or manual need to start validation.

Basically, the constraint validators are generated from specially annotated VIATRA2 patterns; the generation is currently somewhat specific to the Papyrus editor, but can be extended to support most EMF-based editors easily.

A screencast of the demo is presented below. It is important to note, that the Papyrus editor was not modified in any way, and the framework generated all Java code used during validation. More details about the demo application, such as source code, tutorial slides, can be found on the homepage of the application.

(If the embedded video does not show – e.g. in a feed reader – you could watch it on on Youtube)

Custom markers and annotations – the bright side of Eclipse


I have discovered a truly remarkable proof of this theorem which this margin is too small to contain.
Pierre de Fermat

During the development of editors it is often needed to display additional information next to the document, such as error markers, search results or various coverage metrics. However, it is not a good idea to hardcode the list of supported markers during development, as the best ideas will be introduced by others – who do not want to have anything to do with our editor.

Luckily, Eclipse provides a simple way to attach markers to documents without explicit support needed from the editor developer: the markers (and annotations) created by their corresponding extensions are stored independently from the document, so these additions can be made without disturbing any existing user of the files.

As I described it in my previous post about error markers, such markers can be defined creating an extension for the extension point org.eclipse.core.resources.marker, and then using the IResource.createMarker() method to create the marker instances as needed. Then using some magic it can be displayed as the yellow or red wriggly lines during the text.

Other markers (and annotations) can be created in similarly: we create our own marker type, and register an annotation type for it, and the platform manages the rest. In my case I wanted to represent a calculated set of program statements similar to the code coverage display used by the EclEmma tool: the covered statements are highlighted with the green background.

To support such a display, I created a new marker type, that is a text marker; and to make the lifecycle-management easier, I did not set it persistent. This has been reached using the following code:

    name="GTASM Slice"

The most important difference between this marker and the one defined in the previous Markers and Annotations post is that the newly defined marker is not a problem marker, but a simple text marker. This means, the created marker will not show in the Problems view (that is nice – we don’t want to display problems now), and no default annotation is created (that will need fixing).

In theory it is possible to create annotations programmatically together with the markers, but in practice it is really a bad idea. First of all, annotations are created for the document model (thus assuming an open editor and a code dependency to that document model), are never persisted, and the resulting code fails indeterministically (I once debugged a code like that, every annotation creation code ran correctly, but some annotations were missing; re-running the code re-added the missing ones, but sometimes removed others), so this idea will not work.

Luckily, the platform provides two extension points that can be used to automatically assign annotations to marker instances. To define an automatically generated annotation type for the markers, the org.eclipse.ui.editors.annotationTypes extension point is used: basically a marker and an annotation ID is stored this way.




To define the appearance of the created annotation the org.eclipse.ui.editors.markerAnnotationSpecification extension point can be used. In the extensions we have to specify the annotation display methods used together with color and formatting settings.


            label="GTASM Slice Marker"

Even better, by assigning preference keys to the various settings the platform also offers the users the possibility to configure the annotations: the General/Editors/Text Editors/Annotations preference page will list our annotation, and allows setting its preference values. If the preference keys are unique (that is our responsibility 🙂 ), then creating this extension ends our work: annotations will be generated automatically, and they will be displayed in the opened text editors.

Annotation Preferences

The marker creation process is entirely the same as the creation of error markers: a new marker is initialized using the IResource.createMarker(String markerId) method, and the IMarker.CHAR_START and IMarker.CHAR_END markers are set for positioning the markers.

An editor opened with the annotated code

Alltogether, the markers and annotations provide an easy way to extend existing editors with meta-information (e.g. execution state, dependencies, coverage, etc.), that can be used with a very little amount of Java programming. Sadly, the content assists for extensions is not that discoverable as the Java APIs, so a lot of experimenting and/or documentation reading is needed. Luckily, documentation is available for these extension points, so knowing their name will be enough for most cases.

So, lets hope, Eclipse gives us a big enough margin to contain our proo metainformation.

Generating Java code from EMF models automatically

EMF provides a nice code generation facility for generating Java class hierarchy from ecore models. There is support for hand-made modifications to the generated code, or a large selection of generation parameters are available through a generator model.

However, we had a different need together with the VIATRA2 project: we explicitly forbid in our internal policy modifications to the generated Java code in order to avoid non-reusable models, and as this way the generated code does not contain any extra information to the generator model, we do not want to store the Java classes in our Subversion repository.

Another minor obstacle was that the code generation does not happen automatically (e.g. as in the LPG editor of the IMP project) – so simply not committing the generated code to the repository would mean a manual invocation of the generator that is inconvenient.

Luckily, the code generator is available using specific ANT tasks, allowing to write a build file that describes how to generate the required Java code, reusing the existing build model if necessary:

 reconcilegenmodel="reload" >

Some interesting points:

  • As the selected Ant task is provided by an OSGi plug-in, it is important to set the Runtime JRE to the same one as used by the workspace, otherwise the called task will not be found.
  • As the genmodel generator is not available as an Ant task, the ecore2Java task was used. In this case it is important to use the reconcilegenmodel attribute to avoid the unnecessary override of the genmodel
  • Regardless of the value of the generate*project attributes if the *project were not given as sub-tags, the corresponding code was not generated. This seems like a bug for me, as this duplicates information already present in the genmodel, but I am not sure.
  • The code generator does not set the generated files Derived attribute, thus it is not excluded from the svn repository. Unfortunately, it is not possible without writing custom tasks, as there is no such command (yet) in Eclipse (but hopefully the patch gets committed in Indigo).

This ant task mimics the GUI-based code generators functionality, thus fulfilling our needs, except two things: it does not exclude generated code from the repository, and does not run automatically.

The repository exclusion is resolved using a hack: the src folders are listed in all generated projects in the svn:ignore variable – thus manually disallowing their sharing. This approach works well, because if no code is generated, the src folder might be missing, that is marked with a very explicit error message (as it is still referenced in the project properties), but if the task is executed, the folder is created as well (and the source folder reference will be working).

For the automatic building Eclipse provides a neat way: External tools (e.g. Ant scripts, or shell scripts, etc.) can be added as project builders. A Run configuration has to be created for the selected task, that can be added to the project. This means, our script is executed each time the selected project changes

The Ant task has been set as a custom builder
Builders page in the Project properties

Finally, some other modifications were made to the task in order to provide a more standard build: first, as not the entire project is generated at every invocation of the script (the projects could not be deleted, because the generator does not support setting the version number of the generated edit and editor projects), if all project stubs are downloaded from the repository, the build order might not be adequate. To overcome this, a manual build is executed for all projects at the end.

A second modification is a clean task, that removes all source folders, in order to provide a “clean”, out-of-the-box experience for a regeneration. This task can also be registered at the builder properties.

The full, modified script looks as following (using Gist from Github):

Update: the tasks are not available in Eclipse, as mentioned by a comment. They are provided by the Buckminster tool, and are not required for the EMF code generation, so it is safe to remove them. However, in this case the refreshing and building of the projects have to be managed in another way (e.g. registering the Ant script as an External builder for the projects).

This simple script helps in a lot of cases – and because the ecore models does not change very often, its performance is not really an issue (we have quite a large ecore model, so the generator runs about 30-40 seconds). My main issues are the following:

  • I don’t like to reproduce information from the genmodel in the build script. I am not entirely sure which information is read from the genmodel and which is inferred from the parameters, making the solution a bit volatile.
  • The missing derived flag command. It would be quite a nice touch if the code would take care of it.

Alltogether, if you like it, feel free to use it (licensed under EPL). I also would be glad if somebody could give me a hint about how to enhance the script – this is only my second Ant build script, so I don’t know the language very well. 🙂