JPropel brings LINQ, reified collections and consice utilities for Java

This is a post about JPropel-light, a free and open-source Java library aiming to cut down code bloat, boilerplate and generally the number of lines of code Java developers have to write in order to complete a task.

It comes with full LINQ support, reified generic collections and concise one-liners for performing standard tasks such as reading an entire text file into memory, array manipulations, XML processing and querying, etc. Let's examine some of them.


LINQ

A very popular language feature of C#, Language INtegrated Query adds native data querying capabilities. These can, for example, be used to project and filter data in arrays and enumerable classes (source: LINQ)

var results = someCollection.where(c => c.SomeProperty < someValue * 2);

Here is an equivalent Java snippet:

List<SomeElement> results = new ArrayList<SomeElement>();
for(SomeElement c : SomeCollection)
if(c.SomeProperty < someValue * 2)
results.add(c);

This is a trivial example, but it becomes obvious that for more complex queries, such as group by, select many, etc. the code becomes much more verbose. Even the low level of verbosity shown above hides the real purpose of the snippet, so one has to read it multiple lines of code in their entirety to understand it. This gets in the way of the developer and is not good for productivity. So is the following *valid* Java code perhaps better?

List results = someCollection.where(propertyLessThan(someValue*2));

How is this possible? Well, firstly we have to import the right class, which would be the Linq class of the framework. It has implementations for all methods found in the .NET implementation such as: all, any, select, where, distinct, first, last, zip, etc.

The second step is to annotate the class that contains your code with the @ExtensionMethod annotation, imported from a library called lombok-pg. This library makes a lot of the shown syntactic sugar possible by instructing your compiler to pre-process the class code before compiling. This is done transparently to the developer. As far as bytecode is concerned, the code is converted to standard Java calls. So for the snippet above, the code effectively becomes:

List results = Linq.where(someCollection, propertyLessThan(someValue*2));

This you will recognise is standard Java static method call. This is how extension methods work under-the-hood in C# for example. But crucially, you as a developer do not have to work as such and can be much more expressive, using a builder-like fluent API:

List names = new String[] {"james","john","john","eddie"}.where(startsWith("j").select(toUppercase()).distinct().toList();

The above statement returns ["JAMES", "JOHN"] and is arguably more readable and concise than the equivalent imperative series of Java statements that it gets compiled down to, don't you agree?

String[] array = String[] {"james","john","john","eddie"};
array = Linq.toList(Linq.distinct(Linq.select(Linq.where(array, startsWith("j")), toUppercase())));

The other interesting thing to note here is how can we seemingly pass functions as arguments (e.g. see startsWith, toUppercase). Everyone knows that Java does not have first class functions, in other words, cannot pass methods/functions around as objects. This is another area where lombok-pg helps us, allowing for the annotation of methods/functions with @Function, enabling easier functional programming in Java.

@Function
private static String toUppercase(String element) {
return element.toUpperCase();
}

The above annotation will pre-process the code by wrapping the annotated function in an anonymous class, allowing us to pass it around as an object. This is what the resulting code would look like if you decompiled the class:

private static Function1 toUppercase(String element) {
return new Function1() {
public String apply(String arg) {
return element.toUpperCase();
}
}
}

This is similar to how Scala functions are implemented. You can then pass these 'functions' (which are really just Java objects / anonymous classes) around, just like you would pass objects around in other languages, e.g. C#. And these are the types of objects that the Linq class accepts as selectors, predicates, filters, etc. Here is for example the source code of the select() method.

public static TResult[] select(final TSource[] values, final Function1 selector)
{
List result = new ArrayList(values.length);

for(TSource item : values)
result.add(selector.apply( item));

return toArray(result, selector.getReturnType());
}

As you can see, all the above method does is call the apply() method of the function to extract the required data, oblivious to the actual implementation of your selector. The JPropel library comes with a number of predicates and projections built-in, such as the ones shown above e.g. startsWith, endsWith, contains, equal, etc. The Linq class static methods accept these as well as custom (written by you!) functions to perform select, where and other such operations, in a way that is suitable to your program.

The JVM deals with anonymous classes very frequently when you code against it using languages such as Scala. A benchmark between C++, Go, Java and Scala which made the news recently, showed that Scala's performance is very much on par with Java's. So clearly the JVM can cope with wrapping methods into anonymous classes pretty well. Therefore there is a minimal performance consideration when using such code, but you can also cache such 'functions' around when you know you will use them very frequently.

Most Linq methods come in two flavors. One accepts generic arrays and one accepts generic Iterables. There are two reasons for this. Firstly, arrays and Iterables do not share a common super class which would allow for traversal of items. Secondly, arrays and Iterables are handled completely differently from the JPropel library perspective.

So when you pass an array to be processed, the resultant array is allocated and all results are inserted before it is returned. This is not the case with iterables. Iterables are created using a block iterator, similarly to how yielding works in C#. This means that if you do not iterate over the entire collection, then only some of the elements are processed, which is more efficient, because it allows you to for instance break the iteration process without consuming as much memory and processing resources.


Reified generic collections

Generics in Java are implemented using type erasure. This means that generic type information is removed at run-time. A list of integers and a list of strings have the same class type in Java, which creates an interesting set of problems in areas such as overloading, instantiation and introspection. Type erasure is considered by many an inferior way of implementing generics. The subject of erasure vs reification has been covered neatly already so let's focus on what's important here. Since the JVM erases the run-time type of generics, we have to live with this. Or do we?

Neil Gafter came up with a workaround to this problem, called super-type token. The whole idea is based on the fact that the JVM does not erase type information of anonymous classes. Therefore if you instantiate a class using anonymous class semantics, you can query its run-time type parameter(s):

List list = new ArrayList<String>() {}; // note: braces after parentheses

The JPropel library embraces this pattern and provides a wide array of collections that follow it, such as Lists, Maps, Hashtables and others. This allows you to do common sense things such as:

LinkedList list = new LinkedList(){};
String[] items = list.toArray();

With type-erased collections you would have to do nonsensical things such as:

Object[] items = list.toArray();

or

String[] items = list.toArray(new String[list.size()]);

Being able to query the run-time type parameter can bring several benefits, especially if you are creating generic container, e.g. your own collection classes. But in such scenarios, if you have multiple layers of generic containers, you may not be able to reliably obtain the run-time type information. One example:

public class MyClass
public MyClass() {
List list = new ReifiedArrayList() {}; // no can do
}
}

The reason this is not possible, is because the type T of MyClass is not known. In such scenarios you have to pass the type explicitly, as such:

public class MyClass {
public MyClass() {
Class clazz = SuperTypeToken.getClazz(this);
List list = new ReifiedArrayList(clazz) {}; // OK
}
}

You then go ahead and instantiate MyClass using anonymous class semantics, as shown above (or pass the type explicitly if you wish). Although this is not ideal, at least it is possible to accommodate for multi-level generic containers.

Finally, it is also possible to instantiate collections with multiple type parameters:

AvlHashtable lookup = AvlHashtable() {};
Iterable keys = lookup.getKeys();Iterable values = lookup.getValues();

This is possible because the SuperTypeToken class has an overloaded method that allows you to obtain the generic run-time parameter type based on its zero-based index.


Fun with utilities!

There are a lot of very concise one-liners supported by the library, allowing you to focus on the actual problem you are trying to solve instead of writing boilerplate. Here are some examples.

// create alphabet
char[] alphabet = new Character('A').to(new Character('Z'));
char[] numbers = new Charracter('1').to(new Character('9'));

// join two arrays and put in a list
List allowed = alphabet.join(numbers).toList();


// load entire text file in memory
String data = file.readFileToEnd();

// append to file
file.appendText("Some more text");

// XML compacting
String compacted = someXml.compact();

// copy stream over
inputStream.copy(outputStream);

// Culture-aware case-insensitive string comparison
boolean trueInGermany = StringUtils.equal("straße", "STRASSE", StringComparison.CurrentCultureIgnoreCase);

// check if all elements are present
boolean itsTrue = "abcdef".containsAll(new String[] {"bc", "cd"}, StringComparison.Ordinal);


There are too many utilities to cover here, around the areas of reflection, XML, conversions (e.g. binary, octal, hex, decimal, base64, etc.), character escaping, hashing, strings, etc. Have a look at the propel.core.utils.* package for more.


JPropel

JPropel-light is a lightweight version of the JPropel library. If you do not mind a couple of extra dependencies (BouncyCastle and SLF4j), have a look at it as it contains some extra utility functionality.

For instance, the CryptographicString class allows you to store a secret in memory in encrypted form:

CryptographicString cs = new CryptographicString("super secret");
char[] unencrypted1 = cs.asCharArray();
byte[] unencrypted2 = cs.asByteArray();
String unencrypted3 = cs.asString(); // beware of String interning

Another interesting construct is the ability to trace method calls, logging all input and output data (or exceptions thrown) in a highly configurable way.

@Trace(level=LogLevel.INFO)
public int add(int a, int b) {
return a+b;
}

This allows you to focus on the task at hand without worrying if you have enough logging within a method. To enable the tracing you have to instantiate your class (perhaps via a factory newInstance() method) in a way that allows the tracing to take place:

IMyClass myClass = new Tracer(new MyClass());

All calls to myClass that have been annotated with @Trace will emit logging statements of all method arguments, results (if not void) and exceptions (if thrown). The logger uses SLF4j so will plug into any popular logging library you are using. The format of messages is highly configurable as well.


Lombok-pg

JPropel and JPropel-light depend on the lombok-pg library.

If you are using an IDE such as Eclipse or Netbeans, you should patch it so that it knows that lombok is used as a Java agent. The process is very simple: just double-click on the lombok.jar found in the "lib" folder. This will enable much of the syntactic sugar shown above.

Javac and Ant do not require any patching as such.

You may visit the lombok project website for more info, or ask Philipp more about it.