“You know Java—but do you really know Java?” In this article, Voxxed Days Berlin speaker Sean Patrick Floyd serves up some choice hacks he’s picked up in his modest 18 years tooling on the platform. Happy hacking!

The scope of my Voxxed Days Berlin talk was to present different techniques in and around Java that you usually don’t usually learn about. The common theme is: code that interacts with code on a meta level. Having a set of quirky techniques is one thing, but the more interesting question is: what can I do with them? For this reason, I structured the talk (and this article) by use cases. Here’s what I came up with:

  • Value Objects

    In Java, creating simple value objects is one of the most common, and most painful, tasks. A value object should not have any logic, but obey several implicit or explicit contracts. According to Effective Java, value objects should encapsulate fields, ideally be immutable, make defensive copies of mutable fields, and implement equals / hashCode and toString correctly etc. In Java, all of this is very verbose and error prone. I will introduce several techniques to make this functionality less tedious, some of them sensible and others insane.

  • Patching a Third Party Library

    This is an insane use case, though unfortunately a common one. Your company forces you to use a library, but you are aware that it’s buggy, and that it’s no longer maintained, or the author is unwilling to  fix the bugs. Still, you don’t want to fork the library, because you want to stay current with the updates, or the sources are not available. I know this sounds crazy, and it is. Anyway, I am will show ways to patch a library automatically, on source code or bytecode level.

  • Compile-time Defect Analysis

    Static code analysis tools have been around in Java for many years, working on source code (CheckStyle etc.) or byte code (FindBugs). I’d like to concentrate on techniques that can be integrated directly into the compilation process. The idea is to have a single source of truth, and not several tools that have to be run in sequence. After all: the build should be fast, and maintaining several sets of rules is a pain.

Value Objects

Java value objects are expected to respect the JavaBeans convention (a public no-arg instance method named “getFoo” maps to the property “foo”), so by writing a common test framework that utilizes the JavaBeans Introspector mechanism, I can make assertions on the contents of the map.

Listing: turn JavaBeans properties into a Map<String, Object>

private static Map<String, Object> getPropertyMap(final Object instance) {
  final Map<String, Object> propertyMap = new TreeMap<>();
  try {
    Arrays.stream(Introspector.getBeanInfo(instance.getClass(), Object.class)
                              .getPropertyDescriptors())
        .filter((it) -> it.getReadMethod() != null)
        .forEach((propertyDescriptor) -> {
            final Method method = propertyDescriptor.getReadMethod();
            try {
              final Object result = method.invoke(instance);
              propertyMap.put(propertyDescriptor.getName(), result);
            } catch (IllegalAccessException | InvocationTargetException e) {
              throw new IllegalStateException(e);
            }
          });
  } catch (IntrospectionException e) {
    throw new IllegalStateException(e);
  }
  return propertyMap;
}

When it comes to simplifying Value Objects, the best-known method is via Project Lombok. It uses Pluggable Annotation Processing, a compiler hook introduced in Java 6. However, it uses it in a non-standard way, casting the read-only interfaces of the AST (Abstract Syntax Tree) into the internal implementation types and extending the AST on the fly. The advantage of this is that source code can be very minimal:

Listing: mutable User class implemented with Lombok

@Data
public class User{
    private String firstName;
    private String lastName;
    private Date birthDate;
    private 
List
<Address> addresses;
}

Lombok will generate all the missing methods: equals(), hashCode(), toString(), getters, setters.

If you want the same class to be immutable, all you need to do is make the fields final, and Lombok will generate a constructor instead of the getters. This is where things get complicated though: if your immutable class contains mutable state (e.g. Lists), you need to use one of several techniques to ensure immutability (defensive copy, decorate with Collections.unmodifiableList(), etc.), and any of these techniques have to be written manually, taking a lot of the fun out of Lombok.

Listing: user class re-implemented as immutable

@Data
public class User {
  private final String firstName;
  private final String lastName;
  private final LocalDate birthDate;
  private final 
List
<Address> addresses;

  public User(String firstName, String lastName, LocalDate birthDate,
             List
<Address> addresses){
    this.firstName = firstName;
    this.lastName = lastName;
    this.birthDate = birthDate;
    // defensive copy
    this.addresses = new ArrayList<>(addresses);
  }

  public List

<Address> getAddresses() {
    // unmodifiable wrapper
    return Collections.unmodifiableList(addresses);
  }
}

This is where Lombok’s competitor kicks in – Google’s AutoValue. AutoValue embraces immutability, and it doesn’t pretend to be magic the way Lombok does. With AutoValue, you define an abstract class with abstract getters, and a factory method. AutoValue will generate an implementation of the abstract class, which you can instantiate from your factory method. AutoValue assumes everything is immutable, so a defensive copy in a getter is not supported. Instead, you would copy your List to an immutable version in the factory method.

Listing: user class implemented in AutoValue

@AutoValue
public abstract class User{
  public abstract String getFirstName();
  public abstract String getLastName();
  public abstract LocalDate getBirthDate();
  public abstract 
List
<Address> getAddresses();

  public static User create(String firstName, String lastName,
                            LocalDate birthDate, Address address,
                            Address... moreAddresses) {
    return new AutoValue_User(firstName, lastName, birthDate,
       ImmutableList.copyOf(Lists.asList(address, moreAddresses)));
  }
}

As you can see, the factory methods add some verbosity. Sometimes though, this can be a good thing, as this gives you the ability to add custom logic (in this case, converting a varargs parameter to an immutable List). As usual, there will be those who prefer Lombok’s “Magic” approach, and those who like Google’s well-documented no-BS approach better.

I have  some unique ideas when it comes to value objects, so I often write custom source code generators. That way I get the exact code style I want and not the library maintainer’s favorite style. This approach is more complicated however, so it’s definitely not for everyone.

The framework I am using here is JCodeModel, a recent fork off a framework that was originally created by Sun’s JAXB project, for generating stubs from SOAP WSDLs. JCodeModel has a very nice interface that abstracts away the management of types and imports, which is usually the nastiest part of Code Generation. In my sample project, I use Groovy to integrate JCodeModel into the Maven build process. That’s of course a hacky way of doing things (we are here to hack Java, right?), but the alternatives would be too complicated (two separate compile cycles, moving the code generator to a library, writing a Maven Plugin, etc.).

Listing: creating constructor and fields from a Map using JCodeModel (in Groovy)

private defineConstructor(JDefinedClass clazz, Map fields) {
    // public [ClassName]()
    def constructor = clazz.constructor(JMod.PUBLIC)
    def body = constructor.body()
    fields.each {
        entry ->
            def type = resolveType(entry.value)

            // private final [Type] [field];
            def field = clazz.field(PRIVATE | FINAL, type, entry.key)   

            // final [Type] [param]
            def param = constructor.param(type, entry.key)                        
            param.mods().setFinal(true)

            // this.[field] = [param]
            body.assign(THIS.ref(field), param)                                   
    }
}

Of course, code generation could also be done via templates (usually Velocity or Mustache), but in my opinion that’s just not elegant, so I’ll leave that to somebody else.

The fourth technique I describe is dynamic class generation at runtime, using CGLib. This is the hackiest of these solutions, and one I’d certainly not recommend using in production code.

Listing: creating a Bean class at runtime

static Class<?> createBeanClass(
  /* fully qualified class name */
  final String className,
  /* bean properties, name -> type */
  final Map<String, Class<?>> properties) {

  final BeanGenerator beanGenerator = new BeanGenerator();

  /* use our own hard coded class name instead of a real naming policy */
  beanGenerator.setNamingPolicy(new NamingPolicy() {
    @Override
    public String getClassName(String prefix, String source, Object key,
      final Predicate names) {
      return className;
    }
  });
  BeanGenerator.addProperties(beanGenerator, properties);
  return (Class<?>) beanGenerator.createClass();
}

This generates a mutable bean class with the specified properties, but it doesn’t generate equals(), hashCode() or toString(), so I’m creating a proxy to provide this functionality, but this is where it gets really hacky. To cut a long story short, this technique is a proof of concept at best.

Patching a Third Party Library

Imagine the situation: For reasons outlined at the start of the article, you have to use a buggy third party library, and you don’t want to fork it. So you need an automated patching pipeline. If your reaction to this has the initials WTF, then I have to congratulate you on your sanity, but bear with me. For the time being, we will assume this is a sane requirement, you have your CI / CD pipeline set up, versioning and triggers figured out and now you need to apply the actual patch. This is where it gets interesting.

You basically have two options: change the source code or the byte code. If you have access to the sources, you should prefer the former, since you then have a regular compile process and the compiler acts as a sanity check to verify your code is still valid (you’ll need tests too, preferably lots of them).

If you want to patch the source code, the easiest option is of course to apply a standard UNIX diff with your changes. But this is also the flakiest option, since it will break on very minor changes in the source code.

The next step would be to use a regular expression. But as StackOverflow-Users will be aware, regular expressions are ill-suited for the parsing of structural languages. It can feel like every time you attempt to parse HTML with regular expressions, the unholy child weeps the blood of virgins, and Russian hackers pwn your webapp”. This of course applies even more to programming languages like Java. Don’t do it, just don’t. Use a parser.

The only freely available and Java 8 ready standalone parser library is JavaParser, another new fork of a previously dead project. JavaParser’s grammar is created directly from JavaCC, the grammar creator for javac.

JavaParser lets you read a Compilation Unit (a .java source file, which may contain multiple classes), change it programmatically and write it to an output of your choice. Just what we need. In my sample code, I will again do this in Groovy, but it could also easily be done in Java, with a more complicated build setup.

The preferred way to work on a CompilationUnit is to write a visitor, in this case I will be implementing VoidVisitor, or rather extending its convenience adapter VoidVisitorAdapter.

Here is a sample implementation of a Visitor, that replaces the body of a method named “yUNoReuseInteger” with a call to “Integer.valueOf()”.

class PatchVisitor extends VoidVisitorAdapter<Void> {

    public void visit(final MethodDeclaration n, final Object arg) {
        if (n.name == "yUStringConcatInLoop") {
            n.body = new BlockStmt() // delete existing method body
            patchStringMethod(n.body, n.parameters[0], n.parameters[1])
        } else super.visit(n, arg)
    }

    private patchIntegerMethod(BlockStmt blockStatement, Parameter param) {
        Type type = new ClassOrInterfaceType("Integer")
        def typeExpr = new TypeExpr()
        typeExpr.type = type
        MethodCallExpr methodCall = new MethodCallExpr(typeExpr, "valueOf")
        methodCall.args.add(new NameExpr(param.id.name))
        blockStatement.getStmts().add(new ReturnStmt(methodCall))
    }
}

As you can see, it’s very verbose. One line of generated code requires 6 lines of generator code. Also, JavaParser does not manage types for you, so you will have to keep track of all imports that need to be added. Needless to say, this is a technique I would recommend for small patches only.That being said, I have used this technique in the past, creating a custom Maven distribution for a previous client of mine, with an automated pipeline that created a patched version of every new Maven version.

If you don’t have access to the sources, you could resort to bytecode hacking using tools like CGLib or ByteBuddy, although the programming model is far from intuitive and very error-prone. So I prefer AspectJ an aspect-oriented language which integrates nicely with Java.

AspectJ has a separate compiler, ajc, which can be used either instead of, or in addition to, javac. With AspectJ, you identify pointcuts, a kind of selectors that match code patterns, like calling methods or assigning fields. You then link these pointcuts to different types of advices (before, after, around etc.), which means your code will be executed before, after or instead of the code identified by the selector.

AspectJ is a very powerful technology, but it’s also pretty hard to use, and the documentation is almost non-existent. Your only real chance of understanding it is through Ramnivas Laddad’s book ‘AspectJ in action, 2nd Ed. In the example I give, the code replacement is almost trivial. The only downside is that your code will have a dependency on the AspectJ runtime.

Listing: replacing the contents of a method with an around advice

public aspect FicticiousExamplePatch{
      // pointcut
      public pointcut integerMethodCalled(int value) : 
          execution(* com.yourcompany.FicticiousExample.yUNoReuseInteger(..))
          && args(value);
      
      // advice
      Integer around(int value) : integerMethodCalled(value){
          // note that the actual method body is replaced, not wrapped
          return Integer.valueOf(value);
      }

}

Compile-time Defect Analysis

Wouldn’t it be great to have a smart compiler? Don’t get me wrong, the Java Compiler is amazing, it moves initializer blocks to constructors, manages string constants and does many other things we take for granted. But one thing it doesn’t do is prevent you from writing buggy code. In Java, it’s very easy to write code that fails miserably at runtime, and it’s very hard to write tools that prevent you from doing so. Sure, you can (and should!) use static analysis tools like SonarQube in your CI build to identify common bug patterns, but wouldn’t it be nice, for example,  to have the compiler itself reject code that is liable to throw NullPointerExceptions? Let’s look at techniques that plug into, or entirely replace, the Java compiler.

One word of caution: In my sample code on GitHub, none of these technologies are incorporated into the build cycle in the suggested way. The reason is that I am unit testing the actual compilation process, and you can’t test a compilation from inside. Consult the corresponding documentation for the correct build setup.

The first compiler technology I am looking at is Google’s ErrorProne, a library that provides a wrapper around the javac compiler, with a similar API. ErrorProne uses an internal javac hook to inspect the AST for known bug patterns, which it then rejects with compile-time errors. It replaces big parts of the compiler, so it’s libraries need to run on the bootclasspath, not the regular classpath

Some of the supported bug patterns are useful, but most of them are only really helpful if you use some of Google’s APIs (Guice, Protobuf, Android etc.). ErrorProne does not have an open API for bug patterns, the only way to get new patterns added is to submit a Pull Request (or to fork the project). Which leads me to the conclusion that the idea is nice, but the implementation is too specific to be useful outside of Google. That said, one slightly useful example I feature in my talk is the ability to detect bad regex patterns at compile time.

Listing: this class contains a bad regex pattern that causes a compilation error in ErrorProne

public class IllBehavedRegex implements IllBehaved {
    public static List<String> splitByBadPattern(String input) {
        return Arrays.asList(input.split("[a-z")); // ← bad pattern
    }
}

A different, more standard approach is used by the Checker Framework, which uses annotation processing to create defect analysis based on Java 8’s powerful new type annotations. By default, the checker framework comes as a downloaded package, which is installed locally in a way that it shadows and replaces javac, which I personally find a bit hacky. You can also use individual processors by either passing them as a compiler flag or by using the ServiceLoader SPI. Here I’m using  nullness for an example. The checker framework supports pretty fancy semantics in this check, e.g. methods that return null if and only if their parameter is null. Since NullPointerExceptions are the most common RuntimeException, this type of check is extremely valuable.

Listing: illustrating some of the nullness strategies the Checker Framework can validate

public class WellbehavedNullness implements WellBehaved {

    @PolyNull // will return null, if and only if the input is null
    public String nullInNullOut(@PolyNull String input) {
        if (input == null) return input;
        return input.toUpperCase().trim();
    }

    @MonotonicNonNull // starts out as null, but once initialized,
                      // will never again be null
    private String initallyNull;

    public void assignField(@NonNull String value) {
        initallyNull = checkNotNull(value, "Value required");
    }

    @Nonnull // will never return null
    public String getValue() {
        if (initallyNull == null) initallyNull = "wasNull";
        return initallyNull;
    }
}

A different approach is to re-visit AspectJ and use one of it’s less-known features: policy enforcement through custom compiler errors. The usage pattern is to again define pointcuts, and to assign them to custom compiler warnings or errors. While this is a very nice feature, it is unfortunately limited to so-called static pointcuts, e.g. pointcuts which can be statically evaluated at compile time. Static pointcuts can take into account the hierarchy, but they can’t evaluate the control flow, which makes it impossible to detect advanced bug patterns like bad implementations of nullness or immutability. What you can do is enforce very simple usage patterns (e.g. outlaw the use of soft-deprecated classes like Hashtable, Vector or StringBuffer), or to enforce architectural decisions in a monolithic architecture (only the service package may access the persistence package, whereas the web frontend package may only access the service package).

Listing: policy enforcement with AspectJ

public aspect FicticiousExamplePatch{
    // forbidden class usage
    pointcut hashTableOrVector() : call(* Hashtable.*(..)) // method call
                                || call(* Vector.*(..))
                                || call(Hashtable.new(..)) // instantiation
                                || call(Vector.new(..));
    declare error : hashtableOrVector() :
    "Hashtable and Vector are deprecated classes, don't use them!";
    
    // forbidden package access
    pointcut inFrontendPackage()     : within(org..frontend..*);
    pointcut persistenceCall()       : call(* org..persistence..*.*(..));
    
    declare error : inFrontendPackage() && persistenceCall() :
    "The persistence package may not be accessed from the frontend package";
}

This concludes our little hacking session. I hope I’ve made some of you curious, and I’m definitely open to more suggestions (or to pull requests improving my sample code).

 

Meta Level Code on Code – Let’s Do Some Java Hacking

| Java Language| 3,510 views | 0 Comments
About The Author
- Java / Scala developer with ~20 years of experience, working as a Search Engineer for Zalando SE. Austrian / American living in Berlin, martial arts geek, whiskey lover, married with children

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>