Twitter icon.
Random notes on software, programming and languages.
By Adrian Kuhn

Archive for the ‘Java’ Category

Dependency Injection vs. Virtual Classes


Wednesday, December 30th, 2009

Every time you create an instance of a class by using the new keyword, you tightly couple your code to that class, and you will not be able to substitute that particularly implementation with a different one (at least not without recompiling the code).

Taken from an answer on stackoverflow that motivates dependency injection. Dependency injection however is not the real solution, it is rather a workaround for missing virtual classes. In this post I will show you why.

For example in

public class MyClass {
  public string sendMessage(String contents) {
    return new MailService().sendMessage(contents);
  }
}

you cannot use a different MessageService. This might not seem limiting at first sight, but think for example of testing. Your tests might be better off using a stub service that does not send actual emails.

This can and is being solved with dependency injection, yes. But the actual problem is rather a language problem: Classes are not virtual. While methods are virtual and thus looked up at runtime, classes are not.

In the following example, we can change the mail service by subclassing and overriding a method. This works because methods are virtual and thus late bound.

public class MyClass {
  public string sendMessage(String contents) {
    return newMailService().sendMessage(contents);
  }
  public MailService newMailService() {
    return new MailService();
  }
}

What a difference a single whitespace can make!

When newMailService is called, that method name is looked up in the context of the calling method. The context of the calling method is its class. Therefore, by subclassing the containing class we can provide a different mail service.

Imagine the same would happen with class names. Whenever we call new MailService() then the class name would be looked up in the containing context. Imagine further the lookup would not stop with the class but bubble up to the containing package. Then, by “subclassing” that package we might provide a different mail service.

Subclassing a package is nonsense terminology, let’s rather say that we instantiate a package … and let’s further assume that packages and classes are the same, and that our system is built of nested classes.

public class MyApplication {
  public class MyClass {
    public string sendMessage(String contents) {
      return new "MailService"().sendMessage(contents);
    }
  }
}

I have put the class name of MailService in quotes to remind you that it is the name of a virtual class name and thus looked up in the containing context.

Imagine further that we could pass the binding of classes to a class when creating it, then we could instantiate two versions of our application: one for real, and one for testing.

Application forTesting = new Application(MailService = TestMailer);
Application forReal = new Application(MailService = InterwebMailer);

Testing and production are not the only possible contexts. Imagine for example an XML parser. Typically it creates DOM nodes, but maybe you would prefer you own implementation of nodes. Maybe even the DAO objects of your application? There we go

XMLParser custom = new XMLParser(Node = Application.DAO);
XMLDocument doc = custom.new XMLDocument("example.xml");

NB, the second line is actually valid Java syntax: this is how you create an instance of an inner class when you are not in the same file as its outer class.

— So this is how dependency injection should really work!

And it is even not that far from reality. Both Beta and Newspeak (the new language by Gilad Bracha) do have virtual classes and both do have nested classes, and in both classes what I just described is how you decouple your packages in these languages.

And, hup hup hup, another design pattern has been disguised as a missing language feature :)

To Clone or not to Clone


Wednesday, August 26th, 2009

If two or more JExample tests depend on the same return value, something must be done to prevent side-effects. The default policy is to clone the cached return value before injection. More policies are available through the @Injection annotation.

Let’s consider an example with one producer and two consumers:

@RunWith(JExample.class)
@Injection(InjectionPolicy.NONE) // disables cloning, don't do this at home!
public class StackTest {

    @Test
    public Stack emptyStack() {
        return new Stack();
    }

    @Given("#emptyStack")
    public void shouldPushFoo(Stack stack) {
        stack.push("foo");
        assertEquals(1, stack.size());
    }

    @Given("#emptyStack")
    public void shouldPushBar(Stack stack) {
        stack.push("bar");
        assertEquals(1, stack.size());
    }

}

If we run this test class either of the consumers #shouldPushFoo or #shouldPushBar (depending on the order of execution on your machine) will fail with “expected:<1> but was <2>” as error message.

Why?

The test framework runs the producer #emptyStack once and caches its return value. When the first consumer is about to run, it is called with the cached return value as parameter. The consumer then modifies the provided value in order to execute its test. However, if no special measures are taken, this modification will be visible to the second consumer, which is later called with the same cached return value!

JExample offers three strategies to avoid tainted injection values:

  • InjectionPolicy.CLONE uses the #clone method to clone all provided return values. If cloning fails for a value (typically because the value does not implement Cloneable), its provider is rerun to obtain an untainted value.

  • InjectionPolicy.DEEPCOPY uses internal reflection to create a field-by-field copy of all provided injection values and all objects reachable through these values (ie deep copy). This strategy does not call the #clone method and thus even copies values that don’t implement Cloneable. Use with caution: this strategy might cause the VM to die without further notice, since some objects are just not ment to be copied in that radical way.

  • InjectionPolicy.RERUN reruns all providers whenever injection values are required. This strategy is most stable, however you lose the benefit of faster test execution.

By default, JExample uses the CLONE policy.

Lookup of injection policies is resolved as follows: first JExample looks at the consumer method, then at the current class and all its superclasses, then at the current package and the package of all superclasses, and eventually it defaults to InjectionPolicy.DEFAULT (which, by default, defaults to InjectionPolicy.CLONE).

Injection policies are available since JExample revision 374, released in August 2009. You can obtain the latest version either as plain JAR file or as Eclipse library (update site).

Check printf at compile-time


Monday, March 16th, 2009

Since the introduction of #printf in Java 1.5, I always wished to check my format strings and their arguments at compile-time.

Now, it can be done.

The following compiler-plugin checks format string and format arguments of any call to #printf and #format. Of course, checking is limited to format-strings known at compile time.

Just put printf.jar in your classpath (works with javac only!)

javac -classpath printf.jar Example.java

Since there are numerous #printf and #format methods in the Java API, any method with one of said names and with parameters of type String and Object[] is checked. (A cleaner solution would use a @Printf annotation to mark these methods. Alas, I cannot touch the Java API just like that.)

The above plugin works with javac only. Why? Annotation processing is done before type attribution and thus type information is not available to a JSR 269 plugin. However, checking the type of format arguments cannot be done without type information. Thus, what I do is to manually invoke the compiler’s type attribution phase. This is done using the internal Attr component as follows.

import com.sun.tools.javac.comp.Attr;

@SupportedAnnotationTypes("*")
public class Printf extends AbstractProcessor {

    private Attr attr;

    public synchronized void init(ProcessingEnvironment processingEnv) {
        attr = Attr.instance(((JavacProcessingEnvironment) processingEnv).getContext());
        ...
    }

    public boolean process(Set annotations, RoundEnvironment roundEnv) {
        for (Element each: roundEnv.getRootElements()) {
            if (each.getKind() == ElementKind.CLASS) {
                attr.attribClass(null, (ClassSymbol) each);
                ...
           }
        }
    }

    ...

PS, type information is discarded after each annotation processor round. It is thus safe to apply type manipulating changes to the AST from within an annotation processor, since after the plugins have been run, type information is re-attributed again from scratch.

For the full sources, please refer to /Sources/b/Printf on the SCG repository.

Have fun!

Create Annotation Instance with Defaults


Thursday, March 5th, 2009

Working often with annotations, sometimes I feel the need to create an annotation instance that is filled with default values. Annotations are interfaces, thus we cannot create instances. Using dynamic proxies, however, it can be done as simple as follows

Ann ann = Defaults.of(Ann.class)

Just copy paste the following class into your project to make this work

class Defaults implements InvocationHandler {
    public static <A extends Annotation> A of(Class<A> annotation) {
        return (A) Proxy.newProxyInstance(annotation.getClassLoader(), new Class[] {annotation}, new Defaults());
    }
    public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
        return method.getDefaultValue();
    }
}

This creates a dynamic proxy that implements a given annotation and returns the default value for each invoked method.

A more complete implementation should further distinguish between methods that represent actual annotation values and methods inherited from Annotation and Object. The latter require a more complete implementation. But how do they say? left to the read as an exercise, smile.

Have fun!

Foreach in 1 Billion with Index


Tuesday, December 9th, 2008

The previous post presented a DSL around Java’s foreach loop but did not fix its maybe most apparent limitation: the inability to access the index of a running loop.

The following solution is taken from Python’s enumerate function.

for (Each<String> word: withIndex(lorem)) {
    …
    word.index;
    word.element;
    …
}

This works the same way as the remaining ForEach DSL. We hook a custom Iterable into the loop and iterate over a series of wrappers rather than the original elements. Each wrapper has two fields, one containing the original element and one containing its index. Given a collection with 1 billion elements we would thus create 1 billion …

Err … creating ONE billion objects? This is Madness!

Let’s fix that.

While looping the wrappers are used sequentially only. One after the other. Why not reuse a single wrapper instance for all elements? And thus avoid creating another 999′999′999 wrappers? Clearly a difference that matters, I though. But a couple of benchmarks later I learned that creating a wrapper for each element is not significantly slower.

Apparently the JVM knows how to optimize its loops. This is a nice example how premature optimization is rendered futile given today’s virtual machine technology.

Actually, I wasn’t that surprised by these results. I already expected both approaches to be close, but still, it was impressive to see that the difference between both approaches is not statistically relevant. At least not with my primitive benchmarking capabilities. It is hard to write benchmarks that are neither too trivial nor shot themselves in the foot. Given all the optimizations going on under Java’s hood, I wont claim that mine do neither.

 

Updated sources of the ForEach DSL are available on SCG subversion.

Pimp my Foreach Loop


Thursday, December 4th, 2008

Java’s foreach loop is useful but limited. Neither can we access the iterator of a running loop, nor can we write Ruby-style1 foreach loops, such as

1) In fact, this style of looping can be traced back to the ancient times of legendary Smalltalk.

names = words.select { |each| each.frist.upcase? }     

bool = words.any? { |each| each.size > 7 }

length = words.inject(0) { |sum,each| sum + each.size }

We all know, this is because Java is – even though in its 6th release – still without the most basic structure of a programming language: Clo-clo-closures! (And rumours are, they wont fix that for the 7th release either.)

There are workarounds. Typically, anonymous inner-classes are (ab)used as closures. I dont like that. All the syntactic clutter and implementation overhead of using full-blown classes, only to inject some simple behavior into a loop? IMHO, not worth it.

Instead, here is a small DSL that pimps ye old Java foreach loop for you:

for (Select<String> each: select(words)) {
  each.yield = each.value.charAt(0).isUppercase();
}
Collection<String> names = $result();
for (AnySatisfy<String> each: anySatisfy(words)) {
  each.yield = each.value.length() > 7;
}
boolean bool = $result();
for (Inject<Integer,String> each: inject(0,words)) {
  each.yield += each.value.length();
}
int length = $result();

How does it work?

Behind the scenes, Java’s foreach loop operates on an instance of Iterable. Thus we can hook a custom object into the loop and get called back upon starting and terminating the loop, as well as before and after each iteration.

In the first example, #select is a static method that wraps the given collection in a custom object. The custom object is of type Iterable<Select>, where Select has two fields. One field is used a input parameter, the other as output parameter. Before each iteration of the loop, value is populated with the current element. After each iteration, yield is polled for a boolean value. Inbetween, the loop is executed.

While running the loop, all elements for which the loop yields true are copied into a new collection. Upon terminating the loop, this collection is assigned to $result. To keep things thread-safe, the result is stored in a ThreadLocal variable .

The same technique is used in the two other examples. #anySatisfy checks if all iterations of the loop yield true. #inject combines all elements by injecting an accumulator value into each iteration of the loop.

The list of currently supported queries includes

  • AllSatisfy
  • AnySatisfy
  • Cardinality
  • Collect
  • Count
  • CutPieces
  • Detect
  • Fold
  • GroupedBy
  • IndexOf
  • Inject
  • Reject
  • Select

If you need more, you can subclass For<Each,This<Each>> and implement your own query. As an example, I shall leave the implementation of Count here:

public class Count<Each> extends For<Each,Count<Each>> {
  public Each value;
  public boolean yield;
  private int count;
  protected void afterEach() {
    if (yield) count++;
  }
  protected Object afterLoop() {
    return count;
  }
  protected void beforeLoop() {
    count = 0;
  }
  protected void beforeEach(Each element) {
    value = element;
    yield = false;
  }
}

As usual, the complete sources are available on SCG subversion.

Have fun!

Roman Numerals, in your Java


Wednesday, November 5th, 2008

Today’s hack is based on Lukas’s recent work on DSLs. Imagine you could compile and run a Java program that uses Roman numerals.

public class Example {
    public static void main(String[] args) {
        System.out.println(
            MCMLXXVII + XXIV
        );
    }
}

It can be done—put this 7.5 kb jar in your compiler’s classpath!

% javac -cp roman.jar Example.java
Note: 2 roman numerals processed.
% java Example
2001
%

Obviously, this jar extends the syntax of Java … close, but not quite. The source code that contains the Roman numerals is already valid Java syntax. It is valid Java with undeclared variables! If you compile it without our small jar, the compiler will tell you that the Roman numerals are undeclared variable names. Thus, our small jar hooks into the compiler, bending JSR 269 beyond its limits, and replaces all roman variables names with integer literals.

Caveat, requires Java 6.

Below the rewrite rule: class Transform extends TreeTranslator, an internal class of the Java compiler. Transform visits all statements in the source code, and replaces each variable whose name matches a Roman numeral with an int literal of the same numeric value.

public class Transform extends TreeTranslator {
    @Override
    public void visitIdent(JCIdent tree) {
        String name = tree.getName().toString();
        if (isRoman(name)) {
            result = make.Literal(numberize(name));
            result.pos = tree.pos;
        } else {
            super.visitIdent(tree);
        }
    }
}

Our small jar implements an internal DSL, but an unusual one. We reuse the host language’s original syntax in a creative way that allows us to express new semantics. Lukas refers to this kind of language extension as “pidgin”, since human pidgin languages play in a similar way with natural language.

For the technically inclined, the complete source code is available on SCG subversion.

 

You may also like

For.example is Digg proof thanks to caching by WP Super Cache!