Java Language Freakshow

Apr 29, 01:12 PM

In the course of developing and maintaining a tool1 that has to parse Java source, we are sometimes confronted with Java constructions that are, well, more than a little freaky. When we encounter such a beastie, we carefully extract it from its surrounding source and (in full biohazard gear) herd it into the Cenqua Vault for Java Nasties2.

Here, for your voyeuristic pleasure, I’m opening up the vault and giving you a sampling of the curios we’ve collected over the years.

Fun with unicode

First stop on the tour is the Unicode ward. Don’t tap on the glass. What would you expect as the result of executing this snippet:

java:
  if (Character.isJavaIdentifierPart((char) 0) {
      System.out.println("Go figure");
  }

Go figure. Unicode character U0000 is a valid java identifier part. It’s perfectly legal to have the character \\u0000 inside a java identifier. How can this be so? It turns out that U0000 is preserved in Unicode to be backwardly compatible with the ISO-646/ASCII standard, where 0×00 is the control character NULL, designed as a “filler”:

A control character used to accomplish media-fill or time-fill. Null characters may be inserted into or removed from a stream of data without affecting the information content of that stream3.

So it’s kind of like a NOP. Except that it’s not, because if you try this, you’ll get a compilation error:

java:
  String hello = "hello";
  System.out.println(hell\\u0000o);

But you can quite legally have something like this:

java:
  String hell\\u0000 = "hell, no!";

Why you would want to have something like this I’m not sure. People do. Go figure. Between you and me, it’s also legal to write

java:
  String hell\\uuuuuuu0000 = "please, stop!";

but that is another story.

Finally a way to break things

Next stop is the ward of disrupted transfer of control. We won’t stay long here; things can get messy quickly. Have a quick look at this next example and ask yourself what its output is:

java:
  int count = 0;
  do {
      try {
          try {
              break;
          }
          finally {
              throw new Exception();
          }
      } catch (Exception e) {}
      count++;
  } while (count < 10);
  System.out.println(count);

The correct answer is “10”. As you’ve probably worked out, the break attempts to transfer control to the enclosing loop, but the finally clause kicks in and throws an Exception, which is caught inside the loop and discarded. So the break is effectively snuffed.

Label mania

Labels are a curious beast in Java. For me they’ve always been on the fringe of the language, and I’m wary of code that uses them heavily, if only because I have to cognitively downshift in order to follow the flow of control.

Labels have their own namespace, separate from other identifiers. So it’s legal to write

java:
  Object: break Object;

which, apart from being pure code poetry (thanks Matt), doesn’t actually achieve much. Labels are also scoped to the statement they enclose, so it’s also legal to write

java:
  String: String string = "String";
  String: break String;
  String: while (true) break String;

which again doesn’t achieve much, besides perhaps a headache.

Throwing what?

Next up is this code snippet. Do you think it compiles? If so, what do you think it prints?

java:
  try {
      throw null;
  }
  catch (Throwable t) {
      System.out.println(t.getClass().getName());
  }
  try {
      throw (Throwable) new Object();
  }
  catch (Throwable t) {
      System.out.println(t.getClass().getName());
  }

I’ll leave this one as an exercise for the reader. If you want a hint, try the spec.

Constant (head) case

Last stop on the tour is ward for constant expressions. A constant expression is, roughly speaking, one that can be evaluated at compile-time. Turns out, you can use constant expressions in case labels. Yikes:

java:
  final int a = 1;
  final int b = 2;
  final int c = a + b;
  switch (foo) {
     case c - b == a ? 1 : 0:
          System.out.println("head case");
          break;
  }

Okay, I think that is enough of the Cenqua Vault of Java Nasties for today. I hope you weren’t too scarred by the experience.

If nothing else you can impress your friends by checking one in to your favourite open source project, or dazzle your workmates by slipping one into your next peer code review ;-)

1 Clover

2 The Cenqua Vault for Java Nasties is a part of our unit test suite :-)

3 from ASCII C0 control codes. I would have quoted ISO-646 directly, but it appears you have to pay for a copy of the standard. The idea of a standards organisation making you pay for an electronic version of a standard makes absolutely no sense to me.

Comments:

  1. Keep it away from me! It is infectous! I hope you have sterilized them.

    BTW: Do people actually use these type of code? And are they still working? Angsuman Chakraborty    Apr 30, 02:52 AM    #
  2. Yup, labels are pretty funky; PMD’s data flow analysis code still locks up on some cases.

    Hey, don’t forget about the weird inner class scoping stuff like

    class Foo {
    class Inner {
    void bar() {
    x = Foo.this.buz();
    }
    }
    void buz() {}
    }

    Blah. Tom Copeland    Apr 30, 08:37 AM    #
  3. Great post! Keep up the good work.

    I just have to say that FishEye is an awesome product. It really saved my butt last year when I was using the beta version with CVS. ken liu    Apr 30, 08:52 AM    #

Blog roll

Codefeed
Madbean Matt
Vincent Massol
Codegargle
Mike Cannon-Brookes
Oliver Burn

Recent posts

Confessions of a Samurai Coder
JavaOne 2007 - random thoughts
Sneak Preview of Clover 2
Profanity in Software Considered Dangerous
20 Candles for CVS
Checking Pulse: CI Server initial impressions
Seethed Rivers on Our Sly Brain
FishEye plugin for JIRA
The Symmetry of Stupidity
Java Language Freakshow

Blatant plugs

Clover Clover & Clover.NET: Powerful code coverage for Java, C# and VB.NET
FishEye FishEye: datamining, browsing, monitoring for CVS, SVN, and Perforce
Crucible Crucible: efficient, distributed and process-neutral peer code review
Cenqua Cenqua: dedicated to the creation of practical, useful tools for software developers