Sleep Interrupted

I’m currently using optimistic locking to implement a bulk-upload use-case with low concurrency requirements. The uploaded “numbers” simply need to be matched, one-to-one, with any unused “aliases” already present in the database. The optimistic locking is in the form of a “version” field on the “alias” that gets incremented when the “alias” is assigned to a “number”. The easiest way that springs to mind to handle the potential optimistic locking version conflict of concurrent access is to retry n times, sleeping a tiny bit in between. As a side note, if there’s a better way to approach optimistic locking recovery in this rare case where no user input is required, and the system can rationally try something different on every retry, I’d like to know about it.

In the meantime, I’m befriending Thread.sleep(), and I’m reminded of the importance of handling the InterruptedException properly. By far the most common approach is to simply ignore the exception and carry on immediately from where it was caught. The problem with this is that the most likely reason your thread might be getting interrupted is if the application is trying to shut down. Imagine your optimistic locking retry loop is set to retry 10 times, waiting 100ms each time (since it’s low concurrency, and highly unlikely it will have to retry ever, let alone more than once), and ignoring interruptions. On application shutdown, if this loop has only just begun, its first sleep will be interrupted, but that leaves it with 900ms of sleep left to go before it will shut down. That’s likely to cause some issues regarding unclean shutdown of that thread.

So if you get interrupted in a loop while waiting for something, your best bet is to respond by throwing an exception. The way I see it, your options are to “throws” the InterruptedException and don’t bother catching it, requiring callers of your method to catch it and hope they deal with it nicely (this is probably the right approach for library code), or choose an application-specific runtime exception which is obviously terminal so that no other part of the application will attempt to catch it and handle it. Roll back your transactions and walk away.

Ordered Processing of JMS Messages

This is something I’ve taken for granted for as long as I’ve known about JMS – that messages are often going to need to be processed in the order in which they were sent, therefore it must be possible. Thankfully it is possible, though it’s not the default behavior, and the mechanism for achieving it was not as obvious as I was expecting. Hence this blog post. Everything below is JBoss AS 7.1.1 / HornetQ specific, though the concepts should be translatable between vendors to some degree.

The Two Types

Queues

The key here is “message grouping”. The JMS spec states that all messages with the same group ID (“JMSXGroupID”) will be consumed in order. In practice this usually means that in an environment where a queue has multiple consumers (eg: a single application server instance with a pool of consumers per queue), all messages with the same group ID value will be sent to the same consumer, in order. Since each consumer typically runs on its own thread, using the same consumer prevents race conditions that would otherwise occur if messages are delivered to the consumers in order but processed out of order due to thread scheduling.

Happily, for JBoss AS 7.1.1 this even holds true when using a cluster where there are pools of consumers running on multiple servers, though it does take some configuration, particularly if you want to avoid a single point of failure. The general approach is exactly the same, with the additional requirement that if the assigned consumer for a given group ID is on a different server to the one that the message was sent to, the message will simply be forwarded to the server containing the appropriate consumer.

Topics

While topics are usually associated with multiple concurrent processors, they can be configured such that only one consumer in the whole cluster will process each message, using durable subscriptions. This gives effectively the same result as the above queue setup, but the configuration is all done on the MDBs that consume the messages, so no specific server config is required, and no single point of failure is introduced. Additionally, if you have multiple applications and want each application to process each message once only, that can also be achieved. The trick here is to configure each MDB with a unique client ID, and a limit of 1 concurrent session. In the example below, all messages to the HELLOTopic will be processed only once, in order, by the HelloWorldTopicMDB. In contrast to a queue, this does not preclude other applications from also consuming these messages. They can either process the messages normally (concurrently), or if they too want to process messages in order, one at a time, they would just use a different value for “subscriptionName” and “clientId”.

@MessageDriven(name = "HelloWorldQTopicMDB", activationConfig = {
@ActivationConfigProperty(
    propertyName = "destinationType", propertyValue = "javax.jms.Topic"),
@ActivationConfigProperty(
    propertyName = "destination", propertyValue = "topic/HELLOTopic"),
@ActivationConfigProperty(
    propertyName = "subscriptionName", propertyValue = "helloWorldApp"),
@ActivationConfigProperty(
    propertyName = "clientId", propertyValue = "helloWorldApp"),
@ActivationConfigProperty(
    propertyName = "subscriptionDurability", propertyValue = "Durable"),
@ActivationConfigProperty(
    propertyName = "maxSession", propertyValue = "1"),
@ActivationConfigProperty(
    propertyName = "acknowledgeMode", propertyValue = "Auto-acknowledge")})
public class HelloWorldTopicMDB implements MessageListener {
    public void onMessage(Message rcvMessage) {
        ...
    }
}

The only thing you might need to do here is to lift the security restriction on creating durable subscriptions. Something like this in standalone-full(-ha).xml, but obviously more secure for production:

<security-settings>
  <security-setting match="#">
    ...
    <permission type="createDurableQueue" roles="guest"/>
    <permission type="deleteDurableQueue" roles="guest"/>
  </security-setting>
</security-settings>

Summary

Out of these two options, Queues provide potential for a performant solution. That’s because if your business logic allows you can use multiple dynamic group IDs so that messages from different groups can still be processed concurrently, improving overall throughput. For example, if it is enough to serialise all messages for each individual customer, customer.id can be used as the group ID, and many different customer’s messages can be processed simultaneously.

The downside to using queues is that in a clustered deployment they require some configuration and introduce a single point of failure into the system (surmountable with yet more configuration – a whole hot backup instance for the one-true-grouping-handler).

Using topics on the other hand requires no special server configurations in a clustered deployment, and no single point of failure is introduced. Messages will happily be balanced across the cluster, processed one at a time, even as servers come and go.

The downside to using Topics is that although their load is balance across the JMS thread pool / cluster, there is only ever one message being processed at a time. If your business logic requires this anyway, or you can live with this performance constraint, then using Topics may be preferable for their simplicity.

Cache Dup

I love how simple is it to configure and scale an Infinispan instance. My biggest bugbear with NoSQL in general is the problem of denormalisation, leading to lots of duplicate records which need to be “manually” kept in sync by the application. So I’ve been working on a wrapper around Infinispan that tracks what goes in and automatically maintains referential integrity across denormalised data. To demonstrate, here’s an extract from a passing test.

john = (Tennant)cacheDup.get(johnsId);
mark = (Tennant)cacheDup.get(marksId);
// John and Mark are tennants in the same house, in Hilldale
assert john.getHouse().getFullAddress()
  .equals(mark.getHouse().getFullAddress());
assert john.getHouse().getSuburb().equals("Hilldale");
// But they're not the same instance in memory
// (eg: due to networking/persistence serialisation)
assert john.getHouse() != mark.getHouse();
// Oops, John just told me it's actually Lakedale, not Hilldale
john.getHouse().setSuburb("Lakedale");
cacheDup.put(johnsId, john);
// Mark's house has also been updated, automatically
mark = (Tennant)cacheDup.get(marksId);
assert mark.getHouse().getSuburb().equals("Lakedale");

In order for Cache Dup to be able to work with your “entities”, they need to follow these rules, which I’m trying to keep as simple as possible:

  • Have stable hashcode and equals implementations, based on immutable field(s). Ideally this should be something with business meaning, but could also just be a UUID.
  • Implement Serializable (otherwise you wouldn’t need Cache Dup)

The cacheDup variable in the previous example is an instance of CacheDupDelegator, which implements org.infinispan.Cache, delegating to a standard Infinispan Cache instance which you provide to its constructor. I plan on adding a CDI decorator to make this step unnecessary for CDI-managed Cache instances.

CacheDupDelegator cacheDup = new CacheDupDelegator(cache);

This is obviously very early days. Current limitations that I know of are:

  • List is the only type of collection supported
  • If an object contains (directly or indirectly) a reference to itself, a stack overflow will probably result.

All kinds of feedback welcome. My biggest hurdles with this are going to be things I don’t know I don’t know.

Seam Cron: Scheduling Portable Extension for CDI

Update: This has now made its way into Seam 3 proper, as Seam Cron. Unfortunately it won’t be available for the initial Seam 3.0 release, but will become available soonishly.

Introducing Web Beans Scheduling (now Weld Scheduling (Now Seam Cron)) – a way to run scheduled events in JBoss Weld, Seam 3 and possibly any JSR-299 implementation. It makes use of CDI’s typesafe event model for tying business logic to schedules. That is, you define your schedules using the provided qualifiers, which you apply to observer methods containing the business logic that you wish to be run at those times. In other words:

    public void onSchedule(@Observes @Scheduled("20 */2 * ? * *") CronEvent e) {
        // do something every 2 minutes, at 20 seconds past the minute.
    }

The CDI container will fire the @Scheduled(“20 */2 * ? * *”) CronEvent at 20 seconds past every second minute, causing the onSchedule method to be executed each time. When CDI starts up with this module on the classpath, all observers of the CronEvent class are detected by the module using standard CDI APIs. The module then inspects each associated @Scheduled binding annotation and sets up a schedule to fire a CronEvent with that binding at the schedule found. Currently Quartz is used as the underlying scheduler.

One obvious shortcoming of this is that we’ve managed to hard-code scheduling information in our Java code. The answer to this is to define the schedule as a property in the scheduler.properties file at the root of your classpath, for example:

# This schedule is named "test.one" and runs every 2 minutes
test.one=20 */2 * ? * *
# This schedule is named "after.hours" and runs in the wee hours every day
after.hours=0 0 2 ? * *

You can then observe that schedule like this:

    public void onNamedSchedule(@Observes @Scheduled("test.one") CronEvent event) {
        // the schedule is defined in scheduler.properties
    }

This is getting better, but that “test.one” String is still setting off some refactoring alarm bells. No worries, we can deal with this pretty easily using meta-annotations. We just create a custom qualifier like so:

@Scheduled("after.hours")
@Qualifier
@Retention(RetentionPolicy.RUNTIME)
@Target( { ElementType.PARAMETER, ElementType.METHOD, ElementType.FIELD, ElementType.TYPE })
public @interface AfterHours {}

And now we can observe the event in a typesafe manner, in as many places as we want throughout our codebase with all the benefits of code-completion and none of the refactoring headaches:

    public void onTypesafeSchedule(@Observes @AfterHours CronEvent e) {
        // do something after hours
    }

There are also some built-in convenience events for regular schedules:

    public void everySecond(@Observes @Every Second second) {
        // this gets executed every second
    }

    public void everyMinute(@Observes @Every Minute minute) {
        // this gets executed every minute
    }

    public void everyHour(@Observes @Every Hour hour) {
        // this gets executed every hour
    }

Note though that none of these built-in events will be scheduled, let alone fired, unless the module finds an observer for them on startup.

This project has been submitted to the Seam 3 sandbox (find it in seam/sandbox/modules). An early release of the Weld Scheduling module and Memory Grapher example app can be downloaded from here: WeldScheduling.tgz. They’re both built with Maven 2.0.10+. To run the example app, ‘mvn clean install‘ both projects (‘scheduling’ first, then ‘MemoryGrapher’) and then run ‘mvn -Drun install’ from inside MemoryGrapher. It uses the Weld SE extension to run it without an app server (it’s a Swing app).

I Know Shoes

Finally! My copy of Nobody Knows Shoes arrived in the mail this week. As an adoring fan of Why’s Poignant Guide I had perhaps unfairly high expectations of NKS. As can be seen in the downloadable PDF version it’s not as long nor as entertaining as the Guide, seems targeted at a slightly younger crowd and is decidedly less poignant. Unlike the Guide, the comic strips in NKS make no sense whatsoever and have abs(zero) relevance to the actual subject matter. There’s not even any mention of Chunky Bacon. But this is not to say that NKS is somehow inferior compared to WPG, more that I really had no business comparing the two in the first place. After all, Shoes is just a tiny toolkit by it’s own admission, designed with new programmers in mind. For those with a passing interest I recommend simply downloading the on-line PDF version (it is printed at cost anyway so you’ll just be saving _why the effort). But I would recommend it to any teacher-types with a class full of wanna-be programmers. Just hand them each a copy and let _why’s deranged cosmo-babble do the rest.