Cache Dup

I love how simple is it to configure and scale an Infinispan instance. My biggest bugbear with NoSQL in general is the problem of denormalisation, leading to lots of duplicate records which need to be “manually” kept in sync by the application. So I’ve been working on a wrapper around Infinispan that tracks what goes in and automatically maintains referential integrity across denormalised data. To demonstrate, here’s an extract from a passing test.

john = (Tennant)cacheDup.get(johnsId);
mark = (Tennant)cacheDup.get(marksId);
// John and Mark are tennants in the same house, in Hilldale
assert john.getHouse().getFullAddress()
  .equals(mark.getHouse().getFullAddress());
assert john.getHouse().getSuburb().equals("Hilldale");
// But they're not the same instance in memory
// (eg: due to networking/persistence serialisation)
assert john.getHouse() != mark.getHouse();
// Oops, John just told me it's actually Lakedale, not Hilldale
john.getHouse().setSuburb("Lakedale");
cacheDup.put(johnsId, john);
// Mark's house has also been updated, automatically
mark = (Tennant)cacheDup.get(marksId);
assert mark.getHouse().getSuburb().equals("Lakedale");

In order for Cache Dup to be able to work with your “entities”, they need to follow these rules, which I’m trying to keep as simple as possible:

  • Have stable hashcode and equals implementations, based on immutable field(s). Ideally this should be something with business meaning, but could also just be a UUID.
  • Implement Serializable (otherwise you wouldn’t need Cache Dup)

The cacheDup variable in the previous example is an instance of CacheDupDelegator, which implements org.infinispan.Cache, delegating to a standard Infinispan Cache instance which you provide to its constructor. I plan on adding a CDI decorator to make this step unnecessary for CDI-managed Cache instances.

CacheDupDelegator cacheDup = new CacheDupDelegator(cache);

This is obviously very early days. Current limitations that I know of are:

  • List is the only type of collection supported
  • If an object contains (directly or indirectly) a reference to itself, a stack overflow will probably result.

All kinds of feedback welcome. My biggest hurdles with this are going to be things I don’t know I don’t know.