Terracotta, Domain Driven Design and Anti-Patterns

Yesterday I was going to participate at a technical meeting, when the following statement suddenly appeared on my Twitter timeline:

One interesting reflection is that the "super static" property of Terracotta roots helps us with Domain Driven Design.
Finally we have (what I feel a nice) way of using Repositories in our Entities without Aspects or Register style coding.


My immediate reaction was to reply by saying:

Strongly disagree. That means your business/domain code depends on Terracotta semantics.


I thought my comment were pretty intuitive: if Terracotta is a transparent clustering solution (and it actually is), how can Terracotta help you writing your domain code without polluting the code itself?

Too bad, I was wrong: it generated a lot of discussions with a few of my favorite tech-friends, and what's worse, I wasn't able to further clarify my statement!

Let me blame Twitter 140-characters constraint, please ... I don't want to hurt my own self-respect ;)

So here is a simple-stupid sample, which I hope will clarify the following simple concept: if your code relies on Terracotta super-static variables to access repositories (as apparently stated by the first quote above), then your code is IMO wrong.

Let's go.

We have an Amazon-style application, dealing with a lot of users and books: so we choose to use Terracotta for transparently clustering our application and make it scale.
Taking a deeper look at our domain, and using the Domain Drive Design terminology, we have two aggregate roots: User and Book.


public class User {

private final String id;

// ...
}

public class Book {

private final String isbn;

// ...
}


They're two separated roots because completely independent and with different lyfecycles.
So we have two different repositories, too:


public class UserRepository {

public User findById(String id) {

//...
}

// ...
}

public class BookRepository {

public User findByIsbn(String isbn) { //... }

// ...
}


Now, we want to add the following responsibility to our User: tell us all books he viewed.


public class User {

private final String id;

// ...

public Collection getViewedBooks() { //... }
}


We don't want to establish a permanent relation between User and Book, i.e., make the collection of viewed books an instance variable of user: it would clutter the User entity, and associate it a lot of additional data which is related to the user, but in no way part of the user itself.
So we choose to access Book entities through the BookRepository, meaning that User entities must have access to the repository instance.

The repository is implemented as a collection of objects persisted and distributed by Terracotta, so the following idea comes to our mind: we may configure the repository as a Terracotta root, and access its instance everywhere in our cluster of User objects thanks to its super-static nature!

For those unfamiliar with Terracotta, a Terracotta root, also known as super-static variable, is an object instance shared among all cluster nodes: it is actually created only the first time the new operator is called. After that, that instance will be always the same among all nodes, and every other new invocation will have no effect.

It means that we have a number of super-easy ways to access our super-static repository from inside our User entity, and here is one:


public class User {

private static final BookRepository bookRepository =
new BookRepository();

private final String id;

// ...

public Collection getViewedBooks() {
for (String isbn : viewed) {
Book book = bookRepository.findByIsbn(isbn);
}
// ...
}
}


Thanks to Terracotta, the BookRepository will be actually created for the first time, and then always reused, even if we shutdown our application and restart: the bookRepository persisted/distributed state will be always there.

Well, this is IMHO an anti-pattern.

While it is true that Terracotta makes easy to access repositories from entities, it is IMO wrong because coupled to Terracotta itself: if you swap out Terracotta, that static variable will be always reset at every shutdown/restart, as you would expect in any Java application, and your data will be gone!

I hope it's clearer now.
And if someone disagrees, or just thinks I misled the original quote, I'd be more than happy to hear his thoughts.

Finally, a disclaimer.
I'm an happy Terracotta user, and there are many ways to correctly use Terracotta in your application: putting it simple, just hide the way it works behind the repository implementation, and access the repository itself as it were a normal Java object, not a super-static one.

And that's all.

Actor concurrency model in a nutshell

While the necessity of writing software applications capable of exploiting the multi-processor architecture of today computers is more and more common, concurrent programming is often perceived as an hard task.
No wonder, so, if many languages come to our rescue by supporting concurrent programming through first-class syntax support, or through higher level user libraries.
Two well-known languages providing explicit concurrent programming support are Erlang and Scala, and both have in common the same concurrency model: actors.

The actor model is a concurrency abstraction based on the concepts of message-passing concurrency: very different from the shared state concurrency model we're used to in general purpose languages such as Java, but more efficient and easier to program with.

Let's see the difference between the two.

Shared-state concurrency is based on two fundamental concepts: resource sharing and resource synchronization.
As already said, it's the most common scenario with general purpose OO languages such as Java: it's composed by computational units (often implemented as threads) concurrently executing code sections containing resources that must be shared, and hence, synchronized in order to guarantee correct ordering, visibility and data consistency.

Message-passing concurrency, also known as share-nothing concurrency, is the exact opposite: here, computational units are just endpoints exchanging immutable messages one another, and reacting to received messages by executing a given computation.
In such a concurrency model, so, there isn't any shared resource, nor there is any need for resource synchronization: messages are immutable, and each computational unit is only able to change its own state in response to a received message.
It has several interesting consequences, making message-passing concurrency preferable over shared-state one:

  • Message-passing concurrency is safer: there are no shared resources, so there is no possibility to corrupt data due to concurrent access.

  • Message-passing concurrency is faster: there is no resource synchronization, so there are no bottlenecks, deadlocks, livelocks, or similar locking issues.

  • Message-passing concurrency is easier: once you get used to the new paradigm, not to have to think at how to share and synchronize resources, is a big relief.


The actor concurrency model is a form of message-passing concurrency, based on:

  • Actors: the computational units capable of sending messages and reacting to received ones by executing a given function.

  • Messages: the form of communication used by actors in order to exchange data and carry on some kind of computation based on that data.

  • Mailboxes (or channels): a kind of buffer every actor has for storing received messages which haven't been processed yet.


Every actor can communicate with other actors by obtaining their mailbox (or channel) "address", and then sending a message to it: this is the only way an actor has for changing the state of the system.
Every actor can receive messages and process them by executing a behavior function: such a function can only change the state of the actor itself, or send new messages to other actors (already existent or created on-the-fly).
Communication between actors is completely asynchronous and decoupled: that is, actors do not block waiting for responses to their messages; they just send messages and forget, reacting to incoming messages without any correlation to previously sent messages.
A concurrent system implemented trough the actor concurrency model is considered to be:

  • Parallel: several actors can process several messages in parallel, each one independently from the other.

  • Scalable: actors can be implemented in several ways, as local computational units or distributed ones, so
    they can easily scale out to the number of available processors or computers.

  • Reconfigurable: actors can be dynamically added and removed from the system, and then communicate the new topology through special purpose messages.


Obtaining such a properties through a shared-state implementation is harder and requires a lot of challenges.
The actor model provides instead a well-defined, clearly stated way to easily build highly concurrent systems, at a cost of a paradigm shift.

Next time we will see how to implement an actor-based concurrent application through our favorite OO language: Java!

Against the viral Commons-Logging

Years ago I wrote a blog post about how to make your application Commons-Logging (JCL) free by using SLF4J.

However, there is still a problem if you're writing a Maven2 based application: even if you're using SLF4J, you will probably end up with a viral JCL jar in your classpath!
That's because JCL is still used by a lot of projects, and Maven places it in your classpath as a transitive dependency.

Fortunately enough, there's still a way to completely, and easily, get rid of it thanks to the jcl-over-slf4j module, and a little Maven trick.

Here's the receipt:


  1. Add the following SLF4J dependencies:

    • slf4j-api

    • logback-classic (or your SLF4J implementation of choice).

    • jcl-over-slf4j




  2. Configure the JCL dependency with scope provided.


So here's how your Maven dependencies should look like:

By doing so, JCL will not be included in your classpath (because Maven considers it to be externally provided), while under the hood will be implemented by SLF4J thanks to the jcl-over-slf4j module.

That's all, folks.
Now you have one, and only one, safe logging library!

Real Terracotta @ Rome JavaDay 2009

I'm just back from the Rome JavaDay 2009, and here is the presentation I gave: Real Terracotta - Real-world scalability patterns with Terracotta.



I hope you enjoyed it!
I'd be very glad to hear your feedback, so feel free to comment on with any question/thought you may have!

Follow me on Twitter

Just subscribed to Twitter: http://twitter.com/sbtourist

Feel free to follow me, hoping to tweet interesting stuff at an higher rate than my blogging one ;)