Monday, March 15, 2010

Announcing transitivity utilities project

Check out my new project: http://code.google.com/p/transitivity-utils/

This is the recycled message I sent about this at the guava mailing list:

--------
I have implemented some new data structures that may be of interest to someone. It is about maintaining transitive relations efficiently, both in terms of memory and time complexity. Internally it uses interval-encoding for elements, which for example can handle reachability queries in trees in just O(1) time and O(1) memory per element. It's not good for graphs that are close to full bipartite graphs - then memory degenerates to O(N) per element and O(logN) for testing reachability.

Historically, the motivation for this line of research has been encoding inheritance relations in knowledge bases. (For a more familiar example, this can be used to directly implement the instanceof operator of Java, but I don't suggest that!).

The central type amounts to:

TransitiveRelation {
   void relate(E subject, E object);
   boolean areRelated(E subject, E object);
}



This should be considered for any algorithm that relies on reachability queries.

I regard the API rather stable (except for I might need better names here and there).
----
 Boy, the effort it takes.

Long story short, I believe I've found the sweetest spot in the design space for this problem. It contains magic sauce found nowhere else. The gist of the underlying solution is so (deceptively?!) simple that it's hard not to ask: "how could that be overlooked for 20 years?". The details will wait for a future post (but the source is there already).