Lazy Reads - a structured approach
I was reading a post by Steve Eichert on Lazy reads where he discusses a problem where his lazy read approach was generating a lot of database traffic. I've run into this situation before so I thought I'd blog some thoughts here.
First, some background...
A good example of a lazy read on an object is where you do not load all properties upon instantiation. You delay the load until the property is first accessed. Each property checks the validity of the property object (typically for nullness) and loads the value before returning it.
I coded a significant com layer on top of a device network library that used lazy reads a few years back. The networks contained autonomous devices ranging from temperature sensors to large building chiller units. The underlying library was written in C++ and I provided the COM library written in C++ as well. The consuming target was Visual Basic, but the principles of the solution apply to any language or platform. Each device exposed properties, where multiple properties were grabbed in a single packet transmission. There would typically be multiple property sets per object due to physical data size contraints imposed by the network protocol. This provided an interesting mapping to the overlying object model. Related to the lazy read issue, I needed to be concerned with (1) minimizing network traffic, (2) concurrency issues caused by caching and (3) multi-threaded access.
Minimizing Network Traffic
When dealing with an abstraction sitting on a network - packets are generally handed back containing multiple pieces of information. Given I had abstracted a 'NetworkNode' object, there may be a number of packets that provide information about that node, and multiple properties that map to each of those packets. Within the class definition I created private subclasses that contained all the information gathered from the packets. That internal class held the statefull data and eliminated the need for multiple hits on the network wire. Following the code I've included below, requesting the DSN property on NetworkNode loads the cached data class and returns the node's DSN value. Subsequent calls to other properties such as NodeName do not require any network traffic.
class NetworkNode
{
private class BaseNodeInfo
{
public BaseNodeInfo(...) { /*perform network read from provided parameters and hold state data*/ }
public DSN dsn
{
get
{ /*retrieve dsn from cached data structure*/}
}
public string NodeName
{
get
{ /*retrieve nodename from cached data structure*/}
}
}
private class ExtendedNodeInfo(...)
{ /* identical in functional nature to BaseNodeInfo */ }
private BaseNodeInfo _bni = null;
private ExtendedNodeInfo _eni = null;
private BaseNodeInfo GetBaseNodeInfo()
{
if( _bni == null )
_bni = new BaseNodeInfo(...); // this line generates network traffic.
return _bni;
}
public DSN dsn
{
get
{
BaseNodeInfo bni = GetBaseNodeInfo();
return bni.dsn;
}
}
public string NodeName
{
BaseNodeInfo bni = GetBaseNodeInfo();
return bni.NodeName;
}
}
Cache Concurrency
The network library would generate callbacks to the application when a node's properties were changed. To accommodate this situation, I simply created a method on the NetworkNode class to nullify the cached data. The callback would transmit the signal through the object hierarchy to the node - where we called 'Invalidate()'. The code for Invalidate is shown below:
public class NetworkNode
{
public Invalidate()
{
_bni = null;
_eni = null;
}
}
Multithreading Issues
Properly handling the multi-threading issues became a bit more tricky (i.e. what happens when an Invalidate() call is fired on a different thread between calls to NodeName and DSN?). Fortunately, my client didn't mind dirty reads in this instance because we were also floating up events on the objects and I handled both notification and invalidating through thread-marshalling on a dedicated thread. I did this on a dedicated thread to avoid locking one of the network library's threads. The dirty read implementation became extremely simple by coding critical sections around all GetBaseNodeInfo() - style methods and the Invalidate() methods.
In Summary...
So there is the basis for the solution that meets the objectives - minimize network traffic, handle cache concurrency issues and deal with multi-threaded accesses. This is greatly simplified as there were parent classes that held some of the detailed caching mechanism to eliminate code redundancy and async messaging, but I think this gives an adequate feel for the solution. In the years since, I've used the same approach on other applications including database-centric applications and others. There are a couple of 'design patterns' that played into this solution, I'll have to post those when I remember them.