I was out sick as a dog yesterday, so I missed Jeffery Palermo's blog entry on detecting memory leaks. This is some good stuff. I'm bloggin this entry for my own use because I'll definitely be following these steps shortly...
I see that the
Microsoft Newsgroup Admin, John Eddy, is now blogging at the Microsoft Community Kitchen. I had
written a post a while back about combining Wikis and Newgroups - how they are far apart now but a merging of these two technologies would be very cool. It's nice to see blogging throughout microsoft
I've been reading a number programming/security topics lately, but the topic of security goes much farther as we alter business processes through developing and implementing new tools in the workplace. Books like "Writing Secure Code" focus on individual systems which is a great step, but there is a larger picture at stake here as well.
I remember taking "Accounting Systems" in school that focused on developing procedures that introduce multiple parties throughout a process for the sake of checks and balances. In my experience , these types of systems are not adequate - bordering on grossly inadequate. When you create a multi-step solution with differing parties reviewing the information, you lose the prospect of continuity/control which can be exploited. The individual steps are very secure, but in turn, the entire process becomes vulnerable( in programming, we're talking about integration points here.) People introduce their scheme at a given step in the process and go through extensive steps to make the process look like it is legitimate. Once that process is accepted, the entry point gains familiarity with the "transaction type" and it's never questioned again. It's really very similar to hacking techniques.
I've seen this on a number of levels. I've seen program managers introduce consulting firms owned by themselves to their own company to work on their own projects (this person was physically escorted off the premises by security while kicking and screaming). I've seen DNR wardens who start up their own "Friends of the Environment" group who's sole registered purpose is to raise money - no member lists, no minutes. These are all products of systems that are out of control.
I've been looking for books that cover security from this angle - books that go beyond the traditional 'Accounting System Security' to look at analyzing the unintended implications of security decisions. I haven't found any yet. I keep thinking I should blog some of these experiences, but on one hand they land outside the realm of programming, but on they other hand they ultimately apply to all the processes we engineer.
I was reading a post by Steve Eichert on Lazy reads where he discusses a problem where his lazy read approach was generating a lot of database traffic. I've run into this situation before so I thought I'd blog some thoughts here.
First, some background...
A good example of a lazy read on an object is where you do not load all properties upon instantiation. You delay the load until the property is first accessed. Each property checks the validity of the property object (typically for nullness) and loads the value before returning it.
I coded a significant com layer on top of a device network library that used lazy reads a few years back. The networks contained autonomous devices ranging from temperature sensors to large building chiller units. The underlying library was written in C++ and I provided the COM library written in C++ as well. The consuming target was Visual Basic, but the principles of the solution apply to any language or platform. Each device exposed properties, where multiple properties were grabbed in a single packet transmission. There would typically be multiple property sets per object due to physical data size contraints imposed by the network protocol. This provided an interesting mapping to the overlying object model. Related to the lazy read issue, I needed to be concerned with (1) minimizing network traffic, (2) concurrency issues caused by caching and (3) multi-threaded access.
Minimizing Network Traffic
When dealing with an abstraction sitting on a network - packets are generally handed back containing multiple pieces of information. Given I had abstracted a 'NetworkNode' object, there may be a number of packets that provide information about that node, and multiple properties that map to each of those packets. Within the class definition I created private subclasses that contained all the information gathered from the packets. That internal class held the statefull data and eliminated the need for multiple hits on the network wire. Following the code I've included below, requesting the DSN property on NetworkNode loads the cached data class and returns the node's DSN value. Subsequent calls to other properties such as NodeName do not require any network traffic.
class NetworkNode
{
private class BaseNodeInfo
{
public BaseNodeInfo(...) { /*perform network read from provided parameters and hold state data*/ }
public DSN dsn
{
get
{ /*retrieve dsn from cached data structure*/}
}
public string NodeName
{
get
{ /*retrieve nodename from cached data structure*/}
}
}
private class ExtendedNodeInfo(...)
{ /* identical in functional nature to BaseNodeInfo */ }
private BaseNodeInfo _bni = null;
private ExtendedNodeInfo _eni = null;
private BaseNodeInfo GetBaseNodeInfo()
{
if( _bni == null )
_bni = new BaseNodeInfo(...); // this line generates network traffic.
return _bni;
}
public DSN dsn
{
get
{
BaseNodeInfo bni = GetBaseNodeInfo();
return bni.dsn;
}
}
public string NodeName
{
BaseNodeInfo bni = GetBaseNodeInfo();
return bni.NodeName;
}
}
Cache Concurrency
The network library would generate callbacks to the application when a node's properties were changed. To accommodate this situation, I simply created a method on the NetworkNode class to nullify the cached data. The callback would transmit the signal through the object hierarchy to the node - where we called 'Invalidate()'. The code for Invalidate is shown below:
public class NetworkNode
{
public Invalidate()
{
_bni = null;
_eni = null;
}
}
Multithreading Issues
Properly handling the multi-threading issues became a bit more tricky (i.e. what happens when an Invalidate() call is fired on a different thread between calls to NodeName and DSN?). Fortunately, my client didn't mind dirty reads in this instance because we were also floating up events on the objects and I handled both notification and invalidating through thread-marshalling on a dedicated thread. I did this on a dedicated thread to avoid locking one of the network library's threads. The dirty read implementation became extremely simple by coding critical sections around all GetBaseNodeInfo() - style methods and the Invalidate() methods.
In Summary...
So there is the basis for the solution that meets the objectives - minimize network traffic, handle cache concurrency issues and deal with multi-threaded accesses. This is greatly simplified as there were parent classes that held some of the detailed caching mechanism to eliminate code redundancy and async messaging, but I think this gives an adequate feel for the solution. In the years since, I've used the same approach on other applications including database-centric applications and others. There are a couple of 'design patterns' that played into this solution, I'll have to post those when I remember them.