April 2005 - Posts

Partial Classes are OOP agnostic

There is a pretty good article by Dino Esposito entitled “Implications and Repercussions of Partial Classes in the .NET Framework 2.0that is in the latest issue of CODE magazine. Overall, Dino does a good job of introducing the reader to partial classes. I particularly like the section under the topic of “Usage of Partial Classes”, where he talks about splitting the CRUD functions and Business behaviors of a class into separate parts of the class definition. I particularly like this because I talked about it back here, and it shows that I’m not the only person in the world thinking that there could be some value in this. As a matter of fact, if you change the word “employee” for “customer” and “.vb” for “.cs”, the two discussions are really saying the same thing.

However, a few times in the article Dino makes the following statement about partial classes “partial classes are primarily a sophisticated form of text-merging and have nothing to do with inheritance and object-oriented programming” Which is 100% correct. Partial classes have nothing to do with inheritance or with OOP. As a matter of fact, they are completely agnostic to the concepts of OOP; not for it or against it. Partial classes don’t ask or answer any OOP questions. The whole story they tell is simply about allowing more then one source file to take part in a class definition. That’s it. Whether you use this to assist with code gen, splitting work responsibilities, layering, OOP, AOP, or whatever else you’d like to add here, it is totally based on how you use them.

For some reason, the fact that partial classes have nothing to do with OOP is often misinterpreted to mean that partial classes are against OOP. That somehow, partial classes by their very nature are anti-OOP. That is silly. Yet, a lot of partial class discussions often bring it up.

Dino, even prepares for this misconception by immediately following up his excellent point about using partial classes to introduce some layering concepts inside a class, by stating “If you're a design and OOP purist, you're already putting a curse on me. So, guess what? In this case, I wouldn't even suggest this to you.” IMO this is an unnecessary misconception. An OOP purist should not be concerned in the least with partial classes. They answer no questions for or against OOP. If somebody is trying to use partial classes to implement some OOP concepts then the fault lies with this person’s misconception of partial classes, not with partial classes.

Partial classes are about class file definitions; that’s it. Inherently (pun intended), they don't say anything good or bad about OOP. Partial classes could be used to create a 100% pure OOP solution or not, just like single code file class definitions can. There is nothing about partial classes that OOP purist should fear or scoff. In my mind they are purely a means for breaking the 1:1 coupling between a class definition and a single .vb or .cs file. OOP says nothing about .vb or .cs files so shouldn't be concerned with partial classes.

So, to go out on a limb and begin to change this misconception, I’m going to take a different stance. I’m going to say that if used smartly, partial classes can actually promote better OOP code. Yep, I’m not afraid of the OOP purist that will bark over this, because they wouldn’t really be OOP purist. They may be people on some other mission and soap box, but it certainly wouldn’t be OOP purism. In a previous blog, I talked about some of the reasons that we introduce layering into our code, and why layering in general can be a good thing. Well, in an attempt to promote better layering strategies developers very often do so at the cost of OOP principals. As a matter of fact, I would suggest that layering strategies (as important and good as they may be) often do more to break good OOP design then people like to admit. Actually, I would assume that “OOP Purist” would be more concerned about the damage done to OOP concepts by the current layering practices and trends then partial classes could dream of doing. For example, as soon as you introduce a data layer outside the scope of a class, based on the most common practices being done today, you are in OOP trouble. As soon as your data layer needs to know about the internal private data of a business object, you’ve parted ways to some degree with OOP. Likewise, as soon as you start assuming that a BO uses a traditional Data Layer as you would see in almost any MSDN article (i.e. separate assembly or at least a separate class), you usually end up talking about ways of getting data from the data layer into or onto the business object’s private variables. This usually involves some form of Data Transfer object, (or perhaps a handoff of a datareader from DAL to the BAL). Almost all of these solutions actually start leading you away from pure OOP.

If however, partial classes allow you to fully encapsulate an object’s behavior, including it’s persistence behavior (CRUD) within a single object, and yet still allow you to utilize some layering principals via separate code files, then in a sense partial class can promote better OOP. That’s right, they could provide a mechanism by which you are able to adhere to better OOP principals, and still gain the layering value of a separate code file; in this case, an object’s CRUD code can be maintained, supported, modified, etc. in a separate code file then the classes business behavior definition.

Hey, currently in .Net 1 you are able to create 1 .vb file with a ton of classes if you should like to do so. There may be (in your environment and development practices) some valid reasons this makes sense. I can’t think of any, but hey you could do this, and it really wouldn’t speak at all to your decision to follow OOP. On the other hand, partial classes allow you to separate a single class definition between multiple code files, and while this itself doesn’t really speak to anything OOP, it may allow you to make some purer OOP decisions and still gain some layering benefits that separate code files can provide.

Event handler declarations

This is a bit of old news, but seeing as I just stumbled on it again, I thought it made sense to blog it.

When declaring an event, it’s best practice to create and/or define the type of that event, instead of allowing implicit creation. So, for example, if you declare an event as:

Public Event SomePropertyChanged

And don’t define the type, then the vb.net compile will do this for you. You end up with a new class being created for you (if you examine the IL you can see this). The class will be called SomePropertyChangedEventHandler and it extends (inherits from) System.MulticastDelegate. A field is then declared called SomePropertyChangedEvent, which holds the invocation list of subscribers to your event.

One of the reason why this is not considered best practice is because for every event declared, you will essentially be creating a new class. If you have a project with a fair amount of events, you will be creating an additional class for each of these. Additionally, you have no defined signature arguments for the event.

So, for the following code:

Public Class Class1

     Public Event ColorPropertyChanged()

     Public Event HeightPropertyChanged()

     Public Event WidthPropertyChanged()

     Public Event TransparencyPropertyChanged()

     Public Sub DoSomething()

          RaiseEvent ColorPropertyChanged()

          RaiseEvent HeightPropertyChanged()

          RaiseEvent WidthPropertyChanged()

          RaiseEvent TransparencyPropertyChanged()

     End Sub

End Class

You will end up with IL that looks like this:
EventHandler1

Notice that besides for the 4 event declarations, we also have 4 class declarations; 1 new class created for each event declaration.

If you instead define the type of your event, such as:


Public Event As EventHandler


The .Net compile won’t need to implicitly create a class to handle your events because you’ve declared your event as a predefined class type; in this case System.EventHandler (which is a delegate type used to wrap System.MulticastDelegate.)

So, if you instead declare the events as such:

Public Class Class1

     Public Event ColorPropertyChanged As EventHandler

     Public Event HeightPropertyChanged As EventHandler

     Public Event WidthPropertyChanged As EventHandler

     Public Event TransparencyPropertyChanged As EventHandler

     Public Sub DoSomething()

          RaiseEvent ColorPropertyChanged(Me, New EventArgs())

          RaiseEvent HeightPropertyChanged(Me, New EventArgs())

          RaiseEvent WidthPropertyChanged(Me, New EventArgs())

          RaiseEvent TransparencyPropertyChanged(Me, New EventArgs())

     End Sub

End Class

You will be creating IL that looks like this:
EventHandler2

Notice that in this sample, there is not a separate class created for each Event type. Instead, all of the event declarations are of type System.Event, which is a system defined Delegate that extends System.MulticastDelegate.

If you need to provide a different signature and arguments from System.EventHandler, then you can define your own type delegate. For example:

Public Class Class1

     Delegate Sub MyPropertyChangedEventHandler(ByVal sender As Object, ByVal e As MyEventArgs)

     Public Event ColorPropertyChanged As MyPropertyChangedEventHandler

     Public Event HeightPropertyChanged As MyPropertyChangedEventHandler

     Public Event WidthPropertyChanged As MyPropertyChangedEventHandler

     Public Event TransparencyPropertyChanged As MyPropertyChangedEventHandler

     Public Sub DoSomething()

          RaiseEvent ColorPropertyChanged(Me, New MyEventArgs())

          RaiseEvent HeightPropertyChanged(Me, New MyEventArgs())

          RaiseEvent WidthPropertyChanged(Me, New MyEventArgs())

          RaiseEvent TransparencyPropertyChanged(Me, New MyEventArgs())

     End Sub

End Class

This allows .Net to create one type that each event field can be declared as. When this code is compiled, the IL looks like this:
EventHandler3

Notice in this IL, there is a single Class defined for the MyPropertyChangedEvnetHandler delegate, and then each respective Event is defined as this type.

So, by defining the type of your Event declaration, you control the number of instances of new event handler classes created. This is better practice then allowing .Net to implicitly create new event handler classes for each and every event declaration; especially if you have a large number of events declared. Additionally, by defining your own event delegate handler, you are able to control and define the signature of the event.

Layers and Tiers and Bears, oh my - Part1

OK, so we really won’t be talking about Bears, but the title had a nice ring ;-)

I was scheduled to present at our local .Net user group (TVUG) back on March 8th, but it ended up getting canceled due to some bad weather. Hopefully, we’ll be able to work out a reschedule. In the meantime, I thought I’d post a few pictures and discussion topics planned for the presentation. The topic, Communicating Between Tiers, is a sensitive one (at least for me). The nature of the discussion can get off-track early, simply due to personal differences in vernacular usage of a few terms. Layers, Tiers, Services, and Applications, are things we developers talk about everyday. We use them so much that you would think the words meant the same to all of us, but they seldom do. I often find when talking to others that everyone has a slightly different twist on what these terms mean. It’s scary to think that we as developers and software architects use these words almost everyday, but are often saying something slightly different. My goal here is not an attempt to describe the universal right meaning. However, in order to have a productive conversation about Communicating Between Tiers, it’s very important that we at least establish a baseline (at least for the duration of the presentation.) So a good chunk of the beginning of the presentation is about establishing that baseline.

I often hear people intermingle the words Layers and Tiers, like they are the same thing. Additionally, everyone seems to have a slightly different description of what Applications and Services are. So, lets give it a shot, remembering that this is in the context of setting a baseline for a presentation, not setting the universal right.

Layers

  • Logical separations of responsibility that can be realized by separate code files, separate assemblies, or simply separate methods and components. 
  • Promote code reuse
    • If employee data is shown on 3 screens and we are not using layers, we might find ourselves repeating the code to get the employee data 3 times (open a connection, create a command, add parameters, execute the command, etc.) Layers give us reusable blocks of code that can prevent this duplication.
  • Ease maintainability concerns 
    • If a stored procedure is modified, we know that the first place we need to look to understand the initial impact is our Data Layer. 
  • Ease Extensibility 
    • If a new piece of business logic needs to be added, we know that it needs to be added to the Business Layer and not to just anywhere in all of our code 
  • Better support for team development efforts 
    • Code that is organized into layers allows some developers to concentrate on UI, while others concentrate on Business, others Data etc. 
  • Logical separations promote understandable code 
    • It is just easier to read and follow code that is organized into responsibility groupings.

Notice that in this definition of Layers, we do not say anything about the physical location of code. By this definition alone, it would be generally assumed that these Layer are all “in process”. There is no implication of cross-process or out of process communications (remoting, web-services, enterprise services, etc). Once again, these layers can be realized by separate code files, separate assemblies, or simply separate methods or components.

For some additional fun on layering, you can take look at these results of a Layering Principals Vote ;-)

Tiers

“My First Law of Distributed Object Design: Don’t distribute your objects” – Martin Fowler, from Patterns of Enterprise Application Architecture

“You should be forced into implementing physical tiers kicking and screaming. There should be substantial justification for using tiers, and those justifications should be questioned at every step along the way.” – Rocky Lhotka, from 2/18/2005 blog posting “Fire bad, Tree pretty

 

Tiers describe the physical deployment and distribution of layers. Tiers encompass 1-n layers. Tiers of an application can run in process with one another, inter-process on a local machine, or cross process communications between different machines. 

  • Tiers describe the physical distribution of code 
  • Tiers can be designed to impact the following areas of concern: 
    • Scalability – how many concurrent users can I support, how many concurrent request can I support 
    • Performance – how fast is each individual application request
    • Security – define multiple domains of trusted environments, i.e. each Tier can demand and/or impersonate different security users and or models. 
    • Fault tolerance – failure point redundancy 
  • Tiers provide tradeoffs between these concerns, not simultaneous resolution! 
  • Adding Tiers does not indicate added performance!
  • Adding tiers greatly increases the complexity of a system and should be well justified!

IMO “Tiers of an application” is probably one of the most abused, most misunderstood software architecture topics going. The terms “Data Layer” and “Data Tier” are used almost interchangeably, but yet have significant differences. Additionally, the most commonly abused justification for Tiers is performance. When, in fact for the average application being built, each addition of a Tier will almost always have a negative impact on performance. Generally, the performance gained by adding Tiers is only significant if you’ve reached concurrent user threshold, and then we are really talking about scalability. So, if my system is really performant for 100 simultaneous users, but sucks when we get 1000+ users, then adding Tiers almost never will add performance benefits for the 100 user case. At best, it can help you keep the 1000+ user case performance more in line with the 100 user case by only negatively impacting the 100 user case slightly. This is significant, and says that adding Tiers in this example case can be a very good architecture decision if you expect 1000 concurrent users, but could be foolish and unnecessary if you only expect 100-500.

Generally, when I hear somebody speak about Tiers for only performance reasons, I get a bad feeling about their understanding of Tiers. I feel better when I hear them talking about the tradeoff of performance, security, fault tolerance and scalability that is made for each Tier decision. However, even then we should really do our homework in justifying additional Tiers.

It should be well proven that existing Layers need to be separated into more then one Tier before making the decision to do so. A better approach is to design your Layering in such a way that adding additional Tiers, when necessary, does not require a complete system redesign.

So, with all that said about Layers and Tiers, why in the world would I decide to do a presentation entitled “Communicating Between Tiers”; especially if I’m promoting the fact that a significant majority of the judgments I see and hear involving decisions to divide an application into multiple Tiers are ill conceived? Answer: “the world is a strange place”

Inevitably, someone, somewhere on the “totem pole of decision-making responsibility”, will have read or heard something about Tiers and the value of “n-Tier” systems. And this person will have a completely misinformed understanding of what all this really means. They will say things like, “I want our system to be a highly performant n-Tier system”, “I want us to have the fastest most scalable system. Therefore, I have decided that we will separate our Application Tier from our UI Tier”, “We are going to design a UI Tier, Business Tier and Data Tier.” - this is one that scares me the most. These are just generalities, but the tone of these types of generalities usually means a blurred vision of Tiers, and n-Tier.

There are a lot of reasons people make these statements and jump to such conclusions. Is Microsoft marketing to blame? Partially? Are the software vendors to blame because of their broadcasted hype over today’s hot topic that they apparently solve better then everybody else? Partially. Truth is, I don’t think we will ever really know how these things get so blown out of perspective. I’m fully confident that today’s SOA hype is well on its way to causing people to make similar ill-conceived Service decisions. I see it already. Am I saying that Tiers are bad? No, tiers are very important architecture decisions that have a very significant place. Am I saying that a very large number of decisions to separate an application into Tiers are ill-conceived? Yes.

Let’s leave off the definitions for Applications and Services for now, and move on to look at some pictures of Layers and Tiers. We will then circle back on the Application and Service definitions.

Here is a picture of a good old fashion single Tier application, with an external Data Storage system like perhaps a MS SQL Server.

1Tier4LayerApp

Notice how the application encompasses a single Tier, 4 Layers and a Database. The GUI Layer, is at the edge of our Tier and it’s also at the edge of our application. For this application, the GUI layer is the only point of contact that our user can have with the application. The other layers, never reach the edge of the application so the user can’t see our touch them. Additionally, there is no need for communication between Tiers in this application because there is only a single Tier, and therefore all layers are in process with one another.

Notice how the Data Access Layer is at the edge of our single Tier. It needs to be at the edge of this Tier so that it can communicate with our external data storage. If we were using an internal in-memory data storage system, then we might want to redraw this single tier to include the data storage inside the Tier and also move the Data Access Layers edge inside the Tier. But in this case, we are assuming external data storage and therefore need the Data Access Layer to have an edge exposed to communicate with the Data Storage. Therefore, if we use ADO.net to communicate between our data layer and our data storage, we need to realize that this puts the data layer at the edge of it’s Tier. By following this definition, Layers not at the edge of a Tier cannot communicate to anything outside of the Tier (this would include web service calls, which should be placed similarly in a Layer at the edge of the Tier)

Some people consider the Data Storage to be a separate Tier, because it can run on a separate server and process space. However, I’m trying to demonstrate a cohesive explanation for Applications, Tiers and Layers. And, this definition is assuming that Tiers represent distributions of Layers, and Layers are logical organization and groupings of code. Therefore, Data Storage systems, like MS SQL Server are hardly Tiers by this definition. I did draw the Data Storage “inside” the application, just not inside the Tier. This is suggesting that the Data Storage service (i.e. MS SQL Server) belongs to the overall description of this application, it’s just not a component of a Tier.

This type of application is often blindly scoffed at because it isn’t “n-Tier”. However, in most cases it will out perform similar n-Tier versions of the same application (on a per/user basis). Occasionally, people refer to this as a two-tier model. But, for the purpose of this discussion, this is clearly a 1 Tier model; none of our layers can be physically distributed. Yes, the database can be, but it is looked at as a separate service that interfaces with our Data Access Layer. The Data Access Layer is at the edge of our single Tier, which is all within the domain of the entire application.

Let’s take a peek at a slight modification to this Application and look at a 2 Tier system; which distributes the datalayer to a separate physical Tier: 

.2Tier4LayerApp

This example is very similar to the first, except we are now distributing our DataAccess Layer onto a separate physical tier. We still have one Application and we still have 4 Layers. However, we now have 2 Tiers. By our definition, these two Tiers can run in the same process space, a separate process space on the same machine, or a process space on another machine. At this point, we are talking about a distributed application. We still have one application, but we have Tiers that could be distributed across process and machine boundaries. I think it is still important at this point that we describe our Application as a single Application, even though some of its Tiers are distributed. We’ll talk more about that later. Additionally, because this application contains multiple Tiers, we will eventually need to think about how we are going to communicate between these Tiers.

Notice how the Business Layer now sits on the bottom edge of the top Tier; in the previous diagram, the Business Layer lived completely within the boundaries of the Tier. This tells us that the Business Layer will now have some inter-process (or cross process) communication concerns with the second Tier. Additionally, the Data Access Layer (which now lives on it’s own Tier) has frontage on two edges of its Tier. This top edge allows us to recognize that this layer will take part in the inter-process communication with the top tier and namely the Business Layer. Also, just as the first diagram, the Data Access Layer has frontage on the bottom of its Tier allowing for the recognition that it will be communicating out of it’s Tier space with the Data Storage.

Sometimes, when this type of application is designed, the Data Access Layer is broken into 2 different Layers; usually a Client Data Layer (which would be located in the top Tier) and a Server Data Layer (which would be located in the bottom Tier). The idea being that the Business Layer remains completely inside the top Tier (with no edge concerns) and is coupled exactly to the Client Data Layer as it was to the Data Layer in the Single Tier first diagram. Then, the Client Data Layer and Server Data Layer abstract the differences involved with communicating between one another over a Tier boundary. However, as this diagram is drawn, it would be the Business Layer and Data Access Layer that need to be involved with the communication process between the Tiers.

Earlier, we discussed that adding Tiers adds complexity. The complexity added here would be the complexity involved with getting these two layers (which now live in separate Tiers) communicating with some form of inter-process or cross process communication. We also mentioned earlier that adding Tiers should be something we kick and scream going into. Martin Fowler suggests you consider selling “any grandparent that you can get your hands on to avoid this” split of an application.

So, why would we do this. Unfortunately, I am convinced that in a lot of circumstances the decision to split Tiers is purely misguided, ill-conceived Totem Pole stuff described above. I think we often blur the definitions of Tiers and Layers, and unfortunately Data Layer becomes Data Tier. We see pictures on MSDN in articles that describe the suggested distributed Tiers of an application and how we pass data between them; implying that distributing Tiers is a given.

However, no matter how you slice it, on a per request basis this design will not perform as well as the first diagram, because of the inter-process or cross-process communication. However, there are some very valid reasons to decide to split your application’s Tiers like this. But, remember that these are all tradeoffs; i.e. gain scalability, or gain security, all for the cost of performance and complexity.

Here are some examples where it might make sense to consider splitting Tiers:

  • Due to Security restrictions, we may want to have all our Data Access impersonate an account that lives in a different trust domain then the first Tier. So, we may want to run the Data Access Layer as an Enterprise Service component with a configured impersonation account different then the web application.
  • To increase security in a public web-hosting environment, it might be necessary to prevent the first Tier from accessing the Sql Servier directly. This might be because the first Tier resides on a Web Server living in a DMZ, and we are not allowed to connect directly to the database living inside the firewall.
  • Perhaps the first diagram represents a Windows application, and we are expected to support 1000+ concurrent users. In the first diagram this might mean 1000 concurrent DB connections. The second diagram can allow for all 1000 Windows clients to communicate to a common Data Layer running on it’s own Tier. Therefore, the Data Layer as a second Tier could allow us to reduce the number of concurrently open DB connections to just a few; i.e. connection pooling. Rocky describes this justification for Tiers in his blog “Fire bad, Tree pretty”

The point being, there are some tradeoff decisions that need to be made here, and sometimes these tradeoffs can be valid. Remember though, that any decision to split Tiers, will always bring some performance hit and additional system complexity.

Now, lets look at another diagram. This will be very similar to the last, but this time we will shuffle the Tier decision points:

 2Tier4LayerApp_2

Just as in the last diagram, we still have 1 Application, 2 Tiers and 4 Layers. However, in this application, we are using the Business Layer on both the top Tier and the bottom Tier. The idea behind this approach is that our Business Layer is going to be operating within the context of both Tiers. We are still describing a 2 Tier system, just as the last diagram. But, we’ve decided to distribute our Business Layer to both the top Tier and the bottom Tier. Because we have positioned the Business Layer at the edge of both Tiers, we are assuming in this diagram that at some level of abstraction the Business Layer is responsible for making the cross process communication decisions. This does not necessarily mean that the Business code is explicitly responsible for the cross process communication plumbing code. It is possible to create a framework of base classes that encapsulate this work, which our actual business classes inherit from.

Depending on how detailed we wanted the diagram to get, this particular one could be drawn slightly different to better demonstrate this framework approach. It might make more sense to explicitly place the Business Layer within each Tier and remove its frontage from the edge of both. Then, crossing over the edge of each Tier (as the Business Layer is now) we add a Framework Layer. This could be redrawn like this:

2Tier5LayerApp

The purpose of this diagram is to better suggest that the Business Layer exists in both Tiers, and that it is the Framework Layer’s responsibility to handle the cross process communications between the Tiers. This is basically the approach Rocky has taken with his awesome CSLA framework. With this approach, the Business Layer in the top Tier calls upon the Framework’s Data Portal plumbing to perform CRUD operations. This plumbing code is responsible for the cross process communication to ferry the Business Object across to the second Tier. The Data Portal plumbing code on the second Tier calls upon the respective Business object (i.e. the Business Layer) to perform its required CRUD work, which then uses its Data Layer to make this happen. The really powerful thing about this particular approach is the well thought out Layering and Tier decisions. The Business Layer is actually coupled with the Data Layer, so there is no reason to ever assume a need to cross a Tier between these two layers. Therefore, we get fast in process communication between the Data Layer and the Business Layer. Additionally, the Business Layer is still an in process Layer with the presentation Layer, so we get nice quick responsive UI behavior. There are only a few highly grained CRUD service points within the framework that things actually rely on cross process communication between the Tiers, and then with the flip of a switch, the framework layer can be told not to ferry the CRUD call across to the second Tier, but call directly through in process.

In other words, a simple configuration setting can tell the framework to either make a cross process communication call to the second Tier or not. If there is no need (based on the tradeoff decisions discussed earlier) to introduce a second Tier, the framework can actually behave just like our first diagram above; and therefore have all the benefits of its in process best-case performance metrics.

I included this discussion as a means of demonstrating how good Layering and Tier planning can make the decision to introduce Tiers an optional one. We should always attempt to follow Martin Fowler’s first law of distributed design; and therefore only decide upon distributing Tiers, after we have applied good sound tradeoff reasons and justifications. Good Layering and Tier strategies can be the ammunition we developers use to combat, or at least work with the “totem pole of decision-making responsibility”. So, the next time the totem pole speaks and says, “we have a highly performant n-Tier application”, you can kindly remind them that better then that, it’s a highly performant n-Layered application, with multiple n -Tier options.

This is by no means an exhaustive look at Layers and Tiers, but the purpose here was to establish a baseline for a presentation on Communicating Between Tiers. It would be tough to have an objective discussion on Comunicating Between Tiers, if everybody thought something different every time the word Layer or Tier was used.

This particular blog entry, is getting a little long, so I’ll leave the discussion about Applications and Services for a future “Part 2” blog.