posted on Wednesday, February 02, 2005 2:25 PM by mlorengo

Optimizations in Complex Object Creation

In comments to my article on Handling Complex Objects in the GridView Control, Eric Wise mentions that one needs to be careful about Handling Complex Objects in Grids capabilities with complex objects. Why? If I understand correctly, it's a database issue. You want to avoid loading up all of that information up from the database unless your really want to.

He proposes two alternatives

  1. Expose inital load methods that make the huge database call and prepopulate the referenced objects in one shot.
  2. Expose domain level queries that execute single sql statements and return datatables.

I'm going to explore method 1 in this post, because I can already use method 2 with my object model. I simply don't include those properties in my binding list that require complex types. Unless I'm missing something, I don't see any advantages of method 2 over this solution. I can then drill down from the summary information as needed to the detail. This is known as Lazy Loading.

So, with the assumption (mine), that I can do method 2 above with my current object model, and still retain a level of separation, let's look at how we can improve the data loading, *if* we've decided that we *do* want to display "complex" objects on the GridView.

One way to optimize the object loading would be to reduce the number of times we go to the database. Here's a layout (via Access) of my data tables.

 If I wanted to load all of the Wines associated with a Producer, I could do it via the following methods

  1. Return a DataReader of Wine records, and then lazy load the WineVarietal (and Varietal) records in another DataReader as needed
  2. Return a DataReader of Wine records and for each one get another DataReader to load the associated Varietal information
  3. Return a DataReader of Wine records with multiple INNER JOIN's and iterate through the records building the Wine and Varietal information
  4. Issue a batch of Sql Statements for each set of information and use the  IDataReader's  NextResult() method to advance the DataReader to the next result set while building the Wine and Varietal information.

I'm currently using method 2. And it goes something like this:

Get the WineDataProvider
Get an IDataReader from the WineDataProvider for all the Wine produced by the specified Producer.Id

For each record in the IDataReader
    Build a Wine instance from the IDataReader
    Get the VarietalDataProvider
    Get an IDataReader from the VarietalDataProvider for the current Wine.Id

    For each record in the IDataReader
        Build a Varietal instance and add it to the Varietals collection

    Add the Wine to the collection

Return the collection of Wines

Method 1 (Lazy Load) would look very similar to Method 2, with the exception that the second IDataReader would be delayed until the Varietals collection was accessed. In my GridView example, it would be deferred until I bound the Wine instance to the row and accessed the Varietal collection in the <ItemTemplate>.  So, again I would expect very similar performance. Of course you don't know until you measure right!

Method 3 would retrieve all the necessary  information in one database call. The stored procedure would something like this. Now, there's most likely some optimizations that I can make to the stored procedure, but that's not the purpose of this post. This would give me all of the Wines for a particular Producer. In fact it would give me multiple records for a Wine if the wine had multiple Varietals.

SELECT Wine.ProducerId AS ProducerId, Max(Wine.Id) AS WineId, Max(Wine.Name) AS WineName, WineVarietal.Percent, Varietal.Name
FROM Varietal 
INNER JOIN ((Producer INNER JOIN Wine ON Producer.Id = Wine.ProducerId) 
INNER JOIN WineVarietal ON Wine.Id = WineVarietal.WineId) ON Varietal.Id = WineVarietal.VarietalId
GROUP BY Wine.ProducerId, WineVarietal.Percent, Varietal.Name
HAVING (((Wine.ProducerId)=[@producerId]));

Some sample output for the above statement could look something like this

Query1
ProducerId WineId WineName Name Percent
181 1 2001 Ravenswood Winery Shiraz-Cabernet Cabernet Sauvignon 75
181 1 2001 Ravenswood Winery Shiraz-Cabernet Shiraz 25
181 82 2000 Ravenswood Winery Pickberry Vineyards Sonoma Mountain Merlot 53
181 82 2000 Ravenswood Winery Pickberry Vineyards Sonoma Mountain Cabernet Sauvignon 45
181 82 2000 Ravenswood Winery Pickberry Vineyards Sonoma Mountain Cabernet Franc 2

So, now my logic for building up an instance of a Wine would look something like this

For each record in the IDataReader
    If the WineId = current Wine instance Id
        Build a Varietal instance and add it to the Varietals collection
    else
        Build a Wine instance from the IDataReader
        Add it to the Wine collection
        Build a Varietal instance and add it to the Varietals collection

Return the Wine collection

Again, check my logic, but you get the idea. Here we tradeoff making a database call for every object and instead load it up all at once.

Finally Method 4

John Kilgo discusses using the IDataReader.NextResult() method here. In this method I would forego the inner joins of the previous example and instead issue 2 selects, 1 to retrieve the Wine records, and the second to retrieve the Varietal records. I would imagine the logic would go something like this

For each record in the IDataReader
    Build a Wine instance and add it to the collection

Advance the IDataReader to the next result set

For each record in the IDataReader
    Retrieve the Wine instance from the Wine collection based on the Wine.Id
    Build a Varietal class and add it to the Varietals collection

Some optimizations can be made to the above code as well

Conclusion

As you can see there are many different ways to handle the data, it all depends on how you are going to use it. In fact you could implement the data retrieval algorithms based on a strategy pattern, then once you understand what method works best for a specific situation, you simply adjust the data retrieval strategy.
  
Did I mention how much I love blogging? It's great to get feedback. It's the one thing I've missed since I took about 5 months off. Did I mention I'm starting back to work on Monday? Yup.

Comments