Navigation

Archive

<January 2009>
SunMonTueWedThuFriSat
28293031123
45678910
11121314151617
18192021222324
25262728293031
1234567

Categories

Blogroll

Contact

Send mail to the author(s) Email Me

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way

Sign In
Copyright ©  2009   Jeff Klawiter . All rights reserved.
My Amazon.com Wish List

Pick a theme:
# Saturday, August 23, 2008
by Jeff Klawiter - Saturday, August 23, 2008 2:39:02 PM (Central Standard Time, UTC-06:00)

Over the summer I was able to run my first large project in .NET 3.5. I had a chance to put to use all the new features and learned quite a bit on the way. I've blogged a bit about this project before. The project contained 2 data backends a local SQL database and a 3rd-party ASMX service. I took the approach of having a business object library that contain only class definitions that I had full control over. I split out the backends to their own libraries with a main Datalayer library that handled the communication with the two underneath it.

Initially I started coding the two bottom layers to do the object instantiation inline in the LINQ queries. As the layers grew I began to refactor much of the instantiation to methods that mapped the layer objects to the business objects.

Converting an inline expression like this

var result = from id in dc.OrderDetails
             where id.OrderID == OrderId
             select new DataObjects.OrderItem()
             {
                 PartID = id.PartID,
                 Price = id.Price,
                 Quantity = id.Quantity
             };

To this

var result = from id in dc.OrderDetails
             where id.OrderID == OrderId
             select MapOrderDetailToOrderItem(id);

Using the LINQ to SQL Classes was a perfect fit. They are by far much easier to use as an ORM than SQL Datasets and DataAdapters are. When retrieving lists such as items on an order the relationship propertis on the SQL objects made it extremely easy to have clean data access.

After completing the project I started thinking about what the performance impact of the refactoring had on the data access. So I decided to run some tests. Initially I thought that the inline instantiation would probably be faster since it was constructing the object in the expression instead of getting the SQL objects and then passing it to a function.

I wrote a quick program to do some performance testing on both implementations. I set up examples using a common real world call. Retrieving an order from a database with the line items on the order Below you can see the inline call and the refactored call.

[Tests.cs]
    static class Tests
    {
        public static void RunInlineInitialization()
        {
            using (SqlOrdersDataContext dc = new SqlOrdersDataContext())
            {
                var result = from o in dc.OrderHeaders
                             select new DataObjects.Order()
                             {
                                 CustomerID = o.CustomerID,
                                 OrderDate = o.OrderDate,
                                 OrderID = o.OrderID,
                                 OrderTotal = o.Total,
                                 ShippingAddress1 = o.ShippingAddress1,
                                 ShippingAddress2 = o.ShippingAddress2,
                                 ShippingCity = o.ShippingCity,
                                 ShippingDate = o.ShippingDate,
                                 ShippingMethod = o.ShippingMethod,
                                 ShippingState = o.ShippingState,
                                 ShippingTotal = o.ShippingTotal,
                                 ShippingZip = o.ShippingZip,
                                 SubTotal = o.SubTotal,
                                 TaxTotal = o.TaxTotal,
                                 TrackingNumber = o.TrackingNumber,
                                 Details = o.OrderDetails.Select(od => new DataObjects.OrderItem()
                                 {
                                     PartID = od.PartID,
                                     Price = od.Price,
                                     Quantity = od.Quantity
                                 }).ToList()
                             };
                DataObjects.Order order = result.FirstOrDefault();
            }
        }
        public static void RunRefactoredInitialization()
        {
            using (SqlOrdersDataContext dc = new SqlOrdersDataContext())
            {
                var result = from o in dc.OrderHeaders
                             select MapOrderHeaderToDataObjectOrder(o);
                DataObjects.Order order = result.FirstOrDefault();
            }
        }

        private static LinqTest.DataObjects.Order MapOrderHeaderToDataObjectOrder(OrderHeader o)
        {
            return new DataObjects.Order()
            {
                CustomerID = o.CustomerID,
                OrderDate = o.OrderDate,
                OrderID = o.OrderID,
                OrderTotal = o.Total,
                ShippingAddress1 = o.ShippingAddress1,
                ShippingAddress2 = o.ShippingAddress2,
                ShippingCity = o.ShippingCity,
                ShippingDate = o.ShippingDate,
                ShippingMethod = o.ShippingMethod,
                ShippingState = o.ShippingState,
                ShippingTotal = o.ShippingTotal,
                ShippingZip = o.ShippingZip,
                SubTotal = o.SubTotal,
                TaxTotal = o.TaxTotal,
                TrackingNumber = o.TrackingNumber,
                Details = MapOrderDetailToDataObjectOrderItem(o)
            };
        }

        private static List<LinqTest.DataObjects.OrderItem> MapOrderDetailToDataObjectOrderItem(OrderHeader o)
        {
            return o.OrderDetails.Select(od => new DataObjects.OrderItem()
            {
                PartID = od.PartID,
                Price = od.Price,
                Quantity = od.Quantity
            }).ToList();
        }
    }

As you can see both public methods do the same thing. The second test was refactored easily using the Refactor-Extract Method menu item in Visual Studio. I load the orders from the database, take the SQL OrderHeader object and map it to the business object. The Details property on the Order object is simply a generic list of OrderItems. To retrieve them I do a quick lambda expression to query the OrderDetails relationship property. 

For some the refactoring goes without saying. Modularizing code like this makes it more maintainable and reusable. This concept can be foreign to some procedural programmers. With Visual Studio and addins like Resharper refactoring becomes so easy it's almost an afterthought to do it. For anyone that still doesn't see the benefit with refactoring, I hope this article will help you.

The testing program is pretty simple, pass in the amount of iterations and a boolean to turn pre-JITing on or off.

[Program.cs - (some code removed for brevity)]
        static System.Diagnostics.Stopwatch stp = new System.Diagnostics.Stopwatch();
        static int Runs = 10;
        static bool PreJitRoutines = false;
        
        static void Main(string[] args)
        {
            ProcessCommandLineArguments(args);

            //Lets get JIT over all the methods in question
            if (PreJitRoutines)
            {
                PreJitTestRoutines();
            }
            //Display Current Selected Options
            Console.WriteLine("Number of Runs: {0}", Runs);
            Console.WriteLine("Pre JIT Enabled: {0}", PreJitRoutines);
            
            //Run and Measure Inline Test
            stp.Start();
            for (int i = 0; i <= Runs; i++)
            {
                Tests.RunInlineInitialization();
            }
            stp.Stop();
            //Display Test Results
            Console.WriteLine("Inline Initialization Test: {0} , Average: {1}", stp.Elapsed, new TimeSpan(stp.ElapsedTicks/Runs));
           
            //Save Result for later calculations
            TimeSpan FirstRun = stp.Elapsed;
            
            //Reset StopWatch
            stp.Reset();

            //Run and Measure Refactored Test
            stp.Start();
            for (int i = 0; i <= Runs; i++)
            {
                Tests.RunRefactoredInitialization();
            }
            stp.Stop();
            //Display Refactored test results
            Console.WriteLine("Refactored Initialization Test: {0} , Average: {1}", stp.Elapsed, new TimeSpan(stp.ElapsedTicks / Runs));
            
            //Perform and report comparisons between tests
            if (FirstRun.CompareTo(stp.Elapsed)<0)
                Console.WriteLine("Inline Construction Faster: {0:f}", stp.Elapsed.TotalMilliseconds / FirstRun.TotalMilliseconds);
            else
                Console.WriteLine("Refactored Construction Faster: {0:f}", FirstRun.TotalMilliseconds / stp.Elapsed.TotalMilliseconds);

        }

I ran the tests in release mode with iterations of 1, 10, 100 and 1000.

>LinqTest.exe 1  true
Number of Runs: 1
Pre JIT Enabled:  True
Inline  Initialization Test: 00:00:00.0124051 , Average: 00:00:00.0177619
Refactored  Initialization Test: 00:00:00.0121951 , Average: 00:00:00.0174613
Refactored  Construction Faster: 1.02


>LinqTest.exe  10 true
Number of Runs: 10
Pre JIT Enabled:  True
Inline  Initialization Test: 00:00:00.0701888 , Average: 00:00:00.0100497
Refactored  Initialization Test: 00:00:00.0650985 , Average: 00:00:00.0093209
Refactored  Construction Faster: 1.08


>LinqTest.exe  100 true
Number of Runs:  100
Pre JIT Enabled:  True
Inline  Initialization Test: 00:00:00.6291376 , Average: 00:00:00.0090081
Refactored  Initialization Test: 00:00:00.5354964 , Average: 00:00:00.0076673
Refactored  Construction Faster: 1.17


>LinqTest.exe  1000 true
Number of Runs:  1000
Pre JIT Enabled:  True
Inline  Initialization Test: 00:00:06.2699034 , Average: 00:00:00.0089773
Refactored  Initialization Test: 00:00:05.3725538 , Average: 00:00:00.0076925
Refactored  Construction Faster: 1.17

As you can see at 1 iteration there's barely a difference. As we move up the scale the refactored code does consistently outperform the inline expression. This outcome was different from my initial hypothesis. I decided to dig a bit deeper and find out why. So I pulled out ILDasm to see what was going on. I was surprised to see that the IL generated for inline test was twice as long as the refactored test. Looking at the code it became clear what was going on.

Inline IL
 IL_0042:   stloc.3
 IL_0043:   ldloc.3
 IL_0044:   ldc.i4.0
 IL_0045:   ldtoken    method instance void  LinqTest.DataObjects.Order::set_CustomerID(int32)
 IL_004a:   call       class [mscorlib]System.Reflection.MethodBase  [mscorlib]System.Reflection.MethodBase::GetMethodFromHandle(valuetype  [mscorlib]System.RuntimeMethodHandle)
 IL_004f:   castclass  [mscorlib]System.Reflection.MethodInfo
 IL_0054:   ldloc.2
 IL_0055:   ldtoken    method instance int32  LinqTest.OrderHeader::get_CustomerID()
 IL_005a:   call       class [mscorlib]System.Reflection.MethodBase  [mscorlib]System.Reflection.MethodBase::GetMethodFromHandle(valuetype  [mscorlib]System.RuntimeMethodHandle)
 IL_005f:   castclass  [mscorlib]System.Reflection.MethodInfo
 IL_0064:   call       class  [System.Core]System.Linq.Expressions.MemberExpression  [System.Core]System.Linq.Expressions.Expression::Property(class  [System.Core]System.Linq.Expressions.Expression,
 class  [mscorlib]System.Reflection.MethodInfo)
 IL_0069:   call       class  [System.Core]System.Linq.Expressions.MemberAssignment  [System.Core]System.Linq.Expressions.Expression::Bind(class  [mscorlib]System.Reflection.MethodInfo,
 class  [System.Core]System.Linq.Expressions.Expression)
Refactored IL
 IL_0005:   stloc.0
 IL_0006:   ldloc.0
 IL_0007:   ldarg.0
 IL_0008:   callvirt   instance int32 LinqTest.OrderHeader::get_CustomerID()
 IL_000d:   callvirt   instance void  LinqTest.DataObjects.Order::set_CustomerID(int32)

The inline call uses reflection on the objects to build the instantiation into the expression tree. It has to load the information about the SQL OrderHeader.CustomerID property via reflection. It then does the same thing for Order.CustomerID on the business object. After that it takes the value loaded from the sql object and binds it to the business object. 

The refactored code skips the reflection entirely. Since the method is expecting an order object LINQ to SQL just needs to do what it does best, load data from the database and map it to the ORM object. The refactored methods just need to do straight property assignment

Now most of this testing was doing with the C# LINQ syntax and not the chained function calls. I'm going to dig a bit deeper and recreate this with pure lambda expressions and see how that stacks up. I have a feeling they will probably perform close to the refactored examples

Another thing that makes me curious is the plateau reached and the differences between 1 to 100 iterations. I have a sneaking suspiscion that some of this maybe related to the JIT compiler and the garbage collector optimizing for the pattern of execution. I'll probably throw in some garbage collection counters to see how different they are.

So for now the moral of the story is refactoring is not only good for code reuse, simplicity it can also help increase performance. Without moving the mapping into another method other calls would have increased JIT time due to having more code than needed. The mapping call only needs to be JITed once.

kick it on DotNetKicks.com
Comments [2] #      LINQ | Performance  |  kick it on DotNetKicks.com