Data Access Framework Comparison

Introduction

For some time now I have been working on a project that utilizes a custom-built data access framework, rather than popular ORM frameworks such as Entity Framework or NHibernate.

While the custom framework has worked well for the project, I had questions about it.  For example, it uses stored procedures to implement basic CRUD operations, and I wondered if inline parameterized SQL statements might perform better.  Also, I wondered about the performance of the custom framework compared to the leading ORMs.

Besides my questions about the custom framework, I recognized the importance of having at least a basic understanding of how to use the other ORM frameworks.

In order to answer my questions about the custom framework and to gain some practical experience with the other ORMs, I created a simple web application that uses each of those frameworks to perform basic CRUD applications.  While executing the CRUD operations, the application times them and produces a summary report of the results.

The code for the test application can be found at https://github.com/mlichtenberg/ORMComparison.

NOTE: I assume that most readers are familiar with the basics of Entity Framework and NHibernate, so I will not provide an overview of them here.

Using the custom framework is similar to Entity Framework and NHibernate’s “database-first” approach.  Any project that uses the library references a single assembly containing the base functionality of the library.  A T4 template is used to generate additional classes based on tables in a SQL Server database.  Some of the classes are similar to EF’s Model classes and NHibernate’s Domain classes.  The others provide the basic CRUD functionality for the domain/model classes. 

For these tests I made a second copy of the custom framework classes that provide the basic CRUD functionality, and edited them to replace the CRUD stored procedures with parameterized SQL statements.

The custom framework includes much less overhead on top of ADO.NET than the popular ORMs, so I expected the tests to show that it was the best-performing framework.  The question was, how much better?

In the rest of this post, I will describe the results of my experiment, as well as some of the optimization tips I learned along the way.  Use the following links to jump directly to a topic.

Test Application Overview
“Out-of-the-Box” Performance
Entity Framework Performance After Code Optimization
     AutoDetectChangesEnabled and DetectChanges()
     Recycling the DbContext
NHibernate Performance After Configuration Optimization
     What’s Up with Update Performance in NHibernate?
Results Summary

Test Application Overview

    A SQL Express database was used for the tests.  The data model is borrowed from Microsoft’s Contoso University sample application.  Here is the ER diagram for the database:

image

 

The database was pre-populated with sample data.  The number of rows added to each table were:

Department: 20
Course: 200
Person: 100000
Enrollment: 200000

This was done because SQL Server’s optimizer will behave differently with an empty database than it will with a database containing data, and I wanted the database to respond as it would in a “real-world” situation.  For the tests, all CRUD operations were performed against the Enrollment table.

Five different data access frameworks were tested:

  1. Custom framework with stored procedures
  2. Custom framework with parameterized SQL statements
  3. Entity Framework
  4. NHibernate
  5. Fluent NHibernate

The testing algorithm follows the same pattern for each of the frameworks:

01) Start timer
02) For a user-specified number of iterations 
03)      Submit an INSERT statement to the database
04)      Save the identifier of the new database record
05) End timer
06) Start timer
07) For each new database record identifier
08)      Submit a SELECT statement to the database
09) End timer
10) Start timer
11) For each new database record identifier
12)      Submit an UPDATE statement to the database
13) End timer
14) Start timer
15) For each new database record identifier
16)      Submit a DELETE statement to the database
17) End timer

Note that after the test algorithm completes, the database is in the same state as when the tests began.

To see the actual code, visit https://github.com/mlichtenberg/ORMComparison/blob/master/MVCTestHarness/Controllers/TestController.cs.

"Out-of-the-Box" Performance

I first created very basic tests for each framework. Essentially, these were the “Hello World” versions of the CRUD code for each framework.  No optimization was attempted.

Here is an example of the code that performs the INSERTs for the custom framework.  There is no difference between the version with stored procedures and the version without, other than the namespace from which EnrollmentDAL is instantiated.

    DA.EnrollmentDAL enrollmentDAL = new DA.EnrollmentDAL();

    for (int x = 0; x < Convert.ToInt32(iterations); x++)
    {
        DataObjects.Enrollment enrollment = enrollmentDAL.EnrollmentInsertAuto
            (null, null, 101, 1, null);
        ids.Add(enrollment.EnrollmentID);
    }

      And here is the equivalent code for Entity Framework:

    using (SchoolContext db = new SchoolContext())
    {
       for (int x = 0; x < Convert.ToInt32(iterations); x++)
        {
            Models.Enrollment enrollment = new Models.Enrollment {
                CourseID = 101, StudentID = 1, Grade = null };
            db.Enrollments.Add(enrollment);
            db.SaveChanges();
            ids.Add(enrollment.EnrollmentID);
        }

    }

    The code for NHibernate and Fluent NHibernate is almost identical.  Here is the NHibernate version:

using (var session = NH.NhibernateSession.OpenSession("SchoolContext"))
{
    var course = session.Get<NHDomain.Course>(101);
    var student = session.Get<NHDomain.Person>(1);

    for (int x = 0; x < Convert.ToInt32(iterations); x++)
    {
        var enrollment = new NHDomain.Enrollment { 
            Course = course, Person = student, Grade = null };
        session.SaveOrUpdate(enrollment);

        ids.Add(enrollment.Enrollmentid);
    }

}

The SELECT, UPDATE, and DELETE code for each framework followed similar patterns. 

    NOTE: A SQL Server Profiler trace proved that the actual interactions with the database were the same for each framework.  The same database connections were established, and equivalent CRUD statements were submitted by each framework.  Therefore, any measured differences in performance are due to the overhead of the frameworks themselves.

        Here are the results of the tests of the “out-of-the-box” code:

      Framework              Operation     Elapsed Time (seconds)
      Custom                 Insert        5.9526039
      Custom                 Select        1.9980745
      Custom                 Update        5.0850357
      Custom                 Delete        3.7785886

      Custom (no SPs)        Insert        5.2251725
      Custom (no SPs)        Select        2.0028176
      Custom (no SPs)        Update        4.5381994
      Custom (no SPs)        Delete        3.7064278

      Entity Framework       Insert        1029.5544975
      Entity Framework       Select        8.6153572
      Entity Framework       Update        2362.7183765
      Entity Framework       Delete        25.6118191

      NHibernate             Insert        9.9498188
      NHibernate             Select        7.3306331
      NHibernate             Update        274.7429862
      NHibernate             Delete        12.4241886

      Fluent NHibernate      Insert        11.796126
      Fluent NHibernate      Select        7.3961941
      Fluent NHibernate      Update        283.1575124
      Fluent NHibernate      Delete        10.791648

      NOTE: For all tests, each combination of Framework and Operation was executed 10000 times.   Looking at the first line of the preceding results, this means that Custom framework took 7.45 seconds to perform 10000 INSERTs.

      As you can see, both instances of the the custom framework outperformed Entity Framework and NHibernate.  In addition, the version of the custom framework that used parameterized SQL was very slightly faster than the version that used stored procedures.  Most interesting however, was the performance for INSERT and UPDATE operations.  Entity Framework and both versions of NHibernate were not just worse than the two custom framework versions, they were much MUCH worse.  Clearly, some optimization and/or configuration changes were needed.

      Entity Framework Performance After Code Optimization

      AutoDetectChangesEnabled and DetectChanges()  

      It turns out that much of Entity Framework’s poor performance appears to have been due to the nature of the tests themselves.  Information on Microsoft’s MSDN website notes that if you are tracking a lot of objects in your DbContext object and call methods like Add() and SaveChanges() many times in a loop, your performance may suffer.  That scenario describes the test almost perfectly.

      The solution is to turn off Entity Framework’s automatic detection of changes by setting AutoDetectChangesEnabled to false and explicitly calling DetectChanges().  This instructs Entity Framework to only detect changes to entities when explicitly instructed to do so.  Here is what the updated code for performing INSERTs with Entity Framework looks like (changes highlighted in red):

      using (SchoolContext db = new SchoolContext())
      {
          db.Configuration.AutoDetectChangesEnabled = false;

          for (int x = 0; x < Convert.ToInt32(iterations); x++)
          {
              Models.Enrollment enrollment = new Models.Enrollment {
                  CourseID = 101, StudentID = 1, Grade = null };
              db.Enrollments.Add(enrollment);
              db.ChangeTracker.DetectChanges();
              db.SaveChanges();
              ids.Add(enrollment.EnrollmentID);
          }
      }

      Here are the results of tests with AutoDetectChangesEnabled set to false:

      Framework           Operation    Elapsed Time (seconds)
      Entity Framework    Insert       606.5569332
      Entity Framework    Select       6.4425741
      Entity Framework    Update       605.6206616
      Entity Framework    Delete       21.0813293

      As you can see, INSERT and UPDATE performance improved significantly, and SELECT and DELETE performance also improved slightly.

      Note that turning off AutoDetectChangesEnabled and calling DetectChanges() explicitly in all cases WILL slightly improve the performance of Entity Framework.  However, it could also cause subtle bugs.  Therefore, it is best to only use this optimization technique in very specific scenarios and allow the default behavior otherwise.

      Recycling the DbContext

      While Entity Framework performance certainly improved by changing the AutoDetectChangesEnabled value, it was still relatively poor. 

      Another problem with the tests is that the same DbContext was used for every iteration of an operation (i.e. one DbContext object was used for all 10000 INSERT operations).  This is a problem because the context maintains a record of all entities added to it during its lifetime.  The effect of this was a gradual slowdown of the INSERT (and UPDATE) operations as more and more entities were added to the context.

      Here is what the Entity Framework INSERT code looks like after modifying it to periodically create a new Context (changes highlighted in red):

      for (int x = 0; x < Convert.ToInt32(iterations); x++)
      {
          // Use a new context after every 100 Insert operations
          using (SchoolContext db = new SchoolContext())
          {
              db.Configuration.AutoDetectChangesEnabled = false;

              int count = 1;
              for (int y = x; y < Convert.ToInt32(iterations); y++)
              {
                  Models.Enrollment enrollment = new Models.Enrollment {
                      CourseID = 101, StudentID = 1, Grade = null };
                  db.Enrollments.Add(enrollment);
                  db.ChangeTracker.DetectChanges();
                  db.SaveChanges();
                  ids.Add(enrollment.EnrollmentID);

                  count++;
                  if (count >= 100) break;
                  x++;
              }
          }
      }

      And here are the results of the Entity Framework tests with the additional optimization added:

      Framework            Operation     Elapsed Time (seconds)
      Entity Framework     Insert        14.7847024
      Entity Framework     Select        5.5516514
      Entity Framework     Update        13.823694
      Entity Framework     Delete        10.0770142

      Much better!  The time to perform the SELECT operations was little changed, but the DELETE time was reduced by half, and the INSERT and UPDATE times decreased from a little more than 10 minutes to about 14 seconds.

      NHibernate Performance After Configuration Optimization

      For the NHibernate frameworks, the tests themselves were not the problem.  NHibernate itself needs some tuning. 

      An optimized solution was achieved by changing the configuration settings of the NHibernate Session object.  Here is the definition of the SessionFactory for NHibernate (additions highlighted in red):

      private static ISessionFactory SessionFactory
      {
          get
          {
              if (_sessionFactory == null)
              {
                  string connectionString = ConfigurationManager.ConnectionStrings
                      [_connectionKeyName].ToString();

                  var configuration = new NHConfig.Configuration();
                  configuration.Configure();

                  configuration.SetProperty(NHConfig.Environment.ConnectionString,
                      connectionString);

                  configuration.SetProperty(NHibernate.Cfg.Environment.FormatSql,
                      Boolean.FalseString);
                  configuration.SetProperty
                     (NHibernate.Cfg.Environment.GenerateStatistics,
                          Boolean.FalseString);
                  configuration.SetProperty
                     (NHibernate.Cfg.Environment.Hbm2ddlKeyWords,
                          NHConfig.Hbm2DDLKeyWords.None.ToString());
                  configuration.SetProperty(NHibernate.Cfg.Environment.PrepareSql,
                          Boolean.TrueString);
                  configuration.SetProperty
                      (NHibernate.Cfg.Environment.PropertyBytecodeProvider,
                          "lcg");
                  configuration.SetProperty
                      (NHibernate.Cfg.Environment.PropertyUseReflectionOptimizer,
                          Boolean.TrueString);
                  configuration.SetProperty
                      (NHibernate.Cfg.Environment.QueryStartupChecking,
                          Boolean.FalseString);
                  configuration.SetProperty(NHibernate.Cfg.Environment.ShowSql, 
                      Boolean.FalseString);
                  configuration.SetProperty
                      (NHibernate.Cfg.Environment.UseProxyValidator, 
                          Boolean.FalseString);
                  configuration.SetProperty
                      (NHibernate.Cfg.Environment.UseSecondLevelCache,
                          Boolean.FalseString);

                  configuration.AddAssembly(typeof(Enrollment).Assembly);
                  _sessionFactory = configuration.BuildSessionFactory();
              }
              return _sessionFactory;
          }
      }

      And here is the InitializeSessionFactory method for Fluent NHibernate, with the equivalent changes included:

      private static void InitializeSessionFactory()
      {
          string connectionString = ConfigurationManager.ConnectionStrings[_connectionKeyName]
              .ToString();

          _sessionFactory = Fluently.Configure()
              .Database(MsSqlConfiguration.MsSql2012.ConnectionString(connectionString).ShowSql())
              .Mappings(m => m.FluentMappings.AddFromAssemblyOf<Enrollment>())
              .BuildConfiguration().SetProperty
                  (NHibernate.Cfg.Environment.FormatSql, Boolean.FalseString)
              .SetProperty(NHibernate.Cfg.Environment.GenerateStatistics,
                  Boolean.FalseString)
              .SetProperty(NHibernate.Cfg.Environment.Hbm2ddlKeyWords,
                  NHibernate.Cfg.Hbm2DDLKeyWords.None.ToString())
              .SetProperty(NHibernate.Cfg.Environment.PrepareSql,
                  Boolean.TrueString)
              .SetProperty(NHibernate.Cfg.Environment.PropertyBytecodeProvider,
                  "lcg")
              .SetProperty
                  (NHibernate.Cfg.Environment.PropertyUseReflectionOptimizer,
                      Boolean.TrueString)
              .SetProperty(NHibernate.Cfg.Environment.QueryStartupChecking,
                  Boolean.FalseString)
              .SetProperty(NHibernate.Cfg.Environment.ShowSql, Boolean.FalseString)
              .SetProperty(NHibernate.Cfg.Environment.UseProxyValidator,
                  Boolean.FalseString)
              .SetProperty(NHibernate.Cfg.Environment.UseSecondLevelCache,
                  Boolean.FalseString)
              .BuildSessionFactory();
      }

      The following table gives a brief description of the purpose of these settings:

      Setting                   Purpose
      FormatSql                 Format the SQL before sending it to the database
      GenerateStatistics        Produce statistics on the operations performed
      Hbm2ddlKeyWords           Should NHibernate automatically quote all db object names
      PrepareSql                Compiles the SQL before executing it
      PropertyBytecodeProvider  What bytecode provider to use for the generation of code
      QueryStartupChecking      Check all named queries present in the startup configuration
      ShowSql                   Show the produced SQL
      UseProxyValidator         Validate that mapped entities can be used as proxies
      UseSecondLevelCache       Enable the second level cache

      Notice that several of these (FormatSQL, GenerateStatistics, ShowSQL) are most useful for debugging.  It is not clear why they are enabled by default in NHibernate; it seems to me that these should be opt-in settings, rather than opt-out.

      Here are the results of tests of the NHibernate frameworks with these changes in place:

      Framework                        Operation     Elapsed Time (seconds)
      NHibernate (Optimized)           Insert        5.0894047
      NHibernate (Optimized)           Select        5.2877312
      NHibernate (Optimized)           Update        133.9417387
      NHibernate (Optimized)           Delete        5.6669841

      Fluent NHibernate (Optimized)    Insert        5.0175024
      Fluent NHibernate (Optimized)    Select        5.2698945
      Fluent NHibernate (Optimized)    Update        128.3563561
      Fluent NHibernate (Optimized)    Delete        5.5299521

      These results are much improved, with the INSERT, SELECT, and DELETE operations nearly matching the results achieved by the custom framework.   The UPDATE performance, while improved, is still relatively poor.

      What’s Up with Update Performance in NHibernate?

      The poor update performance is a mystery to me.  I have researched NHibernate optimization techniques and configuration settings, and have searched for other people reporting problems with UPDATE operations.  Unfortunately, I have not been able to find a solution.

      This is disappointing, as I personally found NHibernate more comfortable to work with than Entity Framework, and because it beats or matches the performance of Entity Framework for SELECT, INSERT, and DELETE operations.

      If anyone out there knows of a solution, please leave a comment!

      Final Results

      The following table summarizes the results of the tests using the optimal configuration for each framework.  These are the same results shown earlier in this post, combined here in a single table.

      Framework                        Operation     Elapsed Time (seconds)
      Custom                           Insert        5.9526039
      Custom                           Select        1.9980745
      Custom                           Update        5.0850357
      Custom                           Delete        3.7785886

      Custom (no SPs)                  Insert        5.2251725
      Custom (no SPs)                  Select        2.0028176
      Custom (no SPs)                  Update        4.5381994
      Custom (no SPs)                  Delete        3.7064278

      Entity Framework (Optimized)     Insert        14.7847024
      Entity Framework (Optimized)     Select        5.5516514
      Entity Framework (Optimized)     Update        13.823694
      Entity Framework (Optimized)     Delete        10.0770142

      NHibernate (Optimized)           Insert        5.0894047
      NHibernate (Optimized)           Select        5.2877312
      NHibernate (Optimized)           Update        133.9417387
      NHibernate (Optimized)           Delete        5.6669841

      Fluent NHibernate (Optimized)    Insert        5.0175024
      Fluent NHibernate (Optimized)    Select        5.2698945
      Fluent NHibernate (Optimized)    Update        128.3563561
      Fluent NHibernate (Optimized)    Delete        5.5299521

      And here is a graph showing the same information:

      image

    That Conference 2016–Session Resources

    Last week I had the pleasure of attending the 2016 edition of That Conference.

    It was an all-around excellent experience.  The venue, topics, speakers, sponsors, food, after-hours events, and swag all left little to complain about.  In addition, many technical conferences include areas/times for free-form open discussions led by conference attendees on topics of their choosing, and That Conference is no exception.  That Conference’s version of this was called Open Spaces, and by all accounts it was a success.  While I only took part in a single discussion, I observed that the area designated for those discussions was never not busy.

    Conference experiences can be spoiled by inexperienced or ill-prepared session speakers.  At That Conference I was pleased by the quality of the speakers in all twelve sessions and three keynotes that I attended.  However, there were too many interesting sessions (a good thing!) and too little time (can’t be helped).  Therefore, since returning home I have been watching social media and the conference website in order to compile links to as many of the session materials as possible.

    Here are the links to everything that I have been able to find.  (If you know if others, please post a comment with the links!)

    Against Best Practices – Embracing the Avant Garde for a Weirder Web
    Chelsea Maxwell

    As Seen On TV: Developing Apps for Apple TV and TVOS
    Matthew Soucoup

    Back to the Future => C# 7
    Mike Harris

    Battle of the CLI: Gulp vs. Grunt
    Abbey Gwayambadde

    Be An Expert Xamarin Outdoorsman with the Ultimate Xamarin Toolchain
    Vince Bullinger

    Bear Proof Applications: Using Continuous Security to Mitigate Threats
    Wendy Istanick

    Boost Your Immune System with DevOps
    Michelle Munstedt

    Build and Deploy Your ASP.NET Core Applications… Automatically!
    Brandon Martinez

    Build Your Own Smart Home
    Brandon Satrom

    Building Mobile Games That Make Money
    Scott Davis

    C#: You Don’t Know Jack
    George Heeres

    Clean Architecture: Patterns, Practices, and Principles
    Matthew Renze

    Common T-SQL Mistakes
    Kevin Boles

    Computer Science: The Good Parts
    Jeffery Cohen

    Daring to Develop With Docker
    Philip Nelson

    Date and Time: Odds, Ends, and Oddities
    Maggie Pint

    Domain Driven Data
    Bradley Holt

    Enough Cryptography to be Dangerous
    Steve Marx

    The Experimentation Mindset
    Doc Norton

    Finding Your Way to the App Store
    Matthew Ridley

    From Inception to Production: A Continuous Delivery Story
    Ian Randall

    From Mobile First to Offline First
    Bradley Holt

    Full-Stack ASP.NET MVC Performance Tuning
    Dustin Ewers

    Happy Full-Stack Javascript Campers
    Ryan Niemeyer

    How I Learned To Love Dependency Injection
    James Bender

    Identity Management in ASP.NET Core
    Ondrej Balas

    An Internet Of Beers
    Wade Wegner

    Intro to Typescript
    Jody Gustafson

    Introduction to Angular 2.0
    Jeremy Foster

    Javascript Code Quality
    Md Khan

    Keynote: Family Keynote
    Neely Drake and Emily Davis

    Keynote: From 0 to 100,000: How Particle Failed, then Succeeded, then Scaled
    Zach Supalla

    Keynote: Stop Writing Code
    Keith Casey

    Keynote: You Have Too Much Time
    Jeff Blankenburg

    Mastering Voice UX Featuring Amazon’s Echo (AKA Alexa)
    Chris Pauly

    A Microservices Architecture That Emphasizes Rapid Development
    Rob Hruska

    Microsoft Bot Framework: Hiking Up the Trail of Automation
    David Hauck

    The Millennials R Coming
    Heather Shapiro

    Node.JS Crash Course
    David Neal

    Not Just Arts & Crafts: A Developer’s Guide to Incorporating Lean UX Practices into the Development Process
    Rachel Krause

    Out With the Old, In With the New: A Comparison of Angular 1 and 2
    Tony Gemoll

    Pavlov Yourself!
    Alexandra Feldman

    React Native to the Rescue
    Josh Gretz

    React vs. Angular – Dawn of Changes
    John Ptacek

    ReactJS For Beginners
    Arthur Kay

    Ruby on Rails from 0 to Deploy in 60 Minutes
    Chris Johnson

    Ruby Writing Ruby – Campfire Tales of Metaprogramming
    Sara Gibbons

    Service Bus Summer Camp
    David Boike

    So Many Analytics Tools, Which One Is Right For Me?
    Jason Groom

    Start Your Own Business, Dammit!
    Terra Fletcher

    A Tale of Two Redesigns
    Jess Bertling

    Tell SQL Server Profiler To Take A Hike
    Jes Borland

    Understanding Git, Part 2
    Keith Dahlby

    UX Beyond the UI – How the Rest of Software Development Affects User Experience
    Joe Regan

    Why Your Site Is Slow
    Steve Persch

    Working From Whereever
    Aaron Douglas

    Log Parser – Transforming Plain Text Files

    This post describes how to solve a specific problem with Microsoft’s Log Parser tool.  For background on the tool (and lots of examples), start here.

    The Problem

    Given a file named MyLog.log that looks like this…

    ip=0.0.0.0 date=20160620 time=06:00:00 device=A23456789 log=00013
    ip=0.0.0.1 date=20160621 time=06:00:01 device=A13456789 log=00014
    ip=0.0.0.2 date=20160622 time=06:00:02 device=A12456789 log=00015
    ip=0.0.0.3 date=20160623 time=06:00:03 device=A12356789 log=00016
    ip=0.0.0.4 date=20160624 time=06:00:04 device=A12346789 log=00017
    ip=0.0.0.5 date=20160625 time=06:00:05 device=A12345789 log=00018
    ip=0.0.0.6 date=20160626 time=06:00:06 device=A12345689 log=00019
    ip=0.0.0.7 date=20160627 time=06:00:07 device=A12345679 log=00020
    ip=0.0.0.8 date=20160628 time=06:00:08 device=A12345678 log=00021
    ip=0.0.0.9 date=20160629 time=06:00:09 device=A123456789 log=00022

    …transform it into a tab-separated file with a header row.  Each field should include only the field value (and not the field name).

    Notice that the original file has no header, the fields are separated with spaces, and the field name is part of each field (i.e. "ip=").

    The Solution

    Step 1)

    logparser -i:TSV -iSeparator:space -headerRow:OFF
         "select * into ‘MyLogTemp.log’ from ‘MyLog.log’"
         -o:TSV -oSeparator:space -headers:ON

    In this command, -i:TSV -iSeparator:space informs Log Parser that the input file is a space-separated text file, and -headerRow:OFF lets Log Parser know that the file has no headers.  Likewise, -o:TSV -oSeparator:space -headers:ON tells Log Parser to output a space-separated text file with headers.

    This produces a file named MyLogTemp.log with the following content:

    Filename RowNumber Field1 Field2 Field3 Field4 Field5
    MyLog.log 1 ip=0.0.0.0 date=20160620 time=06:00:00 device=A23456789 log=00013
    MyLog.log 2 ip=0.0.0.1 date=20160621 time=06:00:01 device=A13456789 log=00014
    MyLog.log 3 ip=0.0.0.2 date=20160622 time=06:00:02 device=A12456789 log=00015
    MyLog.log 4 ip=0.0.0.3 date=20160623 time=06:00:03 device=A12356789 log=00016
    MyLog.log 5 ip=0.0.0.4 date=20160624 time=06:00:04 device=A12346789 log=00017
    MyLog.log 6 ip=0.0.0.5 date=20160625 time=06:00:05 device=A12345789 log=00018
    MyLog.log 7 ip=0.0.0.6 date=20160626 time=06:00:06 device=A12345689 log=00019
    MyLog.log 8 ip=0.0.0.7 date=20160627 time=06:00:07 device=A12345679 log=00020
    MyLog.log 9 ip=0.0.0.8 date=20160628 time=06:00:08 device=A12345678 log=00021
    MyLog.log 10 ip=0.0.0.9 date=20160629 time=06:00:09 device=A123456789 log=00022

    This hasn’t done much.  In fact is has added some stuff that is not relevant (the Filename and RowNumber columns), while leaving field names in each fields and maintaining the space field separator.  However, it HAS added headers (Field1, Field2, ect), which are needed for the second step.

    Step 2)

    logparser -i:TSV -iSeparator:space -headerRow:ON
         "select REPLACE_STR(Field1, ‘ip=’, ”) AS ip,
                   REPLACE_STR(Field2, ‘date=’, ”) AS date,
                   REPLACE_STR(Field3, ‘time=’, ”) AS time,
                   REPLACE_STR(Field4, ‘device=’, ”) AS device,
                   REPLACE_STR(Field5, ‘log=’, ”) AS log
         into ‘MyLogTransformed.log’
         from ‘MyLogTemp.log’"
         -o:TSV -oSeparator:tab -headers:ON

    The input and output specifications in this command are similar to those in Step 1, except here the input file has headers (-headerRow:ON) and the output file is tab-separated (-oSeparator:tab) instead of space-separated.  The main difference is in the SELECT statement itself, where the use of the REPLACE_STR function removes the field names from the field values and the AS statement assigns the desired headers to each column of data.  Notice that the REPLACE_STR function uses the headers that were added in Step 1.

    This produces the final result in a file named MyLogTransformed.log:

    ip     date     time     device     log
    0.0.0.0     20160620     06:00:00     A23456789     00013
    0.0.0.1     20160621     06:00:01     A13456789     00014
    0.0.0.2     20160622     06:00:02     A12456789     00015
    0.0.0.3     20160623     06:00:03     A12356789     00016
    0.0.0.4     20160624     06:00:04     A12346789     00017
    0.0.0.5     20160625     06:00:05     A12345789     00018
    0.0.0.6     20160626     06:00:06     A12345689     00019
    0.0.0.7     20160627     06:00:07     A12345679     00020
    0.0.0.8     20160628     06:00:08     A12345678     00021
    0.0.0.9     20160629     06:00:09     A123456789     00022

    More Information

    See Log Parser’s built-in help for additional explanations of the Log Parser features used in the solution.  In particular, look at the following:

    logparser -h
    logparser -h -i:TSV
    logparser -h -o:TSV
    logparser -h FUNCTIONS REPLACE_STR

    Recommended Tool: Express Profiler for SQL Server Databases

    NOTE:  As I was writing up this post I discovered the news that SQL Profiler is deprecated as of the release of SQL Server 2016.  If this also affects the underlying SQL Server tracing APIs, then this news may affect the long-term future of the Express Profiler.  For now, however, it is a tool that I recommend.

    Express Profiler is a simple Open Source alternative to the SQL Profiler that ships with the full SQL Server Management Studio.  This is particularly useful when working with SQL Server Express databases, as the Express version of the Management Studio does NOT include the SQL Profiler.

    Usage of the Express Profiler should be self-explanatory to anyone familiar with the SQL Profiler.

    Here are some details about Express Profiler from the project page:

    • ExpressProfiler (aka SqlExpress Profiler) is a simple and fast replacement for SQL Server Profiler with basic GUI
    • Can be used with both Express and non-Express editions of SQL Server 2005/2008/2008r2/2012/2014 (including LocalDB)
    • Tracing of basic set of events (Batch/RPC/SP:Stmt Starting/Completed, Audit login/logout, User error messages, Blocked Process report) and columns (Event Class, Text Data,Login, CPU, Reads, Writes, Duration, SPID, Start/End time, Database/Object/Application name) – both selectable
    • Filters on most data columns
    • Copy all/selected event rows to clipboard in form of XML
    • Find in "Text data" column
    • Export data in Excel’s clipboard format

    While I have found Express Profiler to be a good and useful tool, it is not as fully-featured as the SQL Profiler.  Here are some key "missing" features in Express Profiler:

    • No way to load a saved trace output, although that feature is on the roadmap for the tool.
    • No way to save trace output directly to a database table.
    • Fewer columns can be included in the trace output, and many fewer events can be traced.  In my experience, however, the columns and events that I find myself using in most cases are all available.
    • As there are fewer columns in the output, there are fewer columns on which to filter.  Again, the most common/useful columns and events are covered.
    • No way to create trace templates for use with future traces.

    Despite these limitations, I recommend this tool for situations where the full SQL Profiler is not available.

    Installing Docker on Windows

    I recently set up Docker on both Windows 10 and Windows 7.  In both cases, it was a slightly bumpy experience, so I am recording the steps I followed here.

    Windows 10

    Machine Specifications
    • 64-bit Windows 10
    • VT-X/AMD-v support enabled in the BIOS
    • Hyper-V installed
    • Latest version of VirtualBox (version 5.0.16 at the time of this writing) installed
    Process

    STEP 1

    Download and install Docker Toolbox.  This is straightforward, and detailed instructions for doing this are available online).

    STEP 2

    Docker on Windows requires VirtualBox in order to run a lightweight Linux virtual machine.  VirtualBox and the Windows Hyper-V technology are mutually exclusive technologies; you cannot use both at the same time.  Therefore, If Hyper-V is installed and enabled on your machine, you should disable it by following the directions found here.

    STEP 3

    The Docker startup script checks for the presence of Hyper-V, and halts if it is found.  It does NOT check the enabled/disabled state of Hyper-V.  So, if Hyper-V is installed on your computer, add a parameter to the Docker startup script to avoid the Hyper-V check.

    1. Locate the file C:\Program Files\Docker Toolbox\start.sh, and make a backup copy.
    2. Open the file C:\Program Files\Docker Toolbox\start.sh for editing.
    3. Look for this line: "${DOCKER_MACHINE}" create -d virtualbox "${VM}"
    4. Add a parameter to the line: "${DOCKER_MACHINE}" create –virtualbox-no-vtx-check -d virtualbox "${VM}"
    5. Save the file
      STEP 4

    The Docker Toolbox installation adds an icon to the desktop labeled “Docker Quickstart Terminal”.  Click on this icon to run Docker for the first time. 

    At this point the script will create the Docker virtual machine and attempt to configure it.  If the configuration proceeds successfully, you will have a running Docker instance.

    In my case, configuration failed.  I received the following error, which seems to be not uncommon:

    Error creating machine: Error in driver during machine creation: Unable to start the VM: C:\Program Files\Oracle\VirtualBox\VBoxManage.exe startvm default –type headless failed:

    VBoxManage.exe: error: Failed to open/create the internal network ‘HostInterfaceNetworking-VirtualBox Host-Only Ethernet Adapter #5’ (VERR_INTNET_FLT_IF_NOT_FOUND).

    VBoxManage.exe: error: Failed to attach the network LUN (VERR_INTNET_FLT_IF_NOT_FOUND)

    VBoxManage.exe: error: Details: code E_FAIL (0x80004005), component ConsoleWrap, interface IConsole

    Details: 00:00:03.020271 Power up failed (vrc=VERR_INTNET_FLT_IF_NOT_FOUND, rc=E_FAIL (0X80004005))

    To fix this error, I followed the instructions at https://caffinc.github.io/2015/11/fix-vbox-network/ to change the properties of the VirtualBox network adapter.  The key instructions are repeated here:

    I opened my Control Panel > Network & Internet > Network Connections and located my VirtualBox Host-Only Network adapter, called VirtualBox Host-Only Network #6 on my machine. I right-clicked it and selected the Properties, and checked the un-checked VirtualBox NDIS6 Bridged Networking Driver and started VirtualBox again.

    Then I used the VirtualBox Manager to delete the Docker virtual machine (named "default") and delete all associated files.

    Once that additional configuration was complete, I was able to re-run the “Docker Quickstart Terminal” successfully.

    Windows 7

    Machine Specifications
    • 64-bit Windows 7
    • VT-X/AMD-v support enabled in the BIOS.
    • Latest version of VirtualBox (version 5.0.16 at the time of this writing) installed
    Process

    STEP 1

    Download and install Docker Toolbox.

    STEP 2

    The Toolbox installation adds an icon to the desktop labeled “Docker Quickstart Terminal”.  Click on this icon to run Docker for the first time.  

    At this point the script will create the Docker virtual machine and attempt to configure it.  If the configuration proceeds successfully, you will have a running Docker instance.

    My configuration failed.  I received the following error:

    Running pre-create checks…
    Creating machine…
    (default) Copying C:\Users\mlichtenberg\.docker\machine\cache\boot2docker.iso to
    C:\Users\mlichtenberg\.docker\machine\machines\default\boot2docker.iso…
    (default) Creating VirtualBox VM…
    (default) Creating SSH key…
    (default) Starting the VM…
    (default) Check network to re-create if needed…
    (default) Windows might ask for the permission to create a network adapter. Some
    times, such confirmation window is minimized in the taskbar.
    (default) Creating a new host-only adapter produced an error: C:\Program Files\O
    racle\VirtualBox\VBoxManage.exe hostonlyif create failed:
    (default) 0%…
    (default) Progress state: E_INVALIDARG
    (default) VBoxManage.exe: error: Failed to create the host-only adapter
    (default) VBoxManage.exe: error: Assertion failed: [!aInterfaceName.isEmpty()] a
    t ‘F:\tinderbox\win-5.0\src\VBox\Main\src-server\HostNetworkInterfaceImpl.cpp’ (
    74) in long __cdecl HostNetworkInterface::init(class com::Bstr,class com::Bstr,c
    lass com::Guid,enum __MIDL___MIDL_itf_VirtualBox_0000_0000_0036).
    (default) VBoxManage.exe: error: Please contact the product vendor!
    (default) VBoxManage.exe: error: Details: code E_FAIL (0x80004005), component Ho
    stNetworkInterfaceWrap, interface IHostNetworkInterface
    (default) VBoxManage.exe: error: Context: "enum RTEXITCODE __cdecl handleCreate(
    struct HandlerArg *)" at line 71 of file VBoxManageHostonly.cpp
    (default)
    (default) This is a known VirtualBox bug. Let’s try to recover anyway…
    Error creating machine: Error in driver during machine creation: Error setting u
    p host only network on machine start: The host-only adapter we just created is n
    ot visible. This is a well known VirtualBox bug. You might want to uninstall it
    and reinstall at least version 5.0.12 that is is supposed to fix this issue

    To attempt to fix this error, I first followed the instructions at https://caffinc.github.io/2015/11/fix-vbox-network/ to change the properties of the VirtualBox network adapter.  The key instructions are repeated here:

    I opened my Control Panel > Network & Internet > Network Connections and located my VirtualBox Host-Only Network adapter, called VirtualBox Host-Only Network #6 on my machine. I right-clicked it and selected the Properties, and checked the un-checked VirtualBox NDIS6 Bridged Networking Driver and started VirtualBox again.

    This did NOT work.  However, I found additional advice online that suggested doing the opposite (checking the “VirtualBox NDIS6 Bridged Networking Driver” instead of unchecking), and then disabling and enabling the network adapter.

    Disabling and re-enabling the network adapter may have been the key to resolving the error, but I cannot sure for sure.  In any case, I had success after 1) putting the properties of the network adapter back to their original state 2) and re-starting the network adapter.

    Once the network adapter was reset, I used the VirtualBox Manager to delete the Docker virtual machine (named "default") and delete all associated files.

    Once the additional configuration was complete, I was able to re-run the “Docker Quickstart Terminal” successfully.

    Testing / Validation

    On both Windows 7 and Windows 10, once the Docker installation completes successfully you will be left with a terminal window opened to a Docker command line.

    An Internet connection is required for the following.

    To test a basic Docker image, run the following command:

    docker run hello-world

    You will receive output that looks like the following:

    Unable to find image ‘hello-world:latest’ locally
    511136ea3c5a: Pull complete
    31cbccb51277: Pull complete
    e45a5af57b00: Pull complete
    hello-world:latest: The image you are pulling has been verified.
    Important: image verification is a tech preview feature and should not be
    relied on to provide security.
    Status: Downloaded newer image for hello-world:latest
    Hello from Docker.
    This message shows that your installation appears to be working correctly.

    To generate this message, Docker took the following steps:
    1. The Docker client contacted the Docker daemon.
    2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
       (Assuming it was not already locally available.)
    3. The Docker daemon created a new container from that image which runs the
       executable that produces the output you are currently reading.
    4. The Docker daemon streamed that output to the Docker client, which sent it
       to your terminal.

     

    That is it.  For next steps with Docker, reading the User’s Guide is a good place to start. 

    If you want skip the specifics of Docker itself and dive right in to using a Docker container, pre-built Docker images for popular technologies such as nginx, Mongo, MySQL, Node, PostgreSQL, SOLR, ElasticSearch, and many others can be found at the Docker Hub.

    How To Restore a Database in a SQL Server AlwaysOn Availability Group

    Background

    There are two clustered servers running SQL Server 2014.  The servers host production databases as well as databases used for QA and testing.  One AlwaysOn Availability Group has been created for the production databases and one for the QA databases.  The same server serves as the primary for both AlwaysOn Availability Groups.

    One of the production databases needs to be restored from backup.  Backups are taken from the secondary server, not the primary.  The backups should be restored to the same server from which they were taken.

    The following tasks need to be completed in order to restore the database:

    • Make the secondary server from which the backups were taken the primary server
    • Remove the database to be restored from the AlwaysOn Availability Group
    • Restore the database
    • Add the database back into the Always Availability Group

    Following are detailed instructions for completing these tasks.

    Task 1: Switch the Primary and Secondary Servers

    1) Connect to both servers in the cluster in SQL Server Management Studio.

    2) On the Secondary server, expand the "Availability Groups" folder under the "AlwaysOn High Availability" folder.

    01

    3) Right-click on the availability group containing the database to be restored and select "Failover…" from the context menu.  Click “Next >”.

    02

    4) Select the new primary server.  Click “Next >”.

    03

    5) Verify the choices and click “Finish”.

    04

    05

    6) Repeat steps 3-5 for the remaining availability group.

    Task 2: Remove the Database from the Availability Group

    A restore operation cannot be performed on a database that is part of an availability group, so the next task is to remove the database from the group.

    1) On the new primary server, expand the list of Availability Databases for the availability group.

    06

    2) Right-click the database to be restored and select "Remove Database from Availability Group…" from the context-menu. 

    07

    3) Click “OK” to remove the database from the availability group. 

    08

    Task 3: Restore the Database

    1) In the “Databases” folder on the primary server, right-click on the database to be restored and select "Properties" to open the “Database Properties” dialog.  Select the “Options” page, scroll down to the “Restrict Access” option, and change the value from MULTI_USER to SINGLE_USER.  Click “OK”.

    09

    2) In the “Databases” folder on the primary server, right-click on the database to be restored and select Tasks->Restore->Database… from the context menu.

    3) On the “General” page of the “Restore Database” dialog, select the last Full backup and all Transaction log backups.

    10

    4) Select the “Options” page of the “Restore Database” dialog and click the "Overwrite the existing database (WITH REPLACE)" option.  Click "OK".

    11

    Task 4: Add the Database Back to the Availability Group

    After the restore of the database to the new primary server is complete, it can be put back into the availability group.

    1) In the “Database” folder on the secondary server, right-click the database and select "Delete" from the context menu.  Click “OK”.

    12

    2) Right-click “Availability Databases” in the availability group on the primary server and select "Add Database…" from the context menu.  Click “Next >”.

    13

    3) Select the database to be added to the group and click "Next >".

    14

    4) Select "Full" as the data synchronization preference.  This will take a full backup of the database on the primary and restore it on the secondary server(s).  Specify a network location accessible to the primary and all secondary servers in which to place the backup files.  Click "Next >".

    15

    5) Use the “Connect…” button to establish a connection to the secondary server(s).  Click "Next >".

    16

    6) The "Add Database to Availability Group" wizard will validate all of the settings for the new availability group database.  When it completes, click "Next >".

    17

    7) Verify the choices and click "Finish" to add the database to the availability group.

    18

    19

    Final Tasks

    The restore of a database in an AlwaysOn Availability Group is now complete.

    At this point it is recommended to immediately perform a backup of the restored database on the secondary server.  This will establish a new backup chain for the database.

    The backup files created during synchronization of the primary and secondary server(s) can be deleted.  The location of those files was specified in Step 4 of the “Add the Database Back to the Availability Group” task.

    Note that the restored database should now be back in MULTI_USER mode.  Recall that it had been set to SINGLE_USER in Step 1 of the “Restore the Database” task.