Dev Up Conference 2017 – Session Resources

This week I attended the Dev Up Conference in St. Louis.  I thought that the (new) venue, food, and speakers were all excellent, and improved upon past editions of the conference.  Kudos to all involved; they must have put in a lot of hard work.

As usual after attending a conference such as this, I attempt to accumulate links to as many of the session resources as I can find (on twitter, youtube, blogs, and so on), and share them here on my blog.  I do this because I figure that I am not the only one that had to skip great sessions because they were scheduled at the same time as other equally great sessions.

So without further ado, here is the list of all session from this year’s Dev Up Conference, with as many links to additional information as I could find.  Apologies for any that I missed.

Also, a number of the sessions were recorded.  I assume that they will be posted online by the Dev Up organizers, so keep an eye on the conference web site for those.

.CSS {Display: What?}
Martine Dowden

.NET and Couchbase: Using NoSQL Is Easier Than You Think
Don Schenck

.NET, Linux and Microservices Architecture
Don Schenck

1 Billion Records IS NOT BIG DATA PEOPLE!
Steve Howard

5 Popular Choices for NoSQL on a Microsoft Platform
Matthew Groves

A Brisk Stroll Through AzureML Studio
Kevin Queen

A feature based approach to software development
Ryan Lanciaux

A Guide to JavaScript’s Scary Side
Jonathan Mills

A Lap Around Xamarin.Forms
Douglas Starnes

A Skeptics Guide to Functional Style JavaScript
Jonathan Mills

Accessibility Cookbook: 10 Easy Recipes
Martine Dowden

Adding Realtime Features to Your Applications with SignalR
Javier Lozano

Agile Delivery in a Waterfall World
John Gobble

Agile Failures: Stories From The Trenches
Philip Japikse

Agile Metrics That Matter
Clint Edmonson

Agile: You Keep Using That Word…
Philip Japikse

All The New Things: A Madcap Tour of the latest in Microsoft Web Development
Brad Tutterow

An Entrepreneur’s Tale
Randy Walker

An Extended Explanation of Caching
Tom Cudd

An Introduction to Microservices
Mike Green

Angular vs. React: A live demonstration, comparison, and discussion
Kevin Grossnicklaus

Angular, the ASP.NET Pitch
Ed Charbeneau

ASP.NET Core + React Equals Awesome
Lee Brandt

Authentication and Security Strategies for the Modern Web
Spencer Schneidenbach

Azure SQL Data Warehouse, Cloud BI
Randy Walker

Becoming an Architect
Ken Sipe

Beginner Reactive Programming with RxJS
Cory Rylan

Between Two Form Tags
Danielle Cooley

Build a JavaScript Dev Environment in 1 Hour
Cory House

Building a Chat Bot with
Erin Page

Building A Highly Scalable Service that Survived A Super Bowl
Keith Elder

Building Powerful Applications with Angular and TypeScript
David Giard

Building Reusable UI Components in ASP.NET Core MVC
Scott Addie

Building Your Evil(?) Empire with Azure Functions
Bryan Soltis

Bus Accident Management
James West

Career Management – Better than Career Development
John Maglione

Career Paths Beyond Sr. Developer
Jim Drewes

Cloud Networking: What’s Underneath?
James Nugent

Code Is Communication
Steven Hicks

Compromise Less and Deliver More with Xamarin
David Ortinau

Confronting Your Fears: Entity Framework Performance Deep-dive
Mitchel Sellers

Continuous Delivery at Enterprise Scale
Jason Whittington

Custom Middleware & Microservices with ASP.NET Core
Ondrej Balas

Data Science Platform Architecture
Ryan Metcalf

Database DevOps in Visual Studio 2017 Enterprise with ReadyRoll Core
Ronnie Hicks

Dockerize Your Development
Lee Brandt

Domain Driven Design: The Good Parts
Jimmy Bogard

Effective Data Visualization
David Giard

Electron: Desktop Development for Web Developers
Chris Woodruff

Everything I Didn’t Know About JavaScript
Brad Tutterow

Fear and (Self) Loathing in IT – A Healthy Discussion on Imposter Syndrome
Angela Dugan

Feed Your Inner Data Scientist: JavaScript Tools for Data Visualization and Filtering
Doug Mair

Forget Velocity, Let’s Talk Acceleration
Jessica Kerr

From C# 6 to C# 7, Then and Now!
David Pine

From Developer to Data Scientist
Gaines Kergosien

Getting Started with Machine Learning, for Non-Data Scientists
Yung Chou

Git Demystified
Kent Peek

Giving Clarity to LINQ Queries by Extending Expressions
Ed Charbeneau

Growing a Dev Team from Bootstrap to Enterprise
Scott Connerly

Have Your Best Season Yet: Becoming a (Microsoft) MVP
Lisa Anderson

HoloLens Mixed Reality for Fun & Profit
Gaines Kergosien

How do You Measure up? Collect the Right Metrics for the Right Reasons
Angela Dugan

How Mobile Web Works at Twitch
Matt Follett

I, for One, Welcome Our Robot Overlords: Intro to the Bot Framework
John Alexander

Implementing a Modern Web Stack in a Legacy Environment
James West

Implementing Web Security in Your ASP.NET Applications
Javier Lozano

Intellectual Property Fundamentals for the Technologist
Jeff Strauss

Intro to Hacking with the Raspberry Pi
Sarah Withee

Intro to Xamarin
Ryan Overton

Introduction to Amazon AWS
Brian Korzynski

Introduction to Angular
Muljadi Budiman

Introduction to Asynchronous Code in .NET
Bill Dinger

Introduction to Online Security
Michael Dowden

Introduction to the D3.js visualization library
Bryan Nehl

Introduction To the Microsoft Bot Framework
Becky Bertram

Javascript Asynchronous Roundup (Promises Promises…)
Mark Meadows

JavaScript Futures: ES2017 and the Road Ahead
Jeff Strauss

Jewelbots: How to Get More Girls Coding!
Jennifer Wadella

Kotlin: What’s in it For You
Douglas Starnes

Learning the Language of HTTP for a Better Data Experience in Your Mobile Apps
Chris Woodruff

Let’s Talk About Mental Health
Arthur Doler

Leveraging Microsoft Azure to enable your Internet of Things
Ralph Wheaton

Linux and Windows Containers, Not All Are Created Equal
Yung Chou

Love and Hate, Having Conversations About Going to the Cloud
Bryan Roberts

Make .NET Great Again!
Sam Basu

Managing Millennials
Jim Drewes

Maximize Professional Growth By Doing Scary Things
Steven Hicks

Mechanics and Moxie: Modernizing Quality Assurance
Kylie Schleicher

Microservice-Powered Applications – It worked for Voltron, it can work for you!
Bryan Soltis

Microservices – A Pattern for Success
David Davids

Microsoft Azure Makes Machine Learning Accessible and Affordable
Douglas Starnes

Migrating from desktop to serverless with AWS
Bryan Nehl

Mobile Development For Web Developers
Justin James

Moving into mobile with Angular 2 and Ionic Framework
Mike Hamilton

Moving into mobile with React Native
Mike Hamilton

Naked and Not Afraid: How to Better Serve Your Clients
Rick Petersen

Neural Networks: The Good Bits
Chase Aucoin

Next-level test-driven development
Alison Hawke

Optimizing Application Performance
Jason Turan

Planet scale data with CosmosDB
Bryan Roberts

Planning for Failure
Jesse Phelps

Practical Security Practices: Threat Modeling
Josh Gillespie

Productivity: How to Get Things Done in this Digital Age
Keith Elder

React for the Uninitiated
Mark Meadows

Refactoring Towards Resilience
Jimmy Bogard

ReSharper: Discover the Secrets
Ondrej Balas

Respond To and Troubleshoot Production Incidents Like an SA
Tom Cudd

Reverse Engineering a Bluetooth Lightbulb
Jesse Phelps

Securing ASP.NET Core APIs and Websites with IdentityServer4
Jeffrey St. Germain

Securing your Applications with Azure AD
Mike Green

Self-Assembling, Self-Healing Systems in the AWS cloud
James Nugent

Serilog: Logging All Grown Up
Brian Korzynski

Serverless JavaScript OMG
Burke Holland

Should I be a generalist or a specialist?
Eric Potter

Should I make the Transition to ASP.NET MVC Core? Will it Hurt?
Mitchel Sellers

Software Development to Leadership
Cori Kristoff

SQL Server For The .NET Developer
Clayton Hoyt

SQL Server Power Hour With Dan and Kathi
Dan Guzman

Strategies for learning React
Ryan Lanciaux

Survival Guide to the Robot Apocalypse – Intro to Deep Learning for Developers
Steve Howard

Swift start on iOS development
Muljadi Budiman

Take Each Day and Work on Making it Better
Dean Furness

Taking Azure Application Insights to the Next Level
Ralph Wheaton

Taming the Tentacles of Octopus
Kevin Fitzpatrick

Teaching Kids Programming
Sarah Phelps

The Business Case for UX
Danielle Cooley

The Hardest Part of Being an Architect: A Death Star Story
Rick Petersen

The Lean & Agile Transformation Playbook
Clint Edmonson

The Modern ASP.NET Tech Stack!
Sam Basu

The Power of Secrets
Sarah Withee

The Reusable JavaScript Revolution
Cory House

The Saboteur in Your Retrospectives: How Your Brain Works Against You
Arthur Doler

The Thrill of the Hunt: The Return to Exploratory Testing
Kylie Schleicher

The Two Question Code Quiz: How to Interview Programmers Effectively
Scott Connerly

To Infinity and Beyond: Build Serverless APIs
Bryan Roberts

TypeScript — JavaScript Reimagined
David Pine

Understanding Azure Resource Templates
Paul Hacker

Unit Testing Strategies & Patterns in C#
Bill Dinger

Visual Studio Code Can Do THAT?!?
Burke Holland

What C# Programmers Need to Know About Pattern Matching
Eric Potter

What Is Data Science?
Ryan Metcalf

What Makes a Good Developer? – Increasing Your Value in a Polyglot World
Eric Lynn

What’s New in ASP.NET Core 2.0?
Scott Addie

What’s New in Java 9
Billy Korando

What’s New in VS 2017 and C# 7
Doug Mair

Why Aren’t There More Women Developers?
Jennifer Wadella

Windows IoT Core Development on a Raspberry Pi
Kevin Grossnicklaus

You Got Your Dev in My Ops, You Got Your Ops in My Dev
Paul Hacker

Your JavaScript Needs Types
Spencer Schneidenbach


Data Access Framework Comparison


For some time now I have been working on a project that utilizes a custom-built data access framework, rather than popular ORM frameworks such as Entity Framework or NHibernate.

While the custom framework has worked well for the project, I had questions about it.  For example, it uses stored procedures to implement basic CRUD operations, and I wondered if inline parameterized SQL statements might perform better.  Also, I wondered about the performance of the custom framework compared to the leading ORMs.

Besides my questions about the custom framework, I recognized the importance of having at least a basic understanding of how to use the other ORM frameworks.

In order to answer my questions about the custom framework and to gain some practical experience with the other ORMs, I created a simple web application that uses each of those frameworks to perform basic CRUD applications.  While executing the CRUD operations, the application times them and produces a summary report of the results.

The code for the test application can be found at

NOTE: I assume that most readers are familiar with the basics of Entity Framework and NHibernate, so I will not provide an overview of them here.

Using the custom framework is similar to Entity Framework and NHibernate’s “database-first” approach.  Any project that uses the library references a single assembly containing the base functionality of the library.  A T4 template is used to generate additional classes based on tables in a SQL Server database.  Some of the classes are similar to EF’s Model classes and NHibernate’s Domain classes.  The others provide the basic CRUD functionality for the domain/model classes. 

For these tests I made a second copy of the custom framework classes that provide the basic CRUD functionality, and edited them to replace the CRUD stored procedures with parameterized SQL statements.

The custom framework includes much less overhead on top of ADO.NET than the popular ORMs, so I expected the tests to show that it was the best-performing framework.  The question was, how much better?

In the rest of this post, I will describe the results of my experiment, as well as some of the optimization tips I learned along the way.  Use the following links to jump directly to a topic.

Test Application Overview
“Out-of-the-Box” Performance
Entity Framework Performance After Code Optimization
     AutoDetectChangesEnabled and DetectChanges()
     Recycling the DbContext
NHibernate Performance After Configuration Optimization
     What’s Up with Update Performance in NHibernate?
Results Summary

Test Application Overview

    A SQL Express database was used for the tests.  The data model is borrowed from Microsoft’s Contoso University sample application.  Here is the ER diagram for the database:



The database was pre-populated with sample data.  The number of rows added to each table were:

Department: 20
Course: 200
Person: 100000
Enrollment: 200000

This was done because SQL Server’s optimizer will behave differently with an empty database than it will with a database containing data, and I wanted the database to respond as it would in a “real-world” situation.  For the tests, all CRUD operations were performed against the Enrollment table.

Five different data access frameworks were tested:

  1. Custom framework with stored procedures
  2. Custom framework with parameterized SQL statements
  3. Entity Framework
  4. NHibernate
  5. Fluent NHibernate

The testing algorithm follows the same pattern for each of the frameworks:

01) Start timer
02) For a user-specified number of iterations 
03)      Submit an INSERT statement to the database
04)      Save the identifier of the new database record
05) End timer
06) Start timer
07) For each new database record identifier
08)      Submit a SELECT statement to the database
09) End timer
10) Start timer
11) For each new database record identifier
12)      Submit an UPDATE statement to the database
13) End timer
14) Start timer
15) For each new database record identifier
16)      Submit a DELETE statement to the database
17) End timer

Note that after the test algorithm completes, the database is in the same state as when the tests began.

To see the actual code, visit

"Out-of-the-Box" Performance

I first created very basic tests for each framework. Essentially, these were the “Hello World” versions of the CRUD code for each framework.  No optimization was attempted.

Here is an example of the code that performs the INSERTs for the custom framework.  There is no difference between the version with stored procedures and the version without, other than the namespace from which EnrollmentDAL is instantiated.

    DA.EnrollmentDAL enrollmentDAL = new DA.EnrollmentDAL();

    for (int x = 0; x < Convert.ToInt32(iterations); x++)
        DataObjects.Enrollment enrollment = enrollmentDAL.EnrollmentInsertAuto
            (null, null, 101, 1, null);

      And here is the equivalent code for Entity Framework:

    using (SchoolContext db = new SchoolContext())
       for (int x = 0; x < Convert.ToInt32(iterations); x++)
            Models.Enrollment enrollment = new Models.Enrollment {
                CourseID = 101, StudentID = 1, Grade = null };


    The code for NHibernate and Fluent NHibernate is almost identical.  Here is the NHibernate version:

using (var session = NH.NhibernateSession.OpenSession("SchoolContext"))
    var course = session.Get<NHDomain.Course>(101);
    var student = session.Get<NHDomain.Person>(1);

    for (int x = 0; x < Convert.ToInt32(iterations); x++)
        var enrollment = new NHDomain.Enrollment { 
            Course = course, Person = student, Grade = null };



The SELECT, UPDATE, and DELETE code for each framework followed similar patterns. 

    NOTE: A SQL Server Profiler trace proved that the actual interactions with the database were the same for each framework.  The same database connections were established, and equivalent CRUD statements were submitted by each framework.  Therefore, any measured differences in performance are due to the overhead of the frameworks themselves.

        Here are the results of the tests of the “out-of-the-box” code:

      Framework              Operation     Elapsed Time (seconds)
      Custom                 Insert        5.9526039
      Custom                 Select        1.9980745
      Custom                 Update        5.0850357
      Custom                 Delete        3.7785886

      Custom (no SPs)        Insert        5.2251725
      Custom (no SPs)        Select        2.0028176
      Custom (no SPs)        Update        4.5381994
      Custom (no SPs)        Delete        3.7064278

      Entity Framework       Insert        1029.5544975
      Entity Framework       Select        8.6153572
      Entity Framework       Update        2362.7183765
      Entity Framework       Delete        25.6118191

      NHibernate             Insert        9.9498188
      NHibernate             Select        7.3306331
      NHibernate             Update        274.7429862
      NHibernate             Delete        12.4241886

      Fluent NHibernate      Insert        11.796126
      Fluent NHibernate      Select        7.3961941
      Fluent NHibernate      Update        283.1575124
      Fluent NHibernate      Delete        10.791648

      NOTE: For all tests, each combination of Framework and Operation was executed 10000 times.   Looking at the first line of the preceding results, this means that Custom framework took 7.45 seconds to perform 10000 INSERTs.

      As you can see, both instances of the the custom framework outperformed Entity Framework and NHibernate.  In addition, the version of the custom framework that used parameterized SQL was very slightly faster than the version that used stored procedures.  Most interesting however, was the performance for INSERT and UPDATE operations.  Entity Framework and both versions of NHibernate were not just worse than the two custom framework versions, they were much MUCH worse.  Clearly, some optimization and/or configuration changes were needed.

      Entity Framework Performance After Code Optimization

      AutoDetectChangesEnabled and DetectChanges()  

      It turns out that much of Entity Framework’s poor performance appears to have been due to the nature of the tests themselves.  Information on Microsoft’s MSDN website notes that if you are tracking a lot of objects in your DbContext object and call methods like Add() and SaveChanges() many times in a loop, your performance may suffer.  That scenario describes the test almost perfectly.

      The solution is to turn off Entity Framework’s automatic detection of changes by setting AutoDetectChangesEnabled to false and explicitly calling DetectChanges().  This instructs Entity Framework to only detect changes to entities when explicitly instructed to do so.  Here is what the updated code for performing INSERTs with Entity Framework looks like (changes highlighted in red):

      using (SchoolContext db = new SchoolContext())
          db.Configuration.AutoDetectChangesEnabled = false;

          for (int x = 0; x < Convert.ToInt32(iterations); x++)
              Models.Enrollment enrollment = new Models.Enrollment {
                  CourseID = 101, StudentID = 1, Grade = null };

      Here are the results of tests with AutoDetectChangesEnabled set to false:

      Framework           Operation    Elapsed Time (seconds)
      Entity Framework    Insert       606.5569332
      Entity Framework    Select       6.4425741
      Entity Framework    Update       605.6206616
      Entity Framework    Delete       21.0813293

      As you can see, INSERT and UPDATE performance improved significantly, and SELECT and DELETE performance also improved slightly.

      Note that turning off AutoDetectChangesEnabled and calling DetectChanges() explicitly in all cases WILL slightly improve the performance of Entity Framework.  However, it could also cause subtle bugs.  Therefore, it is best to only use this optimization technique in very specific scenarios and allow the default behavior otherwise.

      Recycling the DbContext

      While Entity Framework performance certainly improved by changing the AutoDetectChangesEnabled value, it was still relatively poor. 

      Another problem with the tests is that the same DbContext was used for every iteration of an operation (i.e. one DbContext object was used for all 10000 INSERT operations).  This is a problem because the context maintains a record of all entities added to it during its lifetime.  The effect of this was a gradual slowdown of the INSERT (and UPDATE) operations as more and more entities were added to the context.

      Here is what the Entity Framework INSERT code looks like after modifying it to periodically create a new Context (changes highlighted in red):

      for (int x = 0; x < Convert.ToInt32(iterations); x++)
          // Use a new context after every 100 Insert operations
          using (SchoolContext db = new SchoolContext())
              db.Configuration.AutoDetectChangesEnabled = false;

              int count = 1;
              for (int y = x; y < Convert.ToInt32(iterations); y++)
                  Models.Enrollment enrollment = new Models.Enrollment {
                      CourseID = 101, StudentID = 1, Grade = null };

                  if (count >= 100) break;

      And here are the results of the Entity Framework tests with the additional optimization added:

      Framework            Operation     Elapsed Time (seconds)
      Entity Framework     Insert        14.7847024
      Entity Framework     Select        5.5516514
      Entity Framework     Update        13.823694
      Entity Framework     Delete        10.0770142

      Much better!  The time to perform the SELECT operations was little changed, but the DELETE time was reduced by half, and the INSERT and UPDATE times decreased from a little more than 10 minutes to about 14 seconds.

      NHibernate Performance After Configuration Optimization

      For the NHibernate frameworks, the tests themselves were not the problem.  NHibernate itself needs some tuning. 

      An optimized solution was achieved by changing the configuration settings of the NHibernate Session object.  Here is the definition of the SessionFactory for NHibernate (additions highlighted in red):

      private static ISessionFactory SessionFactory
              if (_sessionFactory == null)
                  string connectionString = ConfigurationManager.ConnectionStrings

                  var configuration = new NHConfig.Configuration();



                  _sessionFactory = configuration.BuildSessionFactory();
              return _sessionFactory;

      And here is the InitializeSessionFactory method for Fluent NHibernate, with the equivalent changes included:

      private static void InitializeSessionFactory()
          string connectionString = ConfigurationManager.ConnectionStrings[_connectionKeyName]

          _sessionFactory = Fluently.Configure()
              .Mappings(m => m.FluentMappings.AddFromAssemblyOf<Enrollment>())
                  (NHibernate.Cfg.Environment.FormatSql, Boolean.FalseString)
              .SetProperty(NHibernate.Cfg.Environment.ShowSql, Boolean.FalseString)

      The following table gives a brief description of the purpose of these settings:

      Setting                   Purpose
      FormatSql                 Format the SQL before sending it to the database
      GenerateStatistics        Produce statistics on the operations performed
      Hbm2ddlKeyWords           Should NHibernate automatically quote all db object names
      PrepareSql                Compiles the SQL before executing it
      PropertyBytecodeProvider  What bytecode provider to use for the generation of code
      QueryStartupChecking      Check all named queries present in the startup configuration
      ShowSql                   Show the produced SQL
      UseProxyValidator         Validate that mapped entities can be used as proxies
      UseSecondLevelCache       Enable the second level cache

      Notice that several of these (FormatSQL, GenerateStatistics, ShowSQL) are most useful for debugging.  It is not clear why they are enabled by default in NHibernate; it seems to me that these should be opt-in settings, rather than opt-out.

      Here are the results of tests of the NHibernate frameworks with these changes in place:

      Framework                        Operation     Elapsed Time (seconds)
      NHibernate (Optimized)           Insert        5.0894047
      NHibernate (Optimized)           Select        5.2877312
      NHibernate (Optimized)           Update        133.9417387
      NHibernate (Optimized)           Delete        5.6669841

      Fluent NHibernate (Optimized)    Insert        5.0175024
      Fluent NHibernate (Optimized)    Select        5.2698945
      Fluent NHibernate (Optimized)    Update        128.3563561
      Fluent NHibernate (Optimized)    Delete        5.5299521

      These results are much improved, with the INSERT, SELECT, and DELETE operations nearly matching the results achieved by the custom framework.   The UPDATE performance, while improved, is still relatively poor.

      What’s Up with Update Performance in NHibernate?

      The poor update performance is a mystery to me.  I have researched NHibernate optimization techniques and configuration settings, and have searched for other people reporting problems with UPDATE operations.  Unfortunately, I have not been able to find a solution.

      This is disappointing, as I personally found NHibernate more comfortable to work with than Entity Framework, and because it beats or matches the performance of Entity Framework for SELECT, INSERT, and DELETE operations.

      If anyone out there knows of a solution, please leave a comment!

      Final Results

      The following table summarizes the results of the tests using the optimal configuration for each framework.  These are the same results shown earlier in this post, combined here in a single table.

      Framework                        Operation     Elapsed Time (seconds)
      Custom                           Insert        5.9526039
      Custom                           Select        1.9980745
      Custom                           Update        5.0850357
      Custom                           Delete        3.7785886

      Custom (no SPs)                  Insert        5.2251725
      Custom (no SPs)                  Select        2.0028176
      Custom (no SPs)                  Update        4.5381994
      Custom (no SPs)                  Delete        3.7064278

      Entity Framework (Optimized)     Insert        14.7847024
      Entity Framework (Optimized)     Select        5.5516514
      Entity Framework (Optimized)     Update        13.823694
      Entity Framework (Optimized)     Delete        10.0770142

      NHibernate (Optimized)           Insert        5.0894047
      NHibernate (Optimized)           Select        5.2877312
      NHibernate (Optimized)           Update        133.9417387
      NHibernate (Optimized)           Delete        5.6669841

      Fluent NHibernate (Optimized)    Insert        5.0175024
      Fluent NHibernate (Optimized)    Select        5.2698945
      Fluent NHibernate (Optimized)    Update        128.3563561
      Fluent NHibernate (Optimized)    Delete        5.5299521

      And here is a graph showing the same information:


    hOCRImageMapper: A Tool For Visualizing hOCR Files

    Just uploaded to GitHub (, this simple application provides a way to visualize hOCR output.

    Per Wikipedia: "hOCR is an open standard of data representation for formatted text obtained from optical character recognition (OCR). The definition encodes text, style, layout information, recognition confidence metrics and other information using Extensible Markup Language (XML) in form of Hypertext Markup Language (HTML) or XHTML."

    hOCR is produced by the Tesseract, Cuneiform, and OCRopus OCR software.  My motivation for creating this tool was a need to analyze hOCR output produced by Tesseract.

    This application has been implemented as a simple WinForms application  (yeah, I know, but it was quick) written in C#.

    When using the application, the text contained in an hOCR file is loaded alongside the image that is the source of the OCR output.  Hovering over a word in the text highlights the word in the image. 

    Hovering over the word “quantitative” in the left panel highlights the word in the source image on the right.

    Clicking a word in the text displays the coordinates for the bounding box used to highlight the word.  (This bounding box is extracted from the hOCR output).  The coordinates are displayed as two pairs of X-Y coordinates that represent the upper right and lower left corners of the bounding box.

    Clicking the word displays its coordinates.  In
    this case, the X-Y pairs are (513, 540) for the
    upper right and (846, 600) for the lower left.

    The source code can be downloaded from the Github repository, or the compiled executable can be downloaded directly.

    St. Louis Days of .NET 2014

    My notes from the 2014 edition of St. Louis Days of .NET.  I was only able to attend the first day of the conference this year.

    Front-End Design Patterns: SOLID CSS + JS for Backend Developers

    Session Materials:

    Use namespaced, unambiguous classes.   For example, use “.product_list_item” instead of “.product_list li” , and “.h1” instead of “h1”.

    No cascading

    Limit overriding

    CSS Specificity – Specificity is the means by which a browser decides which property values are the most relevant to an element and get to be applied.
        Each CSS rule is assigned a specificity value
        Plot specificity values on a graph where the x-axis represents the line number in the CSS
        Line should be relatively flat, and only trend toward high specificity towards the end of the CSS
        Specificity graph generator:
        Another option of what a graph should look like:

    Important CSS patterns and concepts
        Revealing Module
        Revealing Prototype

    Optimizing Your Website’s Performance (End-To-End Diagnostics)

    Session Materials:

    If your test environment is different that your production environment, look for linear differences in order to estimate the differences between the servers.  For example, if the production server is a quad-core server and the test server is a dual-core server, measure the performance of the test server twice: once with one core active and once with both cores.  The difference between running with one core vs. two cores should allow you to estimate the difference between the dual-core server and the quad-core server.  Obviously, this will not be perfect, but does provide some baseline for estimating the differences between servers.

    Different browsers have different limits on how many simultaneous requests can be made to a single domain (varies from 4 to 10).

    Simple stuff to look at when optimizing a web site:
        Large images
        Long-running javascript
        Large viewstate

    Make sure cache-expiration is set correctly for static content.  This is done in the web.config file.


    Google PageSpeed
        Provides mobile and desktop scores
        Used in Google search rankings!
        Not useful for internal sites
        Similar to YSlow
        Blocked by pages requiring a login

    Google Analytics (or similar)
        Useful for investigating daily loads (determine why site is slow at certain times)
        Use to investigate traffic patterns

        Reasonably priced and free options available
        Use to simulate traffic load on your site
        Only tests static html

        More expensive
        Use to simulate traffic load
        Tests everything; not just static content

    New Relic
        Internal server monitoring

    Hadoop For The SQL Ninja


    Hive is a SQL-like query language for Hadoop.
        Originated at Facebook
        Compiles to Map/Reduce jobs
        Queries tables/catalogs defined on top of underlying data stores
        Data stores can be text files, Mongo, etc
        Data stores just need to provide rows and columns of data
        Custom data provides can be created to provide rows/columns of data

    Hive is good for:
        Large scale queries
        A variety of formats
        UDF extensibility

    Hive is NOT good for:
        Interactive querying
        Small tables

    Hive connectivity
        ODBC/JDBC – responsive queries
        Oozie – job-based workflows
        Azure Toolkit/API – now includes Visual Studio integration for viewing/executing queries

    Angular for .NET Developers

    Session Materials:

    AngularJS is a Javascript MVC framework
        Model-View-Controller are all on the client
        Data is exchanged via AJAX calls to REST web services
        Makes use of dependency injection

    Benefits of AngularJS
        Unobtrusive Javascript
        Clean HTML
        Limits the need for third party libraries (like jQuery)
        Works well with ASP.NET MVC
        Easy Single-Page Applications (SPA)
        Testing is easy.  Jasmine is the test framework of choice.

    HTML attributes provide AngularJS “hooks”.  For example, notice the attributes on the elements <html ng-app=”AngularApp”> and <input ng-model=”” />

    Data binding example:

        <input ng-model=””/>
        <p>Hello {{}}</p>

        In this example, data entered into the input text box is echoed in the paragraph below the input element.

    Making Rich, Interactive, Multi-Platform Applications with SignalR

    Session Materials:

    Use cases for SignalR
        Any application that involves polling
        Chat applications
        Real-time score updates
        Voting results
        Real-time stock prices

    The Smooth Transition to TypeScript


    TypeScript provides compile-time errors in Visual Studio.

    TypeScript has type-checking
        Optional types on variables and parameters
        Primitive types are number, string, boolean, and any
        The “any” type tells the compiler to treat the variable like Javascript would

    Intellisense for TypeScript is very good, and other typical Visual Studio tooling works as well.

    TypeScript files compile to javascript (example.ts –> example.js), and the javascript is what gets referenced in your web applications.

    TypeScript class definitions become javascript types.

    The usual Visual Studio design and compile-time errors are available when working with classes.

    A NuGet package exists that provides “jQuery typing files” that enable working with jQuery in TypeScript.

    TypeScript supports generics and lambdas.

    St. Louis Day of .NET 2013

    This post is long overdue, as the 2013 Day of .NET took place almost two months ago.  I set aside my notes while I waited for presenters to post their session materials online… and then I forgot about it.  So, without further ado, here are my notes from the event:

    DAY 1

    Session: Entity Framework in the Enterprise

    Session Materials:

    Getting Started with Entity Framework (EF6 and MVC5)
    (EF5 and MVC4)

    SQL Server Data Tools 
         Use LocalDB 
         Allows for loading of test data 
         Allows for data to be "reset" to a known state 
         Remember to check the "Target Connection String" in the DB project properties dialog

    Entity Framework Power Tools v.4 (Beta)
         Provides reverse engineering of databases into code-first classes, using the Fluent API

    Unit Testing
         Entity Framework 6 has support for mocking frameworks
         Allows you to create your own test doubles
         It is recommended to test against a "real" DB for Last Mile test and performance tests

    Audit Tracking
         SQL Server Change Data Capture
              Available in SQL Server 2008 and beyond (Enterprise Editions only)
              Uses change tables that mirror structure of tables being tracked
              Populates the change tables by analyzing the transaction log (not via triggers)
         If using EF natively
              Override the "SaveChanges" methods
              Loop through the contents of the "ChangeTracker" collection (saving the details along the way)

    Performance Tracking
         Entity Framework 6 includes/allows logging of SQL statements and execution times
         Other useful tools include NLog and Glimpse

    Session:  Introduction to MongoDB


         6th most popular database in the world, just behind PostgreSQL and DB2
         There are drivers for many languages, as well as a LINQ provider.
         Data stored as BSON (binary JSON)
         Everything is case-sensitive

         Speed – basic queries are much faster than SQL DBs
         Rich Dynamic Queries – not as limited as other NoSQL DBs
         Easy Replication and Failover
         Automatic Sharding

         No transactions
         No joins
         RAM intensive
         No referential integrity
         "Eventual consistency" – periods of inconsistency usually measured in milliseconds

         MongoDB shell (command line)
         Various GUI tools

         Can query by regular expression
         Can return entire records or specific fields

    Object IDs
         Object IDs (auto-generated unique IDs) contain timestamp of record creation.
         Timestamps contained in Object IDs can be retrieved.
         Can define your own IDs, which is useful for sharding

         Can index pretty much any part (or parts) of a record, up to and including the entire record

         If the primary fails, a secondary is auto-elected as the new primary

    Session:  Modern Web Diagnostics with A Glimpse into ASP.NET


         Installed via NuGet
         New versions are released approximately every two weeks

         Gives insight into ASP.NET, WebForms, and others
         Gives diagnostics on networks, databases, page lifecycle, viewstate, and more
         Can trace individual users
         Can be enabled/disabled in various ways (cookies, roles, etc)
         Keeps the history of the last 50 requests, so recent requests can be examined after they occur

    Platform Support
         Cross browser (last versions of browers supported) and cross platform
         Support exists for tracing NHibernate, Entity Framework, MVC, WebForms
         WebAPI support is on the way (not there now)

    Session:  Parallelism in .NET

    Session Materials:

         More threads means more memory usage and more context switching
         Developers need to find the appropriate balance between the # of threads and resource usage
         Available since .NET 1.0

         Similar to database connection pooling
         Resources are managed much better
         Available since .NET 1.0

    Parallel Linq (PLINQ)
         Example: from r in object.AsParallel() select r
         When using this, you must watch out for shared resources, and lock them correctly

    Parallel Library
         Provides the parallel For, ForEach, and Invoke methods
         Allows processing to be stopped via the "ParellelLoopState" delegate

    Tasks (TPL => Task Parallel Library)
         The most complex option to use, but also the most flexible
         Can be used "as needed"; they are not bound to the loop processing of the Parallel Library
         Allow parallel processes to be stopped
         Necessary for the use of Await/Async

         See the slide deck for the details of how "await" works
         Async methods must return Task
         Await can always be used on a Task, whether it is "async" or not
         When calling an async method, always await it (best practice)

    Debugging Support
         When a breakpoint is hit, all running tasks stop
         Several parallel debugging windows are available under the Visual Studio "Debug" menu
              Tasks – shows all running tasks; click a task to go to the currently executing statement
              Parallel Stacks – visual display of running tasks and the call stack; click a task to see the current statement
              Parallel Watch – allows watching a variable in a particular task

    Session:  A Deeper Dive Into Xamarin.Android

    Presenter: or
    Session Materials:

         Xamarin Studio (native) – not free
         Xamarin Studio plug-in for Visual Studio – not free

    Recommended components for easing cross-platform development:
         Xamarin.Mobile – abstracts your code for location/photos/contacts across all platforms
         Xamarin.Social – similar to the Mobile component, only for social services
         Xamarin.Auth – makes OAuth easier to use

    Components for Android
         Google Play Services
         Backward compatibility component (for supporting older versions of Android)


    Notes about developing for Android
         Turn on Hardware Acceleration in the application manifest
         Activity (app) lifecycle events reminiscent of ASP.NET page lifecycle events (or Windows 8 app events)
         Lots of XML involved in app creation
         "Layouts" are used to create app UIs.  Reminiscent of XAML.

         Android SDK is more robust and complicated than iOS
         Not as prescriptive in UI/design
         Device fragmentation is a challenge
         Emulators are poor; use a real device for testing
         Platform is more innovative than iOS, but not as polished

    DAY 2

    Session:  All You Ever Wanted to Know About Hadoop

    Presenter: Matt Winkler

    Written in Java (runs on the JVM)

    Installation Options
         Single computer
              HDInsight (Microsoft’s implementation) can be installed from Web Platform Installer
              Various installation packages
              Azure – multiple nodes running HDInsight can be easily provisioned
              Amazon Cloud Services

    MapReduce is the tool for querying data with Hadoop
         White Paper: Data-Intensive Text Processing with MapReduce (
         MapReduce can be thought of as the assembly language for Hadoop.

    Extensions to MapReduce
         Most of these compile down to MapReduce packages

         Hive – SQL-like query language
         Pig – another query platform
         SCALDING – Scala-like query language.  The syntax is LINQ-like.
    Other Tools
         SQOOP – used for loading traditional RDBMS data into Hadoop
         STORM – tool for complex event processing
         OOZIE – Workflow Management for Hadoop

    Session:  Building A REST API With Node.js and MongoDB

    Session Materials:

    Useful Node.js packages (similar to NuGet packages in .NET)
         Restify – adds REST capabilities
         Toaster – UI functionality
         Moment – date handling
         MongoDB – MongoDB client tools

    WebStorm from JetBrains is a recommended Javascript editor ($49 individual developer license) – offers *free* online course on MongoDB

    Session:  Starting with Code-First Entity Framework

    Session materials:

    Create a class that inherits from DbContext… within that class, define the tables to create

    Create classes to represent each table

    Database is created automatically the first time that it is accessed

    Handing DB changes
         1) Update the code
         2) Via attributes, databases can be set to drop/create always, drop/create only when the model changes
         3) Database migrations are another option

    Database Migrations
         Package Manager Console can be used to generate classes to handle migrations. 
         Alternately, create a Configuration class in a Migrations folder
         Use the Configuration class with the MigrateDatabaseToLatestVersion class in SetInititalizer method of the Database object.
         Or, if you choose not to trust the auto-migration, generate a TSQL script to perform the migration.
         TSQL scripts can be generated from the Package Manager Console

    ExpressProfiler is a simple SQL profiler… find it on CodePlex.

    Session:  Introduction to Knockout.js

    Session Materials:!APKGSzHxmC91400

    What is it?

         JS library for dynamic web-based UI’s
         Applies MVVM to automate data binding


         Declarative bindings
         Dependency Tracking
         Automatic UI Refresh
         Dependency Injection

    MVVM (Model-View-ViewModel) Pattern

         Combination of the MVC/MVP patterns
         View – UI and UI Logic, talks with ViewModel and receives notifications from ViewModel
         ViewModel – Presentation Logic, talks with View (data binding and commands (bi-directional), notifications [to View]) and Model (bi-directional)
         Model – Business Logic and Data, talks with ViewModel

    Data Binding


         Knockout.js implements the ViewModel

    var myViewModel = function() {

          var data = { productid: 1, productname: "shoe", productprice=1.99 };

     = ko.observable("value");   

              this.products = ko.observableArray(data);     // "data" is an array of products

              this.handler = function(data,event) {}


         ko.applyBindings(new myViewModel());

         "ko" is the global identifier for Knockout


    Attributes of HTML elements are bound to the ViewModel properties (also CSS and conditional logic like "foreach" and "if")

         <input data-bind="value: property" />

         <button data-bind="click: handler"></button>

         <tbody data-bind="foreach: products">

              <tr><td><input data-bind="value: productid"></td></tr>


    Individual elements can be bound to more than one property (example: "text" bound to one thing, "visible" bound to another)

    Session:  Real World Azure – How We Use Azure at Swank HealthCare

    Presenter: Brad Tutterow

    SQL in an Azure VM vs SQL Azure Database
         VM option does place your database in the cloud
         VM option still requires you do to your own backups/restores/server maintenance
         VM option does not provide for scalability of a "true" cloud DB

         SQL Azure DB is Microsoft’s preference

    DB Changes that were needed for SQL Azure Database
         Remove Cross-DB triggers
         Remove file groups in CREATE scripts
         Account for cloud-based SQL being a limited subset of full SQL
              Example: No "USE" statement, so scripts may need update
         Modify backup strategy (no traditional Backup/Restore in the cloud)

    Always run two of every Web Role
         Roles are frequently recycled by Azure
         If only one Role exists, your site is down when the Role is recycled
         If two Roles exist, Azure will switch between the two as needed, and not recycle both at the same time

    Deployment best practices
         Determine which application settings (web.config) need to be changed at runtime
              Move those settings to Azure settings
              Everything else can stay in the web.config
              Changing web.config in production doesn’t "stick"… Role recycle will wipe the changes
         Create deployment packages
              Role recycle will wipe updates if not deployed via a package
         Make no assumptions about what is available on the server
              You must deploy everything your app needs (all NuGet packages, etc)
              Role recycle produces fresh copy of Windows

    Database updates handled via EF Code-First Database Migrations
         Question: How would updates be handled without Code-First, or with some other ORM?

    Pain points
         Local Azure emulator is unreliable and inconsistent
         No effective way to do QA on-premise (means more cost for a QA environment in Azure)
         Learning curve (not too bad)
         Azure SDK versioning (keeping everything in sync… updates are quarterly)
         EF migrations and Azure SQL (scripts don’t always work in Azure; need to be edited)

    Good things
         Uptime and reliability
         Buy-in from sales/operations/infrastructure
         Enforced best practices for design and deployment
         Pristine/clean production environments
         QA/Prod environments are identical
         No IIS or Windows OS management
         Easy deployments


    • Investigate SQL Server Change Data Capture as a replacement for auditing with triggers.
    • Check out DurandalJS (mentioned in several sessions)
    • Check out Twitter Bootstrap
    • Check out LESS
    • Check out Glimpse
    • Think about what could be done with a large OCR corpus and Hadoop 

    Parsing Delimited Text Files with LINQ

    A simple LINQ query can be used to parse delimited text files into a list of objects.

    Consider a tab-delimited file named Data.txt that contains contact information.  Specifically,it contains Names, Phone Numbers, Birth Dates, and Email Addresses, like this:

    Joe Smith    111-222-3333     1/1/1980
    John Doe     444-555-6666     7/31/1970
    Jane Doe     666-777-8888     4/25/1975

    Assume that the following class exists:

    class Contact
        public string Name { get; set; }
        public string Phone { get; set; }
        public string BirthDate { get; set; }
        public string Email { get; set; }

    This LINQ query will produce a list of Contact objects that are populated with the information in the text file:

    var contacts = from line in System.IO.File.ReadAllLines(@"Data.txt")
                   let parts = line.Split(‘\t’)
                   select new Contact
                       Name = parts[0],
                       Phone = parts[1],
                       BirthDate = parts[2],
                       Email = parts[3]