Encoding XML in UTF-8 with .NET

The solution described here was inspired by the blog post found at http://rlacovara.blogspot.com/2011/02/how-to-create-xml-in-c-with-utf-8.html.  It explains how to replace the default UTF-16 encoding with UTF-8.  I have implemented a variation of this.  In addition, a more generic solution is available at http://www.experts-exchange.com/Programming/Languages/C_Sharp/Q_20554526.html. This one (which I have not implemented), allows for variable encoding values for the output.

By default, XML documents produced using C# and the .NET XMLSerializer class are encoded as UTF-16.  I recently needed to change this to the more commonly-used UTF-8, and learned a few things along the way.

The first thing that I discovered (and perhaps should have already known) is that internally .NET stores all string representations as UTF-16.  That is why, if you don’t change the default encoding, the XML is produced as UTF-16.

Next, I found that the Encoding property of the StringWriter class is read-only, so you can interrogate the default encoding (and see that it is in fact UTF-16) but cannot change it. 

As I learned from the blog posts that I referenced above, the solution to changing the default UTF-16 encoding is to subclass the native .NET StringWriter class and override the default Encoding property value.

Following is a solution for producing a UTF-8-encoded XML document.  The “StringWriterUtf8” class is the key to the solution.  It inherits from the native System.IO.StringWriter class and overrides the Encoding property (returning Encoding.UTF8 instead of Encoding.UTF16).  Using an instance of this class as the target for the XML serialization output produces UTF-8 output.

[Serializable]
public class ClassToSerialize
{
   public string ToXml()
   {
       System.Xml.Serialization.XmlSerializer xml = new XmlSerializer(typeof(ClassToSerialize));
       StringWriterUtf8 text = new StringWriterUtf8();
       xml.Serialize(text, this);
       return text.ToString();
   }

   private String _errorMessage = String.Empty;
   public string Message
   {
       get { return _errorMessage; }
       set { _errorMessage = value; }
   }

   private List<string> _citations = new List<string>();
   public List<string> citations
   {
       get { return _citations; }
       set { _citations = value; }
   }
}

// Subclass the StringWriter class and override the default encoding.  This
// allows us to produce XML encoded as UTF-8.
public class StringWriterUtf8 : System.IO.StringWriter
{
   public override Encoding Encoding
   {
       get
       {
           return Encoding.UTF8;
       }
   }
}

St. Louis Day of .NET – S.O.L.I.D.

This is part of a series of posts containing my notes from the sessions I attended at the 2011 St. Louis Day of .NET conference.

This series does not attempt to give complete accounts of the information presented in each session; it is just a way to capture the bullet points, notes, and opinions that I recorded while attending the conference. I have previously posted a list of all of the session materials and sample code that I have been able to find online, so if you are looking for a more precise account of a session, try looking there.

My favorite presenter at this year’s conference was Steve Bohlen.  He presented at three session; I attended two: “Taming Dependency Chaos with Inversion of Control Containers” and “Refactoring to a SOLID Foundation”.  Both were excellent.  Following are my notes from the SOLID session.

Single Responsibility Principle

There should never be more than one reason for a class to change.  Each class should do one thing.

Open-Closed Principle

Software Entities (classes, modules, functions, etc) should be open for extension, but closed for modification.

Instead of this:

     public class Report
     {
          public void Print()
          {
          }
     }

Use this:

     public class Report
     {
          public virtual void Print()
          {
          }
     }

So that you can do this:

     public class Report2 : Report
     {
          public override void Print()
          {
          }
     }

In this case, the old working code ("Report" class) still works, and we have also added new functionality ("Report2" class).

Liskov Substitution Principle

Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it. (Polymorphism; important part highlighted)

Instead of:

     public class LetterReport
     {
          public virtual void Print()
          {
          }
     }

     public class TabloidReport : LetterReport
     {
          public override void Print()
          {
          }
     }

Here, TabloidReport overrides a particular kind of report (LetterReport).

Do this instead:

     public abstract class Report
     {
          public abstract void Print()
          {
          }
     }

     public class LetterReport : Report
     {
          public override void Print()
          {
          }
     }

     public class TabloidReport : Report
     {
          public override void Print()
          {
          }
     }

Now, the base class for the reports is truly generic (it’s not a particular kind of report).

Interface Segregation Principle

Clients should not be forced to depend upon interfaces that they do not use.

Do not build catch-all interfaces like this:

     public interface IDataAccess
     {
          public void SetConnectionString();
          public void Connect();
          public Data GetReportData();
     }

Instead, interfaces should "build upon" other interfaces, as such (interface composition):

     public interface IDataAccess
     {
          public void SetConnectionString();
          public void Connect();
     }

     public interface IReportDataAccess : IDataAccess
     {
          public Data GetReportData();
     }

Now, classes can select the interface that makes the most sense, rather than getting a single interface with everything.

Dependency Inversion Principle

High level modules should not depend on low level modules.  Both should depend on abstractions.  Abstractions should not depend upon details. Details should depend upon abstractions.

This is where dependency injection and object composition comes into play.  No easy code example to give here.

St. Louis Day of .NET – jQuery Plug-ins

This is part of a series of posts containing my notes from the sessions I attended at the 2011 St. Louis Day of .NET conference.

This series does not attempt to give complete accounts of the information presented in each session; it is just a way to capture the bullet points, notes, and opinions that I recorded while attending the conference. I have previously posted a list of all of the session materials and sample code that I have been able to find online, so if you are looking for a more precise account of a session, try looking there.

One of the better sessions I attended at this year’s conference was Ian Robinson’s “Building jQuery Plug-Ins”.  Too many sessions I attended this year skipped the how-to-get-started and jumped right into examining the code of a finished product.  This session, on the other hand, stepped through the entire process of building a jQuery plug-in.  Here are my notes from the session:

The Process

  • Wrap business logic with plug-in logic
  • The business logic can stay largely the same
  • Modify to inject settings and context
  • Modify for safety and to play well with others
    First step

Wrap in a closure…

(function($) {
     // ahh… safety
})(jQuery);

This structure ensures that our code doesn’t conflict with others (takes it out of global scope and defines the plug-in’s own scope).  It also ensures that when a dollar sign ($) is used in the body of this code, it means jQuery (and not anything else).

http://nathansjslessons.appspot.com/ (What’s A Closure?)

Define the Plug-in

No selecting needed:

     $.fn.myPlugin = function(options) {…};

Select and return (chain):

     $.fn.myPlugin = function(options) {
          return this.each(function() {
               …
          });
     };

Establish Default Options

     $.fn.myPlugin.defaultOptions = {
          speed: ‘slow’
     };

     var opts = $.extend({}, $.fn.myPlugin.defaultOptions, options);

Inject HTML Context

Options aren’t just for settings.

Inject all context through options.  For example:

     $(‘.nav’).myPlugin({
          speed: ‘fast’,
          subNavSelector: ‘li.subnav’
     });

Events

Determine if you need to enforce document.ready.  If not, let the user of the plug-in do it.

If you’re ever unbinding events, bind with a namespace first.  For example:

     $(‘elem’).unbind(‘click.Namespace’, function(){…});

Complete Example

(function($) {
     $.fn.myPlugin = function(options) {
          var opts = $.extend({}, $.fn.myPlugin.defaultOptions, options), $moduleWrap = this;
          // Iterate over all elements contained in the current context ($moduleWrap = ".myClass"???)
          $moduleWrap.each(function() {
               // Grab the element being evaulated
               var $module = $(this);
               … do something …
          })
     };

     $.fn.myPlugin.defaultOptions = { 
          speed: ‘slow’ 
          subNavSelector: ‘li.subnav’
     };

     $(document).ready(function() {
          $(‘.myClass’).myPlugin();
     });
})(jQuery);

More examples and documentation
http://css-tricks.com/snippets/jquery/jquery-plugin-template/
http://docs.jquery.com/Plugins/Authoring
http://jquery.ian-robinson.com (jQuery Crash Course)

Follow

Get every new post delivered to your Inbox.