Wednesday, October 22, 2014

The Game - Artillery


So BattleField One has gotten a little more serious.  My goal in this project is to design and build a strategy game engine.  Eventually I'll demonstrate how to modularize the interface so that the game can operate with DirectX instead of a web-based interface using SVG.  My method of development is to do this in tiny chunks the way that XP programming is performed.  I never want to get into a situation where I need to do months worth of work in order to get the program back into a playable position.  So I'm building a feature and making the game playable, then building a feature, etc.

Artillery Units

In this blog post I'm going to show how to incorporate the unit range feature by adding artillery.  For those who have no military knowledge, artillery is nothing more than a large cannon with lots of range.  Artillery positions are normally miles away from the target.  But artillery and their crew cannot see very far away, so they need a spotter.  There is also a movement restriction on artillery, but I'll be ignoring that for now (most artillery need a vehicle to tow it into position). 

So I drew a unit type and modified the game to include this unit in it's case statement (see the UnitClass.cs file).  Here's what the new unit looks like:

This unit has an attack strength of 14 a defense of 2, the range is 3 hex cells and it is able to move one cell per turn.  The tiny "14" at the top is the unit number, which I use for troubleshooting purposes.  I have updated the javascript side of this program to prevent this unit from attacking an enemy unit that is not visible.  I also updated the SnapTo() javascript function to allow a range greater than 1.

New Map

I designed a new map to represent a beach invasion scenario.  The new map looks like this:

The ocean cells will block any unit from passing.  The beach cells act the same as grass.  The goal of this game is to take all cities that the German units are protecting.  So I modified the CheckForEndOfGameCondition() method to call a new object that can be initialized with different goal conditions.  The new object is called VictoryCalculator().  It can be setup to a condition where the enemy (or allies) must defend a given number of cities instead of just destroying all enemy units.  I have also designed to to be extendable to allow game conditions where German or Allied units must retain a certain number of units to win.  I don't currently have a game turn limit and (number of turns to meet objective) but it could also be added to this object in the future.

Getting the Code

You can go to my GitHub account and download the zipped up version of this project (Click Here).

Sunday, October 19, 2014

The Game - Forest Terrain


In my last blog post on the game I demonstrated how to setup blocked terrain cells (aka mountains).  Now I'm going to introduce a forest cell that will block tanks but not infantry.

The Forest Terrain Type

I wanted to introduce a forest terrain type so that I could prevent tanks from penetrating (assume this is a forest of very large trees), but allow infantry to penetrate.  First, I used the same Google Maps screen shot technique to create a forest hex cell:

Then I setup a test map to test my shortest path algorithm:

Modifying the Code

In my previous code, I created a property to handle the blocked terrain.  This was located inside the GameMap object.  In order to test both the terrain and unit combinations I had to change this into a method that uses the unit type integer as a parameter.  The forest terrain number is 7 and a tank is unit type 2.  So I setup the Blocked() method to check for this combination and return true (as in, terrain will block this unit type).  Eventually, I'll need to extend this simplistic method to include an array of terrain types verses unit types.  Before I do that, I'll probably introduce a technique of slowing down the progress of a unit.  For this instance I'll make the unit travel slower through the forest.  For now though, I'm going to just provide a flag for blocked or not blocked.

Now that the Blocked property has been converted to a method, any calling methods need to be re-factored to pass the unit type.  Then I tested to see if the tank went around the wall of forest and the unit went through the forest.

Adding Unit Tests

I ended up adding the unit tests after I made this work.  Technically, I could have added them first and used the unit tests in order to build the code.  The purpose of adding unit tests after the fact is that I want to make sure that future changes don't break this feature.  So you'll see two new unit tests that will test an infantry unit and a tank unit path calculation for the sample map above.

Getting the Code

You can go to my GitHub account and download the zipped up version of this project (Click Here).

The Game - A* Algorithm


I've been working on this game called Battlefield One and it has a very simple algorithm for computing the next move that an enemy unit will take in order to reach it's goal.  The algorithm used searched all 6 surrounding hex and eliminates illegal hex points (off the grid or occupied by a unit).  Then it computes the distance from each grid point to the destination and takes the shortest one.  In this blog post, I'm going to put in some obstacles in the game and demonstrate the flaw in the algorithm I chose.  Then I'm going to demonstrate how to use the A* algorithm to work on a hex grid instead of an 8-way or 4-way square matrix.

Adding Mountains

First, I painted a mountain hex in Photoshop.  You can make up any hex terrain you wish, but I just went to Google Maps and copied some of the Rockey mountains onto the clipboard and then pasted it into a hex block and erased to match the hex shape:

Then I turned off the green hex shape layer and saved my hex-shaped mountain terrain as a mountain.png file:

Fortunately, I had planned on having many terrain types, so I coded the DrawTerrain method to contain a switch for each terrain picture to render.  If you look at the code at GitHub you'll see there are 4 grass terrains (I added some variety) and a new "mountains_01.png" terrain file type.

Next I modified the GameMap object to include a new "Blocked" property.  If the terrain is equal to 6 (representing a mountain terrain type), then Blocked returns true, otherwise it's false.

The FindAdjacentCells(x,y) method inside the GameBoard object was modified to use this Blocked property.  This method will return all cells that are on the map board but not mountain cells.  There are several unit tests that check different conditions (edge of map, corner of map, near mountains and in the middle).

Now to setup a test board to test the blocked cells:

This is a 7 by 6 cell gameboard with a German unit on cell 0,3 a city on cell 6,3 a spare allied unit in the lower right corner (to keep the game from ending with a German win).  Finally, I put a line of mountains blocking the German unit from reaching the city.  With the simple straight line algorithm of navigation, the German unit will go to the wall and toggle between cells 2,3 and 2,2.  

The A* Algorithm

Next searched for an A* algorithm that I could adapt.  I ended up using this web site to explain the details and I wrote the entire thing from the ground up:

A* Pathfinding for Beginners

The A* algorithm uses two lists to contain search nodes.  The open list and the closed list.  I created an object that represented one search node and called it AStartNode.  This node needs to contain the F, G and H variables as described in the beginner guide (I set those to integers).  The F variable is nothing more than G + H, so I just made a getter that adds G and H and returns the result as F.  The next variables I needed was the X,Y coordinate of the cell that this node will represent and finally the location of the cell that was searched from (called Source).  The constructor for the AStarNode just stores the values in the getters and then it computes the distance to the destination (which is an approximation of the distance as the crow flies).

Next, I created a list to contain AStarNodes.  This generic list class is called AStarNodeList (yeah, not creative, but obvious).  Instances of this list becomes the open and closed lists.  That means that I can put all the methods I need inside this list to manipulate the nodes being processed.  The FindSmallestNode() method is very useful.  It finds the node with the lowest F value and returns it (after it removes it from the list).  This is where I will grab the smallest node and find all it's surrounding nodes and then push it on the closed list.  I created a Contains() method to be used when I get surrounding nodes and I want to not save the ones that are already in the closed list.  I created a GetNode(x,y) so I can read the nodes off the list in order to build the way point list (the goal of this whole exercise).

Next is an object called ShortestPath.  This object is what does the work of putting the first node on the open list, then calling the FindPath() method which will recursively find the smallest node and find it's neighbors until it runs into the destination point.  I put in an iterations counter and made sure I exited if it hit 50.  This max might need to be incremented if the map size is larger (you'll know if the unit goes almost to its destination and stops).  I wanted to make sure I didn't get an infinite loop while I'm testing.

There is a method called GetWayPoint(x,y) inside the ShortestPath object.  This is used by the game to get the next way point.  Basically, to ensure that I can provide backwards compatibility, I just made my CollectGermanMovementData() method call this method first.  This method will check to see if the WayPoint list has any nodes.  If not, then a null is returned and CollectGermanMovementData() will grab the coordinates using the old fashioned direct calculation method (because lCoordinates will be null).  If the first way point variable is equal to the unit x,y coordinates, then it is removed and the next coordinates are returned (but not removed from the list).  The reason this is left on the list is in case the unit runs into an allied or even a German unit and is temporarily blocked.  For now, it will stop and attack (if Allied unit) and continue on its way when it is no longer blocked.  In the future, I'll make it recompute the path (if it's just going around and not attacking).  Baby steps.

How to Build Something This Complicated

This algorithm is a bit complicated and there is a lot going on.  So I create a visio document with the test map fully numbered.  Then I plotted the F, G and H numbers for the first iteration:

I also put arrows on the diagram to point to the previous cell.  Then I put a breakpoint in my AStarNode object and I ran the program.  The first AStarNode was node 0,3.  Then 0,2.  That's when I checked to make sure that G=1, H=6 and F=7 (the red numbers in the corners of each cell).  Then I checked the next cell, which turned out to be cell 0,4, and so on.  On the next iteration, things got a bit complex so I added Log4Net to my application, spit out the node being worked on and listed the nodes that were pushed onto the open list.  Eventually, I was able to walk through the log file and see the order that the algorithm was using to walk toward the goal of cell 6,3.  

The last task of this project was to make sure the list was read back into the way point list.  You have to walk the list backwards because the AStarNode contains the x,y coordinates for the previous node.  A simple while loop walked this back to the starting point and I just inserted it backwards into the WayPoint list, leaving the starting point out of the list (since the GetNextWaypoint() method takes care of that anyway.

Get the Code

You can go to my GitHub account and download the zipped up version of this project (Click Here).  Assuming you have Visual Studio 2012 or better, you should be able to re-build the project and make this work.  If you want to run the sample as shown in this blog post, you can go to the HomeController.cs file and change the GameData.InitializeGame(1) to a 3.  Then go to the GameBoard.cs file (in the core project) and at the top is a TestMode boolean variable that you should set to true.  Then run the game and you'll see the map pictured earlier in this post.  You can just click the next phase button and watch the German unit navigate to the city on the right side of the map.  If you want to dig through the code, you can un-comment the log4net logging inside the ShortestPath.cs file and run the program, then look at the log file (which will be located in c:\logs.

Last, you should take a look at how the non-A* algorithm failed.  You can do that by going to the GameClass.cs file.  Search for the SetEnemyStrategy() method and you'll see two places where the ComputePath() method is called.  Comment these two lines and run the program.  You'll see the German unit walk right up to the mountains and get stuck going back and forth.

Tuesday, October 14, 2014

Turning off Log4Net Logging for NHibernate

Short Post!

In one of my previous posts I showed how to setup Log4Net to log errors and warnings.
One of the nice features of NHibernate is that it directly interfaces with Log4Net and generates all kinds of information.  If you use Log4Net, you'll see pages of information about what your NHibernate ORM is doing when it runs.  The problem with this information is that it's too much raw data and rapidly turns into spam rather than helpful information.  So I'm going to demonstrate how to turn off some of that output so you can see your Errors, without all the extra information.

Sample of full error reporting from NHibernate as a default:

Right after the <root> tag you can paste this into your app.config (or web.config) file:

   <logger name="NHibernate">
     <level value="ERROR" />
   <logger name="NHibernate.SQL">
     <level value="ERROR" />
This will cause NHibernate to only output errors and reduce the amount of information that goes into your log file.  If your program runs without bugs, you will not see anything in your log files.

You can also use "WARN" and "DEBUG" like this:

   <logger name="NHibernate">
     <level value="WARN" />
   <logger name="NHibernate.SQL">
     <level value="DEBUG" />
This will cause NHibernate to output the SQL that it generates, but keep other output to warning level and above.  

Here's an example of a log file using the above configuration:

Notice that only the SQL information was sent to the log file (unless your NHibernate queries or mappings fail).

Which  brings me to the fact that NHibernate uses two loggers.  You can independently control the level of logging for NHibernate and the logging for the SQL that NHibernate generates.  This is helpful when you want to verify your LINQ or you notice a performance issue that might be caused by NHibernate generating a bad query.

OK, so maybe this wasn't a short post after all.  You might want to tuck this information away for future use.  It could save a lot of time when trying to debug your database application.

Sample Code

You can download the sample code here:

Sunday, October 12, 2014

Game Design, Back to Battlefield One


A while back I wrote a sample game called Battlefield One.  This game was a turn-based war game that was built on SVG, javascript and C#.  I have since converted this game into an MVC application (although it still uses the same code-behind techniques and I keep track of the game state with a session).  I have also refactored much of the game to expand it, add unit tests, make the dimensions more flexible and I added tanks.


OK, here's how the game was designed:

1. The game board is rendered in hex cells.
2. Game units do not stack, only one unit can occupy a cell at a time
3. Attack distance assumed to always be 1 for simplicity.
4. No terrain effects.
5. Capture all cities to win.
6. Destroy all enemy units to win.
7. Movement phase then attack phase.
8. Areas of board not visited will be blacked out.
9. Areas not visible to any unit will be dimmed and not display any enemy units.

I improved the AI some since the first game was written:

1. Enemy units will attack the lowest defense numbered unit.  This causes many enemy units to gang up on one nearby unit.  Reducing the number of Allied units quicker and reducing the number of future Allied attacks.

2. All enemy units will choose a city to move toward.  

3. There is a test flag in the GameBoard object that will allow you to turn off the masks so you can test the game.  This mode will also draw red lines between the units and the destination they have chosen.

4. If there are two allied units next to an enemy unit and one allied unit is on a city, the enemy unit will attack the unit on the city first.

Algorithms Used

The Die Roller object was improved.  Instead of using one 6-sided die, I used a combination of three die rolls.  The problem with one die is that the outcome is linear and the game is boring.  As you use more dice, the outcome becomes more like the normal distribution of events and your attack and defense outcomes become more realistic.  This was not much of a problem until I introduced tanks which could inflict a damage of 1 or 2.  I wanted to have a damage of 1 occur more often than a damage of 2.  Wolfram Alpha has an article on this here.  My interest was in this chart (from the wolfram article), which explains it all visually:

I added unit tests to complex parts of the game.  The original game was written without any unit tests.  This was due to the fact that the original game was only a demonstration of what can be done and I only needed it for an example in a blog post.  Now I intend to turn this into a project (aka hobby) that I will add features and blog about the features (like AI enhancements).

I added a unit list class.  This is a list of units for the entire game and it allowed me to refactor some methods that were performed against the list of units.   This object also has an AddUnit method to insert a unit into the list without referencing the list of units (called "Items") directly.

I added a game board class to contain all the information and methods pertaining to the game board itself.  Each cell now references a GameMap object containing variables for the terrain, mask and visibility settings.  The previous version of this game contained three arrays to keep track of this information and I wanted a way to be able to change the game board size at game startup.

I changed the map background pieces into individual hex pictures so I could create a board of any size.  The original game had a large painted background that was fixed in size and had 4 cities painted in place.  By creating hex images of the green areas and a hex image of a city, I can just set the array at game start with the map size and city positions I want.  Future versions can contain a mixture of different textures to make the game more interesting.

There was a bug involving the movement of playing pieces.  The movement calculator would follow a straight line and then snap the unit to the closest board hex.  The problem with this algorithm is that any piece traveling close to the edge of the map would stop because the closest hex position was off the map and the edge detector prevented the unit from going off the board.  The new algorithm gathers a list of up to 6 surrounding hex squares.  It will eliminate any hex that is off the board and any hex that is already occupied by another unit.  The remaining list of hex positions is measured by distanced to the destination and the shortest one is chosen as the next hex to enter.  This allows units to go around each other if necessary and the destination will always be reached unless it is totally blocked.

Playing the Game

You can go here to play the game on-line, or you can download the source code and experiment with the code yourself.  To download the source code, go to and download the zip file (there's a button in the lower right corner).

Saturday, October 11, 2014

Using Log4Net Logging Utility


In this blog post I'm going to show you how to use Log4Net.  Log4Net is a logging library that you can add to an application in order to log errors, warnings and informational messages.  Your log can be a file on your hard drive or messages can be sent to an email address.

The Basics

If you're building software that will run on a web server or as a service on a server to process data, you're going to want some sophisticated logging.  Logging can give you clues as to how your software is performing, it can indicate if there were errors and you can log information so that you can trace back and find out what your program did before it crashed (assuming it crashed).  If you have many systems, then you'll probably want one central location where you can gather logs and the easiest way to do that is to setup logging to send emails to a debug email account.  Then you can setup filters to sort your log errors by system.  Once you look at the emails you're receiving from your logger you can see if something is going wrong once in a while, or if your system is going crazy just by the rate at which your error emails are arriving.

Setting up Log4Net

You can install Log4Net from NuGet.  Just go to Tools -> NuGet Package Manager -> Manage NuGet Packages for Solution.. and type in "log4net" in the search box.  Then install the Log4Net package.

Next you'll need to setup your app.config file.  This file is where all the magic occurs.
First, you'll need a line at the top that tells your application that a log4net configuration follows:

    <section name="log4net" type="log4net.Config.Log4NetConfigurationSectionHandler, log4net" />

Next, you'll need to define one or more "appenders".  These are definitions for each output you want your errors to go to.  You can setup an appender to go to your console window and configure it special for a colored display.  You can setup an appender for your log which can be stored on a local hard drive.  You can setup an appender for an email address to send an email for each error/warning that occurs.  If you set up more than one appender then each source will receive error messages.

Here's an example of a log file appender that you can copy and paste into your app.config file:

    <appender name="RollingLogFileAppender" type="log4net.Appender.RollingFileAppender">
        <file value="C:\logs\Log4NetDemoApplicationLogFile.txt" />
        <appendToFile value="false" />
        <rollingStyle value="Size" />
        <maxSizeRollBackups value="3" />
        <maximumFileSize value="50MB" />
        <layout type="log4net.Layout.PatternLayout">
            <conversionPattern value="%date [%thread] %-5level %logger%newline%message%newline%newline" />

To make this work, you'll need a root element block to reference this appender (put this before the last log4net tag:

    <level value="DEBUG" />
    <appender-ref ref="RollingLogFileAppender" />

I'm going to discuss what all this means in a minute, but let's get this thing up and running first, so you can see something happen.  In your Program.cs file add a using for "log4net" and then replace your program with something like this:

class Program
    private static ILog log = LogManager.


    static void Main(string[] args)

        log.Info("Program initialize");


    static void DoSomethingBad()
            throw new ArgumentException();
        catch (Exception ex)

Now you can run your program and observe that a c:\logs directory was created on your PC (if you didn't have one already).  Inside that directory is a file titled: Log4NetDemoApplicationLogFile.txt.  If you run your program again, you'll see another file.  with the same file name but a ".1" after it.  If you continue to run your console application, you'll a file added each time with a ".2", then a ".3" etc.  What's happening here is that the first file is renamed with a ".1" after it and then a new file with only the ".txt" extension is created.  That is always the latest log file.  The remaining files will be incremented by one each time a new log file is created causing them to shift down in your directory list. 

If you look at the appender block above you'll see a setting called "maxSizeRollBackups".  You can set this to the maximum number of files you wish to save.  In this example it's set to 3.  So the last file in your logs directory will end with a ".3".  If you start your file after that, then all files will shift down and the last ".3" file will disappear.

The "appendToFile" setting means that you can append to the existing file.  In this example we create a new file every time we restart the application, but you can just append to an existing file until it reaches it's maximum size.

The maximum size of the file can be set using the "maximumFileSize" setting.  When the log file reaches this size, a new file will be created.  This sample is set to 50 megabytes.

The layout conversion pattern allows you to define what you want to display in your error message.  You can customize this to include things like the thread number (or leave it out if you don't care), the date, etc.

The Colored Console Appender

Now, let's add an appender to report log entries on the console itself.  Copy this after your first <log4net> tag:

<appender name="ColoredConsoleAppender" type="log4net.Appender.ColoredConsoleAppender">
        <level value="DEBUG" />
        <foreColor value="White" />
        <level value="INFO" />
        <foreColor value="Green" />
        <level value="WARN" />
        <foreColor value="Yellow, HighIntensity" />
        <level value="ERROR" />
        <foreColor value="Red, HighIntensity" />
        <level value="FATAL" />
        <foreColor value="White" />
        <backColor value="Red, HighIntensity" />
    <layout type="log4net.Layout.PatternLayout">
        <conversionPattern value="%date [%thread] %-5level %logger%newline%message%newline%newline" />

You'll  also need to add this to your root block:

<appender-ref ref="ColoredConsoleAppender" />

Now if you run your program, you'll see error logging in your console as well as the log file.  You can alter which colors represent each type of logging.

The SMTP Appender

As I mentioned earlier, you can add an appender to send log entries to an email location.  Here's an example email or SMTP appender that will send log messages that are errors:

<appender name="SmtpAppenderError" type="log4net.Appender.SmtpAppender">
    <to value="" />
    <from value="" />
    <subject value="Export Notification - Error" />
    <smtpHost value="" />
    <bufferSize value="512" />
    <lossy value="true" />
    <evaluator type="log4net.Core.LevelEvaluator">
        <threshold value="ERROR" />
    <filter type="log4net.Filter.LevelRangeFilter">
        <levelMin value="DEBUG" />
        <levelMax value="FATAL" />
        <acceptOnMatch value="true" />
    <layout type="log4net.Layout.PatternLayout">
        <conversionPattern value="%property{log4net:HostName} :: %level :: %message %newlineLogger: %logger%newlineThread: %thread%newlineDate: %date%newlineNDC: %property{NDC}%newline%newline" />

After you paste the above code into your <log4net> block you'll also need a reference for this block:

<appender-ref ref="SmtpAppenderError" />

Now you need to doctor some values in the appender block.  Change the "to" email address to your email address.  You'll need an smtp server address to relay your emails (smtpHost). Then you can run your program and see error messages go to your in-box.

Download Sample File

You can download the sample file here:


Saturday, October 4, 2014

Fluent NHibernate Mapping Generator


I finally did it.  I completed my new Fluent NHibernate mapping generator.  It has several features that make it more powerful than the hacked together console application that I did earlier this year.  Included in this package are objects that use the generated code to create unit tests around SQLLocalDB.

The Mapping Generator

The mapping generator has an interface allowing you to choose which SQL server to create mappings from.  Once you select your server, a check list of databases will be shown allowing the selection to the database level.  Once one or more databases are selected, click the "Generate" button and your database mappings will be generated in the project named "NHibernateDataLayer".  You can change the location of this directory to any directory or project file.  If you choose a directory, your mappings will be created inside the directory and you will need to manually add your cs files to a project to use them.  If you designate a project, the mapping cs files will be generated in the project directory and the project file xml will be updated so they show up in the Visual Studio project list.

The generator will create a directory with the name of the database.  Inside that directory will be four directories for tables, views, stored procedures and constraints.  The tables are used by Fluent NHibernate to become the mappings for the ORM.  The views, stored procedures and constraints are definitions that will allow your unit tests to generate these objects in a SQLLocalDB before running a unit test.  If you change your stored procedure in your database, you can re-run the generator to update the stored procedure that will be used in your unit tests.

The Unit Testing Package

This unit test package has the ability to insert data defined in an XML file or a JSON file.  To use this feature you will need to create a file like the ones in the sample package, and make sure the file is set as embedded.  Then you can use the UnitTestHelpers.ReadData() command to read all the xml data into the database.  If you specify a primary key data point, then any identity columns will be overridden while inserting this data.  If you don't specify a key in your xml data and the column is setup as an identity column, then a new primary key will be generated.  You can leave out optional fields (i.e. nullable fields) from your xml definition and they will be ignored by the insert command in the ReadData() method.  

The UnitTestHelpers.CreateConstraint() method allows you to create a constraint that has been defined in your database between two specified tables.  This allows you to create only the constraints you'll need for the tables that you will be running your unit tests against.  I did it this way to cut down on the number of constraints to create, otherwise you would have to pre-populate every table that had a constraint on it, instead of a sub-set of tables you wish to tests.  

The UnitTestHelpers.ClearConstraints() method will clear all constraints.  Make sure you call this at the end of your unit test, otherwise the TruncateData() method used in the cleanup method of your unit tests will fail.

What can you do with this Source Code?

Anything you want.  Extract the code you wish to add to your own projects.  Download the code and add the features you desire.  I hope to some day make this into a real application, but for now, I just want to get this posted so other people can use it.  No warranties are implied and you'll have to use this code at your own risk.  If you find any bugs, you can drop a comment here, email me or create an issue on GitHub (you'll need to sign up for a free GitHub account to post issues).

Where to get the Source Code

I used GitHub to check in my source code.  That will allow me to fix any bugs and add any features in the future.  You can click here to find the source.

Dealing with Legacy Code


In this blog post I'm going to talk about legacy code and my experience with transitioning from Legacy into a new model.


I've worked for quite a few companies now and I've converted many systems from antiquated legacy code into something more modern, like C#, PHP, Object Oriented, MVC, etc.  Legacy code that I have worked with included languages like VB.Net, Basic, FoxPro, MS Access, C, ASP and other systems.  My experience has been that a company will hire a "programmer" with little qualifications, other than the fact that they can write code and make small programs work.  Once the software created by this person (and it's almost always one person at first), grows to a limit where it becomes an overwhelming burden on the programmer and starts to drag down the company, some changes need to be made.  Usually this is where the professionals get hired and the real work begins.  Don't get me  wrong, I'm not bragging here, I was once that guy that produced some code in Basic in a non-structured layout with no design in hand.  That's a fun job as long as you can get out before you have to add features to the mess that was created.

Sometimes legacy code can exist due to the fact that it is so big and so expensive to upgrade or so risky to upgrade that a company will hang on forever.  I can tell horror stories about a company that used Fortran for their accounting with thousands of rules that applied to their union negotiated benefits.  All customized for the company in a mainframe computer that needed to fixed in time for the year 2000.  I'm certain that software was well designed.  Unfortunately, the company had two Fortran programmers who had full-time jobs maintaining that software and no plans for replacing the software in the near future.  To top that off, the programmers who maintained that system must have been close to retirement.  I'm not sure what that company did, I'm betting that events probably forced their hand.

Initial Handling of Legacy Code

First of all, I think developers need to think like a doctor.  Doctors have the Hippocratic Oath: "First do no harm".  Let's face some facts: First, the legacy code is running the business, otherwise, you could just throw it away.  Second, the legacy code in place might be the only actual "documentation" of what it was intended to do.  It's always best if there are design documents, because then you can read those and find out what the original intent was.  If there are meeting notes, that good too.  Otherwise, the only thing left (assuming the original programmer is long gone) is the software itself.  Does it have comments?  Probably not.  Are variable names descriptive?  That usually doesn't happen either.  You might need to do some refactoring, but I would do some recon work before attempting anything.

One of the biggest mistakes inexperienced programmers will make is that they ignore version control.  So it's probably also true that you can't look at earlier versions of the legacy code you are about to work on.  The first step in all of this is make sure you have the entire source code package checked into a repository and have some mechanism in place to track changes.  This is very critical, because you'll probably make a mistake and need to roll-back.

Converting Code - My Experiences

I've never worked on legacy code that contained unit tests.  Can unit tests be created to test the legacy code?  This is always a huge challenge.  It would be nice if a unit or integration test could be applied to legacy code without changing the code.  That would make it easy to convert.  Once the unit tests were in place, you could convert the code and use the same unit tests to verify that the changed code matched the output of the original code.  Unfortunately, legacy code is usually tightly coupled and many times it's tied into other systems that cannot be accessed safely from a unit test environment.  You might be able to perform some sort of end-to-end automated tests.  This could become a time-sink in itself, and my experience is that the company is making demands to enhanced the existing legacy code while you're attempting to fix it.

One of the methods I used when I converted BTA's software from two un-normalized databases (using SQL and FoxPro) into one normalized Oracle database server was to write a data converter and re-write the software.  This was quite a painful experience and I would not recommend this technique unless your company has full buy-in and the enhancement requests can be reduced to a minimum.  In the case of BTA, they had such a data quality issue, that fixing that problem was their customer's number one demand.  It took about a year to complete and the roll-out was difficult, but it worked.  In order to be effective I hired two contractors to maintain the existing system and write the converter.  My remaining developers worked on the new system.  The converter was run at least once a week to refresh our new system database and to find and fix any normalization issues that cropped up in the legacy system.  Once the new system was on-line and the initial bugs were worked out, everything ran noticeably smoother.  Having one code base that was well structured helped a lot.  At that time unit tests were not available and most of our code was in PHP and C.

Years later BTA ran into another issue.  This time it was an issue involving the demise of Borland.  Borland made a great product at one time called C++ Builder.  This was C++ that had forms, allowing the developer to create a stand-alone program in C++ without programming and connecting all the resources necessary to handle events, windows, menus, etc.  The product worked the way C# does today.  Visual C++ did not have this capability which made it difficult and time-consuming to build an application.  Unfortunately, C++ builder went through some changes and the 2006 version was too buggy for our developers to use.  So we tested Visual Studio 2005 with C# and discovered that we could quickly convert our software from C++ Builder into Visual C# and have an improved product.  So we did, and it cost us about 4 months of conversion time using one developer.  This legacy code was easy to convert because we had design documentation, we did not change the database and we could transition from the old to the new system because both worked at the same time.  We just continued to use the new software and fix bugs that cropped up until it was better than the legacy software.

About one year after the conversion to C# BTA had ran into an issue that had happened many times before.  We were a unix shop (Oracle and PHP), but a smaller company with little possibility of major growth in the near future.  This issue was due to the fact that we could not keep a unix expert on staff full-time and there were very few contract companies in the area that could cover our issues.  In addition to this issue, BTA took advantage of a one-time tax change that allowed them to collect a large amount of cash that year.  So I did some research into what it would take to transform BTA's unix-based systems into Microsoft based systems.   This was outside of the box type of thinking and I proposed that we convert our Oracle server into MS SQL, which cost us next to nothing since the licensing was comparable and the conversion of our software to point to SQL instead of Oracle was very minor (our system had very few stored procedures and field types were kept to a minimum).  Our biggest headache would be converting a large PHP web site into C# and IIS.  I wrote some helper classes in C# that mimicked the functions in PHP that C# did not perform eactly (like PHP's substring functions).  Then I estimated what it would take in man-hours to convert the entire website into C# and I was tasked at finding a contracting company that could perform the job.  This was a job that involved copying the PHP code and pasting it into the code-behind of a C# .net application, then removing the dollar signs on variables and declaring all variables for the page.  Then there was some cleanup that occurred for each page.  In the end it took about 4-6 months with 3 contractors at a cost of about $140k.  

BTA's benefit was a code base that was entirely C#.  After that point, we were able to merge many common objects between our stand-alone application and our website.  We were also able to find network administrators with a Microsoft background easier than unix administrators.  In addition, and this was our biggest improvement to developer productivity, we were using Visual Studio for all of our code with built-in tools for refactoring code, syntax highlighting, auto-formatting and other capabilities we didn't have with PHP at the time.  

At DealerOn the situation is different.  We have a solid database.  It might not be the most optimal database design, but it's organized pretty well and it doesn't suffer from many integrity issues.  What we currently have is a platform in VB.Net that we are slowly converting into C#.Net and eventually we will convert that into an MVC environment.  There is a lot of progress in this area of legacy code replacement.  Our biggest headache lies with our CMS (Content Management System) software that was built in classic ASP.  Our plan is to build a new system (which is already in progress) to utilize C# with MVC and build it with all the new features.  We are very averse to adding features to the legacy CMS.  Unfortunately, progress is slow and success is uncertain.  I am not a big fan of this method of conversion.  It only works if a company can staff up and build the new system as quickly as possible.  Otherwise the technology used on the new system will become obsolete and possibly be abandoned. 

Going Forward

I anticipate that advances in software development technology will bring us to a point where the legacy code will have unit tests.  At the very least it will be possible to add unit tests.  This will make the job of converting legacy code easier because there will be a more reliable method of verifying the new code.  I don't think the culture of businesses will change such that they will spend the money to hire the best developer when they start their custom software projects.  So we'll always be saddled with code that is tightly coupled, poorly designed and not documented.  Therefore, there will always be a painful legacy code conversion project in the future.  How you handle these projects will determine how successful you are in converting the code.  DealerOn's platform code conversion is one of the most optimal solutions I've seen.  We will have a mix of VB and C# for some time until we get to a critical mass of C# vs. VB and we do a final "clean-up" push to convert the remaining VB code.  At this time the conversion from VB to C# is not costing the company much money since it is only done when a new enhancement is implemented.  Limiting the amount of code converted and allowing a smaller amount of code to be tested before the next conversion takes place.

If you have the option of running two systems side-by-side, it can be a good solution.  Especially in accounting situations where verification is necessary.  If you can mix legacy and modern code, that is a good solution in situations where you are transforming a website.  If you have a faulty database design, then you're dealing with a more serious situation and any solution can get expensive fast.

Reasons to Convert Legacy Code

Conversion of code from one language to another or from one database system to another is expensive and does not necessarily benefit the company.  You should never convert a system just to change to the latest and coolest language.  The only time you should convert to a new language is if there is some sort of structural reason for converting.  In my experiences I converted a FoxPro database system into Oracle and PHP.  Why?  Because FoxPro was not a web language and BTA had an application to allow roofing contractors to enter their bid prices.  The FoxPro application they had was designed to be packaged into an installer shipped by floppy (later downloaded by FTP), and installed on the contractor's PC with a subset of data to bid on.  Then the contractor was required to export their bid prices and email the zip file to BTA for analysis with other contractors.  By building a website all of the headache of preparing data, shipping software, dealing with installation issues (contractors do not buy top of the line PC's) and importing corrupt data vanished overnight.  Now contractors get a login id and password.  They log in, enter their prices and log out.  It goes right into a database server.  FoxPro is a stand-alone database system with limited security capabilities and no web capabilities.  This needed to change and our target was Oracle and PHP.  If MySQL had been a mature product in the late 90's we probably would have opted for MySQL. Today, I would investigate MongoDB and workup an estimate based on that technology.

Converting from PHP to C# was a very rare example of a conversion that could have been avoided.  If BTA were a growing company and could afford a Unix administrator on staff.  If PHP had better tools at the time.  Today, I probably would look at Ruby and Python.

Another consideration you should factor into taking the conversion plunge: What languages do your developers know?  Are you going to have to re-staff?  Switching from PHP to C# did not break the hearts of my team of developers at BTA.  Some developers might take issue with converting from a Unix environment to a Microsoft environment (or the other direction).  I personally have no allegiance to either, but I'm a rare developer with extensive knowledge of both types of infrastructures.

Finally, if you have a valid reason and the benefits outweigh the costs, then don't hesitate.  Set down your plan and get to work.  The sooner you get through the pain of conversion the sooner you can move on to better things.