Sunday, December 29, 2013

MVC 4, Django Research

In earlier posts I've talked about the necessity of research.  I always keep my eyes open for new technology.  At DealerOn we are about to embark on a new and major project (of which I can't give details at this time).  All I can say is that it will be a complete ground-up design.  At this point all possibilities are open.  We have kicked around a bunch of ideas on how to design this new web application.  Django with Python is still being considered and I have done some preliminary analysis of Visual Stuido's Python/django implementation.  My first assessment of this implementation is that was should go all in or don't attempt it.  In other words: Stick to Python's native OS (Linux) so we are relatively assured that we can build this web application without running into technical problems part way through the project. 

The other environment I was researching is MVC4.  Specifically the Web API with AngularJS.  I found a CRUD sample application that has AngularJS with Web API and it works rather well.  My only real turn-off was the sheer number of source files it took to make that beast run.  Here's a link to the code: CRUD grid using AngularJS, web api, Entty Framework and Bootstrap.  One other appealing aspect of this project is that it uses EF and bootstrap (bootstrap provides responsive design for portable devices). 

In addition to downloading samples from the web, I have purchased two books to read and gain further insight to these two frameworks (django and MVC).  The first book is for django:


I only thumbed through this book so far, but I own a lot of Apress books and they're generally good for information. 

For MVC, I purchased this book:

 
 
I started reading this book yesterday and I have to say that I like this book a lot.  If you have never done MVC programming before, buy this book.  It starts with an empty project and has a chapter on one concept at a time.  The first concept it covers is the controller.  The first test programming is nothing more than a controller source file that returns a string.  The chapter describes the default controller names ("HomeController"), the default controller directory and how to use controller methods.  This is all taught without regard to views or models (covered in successive chapters each).  The appeal of this method of learning is that you can lean one concept before moving on to the next.  I have looked at a lot of examples and reverse engineered some of this stuff from the examples (like the Web API interface, I learned by reverse-engineering a working example).  My knowledge is much more solid when I can focus on one piece at a time.
 
So now you're probably thinking "Frank, you've never done an MVC project before?"  Nope.  The reality of software development is that you only get to use a new technology when you create a new application or rewrite an existing application from the ground up.  After that, you're stuck for a long time with the technology you have chosen.  So choose wisely. 
 
I typically try to choose a technology that has been around for a while, that way all the quirks have been discovered by other developers and the answers are posted on forums like Stack Overflow (Stack Overflow, in case you don't already know, is the holy grail of answers for programming problems).  I also strive to choose something that is close to bleeding edge so I know that it won't be obsolete before I deploy the first application (i.e. I could just create this new app as a standard ASP code-behind application and be done with it.  But that's really no fun and I hate code-behind pages).  Currently, frameworks are all the rage and there's a reason for it: Separation of concerns.  It is easier to maintain an application if each file or module is concerned about one aspect of your application.  Concerns are like database access, URL routing, Business logic, views, etc.  By dividing these tasks into different files or modules, the developer can focus on one aspect at a time and not mix it all together in one source file (remember ASP?  Yuck).
 
When I first looked at a demo of the MVC setup, I was a horrified by the fact that there are three primary directories to contain all the major code files for the web application (The Model, View and Controller directories).  In a real-world application, I can visualize these folders growing to thousands of files each.  However, I have another plan for our application.  I plan to leave the model directory empty and wire the controllers directly to the business objects, which will be contained in a separate project or projects.  All the database Entity Framework stuff will also be in a separate project.  This will give a clean break between the web side of things and the business code for unit testing purposes (and the ability to use the business code for other applications, if necessary). 
 
Django
 
I spent some time looking at django and Python for Visual Studio.  If you'd like to experiment with this setup, go here: Python tools for Visual Studio.  There are a few tricks to overcome to get django to work properly, but they're contained in the samples.  One turn-off is the amount of stuff needed to be installed and configured before it will work at all.  However, I'm willing to overlook a difficult configuration if it is only used to get the project started and running.  The appeal of the demo project that I looked at was the minimal number of files that were necessary to make it work.  Django was designed to just get things rolling and it certainly does that (once the initial configuration and setup hurdles are crossed).
 
My purpose in testing the Visual Studio version was my hope that it would work as a project within a solution.  In other words, I was hoping to create my business logic in another project (i.e. assembly) and just reference it in my Python code.  This doesn't seem to work with the 32-bit version of Python.  The 64-bit version of Iron Python doesn't seem to work with django (or perhaps I haven't stumbled on the proper way to install django yet, since Iron Python doesn't use pip for installations). 
 
Another avenue I explored was to connect to SQL Server from Python.  There are database connectors for SQL Server and there are a lot of people having issues with it.  The whole SQL Server part has it's own drivers and requires other software to be installed and configured.  I have a lot more research to do on this subject.
 
The django avenue will create other issues with our environment if we decide to go that route and there is no ability to import and use objects from C# projects.  One issue will be the ORM that we use.  It will have to be an ORM that is native to django.  We would also need to verify how compliant any SQL Server driver is, so we know up front if we are digging a big hole for ourselves.  We are not in a position to convert our database from SQL Server into any other technology at this time.  Our database is currently tied to other applications and it contains a lot of stored procedures.
 
Conclusion
 
At this point you're probably thinking that I'm biased to using MVC because most of my recent development work has been in C# and SQL Server.  However, I have switched technologies in the past.  I have built large web applications using PHP and Oracle all on top of the Linux OS.  So I'm not tied to Microsoft.  I have also ported a PHP/Oracle system to C#/SQL Server, so I know how to switch between two different technologies.  My gut feeling at this point is that I need more knowledge of both MVC and django to make a more informed decision on which way to go.  This decision will probably be sooner than later, since I really want to get the show on the road and this project has been on the back burner for too long already.

Saturday, November 30, 2013

Entity Framework 6 vs. LINQ-to-SQL smackdown!

Today, I'm doing my own performance testing on Entity Framework 6 and LINQ to SQL.  I created two tables in a sample SQL server database.  Here is the ERD for the two tables I created:


Next, I created two console application projects.  One to test LINQ to SQL and the other with EF-6.  You can download the LINQ to SQL test project here: linq2sqltest.zip and the EF6 test project here: ef6test.zip.

I started with the LINQ to SQL test using a basic insert loop like this:

for (int j = 0; j < 10; j++)
{
    for (int i = 0; i < 1000; i++)
    {
        person personRecord = new person()
        {
            first = firstnames[i],
            last = lastnames[i],
            department = 1
         };


         db.persons.InsertOnSubmit(personRecord);
    }
}

db.SubmitChanges();


I used the same code for the normal EF6 test:

for (int j = 0; j < 10; j++)
{
    for (int i = 0; i < 1000; i++)
    {
        person personRecord = new person()
        {
            first = firstnames[i],
            last = lastnames[i],
            department = 1
         };


         db.persons.Add(personRecord);
    }
}

db.SaveChanges();

I am using EF 6 version 6.0.1 as of this writing.  Microsoft has indicated that they are working on a bug fix version of EF6 that will be version 6.0.2.  There is no expected date when that version will become available, so keep your eyes open for this one.

In order to make the test measurable, I downloaded two text files full of first and last names with at least 1000 rows.  Then I ran the loop through 10 times to make it 10,000 rows of data inserted into the person table.  You can download the text files from the census like I did (if you want to include more names) by going to this stack overflow article and click on the suggested links: Raw list of person names.

I also attempted to move the db.SaveChanges(); inside the loop to see what effect that would have on the timing, and received the expected result of slower-than-dirt!  So I did some research to find a method to speed up the inserts and came across this stack overflow hint on bulk inserts: EF codefirst bulk insert.  By changing the configuration of the context before the inserts were performed, I was able to increase the insert speed significantly:

db.Configuration.AutoDetectChangesEnabled = false;
db.Configuration.ValidateOnSaveEnabled = false;

Here are the final results of all four tests:


The first test is the base-line LINQ-to-SQL.  I did not attempt to optimize this query (and I'm sure there is a way to make it go faster as well).  My interest was to see if I can make EF6 perform as fast or faster than straight LINQ-to-SQL.  The first test is the slow test where I put the SaveChanges() inside the loop (epic fail).  T2 is the second test, which is shown in the code above.  The third test (T3) is the code where I added the two configuration changing lines before running the insert loops.  All timings are in seconds.  As you can see from the results LINQ-to-SQL, un-optimized ran for 18.985 seconds to insert 10,000 records.  EF6 in test 3 ran for 2.371 seconds.  Just using the same code in EF6 produced a poor result at just over 46 seconds.  Be conscious of the time that EF6 is going to take some extra work to make it perform.

UPDATE: I have since written a test for NHibernate inserts and re-ran the above tests.  The LINQ-to-SQL test was able to perform the same inserts at a measured time of 3.49 seconds.  Because of this discrepency, I re-ran all tests several times and it seems that all the measurements are close except for the LINQ-to-SQL.  This is not a scientifically accurate test and I would recommend downloading my examples and run your own tests on your own hardware.


Friday, November 29, 2013

Entity Framework 6 Mocking and Unit Testing

Last week I attempted to mimic this article from Microsoft: Testing and Mocking Framework.  I ran into a problem involving some code that was in the article last week, but has since been removed (I'm assuming that Microsoft changed something in their EF6 framework or .Net and didn't update the article before I tested it).  Anyway, I have now duplicated the unit tests that Microsoft demonstrates with my own table.

Two "gotchas" were discovered while attempting to make this work.  First, I had to pass the context into my object (instead of just using a "using" statement with the context I needed.  The reason for this is so the unit test can pass in a mock context.  The second "gotcha" was that the DataSet object inside the EF code needs to be virtual.  This was more difficult due to the fact that the code itself was auto-generated (because I used the EF designer to create it).  Fortunately, there is a reference in the article that states that "virtual" can be added to the T4 template used by EF (inside the <model_name>.Context.tt file).  I added the virtual keyword and my unit test worked like a charm.

The Details

So I wrote a console application and created one table.  The table was generated in MS SQL server.  I named the table "account" and put a couple of fields in it (the primary key field was set up as an "identity" field so it will generate a unique key upon insert).  Here's the ERD:


Don't laugh, I like to start off with really simple stuff.  Things seem to get complicated on their own.

Make sure you use NuGet to download the version 6.0.1 (or later) version of Entity Framework.  I just opened the NuGet window (see "Tools -> Library Package Manager -> Manager NuGet Packages for Solution..."), then I typed "Entity Framework" in the "Search Online" search box.

My console application main program looks like this:

using System;

namespace DatabaseTestConsole
{
    class Program
    {
        static void Main(string[] args)
        {
            UserRights rights = new UserRights(new DatabaseContext());

            string test = rights.LookupPassword("test");
            Console.WriteLine(test);
        }
    }
}


My entity container name is called "DatabaseContext", I created the container using the project right-click, then "add" -> "new item", then selecting "ADO .Net Entity Data Model".  I added a connection and dragged my table to the EF Model Diagram.

Then I created a new class called "UserRights" (right-click on project, "add" -> "class").  This is the content of the UserRights.cs file:

using System.Linq;

namespace DatabaseTestConsole
{
    public class UserRights
    {
        private DatabaseContext _context;


        public UserRights(DatabaseContext context)
        {
            _context = context;
        }


        public string LookupPassword(string userName)
        {
            var query = (from a in _context.accounts

                         where a.username == userName
                         select a).FirstOrDefault();
            return query.pass;
        }
    }
}


I manually added some data into my table and tested my program, just to make sure it worked.  Then I added a unit test source file (I named it "UnitTests.cs"), using the same "add -> class" that I used to create the UserRights.cs file above.  Then I added in two references and usings for unit testing and moq.  Here's the entire source code for the test:

using System.Collections.Generic;
using System.Data.Entity;
using System.Linq;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using Moq;


namespace DatabaseTestConsole
{
    [TestClass]
    public class UnitTests
    {
        [TestMethod]
        public void TestQuery()
        {
            var data = new List<account>
            {
                new account { username = "test",pass="testpass1" },
                new account { username = "ZZZ",pass="testpass2" },
                new account { username = "AAA",pass="testpass3" },
            }.AsQueryable();


            var mockSet = new Mock<DbSet<account>>();
            mockSet.As<IQueryable<account>>().Setup(m => m.Provider)

                   .Returns(data.Provider);
            mockSet.As<IQueryable<account>>().Setup(m => m.Expression)

                   .Returns(data.Expression);
            mockSet.As<IQueryable<account>>().Setup(m => m.ElementType)

                   .Returns(data.ElementType);
            mockSet.As<IQueryable<account>>().Setup(m => m.GetEnumerator())

                   .Returns(data.GetEnumerator());

            var mockContext = new Mock<DatabaseContext>();
            mockContext.Setup(c => c.accounts).Returns(mockSet.Object);


            UserRights rights = new UserRights(mockContext.Object);

            Assert.AreEqual("testpass1", rights.LookupPassword("test"),

                  "password for account test is incorrect");
            Assert.AreEqual("testpass2", rights.LookupPassword("ZZZ"),

                  "password for account ZZZ is incorrect");
            Assert.AreEqual("testpass3", rights.LookupPassword("AAA"),

                  "password for account AAA is incorrect");
        }
    }
}


As you can see from the unit test (called "TestQuery") above, three rows of data are inserted into the mocked up account table.  Then the mock context is setup and the UserRights object is executed to see if the correct result is read from the mock data.  If you want to test this for yourself, go ahead and copy the code segments from this article and put it into your own project.  Unit testing methods that perform a lot of database operations will be easy using this technique and I also plan to use this for end-to-end integration testing.


Update:

I have posted the code on GitHub, you can click https://github.com/fdecaire/MockingEF6 to download the code and try it yourself.


Sunday, November 24, 2013

Getting the Show on the Road

Introduction

So I've been evaluating different ORM's and I've decided to stick with Entity Framework.  I've done some testing with NHibernate and discovered that it is very difficult to get running.  I still need to spend more time to research NHibernate, but for now I think I'll just run with Entity Framework.  One of the positives of EF over NHibernate is the visual tool that makes it easy to setup the data objects.  Our company is planning to employ interns starting in the Spring of 2014 and I'm thinking forward along the lines of using the KISS principle wherever I can.  Later, when things are moving again, I can take another hard look at NHibernate and determine if we want to switch to that ORM over NHibernate.

So Friday I began the tedious task of converting one of the LINQ-to-SQL subsystems to EF.  Most of it went smoothly.  We don't currently have a lot of data access using an ORM, so now is the time to determine which tool we're going to stick with.  In order to get around the namespace weakness of EF, we are going to put our database access inside it's own project or subdirectory.  Our code will need to share tables that will be defined in one place.  I think the ability to refactor will assist us in weeding out deprecated tables and fields in the future.

One of my co-workers made me aware of an article called "Performance Considerations for Entity Framework 5".  This is quite lengthy and very detailed.  I would recommend putting it on your "favorites" list and keep it handy when you're ready to use EF.  This article talks about EF version 5, but I'm sure they'll update it for version 6 soon.  Here's an interesting side article for unit testing and mocking EF6: Testing with a mocking framework (EF6 onwards).

So What's the Point of Using an ORM?

Speed is not necessarily the only reason for using a different data access model.  In this case, the point is to catch SQL query mistakes at compile-time.  In the good-old-days when queries were sent back to the database as a string, any errors in SQL were only detected when the query executed (i.e. run-time).   

ORM's create objects that represent the tables, fields and other components of the database so that the developer can write a query directly in code (like C# or VB).  The SQL statement written in code can have automatic syntax highlighting for errors and errors are detected at compile time.  Of course, it's still up to the software developer to write a correct query.  At least this is one more step in reducing the amount of troubleshooting time a developer must take to get the software right.

Unit Testing

Unit testing is the latest new-hotness.  Technically, unit testing has been around for some time (so it's not new, but it's still hot).  In EF6 there is support for in-memory database mocking that can be used to test parts of your code.  In the past I have used test databases that I generated in a local copy of MS SQL server.  Using a test database means that I had to generate the tables and then tear them down when I was done.  By creating data in memory, the whole process takes less time and resources.

The reason why unit testing is so important in this instance is that many web applications are mostly database queries.  I know that the software that at my current company contains mostly code to access and manipulate data.  So unit tests that don't test the queries that access the database are very limited.

I'm currently looking at the unit testing and mocking of EF described here: Testing with a mocking framework (EF6 onwards).  Specifically the "Testing query scenarios" section.  If you're using Moq, make sure you download the NuGet package for Moq (it's not included in the base install of Visual Studio 2012).  I'm hoping to be back with an example soon.  Stay tuned...

Saturday, November 2, 2013

LINQ and Entity Framework ... sigh

Our company has a ton of legacy code using VB and simple text based queries that are submitted to the database.  There are several disadvantages to this method versus new techniques like LINQ and EF.  First, syntax errors in the query itself can only be tested in run-time.  Second, syntax highlighting, automatic formatting and intellisense are not supported.

One of the improvements started earlier this year was the use of LINQ in place of string queries.  Immediately a flaw was discovered in the interface for setting up tables.  That flaw is the missing capability to synchronize changes made in tables in the corresponding database.  The developer is forced to delete the table and re-add the table in order to reflect any field changes.  This is ok, if there are no stored procedures that return a result set for that table.  Then you have to go to each stored procedure and switch to something else, replace the table, then switch the stored procedures back.  There are third-party add-ons that can synch the tables, but they seem to of the bloat-ware variety and are not very effective (some of them don't work right).  A second "feature" of LINQ is that it creates a settings.settings file in the project to add the ability to change database settings from an interface.  This seems like a great feature until one realizes that it over writes the apps.config file when changes are saved.  The side effect is that comments in apps.config are removed.  Our company uses comments to switch between databases located on our local PC, staging server and the production server.  Powershell scripts comment and uncomment the sections for each of the connector strings to enable the correct database connector.  This goes out the window when the settings.settings control panel wipes out any commented connection strings (usually the staging server and production server settings). 

So we began looking at Entity Framework.  This was a dream.  It has a synchronize capability making it much easier to use.  I jumped in with both feet and started replacing LINQ setups with EF (they are compatible with each other, just keep the table names the same).  My plan was to create a device context of table definitions in each object directory.  This would create an organization where the tables that were used by the current object were in the same directory and any abandoned tables would not be used by some far off code not related to the object that might be refactored in the future.  Then I ran into a serious problem.  EF doesn't recognize name spaces.  Yikes!  So I changed plans quick.  I placed the device context in a central directory under the project and decided we would just use one device context for all objects.  I further discovered that only one database can be used per device context.  While that hasn't been an issue yet, it will be if we decide to use this for more of our larger objects (that query spanning two or more databases).  I theorized that we might be able to use nested "using" statements, but I haven't tried it yet.  This issue, coupled with the namespace issue has made me put on the breaks until I can do further analysis.  I have also heard rumors that EF is not as efficient at queries as LINQ. 

Other issues that discovered is that the objects created for the tables that are dragged into the context may have an "s" appended to them.  This causes all kinds of confusion about the table names.  It's also a headache to have to trace back to the device context to get the real table name to find it in the database (if you are troubleshooting code you are unfamiliar with).  I would prefer an ORM* that would create object names that matched the tables exactly, followed the standard namespace convention and had a visual interface that didn't stomp on my app settings and was flexible enough to synchronize changes with the database (because we make database changes).  Am I asking for too much?

So now what's my plan?  At this point, I'm going to put some time and effort into NHibernate (I have posted on that subject before).  Stay tuned for my results on this ORM.



*LINQ and EF are also known as ORMs (Object Relational Mappers).

Saturday, October 26, 2013

Software Testing

This past Wednesday (10/23/2013) I gave a tech talk at the University of Maryland.  The students there were awesome.  It's been a while since I had the chance to talk to a lot of young, energetic future computer scientists.  I still remember the days when I dreamed of putting my knowledge to practice.  Payam (my co-worker/boss) set up the whole shin-dig and we decided to ad-hoc the actual talk.  I have so much material to draw from that I could have talked for hours on end.  I hope that the students learned a few hints to help them avoid some of the errors that I've made in the past.  Some day I'm going to encapsulate this information into 1 hour lectures.  Maybe I'll get motivated a write a book!

I keep reading news about the new Affordable Care Act website and I'm horrified by the level of amateurism that went into developing that website.  Today I read that the integration testing was only done for two weeks, just before release.  Yikes!  Integration testing should be setup before the modules are even started.  In fact a testing environment can be designed and put into place as the requirements are nailed down.  The first integration test should fail and list every requirement that did not pass.  As modules are completed, the test is re-run and items should start to pass.  That gives the software developers time to fix any issues that crop up on the tests.  Of course, I'm assuming that the integration tests done on the ACA website were automated and not performed by hand (double-yikes!).

DealerOn's system has been build around coding techniques that involved hand-testing only.  We have recently hired a bona-fide quality person and she has coordinated the infrastructure changes necessary for automated testing.  I'm talking about automation beyond just unit testing.  We are currently using Telerik to perform some of these tasks.  She also uses scripts to manual test things that are not yet automated.  DealerOn is also in the process of completely revamping our staging system.  We are going to use tools to record html transactions coming into our live site and replay it back against our staging system to test concurrency and performance bugs.  As a computer nerd, I'm really excited about this stuff!

Testing Your Software

Testing is hard.  It's more of an art than a science.  That doesn't stop me from treating it like a science.  I just understand that no matter how hard I try, I'm probably not going to be 100% successful.  I've found that experience is the great teacher in testing.  No matter how many books I've consumed, or how many classes I've taken, my actual disasters are what sticks in my mind the most.  Probably because they were rather painful.  Story time: I remember a job I worked at involving a hand-full of armature programmers (some were newly minted computer scientists) working on legacy software.  When one of the programmers completed a program that would be distributed to contractors (by mailing packs of floppies), he informed his boss that he needed someone else to do some beta testing.  His boss was more than kind in doing the testing himself.  The first thing he did was install the program on his PC and run it.  He started entering letters in the text field used to enter bid prices.  CRASH!  The programmer in question said "You're not supposed to do that!"  Of course.  That's just something that wasn't thought of when he coded the program.  The obvious and stupid stuff.  Users are not going to be computer scientists. 

So what's the lesson here?  For every user input you should always test the boundaries of what can be entered.  If you are expecting numeric input, make sure your software only accepts numeric input.  You might need to inform the user of an error (preferably in a nice way, like a red background in the text box, or a small message above or next to the text box.  You should also try to not make the user angry.  If your input screen detects an error on submit, then you should provide gentle feedback and make absolutely sure you keep all their data entry points intact.  I can't tell you how many times I've filled out a form on-line and submitted it followed by an error because my credit card number was not supposed to have dashes and the whole data entry screen is blank.  Which requires me to re-fill in all the information that was correct and I could potentially typo something else causing a different error.  Angry customers are not something you need.

Automated Testing

It's nice that todays technology includes automated testing.  There are free tools for unit/integration testing such as NUnit.  Get familiar with these tools.  I normally use MSTest, which is built into the professional version of Visual Studio.  It's nearly identical to NUnit (and personally, I think NUnit has a leg up on some features).  My point is that every student should understand how to use unit tests, even if it's not taught at the University.  Unit testing is going to become much more important in the future of software development.  Here's why:  One of the biggest advantages to unit tests is that it can be re-ran at any time.  This means that you can do regression testing without consuming a lot of time.  Regression testing is where you test unrelated features of your software.  Because, unlike any other engineering practice devised by man, in software everything is related to everything.  Sometimes unintentionally.  In other words, when an engineer designs a jet, fixing a problem with the landing gear is not going to affect the performance of the jet engine.  In software, you can't guarantee or assume anything.

One other thing to remember with unit/integration testing.  When you find a bug and fix it, that means you found a bug that got past your unit tests.  That also means that you're missing a unit test.  So create a test that will find that bug in the future.  Don't assume that the bug you just fixed won't magically re-appear.  Yes, I have fixed the same bug over and over on systems where multiple developers are involved.  This occurs because there are two or more conflicting requirements.  Fixing the software one way, causes a bug in the other requirement.  Re-fixing for the alternate requirement causes the original bug to occur.  If you create a unit test, any other developer will see that unit test (make sure it's properly documented as to what behavior is being tested), and realized that they should not be breaking that requirement.  If a conflicting requirement is discovered, then a decision can be made to fix the problem right.

Test Early

Test should be done as early as possible.  It's easy to fall into the trap of just programming.  I've done it.  I still do it.  Try to make and plan your unit tests before you start to code.  If you can identify functionality early on, make unit tests to verify this functionality.  If you have a document listing each feature that must be supported, you should be able to create a set of unit or integration tests to satisfy each feature.  Initially, these tests will all fail.  As you complete each feature, you can see your unit tests turn green and see how much you have completed and how much is left to complete.  Add tests as you identify problems.  When all tests are green, you should be ready for final testing.  It's also easier to report your progress when you can show the percentage of tests left to complete.

Unit tests are also a way to test tricky areas of your software.  Sometimes real data is too big to really test effectively.  Especially when starting a project.  A unit test can be setup with a small portion of test data.  This can be used for the initial development of the software until it comes time to test on a bigger data set.  It can also save you development time if your program is written to read data from a remote system (think WCF or SOAP).  If I expect XML data (or JASON) from a remote site, I normally grab a one-record example from the remote site and put it into an XML file in the unit test directory.  Then I feed that data into the first object that will receive the data for processing.  I do this because there is no delay in feeding data directly, where the actual SOAP or WCF connection needs to wait for a response from a remote server.  Once the processing object is completed, I can do a full system test to make sure it works correctly with real data.  If I find a different set of data breaks my code, I create a new unit test (notice how I always maintain my original test) and feed that data into my object and fix the problem(s).  Eventually, I'll have a group of unit tests that will cover many possible input configurations.

Education

One of the things I've learned from my experience in the world of software engineering is that you must keep on top of stuff.  Education is very important.  I consume a lot of books.  I have also started viewing on-line classes.  Udacity is one great source of educational materials.  It's free and they have subjects such as software testing (see course CS258).  If you want to learn testing from the very basics on up, this is an excellent course to view and interact with. 

Saturday, June 15, 2013

The Game (Part 4)

I have implemented the mask and view capabilities.  I'm calling this version 2.  Here's a sample screenshot of how it looks in action:



The first thing you'll notice about this map is that unexplored areas are colored in black.  These will uncover as allied units are moved around the map.  The darker areas are areas that are not visible but have been explored.  These areas could have enemy units hiding there.  Every time an allied unit is moved the view map is recomputed in JavaScript (and a duplicate array is maintained in C# in case the player refreshes their browser).  The unexplored areas are initialized at the beginning of the game and are cleared as units are moved.  Both the JavaScript and C# code have functions/methods to deal with this mask.  The difference is that the JavaScript doesn't maintain an array of cells that are explored because they are just set to hide the mask whenever a unit passes by and they are never set back during the game.

The two masks are identical and use the same image to paint over top of the entire board.  I set the opacity to 0.3 for the view hex pictures so the player can see the terrain under it.  I also manually set the display value of each enemy unit to match the value of the view mask.

Here's the code: Battlefield One Code V2

You can go directly to my website to play the game as is: (temporarily down).

I've played the game a few times and it is a bit more challenging than the first version without the masks.  I think my next enhancement will be adding a variety of different unit types.  I might have to enlarge the playing board to allow more playing pieces.

Stay tuned...

Note: I have recently upgraded my website to use MVC4 and .NET 4.5. I have refactored my code to make the game work on MVC4.  Click here to try it out.


The Game (Part 3)

So I'm working on the visibility and mask of the game I wrote in the last two blog posts.  Here's a problem that I ran into right away and it involves not paying attention to detail.  What I'm referring to here is the hex board layout.  If you look at the documentation:

Battle Field One Specifications

You'll see a page titled "Unit Movement" and at the bottom is the definition for finding the surrounding 6 cells.  What I didn't verify is that my game has the cells shifted the exact same way.  You see, there are two different ways a hex-board can be drawn (actually there are rotated arrangements too, but I'm going to ignore that).  If the first column is started lower than the second column , then it follows my specification.  However, if you look at the actual game, you'll see that I drew the map starting with the first column higher than the second column.

Old Documentation:
 
 
Game:



So that means that the game was designed wrong to begin with.  Now the choice is to change the documentation and fix the calculations inside the game or change the game board and leave everything else clean.

I'm choosing to change the specification for the game and I'm going to have to fix the even/odd test inside version 1 to make sure I am searching the right surrounding cells.  If you're just downloading the specification, then you'll noticed that it's already been corrected.

These things happen and it's fortunate that I was able to catch this before getting any deeper into the game code. 

Thursday, June 13, 2013

Follow-Up on Designing a Game

So I designed and created a small game to demonstrate several techniques.  First, I demonstrated how to contain the scope of a project to ensure it gets out the door.  While this was only a tiny project, my goal was to create a working game over the span of a weekend or possibly a week (It took me about 24 man-hours of labor in total over a span of 5 or 6 days).  Technically, I did not track my hours and I did not estimate the development time of this project.  For any projects taking more than a week, estimates should also be included.  Also, if you review the design specifications and match it with the actual game, you'll notice a few missing features.  Notably, there are not armor units, it's legal to move past enemy units and the "next phase" indicator is not very visually appealing.  Chalk it up to time constraints.  I wanted to get this out the door so I could blog about it.  If this were a real game, I'd have to spend much more time polishing it to make it more visually appealing.

About the Game

One important thing I forgot to do in my last blog post was explain how to play the game.  It's somewhat trivial to figure out, but for those who have never played any Avalon military board games, it goes like this:

1. The game phase controls what can be done on the board, starting with the allied move phase.
2. When in the move phase, you can move your units.  Each unit that has been moved will have their movement allocation decremented to zero.  So you can't move the unit again for the current turn.
3. Once you are satisfied with your unit moves or you have no more units to move, you can click the "next" button to advance to the allied attack phase.
4. During the allied attack phase you can attack any enemy unit that is next to one of your allied units.  If you are not next to any enemy units, then you must skip the attack phase.  Each unit can only attack once for this turn.
5. Once you are satisfied with this phase, you can click the "next" button to advance to the next phase, which is the axis movement phase.
6. Now it's the computer's turn to move its units.  The computer will move units around the board and automatically advance the phase to the axis attack phase.
7. When the computer is in the axis attack phase it will attack any allied units that are next to one of its own units.  Then it will advance the phase to the allied movement phase.
8. Continue to step 2 until an end condition is met.

Game End Condition

The game ends when either all 4 cities are captured or all units on the allied or axis side are eliminated.

Weaknesses in the Current Game

If you play this game a few times you'll notice that it's about as difficult to play as tic-tac-toe.  It becomes predictable and it's easy to formulate a strategy to beat the computer almost every time.  One problem is that the dice roll is biased toward the defense.  This is due to the fact that all units have an offense of one and a defense of two.  If you look at the ComputeBattleResult method of the GameClass object, you'll notice that it's just hard-coded at the moment.  I did that because there is only one type of unit on the board.  To expand this game it will be necessary to use the unit index parameters to lookup the offense and defense of the two units and use a lookup table (or a formula) to compute the odds.

The second problem is that the enemy only has one strategy.  Divide its units into 4 groups of units and march them to each of the four cities.  Attacking allied units along the way.  If you manage to destroy the 3 units heading to one particular city, the enemy will not dispatch additional units to reinforce the hole in its strategy.

Options for Fixing the Game Play

One option to fix the game play is to introduce a variety of different units.  Many of the methods that assume there is only one type of unit will need to be altered to make sure the defense and offense are accounted for.  New units should maintain the single movement allocation and range of one to keep this under control for now.

A second option is to add a mask to the play field.  This could be done by creating black hexagon images that can be written over the board positions not explored by the allied units.  Obscuring what has not been seen yet.  This will only have minimal impact for those who play the game more than once because they don't need to re-explore a map they already have memorized.

An alternative or addition to the second option is to create a mask of areas not currently visible by allied units.  This can be accomplished by dimming the hex areas where allied units cannot see and not rendering any enemy units under the dimmed areas.  The background will be visible but enemy units not in visual range will not be seen.  Using this technique in combination with alternate enemy strategies has the potential to make the game more challenging.

Other Possible Enhancements

1. Allow stackable units.  The top unit could contain an icon with the word "stack" in it to indicate that it is a stack of different units.  Offense and defense numbers should be totaled for the top unit.  Complications arise from the fact that units must now be stacked and un-stacked.  Moving a stack needs to be accounted for.  Attacking an enemy unit should account for the entire stack.  I would propose doing the stack bookkeeping in the C# code and replacing all the units in the stack with a special unit in the JavaScript.  There is also the issue of attack.  This should be changed to provide a partial kill instead of all or nothing.  Offense and defense number would need to be decremented as the attack is occurring and units in the stack need to be destroyed if the defense level is too low to support them.

2. Combine attack and movement phases together.  There really is no reason to have a separate attack phase.  If the allied unit is moved into an enemy unit, then it would be considered an attack.  There is one possible issue:  If the range factor is used and a unit has an attack range of 2 or more, then there needs to be a control to determine if that unit is attacking an enemy.

3. Enhance enemy strategy.  The enemy AI can be expanded to provide 2 or more different strategies.  Even a un-sophisticated algorithm could use a random number at the beginning of the game to choose a strategy and then use that throughout the game.  Each strategy can be designed and tested one by one and then randomized when each strategy is completed for the final version.  Also, each strategy can have some intelligence built in.  For instance, in the current strategy of marching to each city, if a group of units is destroyed, then one unit from each other group can replace them.  if the enemy has too few units to hold all cities, then the AI can switch strategy to attempt to hold one city or just battle the allied units in the hope of destroying all allied units.

4. Allow longer movement allowances.  This is tricky because now we'll need to implement the ability to block units so they can't just slide past allied units (and vice versa). 

5. Add terrain effects.  One weakness of the map board is that it only has flat grassy fields with a  movement factor of 1.  Adding roads with double movement factors can make the game more interesting.  Also, a river with bridges.  Units can be blocked by rivers and must use a bridge.  Rivers can be used to form natural boundaries and choke points.  This will complicate the destination problem of the AI because now a path finding algorithm must be implemented (like the A* algorithm).  Mountains can be used as a 1/3 movement factor and forests a 1/2 movement factor, allowing for more choke points.

6. Weather factors.  Most physical board games have a random weather factor that changes for each turn.  Weather factors can modify movement (like snow, all movement cut in half) and offense effectiveness or visibility.

7. If a visibility mask is used, then scouts should be added to the game.  These would be fast units with long visibility, but possibly a zero defense and offense.

8 . Random map boards.  Civilization does this very well.  A completely random map can be generated at the beginning of the game.  This would improve the enhancement of obscuring the map until explored, since there is no way to be certain what terrain is under the obscured areas.  This also increases the challenge of the game for repeat players.  Every game is unique.

Conclusion

I think I've indicated how fast the scope of this game can expand to occupy all available free time of any programmer.  I'm sure anybody can think up other features to add to this game that would make it more interesting to play.  The best way to tackle a project like this is to prioritize the desired enhancements and focus on one enhancement at a time.  This will keep the game playable while each enhancement is added rather than attempting to do it all and working endlessly for months on a game that doesn't work (not to mention attempting to debug it all at once).



Monday, June 10, 2013

Designing a Game

It's been a while since I posted something new.  That's due to the fact that I've been working on a small game.  My intent was to design and build a fully-functional game to demonstrate AJAX, C# and SVG.  The trick to completing something like this is to make sure the scope is small enough that it can be completed in a short time.  I chose a turn-based game to keep the code simple.  This game is a basic Axis-vs-Allies type board game using a hex board layout.  I chose the hex layout to make the a game a bit more challenging and the hex board has 6 directions instead of 4 primary and 4 diagonals. 

I named the game "Battlefield One."  Not overly creative, just a temporary name.  Here's the basic design document:

Battlefield One

Feel free to download and thumb through the details.  There's mention of two types of units in the game, but I trimmed it down to just infantry.  No tanks at this time.  That's easy to add though, since there is an object called "Unit" that I used to contain the defense, offense, movement, etc.  This object can easily be expanded to represent other units.

Follow this link to play the game:

Battlefield One Game

Here's a screenshot of part of the game board:



The first task I performed was rendering the playing board.  There are a lot of decisions that can be made here, but I chose to render the beehive grid with lines on top of a background picture.  The trick was getting the picture to line up.  So I wrote some SVG code to paint one hexagon and repeated a grid of these shapes.  As I've mentioned in a previous post, this board is really no more than an offset checkerboard, where every other column is offset by 1/2 the height of the previous cell.  Once I had a grid printed on the screen I had to make sure my browser was set to 100% (yup, I forgot that step and had to redo this next step).  Next I took a snapshot of the entire screen and put it inside Photoshop.  You can use any paint program you like, I'm comfortable with Photoshop and this problem just screamed "layers!" to me.  After I put the screenshot on a layer in Photoshop, I filled in the green areas for all the cells except the four cities.  I made up an easy pattern of city building roofs and roads and then I just copied to all 4 city locations.  Then I put my background into the SVG code before rendering the beehive cells.

That pretty much completed the background picture.  In future versions of this game, I hope to add roads and make the movement factor double when a unit uses a road.  But for now, it's just a green field.

Next I defined a unit and worked out the rendering portion.  For units, the center icon is just a PNG file that I created.  The unit type will define which PNG file will be rendered in the center of the unit box.  The four corners are numbers representing the defense, offense, movement and fire range.  The top center number is a number that I added for troubleshooting purposes.  I left it in for now.  It represents the unique unit number.  This was a snafu that I ran into when I was debugging a deleted unit problem.

The unit objects are grouped into a list:

private List<Unit> AllUnits = new List<Unit>();

The only issue with using a list is that deleting a unit causes the list to compress and the indexes to each item is now changed (at least for units after the deleted unit).  So I inserted a variable in the Unit class called "Number" which represents the unique unit number:

public class Unit
{
   public int Number { get; private set; }
   public int Defense { get; set; }
   public int Offense { get; private set; }
   public int Movement { get; private set; }
   public int Range { get; private set; }
   public int UnitType { get; private set; }
   public int X { get; set; }
   public int Y { get; set; }
   public NATIONALITY Nationality { get; private set; }
   public UNITCOMMAND Command { get; set; } // used by enemy units only
   public int DestX; // used by enemy units only
   public int DestY; // used by enemy units only
}


I created a couple enum's for NATIONALITY and UNITCOMMAND.  There's nothing fancy going on here, it's just my way of keeping track of what is set.  It also helps in the "if" statements to see what is being compared.  The enums are as follows:

public enum NATIONALITY { Allied, German };
public enum UNITCOMMAND { None, Wait, Destination, Defend };


At the moment, I'm only using the "None" and "Destination" status of the UNITCOMMAND variable.  I initialize the Unit class to a Command of "None" and then I change the command to "Destination" once I have set a unit DestX and DestY for enemy units. 

The Game AI

I managed to crank out enough C# and JavaScript to allow the allied units to move around the map board.  The computer player was still dead (i.e. not moving or attacking).  I decided to work on the enemy move phase of the game.  I kept the game AI as simple as possible.  Basically, the enemy will count the total number of units, divide by 4 (which is the number of cities on this map), then set the destination for each group to a different city. 

That was all I setup initially.  I didn't bother with attacking or detecting any end game results.  The movement of units included a method to force move restrictions on the desired movement (I called this the SnapTo method).  This method will make sure the unit only travels one unit from its current location, it can't overlap another unit and they can't go off the edges of the map. 

There was a problem that I ran into.  My algorithm computed which units would move and did the SnapTo operation inside the C# code, then sent it back through an AJAX return to the calling JavaScript that performed the actual move.  The problem is that the order that the units are moved can determine if there is another unit in the square that will be moved to.  So I decided to just send the data to the JavaScript in no particular order, then I looped through a queue of move orders.  If the unit was blocked, I swapped that unit to the back of the queue and then continued moving other units.  After a few attempts to move a unit and passing it over, eventually the space it was destined to move into will be empty (vacated by another unit just moved). 

Next, I did the enemy attack phase.  I decided that if the enemy bumped into an allied unit, it would just attack during the attack phase.  So I wrote a method to search for allied units that are one hex away from a German unit, then send both unit numbers and the battle result back to the JavaScript from the calling AJAX. 

Last, I completed the code for the allied attack phase.  The JavaScript that triggers this sends and AJAX command to the C# code which does the battle calculation, sending back the results.  The return JavaScript function deletes the unit object from the SVG code.

Game Goals

There are 2 possible goals that can be achieved to win or lose the game.  The first goal is to capture all 4 cities.  The German or Allied side needs to capture all 4 to win the game.  If either side has only 3 units or less, then that goal is not achievable.  The second goal is to destroy all the enemy units (or the enemy must destroy all the allied units).  When one team or the other destroys the other's units, the game is won.

The game can become a draw due to the fact that the enemy AI splits groups of units for each city.  Once the group of units reaches the destination city, they just sit there waiting for something to happen.  Unfortunately, it doesn't do the enemy much good to have 3 or 4 units sitting on one city while the allied player sits on another city with one remaining unit.  The enemy will not seek out and destroy the allied unit.  This is a weakness that I'll be fixing next.

Game Code

I'm going to spend some time and refactor the code to clean up all the scraps I left on the floor of the sausage factory.  Once I do that, I'll post the code for anybody to download and examine.  Though I haven't completely tested this on an iPad, I have done some preliminary testing and it works pretty good.  The total amount of code for this game is about 860 lines of JavaScript and 730 lines of C# code.  Just a little something I do on a weekend.


Update 6/12/2013:

I performed a few bug fixes while refactoring the code.  Now that I have been able to play the game several times in a row without a hiccup, I feel confident that it's pretty close to bug free.

Update 6/13/2013:

Here's the entire project for download: BattleField One Project.

Tuesday, June 4, 2013

Cross-browser Testing

I subscribed to the "Code Project" daily build email.  I have found these daily emails to contain some real gems.  Today, I was looking at an article titled: "HTML4 - we're done" which peaked my interest because I've been focused on HTML5 for the past couple of years. 

Click here for the article

As I read down toward the section called "Multi-browser testing options" I started to click on the links to check out the tools that the author was using.  The last tool called "Modern.IE" took me to this website:

Modern.IE

The top part goes to a tool called "Browser Stack" and it has a list of OS's to test against.  These are emulators that are run in a virtual environment:


It also has a selector for available browsers which are real browsers executed inside the virtual environment:


Signing up gives the user a 3 month free trial and it costs $19 a month for one person.  Which means that this tool is primarily for use by a business or those who are programming for a price and are concerned about compatibility.

Still, I'm happy to see that there are tools out there that can test on multiple platforms.  I typically only test the top 5 browsers, which are installed on my PC and the iPad Safari (because I have an iPad).  I rarely get the opportunity to test against an Android device due to the fact that I don't have one on-hand.  Having the ability to test against a virtual device is better than not testing at all.

Friday, May 31, 2013

Big-O

For those who have a degree in computer science, you'll immediately recognize Big-O notation.  My data structures class required students to figure out the Big-O for best and worst case situations for sort algorithms.  Today, I stumbled across a Big-O cheat sheet web page that is rather interesting:

Know Thy Complexities!

This is handy for students or developers that are about to use an algorithm and want to check how it should perform depending on the data set they are working on.

Each link takes you to a wiki page on the algorithm in question so you can get details without searching around the web. 

My purpose in posting this on my blog is to make people aware of the existence of this handy website and so I can find it again in the future if I need it myself.

Tuesday, May 28, 2013

More Database Basics

So I just completed a post about database basics and at the end I showed how students (and faculty members) can download Oracle or MS SQL Server, install it on their PC and create a database to match the book they are using in database class.  Then I thought about the fact that I haven't really used Oracle since 2008 and maybe, just maybe I'm out of practice.  There's nothing worse than creating a post containing untested code or information that might not be valid, so I went out and downloaded Oracle 11g XE (express).  First, I noticed that there was only a windows x32 version and a warning that it doesn't work on x64.  Being a skillful developer, I ignored that warning and tried it for myself.  There was some sort of error during the installation but it was more of a warning that something was missing.  This did not affect the capabilities of the installed program from what I've used already.  I'm sure it involves stored procedures or maybe some capability that I'm not going to use for an entry level class.  So I dragged out my "Database Systems: principles, design, & implementation" book from the early 1800's (OK, it's copyrighted in 1990).

Here's the starting sample database in the book:


and here's the matching tables with data:


So, I installed Oracle (which took a very long time), then I jumped right into the control panel, which is all new.  In fact, it's a web interface now instead of a java application.  No problem, I clicked around the menu options until I saw a table designer interface (I'll leave it up to the student to read up and figure out how to create a table, it's easy).  So I created all three tables, two foreign key constraints, some not-null constraints and primary keys for the student and class tables.  Yes, I was cringing as I typed the data in and my mind was just screaming (that should be broken out in a separate table!).  But that's the purpose of this exercise in the book is to start with a somewhat functional but incorrectly designed database and work through the normalization rules to make this into a properly designed database.

So here are the three final tables and their data in Oracle:

Student Table
 
Class Table


Enrollment Table


As you can see, learning Oracle is like riding a bicycle.  Technically, all databases function similarly.  They all have tables, fields, foreign key constraints, etc.  There are some nuances in the way different databases handle cascade deletes and triggers and other capabilities, but the basic functions are nearly identical.

Any student should be able to download the express edition of Oracle and create the three tables above in an afternoon.  It only took me two hours (one and a half of that was install time).  Warning: If you're an MS SQL Server user like me, try not to hit the F5 key to execute your query in Oracle.  It just refreshes the browser and erases the query you typed in.

Now you should be able to join the student table to the enrollment table to get a list of students and their grades:

Student Grades

 
Now this is learning!