Tuesday, April 30, 2013

Resource Allocation

I'm going to talk about resource allocation.  This is where you (as an IT manager/CIO decide which people to combine into your team).  When I started designing software I used to estimate the software, define a deadline and then divide it by a 8 hour work day to determine how many programmers I needed.  Over the years, I've learned a few things.  One important issue to understand is that a mixture of different expertise can be a cheaper way to build software.

The reason why this is true is based on the fact that people must wear different hats to perform all the tasks it takes to run a software shop.  Especially if you are also attempting to provide services for existing software.  I'm going to start with a very simple example.  This is, of course, an example that I made up and it might not work in your situation.  I'll show you how to build the numbers to determine if a different mix of people will help your projects bottom line.

Here's what I'm starting with:

What I've done is setup a project that requires about 1000 man-hours of labor to complete (the total coding hours is shown as 930 above).  The total time it will take the three programmers is 600 hours (they're all working 600 hours in parallel).  Or about 4 months (OK, it's 3.75 months, but I'm trying to keep the numbers even).

If you're looking at all the daily tasks listed, you'll immediately notice that these three programmers are spending a lot of time on help desk, paper work and debugging.  So my question for this simulation is "what would happen if we replaced one programmer with a help desk/debugger guy?"

Help desk is taking about 240 man hours away from coding and debugging is taking 450 man hours.  That's a really large chunk of production time.  Plus, it's a little more than the 600 hours that the extra programmer is taking over-all.  I'm going to ignore the paperwork time and assume that 10% is normal for all employees of this particular department.

Here's the new spreadsheet:

OK, so I re-tasked all the help desk calls to the new help desk/programmer and I allocated some of the debugging.  In reality, this person might not be as efficient at finding and fixing bugs as the programmers, but the numbers should still hold up.  Notice how the programmers have been able to do 1020 man-hours of coding in a 600 hour time-frame.  That's better than the mix of people above.

There is also one more consideration here.  You will want to sink your money into some really good programmers and possibly hire an intern or maybe someone straight out of college for the help desk/debugger position.  This will be cheaper than hiring three expensive programmers.  Your project will still be delivered on time.  Also, the moral of your programmers will be high because they don't have to deal with help-desk calls. 


OK, so I've shown a really simple example.  In more complex examples it's possible to separate programmers by specialty.  If you can get a mix of programmers that have strengths in different fields, say interface design, business code, database design and mix them in a team where each can focus on their skilled area.  You can benefit from the strengths of all of them combined. 

Also, you might be asking yourself, "where did you get all those numbers for that spread sheet?"  I got the numbers from metrics.  Each person will enter their time into a time tracking system (computerized or on paper), then I roll up their times per week to get their average percentage.  I do this all the time for my employees so I see where their time is being allocated.  These percentages are then used to estimate how long it will take for my development team to complete a project.  It's not as easy as saying "160 man hours / 2 people = 80 hours, errr, 2 weeks!"  You have to account for percentage of time allocated to the actual task.  The rest is what I consider waste.

Monday, April 29, 2013

Software Development Disasters

If you've designed and built software for as long as I have (more than 20 years), you'll have a whole lot of experience dealing with disasters.  I've had a couple in my lifetime that could have been expensive disasters if not for my knowledge of trading stocks.

You see, there's this little secret to trading stocks, and it's the one technique that separates the chaff from the wheat.  Knowing when to cut your loss.  That's right.  You're going to run into something some day when you'll just have to know when to quit.  If you can see the disaster coming early enough, you can avoid the cost of the disaster.  Sort of like seeing a tornado far enough in advanced to jump out of the way.  In the case of stock trading, you need to cut your loss before you lose too much money.  In the case of software, you need to cut your loss as soon as it starts to look like a loser.

Enough of the metaphors, time to get serious.  The number one thing you need to do when designing and estimating is break your software project into the smallest pieces possible.  Then estimate the tiny pieces and roll it all up.  If you have a chunk of software that is kind of gray in your head, you'll need to focus on that piece and maybe do some mock-ups or a small and cheap demo.  When you have your numbers, your resources are in place, you're authorized to start the project, then you need to actively track your progress.  By the time you're 5% into your project, you should start to see a pattern emerge.  Is it behind schedule?  Is it really behind schedule (more than 20%)?  How do the tasks that are currently completed compare with the tasks left to do?  If the remaining tasks are comparable to the tasks that have been completed, then you'll never finish within your budget.

Let me clarify what I'm talking about here.  I'm not talking about meeting a scheduled deadline.  If your company needs this software and is willing to just throw more money at resources, then you should attempt to throw in more resources early on to get ahead.  However, most companies base their software development on ROI.  Than means cost.  So being behind means that either you have to work your current programmers more hours and pay them more, or you have to work more programmers which costs more (I know, that's really obvious, except "programmers" don't think that way).

So you're already behind schedule and it appears that the project is going to go into ugly territory.  In other words, it looks like the whole thing will probably cost 1.5 to 2 times as much as originally estimated.  What do you do? 

The first thing to look at is the features that this software will contain.  Are there any features that have questionable value for the customer?  Maybe you can trim back on those and make up the difference.  The ROI will come out the same if the value of the overall software package is perceived as the same.  Maybe the software you are developing has only 5 primary features and all other features are secondary in nature.  It's possible that you could defer these features to a future version of this software, say, after the profits start rolling in.

Let's say that your software is cut to the bone.  It's lean and mean and any feature left out will reduce it's value as a package.  Like leaving out the accounts payable section of an accounting package.  Just can't do that.  Now what do you do?  This is the point where I usually think about cutting my loss.  I would build my supporting evidence, show my projection of how much this project will cost to the people who authorized the funding.  Then give my recommendation.  Maybe they will think it's worth the extra money.  Maybe budgets are tight and the funding will not be available to complete the project.  If that happens, then it's ALL STOP! 

Fortunately, you've put a stop to potential loss.  Now you can re-evaluate the project and possibly use a different technology.  Maybe the interface is not worth the effort and an easier to build interface could do the trick.  Maybe a different programming technique can be used to make things easier.  Programming tools?  Different mix of programmers?  There are so many options at this point.  Fortunately, you're not bleeding money while you're re-analyzing the project.

Cutting your losses is not necessarily a good thing.  Sometimes you have to go with your gut, grit your teeth and push through.  If the remaining tasks in your project appear to be easier than the tasks that were completed, you can probably pull it out of the fire (assuming you're not too far behind).  Don't be too averse to risk or you'll never finish a project.  I'm just telling you that you should be aware of your situation at all times and never ride a disaster into the ground.

Sunday, April 28, 2013

Hiring Personnel

Hiring a technical person can be a very tedious and stressful job.  I work for a small company (yeah, I know, I've mentioned that before) and we don't have a fully-staffed personnel department that can go out and find candidates.  So I'm tasked with doing 100% of the legwork for finding a technical person.  Sometimes I opt for the contract agency and let them perform the leg work.  In the case of network administrators, I usually find them myself.  I've been through a few network administrators in my time at my current work place.  Our company used to be a Unix shop and it was nearly impossible to find a competent Unix knowledgeable guy.  In 2007, the stars aligned and our company decided to convert from Oracle/Unix into SQL/Microsoft server.  I'll talk about that project some day, but for now, let's just say that our network administrator requirements changed.

One of the most important requirements of a network administrator (and programmers) that I discovered, after years of hiring people, is that they must have an interest or an inner drive to do the job.  If they're the kind of person who likes to do network stuff as a hobby, then they'll be the best network administrator around.  People who go to a tech school, obtain an education because they thought it would be a "good paying job" but don't have an interest, or never going to cut it.  Let's face it, IT is a constantly changing world and it takes someone with the interest to constantly learn to be good at it.

Thinking outside the box

A couple years ago, I was faced with finding another network administrator.  My previous guy found a help-desk job at a large company (I'm not sure what he was thinking).  So I decided to cut through the bull and get down to business.  There's a local technical school called ITT tech and they train network administrators in a 2-year course.  I decided to call them this time and ask if they had an instructor that I could talk to that could identify students about to graduate that have an actual inner-drive.  The person I talked to said I could talk to their top instructor who teaches networking classes and invited me to come down to visit their campus.  So I jumped in my car and brought my lead programmer with me (I like getting a second opinion on people).  The lady I talked to at ITT met with us and took us around their facility which was not very big but seemed really focused.  They only offer a hand-full of degrees at ITT tech and they are not a college. 

Then she stopped at the classroom of a network class that was in session (they were at the last 15 minutes of class time).  She ran in and talked to the instructor, who happened to be the instructor that I really wanted to see.  I expected to meet with him after class, but he just invited me right into his class room.  At the front of the class.  So I started tap dancing.  I first described who I was, what my company did and what I was looking for.  I described our server/network infrastructure so the students had a context of what they would be working on if we hired one of them.  I then told them that we were looking for an administrator to work for our company with an immediate opening to fill.  I also described the fact that a small company is one of the best places for a new administrator get experience at because the network administrator does everything (and I mean everything). 

So I took questions from some of the students, only a hand full asked questions.  Two in particular started drilling me with all kinds of questions.  Then they dismissed the class and I continued the tour of ITT.  When we returned to the front office, the two students who were asking me the most questions were filling out paperwork to apply at my company.  So I talked to them for a few minutes and then we left.  The first guy that showed up at our company ended up with the position.  He impressed me the most.  His personality fit our organization and he had the drive to learn and apply his knowledge.


The guy that I hired, his name is Jim, is still working for us.  He's dedicated, he has recently taken it upon himself to take the CCNA classes and is studying for his certification.  He's the guy that I trust with our system.  He's the guy who can get the job done, and if he can't figure it out in a reasonable time, he knows how to contact an expert to get the answers he needs.  My point in this article, is that the traditional method of hiring person might not fit your situation.  You need to think about the actual person you would like to hire, not the minor facts that a potential candidate can spit out in an interview, but can they get the job done?  Can they find a solution and close out the issues your company has?  Can they do this without tying up other company resources?  Don't be afraid to try something new.

Duplicate Records

Similar Words

So this post will be an extension of the data quality post I just completed.  I'm going to talk a bit about duplicate records.  Let's pretend for the moment, that we have a list of companies (I know, I'm beating this example to death).  One method of finding duplicate records is to sort by the name, assuming the names are close and not just miss-spelled.  This method can reveal the obvious duplicates, and a verification of identical addresses would give the final conclusion of two records being the same.  Let's assume for the moment that we are really looking for possible duplicates that might be close, but might not begin with the same letter.  There is an algorithm that can be used to determine how close two words are to each other.  That algorithm is called the Levenshtein distance algorithm.

For C# code go here: C# Levenshtein

I'm not going to rehash the details of the algorithm, because anybody can click the link and copy the code and make it work.  I'm going to talk about it's use and how to analyze what you can do with it.  The output they show looks like this:

ant -> aunt = 1
Sam -> Samantha = 5
clozapine -> olanzapine = 3
flomax -> volmax = 3
toradol -> tramadol = 3
kitten -> sitting = 3

In other words, if you compare "ant" to "aunt" you'll get a distance of 1.  "Sam" with "Samantha" gives a distance of 5 and so on.  The distance number is just a relative difference between the two words.  Theoretically, you can compare each company name against all other names and list companies where the Levenshtein number is at say 4 or less (you'll have to adjust the number to capture what you're looking for).

This should give you a small list of companies that are similarly named.  From that list you'll have to make a determination if they are the same company and fix the situation.

The downside is that this duplicate comparison is an N! problem.  So basically, if you have a million records, you'll have to match the first record with a million minus 1 other names, and then the second record with a million minus 2 names, and so on.  This is an impossibly large problem to solve using brute force.  According to wolfram alpha it would take 8.26 x 10^5565708 iterations to compare every company record.  Yikes!

One method to get around this explosive combination is to try and catch the mistake at data entry time.  When a new company record is created, the data entry screen would require the company name to be entered first.  It could then be checked using the Levenshtein distance algorithm to see if there is a company that is similarly named, then display a side screen that shows the address.  The data entry person can double-check the address that they are about to enter to see if it matches, then skip creating a new record.  Further verification can occur if the address matches up when the operator attempts to save the data for the first time.


There are algorithms that can be used to compare words by their phonetic properties (how close they sound when spoken).  MS SQL server has a function to compare two words according to soundex rules.  If you have SQL server, here's an example:

SELECT DIFFERENCE('Smithers', 'Smythers');
The result will be 4.

Also, you can use the SOUNDEX() function to return a code that represents the sound of the word.  Similar sounding words will return the same code:


Both will return S530 as a result because they sound the same.

For this example, you could build a table containing the soundex number (or just add a field to the company table to insert the soundex code.  Then you can group by the soundex number to determine if there are duplicate sounding companies.

Other algorithms include the double Metaphone algorithm.  For a detailed description go here:


I'll leave it to the reader of this article to research the double metaphone algorithm if they so desire.  My point in this article is to indicate that there are a lot of options beyond the simple direct comparison and the "like" SQL command.  Be aware of how you plan to implement your data quality plan, the results could be unobtainable. 

I hope this helps with your endless quest for improving data quality!

Saturday, April 27, 2013

Data Quality

I've touched on this subject in an earlier post called  "The Perils of Bad Data".  This was a specific example, but it applies to a larger subject of overall data quality.  One technique that I've used in the past to determine data quality is to write verification methods to check for missing data, incorrectly formatted data, or obviously bad data and then total up the number of errors detected in a record and show a list of records sorted from worse to best.

OK, it's example time.  Imagine you have a database full of records of companies and their addresses and phone numbers:

So I have a list of 4 companies.  Try to imagine that the database contains over a million records with varying amounts of information.  I'm also going to ignore the fact that the data input screen should do some minimal verification before allowing data to be entered into the system.  You can see from the example above that there are some obviously missing fields.  It might be because the data is optional, or that the data was not available when it was entered into the system.  So the first data quality rule is to count the number of missing data points that are not optional.  I'll assume the fax number is optional and I'm going to put the number of errors in a new column:

Now you can see that the data that I made up has one high quality record and three varying quality records.  The "Truck-R" company is the lowest quality record since it is missing 3 pieces of information.  Obviously data such as the state can be fixed if the city is unique and filled in, but I'm going to ignore automatic data fixing routines and just show how to measure basic data quality.

Next, I'm going to look at the zip code.  This is required to be in one of two formats:




If the dash is missing, I'm going to flag it as an error.  We could technically, auto-format the data if it contains the right number of digits, but I'll just count it as bad.  So now our total looks like this:

In this example I didn't put in any 9 digit zip codes, but I'm sure you can use your imagination.  We could also count 4 digit zip codes as incomplete, but I'm going to ignore that too.

By now you're probably getting the point of this exercise.  Verify the phone format, and you should verify the fax number format even though it is allowed to be blank.  There is no sense in inputting bad data just because it's optional.  The phone number should only contains numbers and can be checked for correct format and number of digits.  Don't forget that phone numbers can contain an extension, an area code, a possible country code and there are other possibilities.

One other metric to consider (and there are probably endless other possible tests you could devise) is the last edit date, or last viewed date.  My company does an annual review of data.  We flag old or rarely used records and run down the list to determine if the companies in our database are still in existence and do we need to continue maintaining data on their company.  For some of our records we track the number of times they've been accessed as well as the date so we can view the frequency.  If a record is only used once a year, then it was probably just viewed by mistake.  Finally, recording access information can also be based on how often data associated with that company is accessed.

What to do with the bad records

When records are identified as having potential errors, then some sort of investigation and repair must be made.  As I mentioned earlier, some information can be repaired automatically, but the most important errors need a human to make a determination.  This is where large data becomes expensive.  Somebody needs to get on-line and look up the missing zip codes for "Dice Game Co." and "Truck-R" (as an example).  If they can't find information, they might be required to call the company in question and ask for the correct information.  Especially the contact person, which might need to be checked periodically (once per year, or every 5 years for example).

Other Quality Checks

Duplicate records are the bane of my existence.  Especially records like company records.  It never fails that a person re-enters a record and spells it differently, or shortens a title (like "co" instead of "company").  Then there are two companies with identical addresses.  Some sort of basic test to see if there are two or more at identical addresses (this can be done by sorting by address) should be performed.  Two companies at the same address might not be duplicate, but 99% of the time they are.

Misspelled company names, incorrect address, wrong state or city, or misspelled contact name are examples of data errors that cannot be detected by a program.  Unless there is another database to reference.  In the case of the company name, it might be possible to do a Google search and verify the company name exists.  This is crossing into the "iffy" zone of data verification.  State and city can be verified.  Lists of cities are readily available, as well as zip code validation.  It's relatively easy to write code to verify that the zip code given is in the city and/or state entered.

I hope this information helps keep your data clean and accurate!

Friday, April 26, 2013


Spam - Sheesh, Are They Kidding?

I get spam every now and then.  I used to get a lot but we have a really good spam filter (Barracuda, by the way).  But I've been receiving spam since I signed up for email back in the early 90's.  I'm so familiar with the social engineering tricks that the "spammers" use that I can smell a spam message the instant I view it (many times I only have to see the title).  They're pretty clever, but not real clever.

This spam message I just had to blog about.  Why?  Because it cracked me up.  OK, here's a screenshot of the email:


Oh yeah, they had me fooled up until I realized that I didn't book a flight (sure they did).  I started to crack up, since the dominant airline in my area is Delta, and I haven't flown since last Fall and I have not plans to fly any time soon.  Anyway, I was curious, how this was setup, so what I normally do is hover my cursor over the "download it" link.  That's obviously, where they're going to execute a Trojan horse and make my life miserable (so I'm not going to click on it).  Here's what I got:

So the address goes to gentedecente.com.br.  Which is a Brazilian domain name.  The ".com" part in the middle is clever, to kind of distract the eye from noticing that it's NOT a dot com address.  So I decided to check if anybody else is receiving this particular spam and I typed "airline ticket email spam" into Google and it looks like it's so common that there are different variations and American Airlines is already aware of the problem.  AA has a webpage describing the scams and that you need to be cautious.

One other thing I did to see why this got past the spam filter, is check the return address.  It was from ticketlesstravel.com.  Which happens to go to a web hosting company, but no site.  That's a dead-giveaway.  They could have at least bought a fake site or linked it back to a real site.  Maybe, I shouldn't give pointers here...

MS SQL Server Tricks

I've talked about SQL server tricks before, but this time I'm going to show you something really useful.  First, here are the assumptions:

1. You have a database with lots of tables.
2. Your database is normalized.
3. You named your foreign key fields the same name as the primary key of the table that they match.

Let's say you have a situation where you are merging two records into one, but there are child records to consider and the child records span multiple tables.  How do you know you merged all data from all tables?

Use the INFORMATION_SCHEMA.columns table to list all the tables that have matching columns.  The following SQL command will query for all tables that have a field named "ccompany":

select * from mydatabase.INFORMATION_SCHEMA.columns where column_name='ccompany'

Here's a partial result on a real database:

Armed with this information, you can go from table to table and merge the appropriate information.

That's not all.  There are numerous tables that can be queried to determine indexes, constraints, list of table names, etc.  Open SQL Server Management Studio and navigate down to: Databases -> System Databases -> msdb -> Views -> System Views.  You'll see a list of views that you can tap into:

Just run a select query on each view to see what is available.  


I'm really getting into the swing of blogging.  When I started, it was harder than I imagined.  It took me a long time to come up with a subject that I could write about and keep my interest.  So I settled on this mix of technical programming and IT manager "how to" format.  I've watched the stats, just to see if anybody is interested and the search engines are bringing people in.  If you're one of those individuals who obtained that tiny piece of information that put your project over the top, then you're welcome.  That's why I'm doing this. 

Unfortunately, I don't get any comments.  Technical subjects don't warrant a lot of comments.  Let's face it, most people just want to learn something and move on.  I turned on anonymous commenting when I setup this blog, so feel free to leave me a note.  Just say "hi" or if you see a possible error, I promise I'll look at it in more detail.  Sometimes software works great in one environment, and then it breaks when someone else doesn't have a certain patch for their Visual Studio (or maybe I'm not on the latest patch), or they are using a different browser, etc.

As of this writing, I have a couple dozen subjects that I jotted down in my Evernote note book.  The only thing limiting the amount of posts I generate at this point is the amount of free time I have available.  The words are flowing.  No more writers block.  I must have broke through my confidence barrier (assuming I had one).  Now that I've become acclimated to free-form writing, it's getting easier (I'm used to writing technical papers or executive summaries, so this style is a bit liberating).

If you have taken up an interest in a subject that I've blogged about, drop me a comment and tell me that you want to know more about that subject.  I can always pull more information out of my brain and put it into words.

And... thanks for reading!

Sunday, April 21, 2013

Software End of Life

So I was checking the end of life date of Windows 7 to see if Microsoft has set a deadline on the end of the OS that we are currently using at our company.  We have been procrastinating the decision making on Windows 8.  At this point we haven't even decided if we are going to upgrade our company to Windows 8 or if we're going to skip over it and wait for the next potential version of Windows.

To check the end of life of a Windows product, you can click here:

Microsoft Windows End of Life

I was surprised to find that there is no official end of life date for Windows 7.  Normally a product will have an end of life assigned as soon as, or soon after a new product is released.  When Vista was released, Windows XP was expected to be dead in a couple of years.  The negative reception of Vista caused a back-lash in the computer community (mostly businesses) because of a lot of compatibility problems.  Microsoft decided to extend the end of life date of XP and it was supported until around 2010 when Windows 7 was already a year old.

I'm guessing here, but I think Microsoft is testing the waters before assigning an end of life date to Windows 7 until they see what kind of reception Windows 8 receives.  So far Windows 8 is not being received very well.  Don't get me wrong, I like the concept of having an OS that runs on tablets and desktops and allows touch input as well as a mouse.  I'm just not sure it's good for general businesses.

So what am I babbling about?  OK, we loaded Windows 8 on an old PC for testing and the first thing we discovered is that we needed to find a way to turn on the start button.  It took a lot of time and training to get people to learn the start button and what it was for.  Now it's turned off by default.  Probably because it's not very convenient for a touch display.  The second problem is the initial screen with its square boxes.  This screen isn't very useful for a PC user.  I'm sure we could arrange applications on this screen to make it easier to use, but I'm dreading the training that it will require to use this interface.  Here's the kicker: Nobody's job will become more efficient after they have learned how to use Windows 8.  It just doesn't translate to more efficiency.

My Dream OS

OK, I'm going to give my idea of a dream operating system.  I know it'll never happen because the challenges are just beyond mankind at this point and it probably doesn't boost sales like a shiny new interface.  Also, almost all OS's fail at some of these things that I'd like to have.

1. Everything that makes the OS go is just a bunch of files that can be zipped up and copied to another machine.  Wouldn't that be cool?  Wouldn't it be nice to be able to save a copy of a freshly installed machine in a directory on a server and easily restore it to any other hardware, without compatibility problems like registry settings, dll versions, locked system files, etc.

2. No reboot needed.  Ever.  I don't think I'm the only one here that wants that.

3. Two second boot up.  That includes the motherboard BIOS.  The BIOS on some servers and PC's that we use are really slow at getting started.  How about a button on the front that can be pushed to switch directly into the bios to set it up, instead of trying to catch that stupid screen before it goes by.  And why should we wait for that screen the other 99% of the time when we don't care to change the BIOS?

4. Print drivers... sigh.  Can't we get a standard here?  What if every manufacturer placed the driver in a directory on their website and their printer had that embedded in their firmware.  Then the computer could look it up, download the driver and install it automatically, without anybody doing anything.  Provide a disc with the normal installation for computers that have no Internet connection.  It's also painful that in a Windows 7 restricted account, the administrator cannot give the user rights to only be able to install print drivers (that, by the way, is a bug that thousands seem to know about but has never been fixed.  Probably a security problem).

I remember when DOS was king and we were able to just run "SYS" to put the two files on a hard drive needed to boot it up.  Then the rest of the command files could be copied.  That made it easy to save a copy of an OS someplace else, and run a batch file that would sys the drive and then copy everything to it to restore a PC to its initial condition.  I'm not advocating that we bring back DOS... NoooOOOO!!!!  DOS has it's own problems with requiring a sys.config file and Windows 3.11 required .ini files for configuration information (though, I do like ini files better than the registry, they had their own problems, like parse errors, corrupt files, etc.).

Thursday, April 18, 2013


Ahh, Cross-Training

Everybody should have heard that phrase at least once in their life.  The military is big on cross-training for obvious reasons.  But I'm going to assume that my readers don't know what I'm babbling about.  So I'll give a quick definition off the top of my head:

Cross training is where multiple people with different skills teach each other their skills so that there is at least two people who can do a certain job.

Sounds good doesn't it?  It's not always that simple.  First, there's the advanced skill problem.  If your company is small like mine, you might only have one person with advanced skills in a subject that just can't be easily taught to other employees.  For such an instance, I would recommend finding a skilled contractor that could be contacted quickly.  Then get them on your speed-dial.

So how do you cross-train someone?  The way I do it is I switch jobs of a person to do the job of another person for a task.  My company has one network administrator, but I have a senior level programmer and myself who are fully capable of doing network tasks.  Even if we are not formally trained in the subject.  So I will occasionally assign tasks to a programmer to do (especially if my network person is going to be on vacation).  I also make it a point to cross-train myself.  I like to know how to do other people's jobs, even if I'm never planning to use that skill.  If something goes wrong, I can fumble through.

Important IT Tasks

Here are a few very important tasks that should be cross-trained:

1. How to disable a user account.
2. How to startup each server.
3. How to properly shut down each server.

How to disable a user account.  This might be a one-step process, assuming that your company uses a single login, like a domain controller.  Most companies have multiple systems that require a user to be shut down in each system.  In the case of my company we have a Microsoft network of servers where we can shut off a persons login with the click of the mouse and we need to shut them out of our website (also one click of the mouse).

How to startup each server.  If you have a server room and only a handful of network personnel (in our case: one), then you'll have to train someone else on how to start things up.  This is just in case of a lengthy power outage and someone has to manually start things.  Most servers these days can start on their own and don't require anything special.  In the case of my company, there is a certain order required to startup our servers.

How to properly shut down each server.  Just pulling the plug, is how do I say it, NOT OPTIMAL!  There is a safe/right way to shut down a server and then there is the wrong way.  Most of the time your servers will run 24/7 and nobody will shut them down unless they are due for maintenance, patches, upgrades or replacement.  However, if there is a long power outage (beyond the capacity of your UPS system, if you have a UPS system), then you'll need to systematically shut down low priority servers, then all servers.

So you might have a ready-made excuse: No free time to train people.  My answer to that, is that you should swap a pair of people to do each other's job.  It'll be slow and tedious, but it can be done.  Another option is to schedule a person to shadow another person for maybe 2 hours or something.  Some important tasks only take a few minutes.  When shutting off an employee's user rights (let's pretend they found a better job and are moving on), the IT person who will be training on the task can shadow the network administrator during the task.  Then there should be a follow up training where the cross-trained person does the task on their own with the expert looking over their shoulder.

One other thing you should do as an administrator.  Keep a list of tasks that you want people to cross-train on.  Then check them off when you feel that they are competent in that task.  You'll need to record the date of the training and schedule them for refresher training about once a year.

The Log Book

You should keep a log book.  It needs to be stored someplace secure, probably in the  main computer room itself.  Our security plan requires a log book, and it must be constantly updated and maintained.

Log Network Changes

I know network guys hate doing paperwork, but a log can be used to do detective work.  If something, say for instance, the backup system quit working last week at 4:30pm on a Wednesday, and there is a log entry that the administrator password was changed at around 1:00pm on Wednesday, it can be helpful to see what coincides (I know, that was a really bad example).  How about this, a firewall was replaced at 1:00pm and a person attempting to use web based email couldn't connect from their home at 6:30pm, but the network administrator on duty is not the one who changed the firewall (and forgot to pass that information on to the night shift).  The log can be a good place to see what happened earlier in the day.  The log book can also be your friend if a lawsuit occurs and you must turn over information for the discovery process.  If you're a good network administrator, then keeping a paper trail is your best protection from unforeseen events.

Log Database Configuration Changes

Keep track of changes to the database.  If your company has software that access your database, you'll have to make sure that there is a distinct roll-out date that is planned around your changes.  Make sure you log the date and time that the roll-out occurred so you can trace bugs back to that point.  It also helps when analyzing the number of bugs per day/week/month.  If there is a spike in the number of bugs, it might be right after a deployment.

Log Employee Changes

Make an entry in the log book when adding an employee or removing an employee from the system.  This information will be used in situations where an investigation occurs and you are questioned by other managers or security.  It might also be useful to log when user rights are changed.

Electronic Logging

I'm advocating a physical book with non-removable pages.  Such a log cannot be deleted by a determined hacker.  If you prefer to record log entries in an electronic format (word, excel, etc.) then make sure it is backed up on the regular backup system.

Wednesday, April 17, 2013

How to Talk Non-Nerd

Talking Nerd is the accepted language between two experts in a common subject.  It's most common in the computer field, but it also occurs in fields such as chemistry, biology and physics (to name a few).  The problem is that "nerd" is like a slang.  Technically the nerd language is not really slang, it's just uber-technical (see how I did that?  Huh, nerd-ism, right in the middle of that sentence).  My prior CEO used to ask me to "dumb it down" to his level.  I never like to call it "dumbing down" because it sounds insulting.  The politically correct phrase is to "translate it into normal language" or "translate it into people-speak."

One of the things I like to do when reporting information to our CEO or COO is to give them the broad stroke.  If a question is asked of me and I know it's going to get into technical mumbo-jumbo, I usually ask if they want the technical version.  Usually, that's a NO.  So it's best to give an analogy or a short summary using non-computer terms.
Nerdy Example

I'll give an example:  Our Internet connectivity consists of two connections (yeah, I think I mentioned that we are a small company).  One connection is a fiber optic at 20 megabytes (20Meg up, 20Meg down) and the other is a coax 20 megabytes (20Meg down, 2Meg up).  We host our own web site and we have other servers (like email and teamserver, etc.).  The coax is a cheap data line primarily purchased for a backup, in case the fiber went down (there's nothing worse than not being able to get to the Internet to test your internet connection).  So we (my network administrator and I) decided to purchase a Cisco router and configure it to dedicate a block of bandwidth through the fiber to the web server and to use the coax connection for all PC's in the office that were using the web.  We purchased the device and installed it.  Now the question from our CEO will be "what was the purpose of that equipment you guys installed?"  (he did not ask, by the way, but it was in my weekly status).

Not so Nerdy Response

My response would be "do you want the technical reason or the broad strokes?"  He would probably say something along the lines of "keep it short and simple."  I would tell him it was to solve our website bottleneck issues. 

If he wanted the technical, but not too technical, I would say we bought a router, which is like a "Y" that you put on a hose to direct half of the water one way and half the other way.  We programmed the router to ensure that we have bandwidth available to the web server for our customers, even when our own people are downloading a lot of stuff at once.

The technical stuff, should be reserved for the log book.  This is something that is only reviewed by technical people in the future, if an investigation happens.  Why am I talking about this?  Because the CEO, the COO and other company heads do not have time to sift through all the details when all they really want to know is how is this fixing a problem, or how is it improving my company.  It's not the CEO's job to understand the network, that's why he hired you!

Tuesday, April 16, 2013

Doing Something New - Continued

OK, so I got a bit sidetracked.  My intention with my experience at Pirelli was to indicate that you should never be afraid to try something new.  In fact, the IT field requires it.  If you're not going to attempt something new, your career will end abruptly.  For some people, doing something new is really stressful.  What if it doesn't work?  What if I get stuck?  Can't I just play it safe and keep the status quo?

For me, attempting something new and challenging is exciting.  That's why I'm in this career.  What if it doesn't work?  That's where thinking on your feet comes in handy.  I can always come up with a plan B.  I haven't run into a project yet that I've had to scrap and start over.  Normally, I can identify a bad project very early on and change directions before anybody really notices.

What if I get stuck?  Google!  If Google isn't the answer, do what I do, find some experts and get them on speed-dial.  I have contacts that can configure a Cisco router.  I have experts to do Unix.  I have experts to do Microsoft servers, firewalls, programming, etc.  If you've done this job as long as I have, you should have some contacts.  Companies that have experts.  Don't forget about books (see Amazon for a really good selection) and sometimes there are classes (depending on how much time you have to execute your plan).

Can't I play it safe and keep the status quo?  That's the kiss of death.  WalMart killed K-Mart because they were the first to computerize their supply chain and reduce their warehousing.  K-Mart played it safe, WalMart did something new.  Don't like WalMart?  How about Amazon?  I would love to see the insides of their supply-chain software and hardware.  Now that's a real challenge!

Most of the technical knowledge I have was learned outside of the university.  U of M gave me the foundation to work with.  After that, it was up to me to figure out how to solve the real-world problems that were presented.  Before I worked for Pirelli, I had no idea what a PLC was.  A few weeks later and I knew how to program them, wire them and transfer data between a PLC and a PC.  At first it seemed daunting, but after I got over the initial shock of starting with something I never saw before, it was a piece of cake.

I like to say that a computer "expert" is a person who never stops saying "what if we try this?"

Sunday, April 14, 2013

Doing Something New

The title of this post is about doing something new.  Not me, you.  I work in a world of technology where my job changes constantly, at a pace that is mind boggling.  For some, it's difficult to catch their breath, for people like me, it's exciting.  In the past I've blogged about doing research and this is sort of the same subject.  Let me give a little story of my background to show an example of what I'm talking about.

Once upon a time... I worked for Pirelli.  I was hired to fix a problem that they had with a tire production factory in California.  The problem boiled down to this: They had an electro-mechanical system to count the number of tires built on the tire floor, but it was so old that only the older machines were connected to this machine.  They also had some newer machines that were tied into the Allen-Bradley network and the data for their counts was inconsistent due to differences in the programming.  So I needed to come up with a system to fix this problem.  I looked at different vendor products and some were very intriguing, but cost too much.  I settled on a solution to use more Allen-Bradley displays and PLCs (industrial computers) because the overall system would provide the needed capabilities at a reasonable cost.

So where to begin?

The first task I had was research.  Yup, I had to figure out how I was going to design and build this thing, so I had to figure out what the AB displays and PLC's were capable of, and they don't teach that technology in college.  Fortunately, the factory repair shop leader was an electrician and he loaned me a small PLC cage, some cards and a couple displays.  That was enough to build a tiny proof of concept system in the computer room and get some experience programming in PLC ladder logic (click here for Wiki on Ladder Logic).  At that time, the internet was new.  Finding answers using Google didn't exist.  So I read the programming manuals, the wiring manuals, everything I could get my hands on.  I had to be able to scale this thing to track 100 tire building machines and I didn't have any experience in building such a thing.  This was my first serious, after college, project.

Once I filled in all the blanks in my knowledge of what I need to build, I created diagrams and details of how it would all fit together.  A list of materials, how much time it would take, what types of people I would need to contract (one assistant programmer, one or more electricians, welder, etc.).  After the documentation was completed, I put together an estimate of the total cost.  Allen Bradley gave me an estimate for the equipment (that was the easy part), I had to determine if I was going to use in-house labor or hire contractors.  I opted for in-house labor for the electricians and a contract programmer to do some of the programming.

Once the budget was set, I ran around and got buy-in from the plant manager, operations manager, my own IT manager (who was on-board from day one).  That part was easy.

Next it was off to Capital.  Capital was a department in the administration office of Pirelli that handled capital investments.  They gave the thumbs up or down on a project.  I discovered some politics in this department.  Apparently, they were not happy with a young, inexperienced IT person receiving money for a project of this size.  It took some meetings with the plant manager and my IT manager before they finally gave the go ahead.  Then one day, I received a notification that I was authorized to spend the money and I had an account I would use to issue purchase orders (I've discovered over the years that every company is different in this regard).

So I asked my IT manager, "now what?"  and she said "start buying stuff, put it together, it's your baby."  That, my friends, is when I achieved a really big smile.  I immediately picked up the phone and called Allen-Bradley and authorized the purchase of parts.  I ran around and notified people that I would be doing electrical work in the factory in the near future.  I issued a job description for a contract employee (this person was hired by the personnel department, again, this process is different for different companies).

The parts arrive

Yikes!  I had no idea that 100 displays, spools of wire, conduit, mounting boxes, PLC's, etc. would look like palettes and palettes of stuff.  Yes, my first real lesson.  Some of my stuff got stored in the electrician storage area, but most of it we put in the computer room storage room (it was stacked to the ceiling).

I worked weekends so I could take advantage of electricians that were working during factory down time.  This expense did not penalize my project capital allocation.  They did an excellent job of running data wires to all the older machines that were not connected.

My programmer arrived and I tasked her with writing the PLC program that would be used to collect data from all the machines.  She was also tasked with writing some code for the displays.

Some in-house welders installed the boxes for the displays.  One for each machine.  These were welded to the frame of the machine in a position where the worker could see the display while they built tires.  The displays used were touch displays.  The worker would enter their clock number into the display and change the status to "operating" to indicate that they were building tires.  There was an entire system of maintenance codes to call maintenance and record when the machine was down and up.  The total tires built by this worker per day was displayed and updated as the machine completed it's cycle.

During this time my IT manager moved on to another job and was replaced by an MS Access programmer from the R&D department.  He obtained his database knowledge from technical classes and only had experience in MS Access.

Deployment Day

I was still a bit wet-behind-the-ears back then.  I was under the impression that everything was going to roll-out smooth.  Fortunately, this system wasn't replacing anything, just adding capabilities to the plant, so a failed startup just meant the plant would still operate as before.  I activated the system and the plant manager made the announcement that the workers would start using it on Monday of the following week.  Training was performed so the workers knew what was required and the maintenance shop was on-hand for any possible user issues. 

Things were running smooth... but slow.  Real slow.  Apparently, the refresh rate of an Allen-Bradley PLC was not as fast as I anticipated.  The system didn't scale up linearly due to limitations of the CPU and backplane in the machine.  So I decided to order another PLC and split the load into three PLC's instead of two.  I also had to buy a bigger cabinet for the PLC in the middle of the factory, since that's where I decided to put the third machine (right next to one of the other machines).  I did this to reduce the amount of rewiring I had to do, since 50 machines went to the machine in the front of the factory and 50 machines went to the middle.  I also had to re-wire 17 machines from the front unit to the back unit.  The cost of all this equipment came out of my 7% contingency money, plus the labor was free to my project (in-house labor).

I just learned a valuable lesson.  Overdesign.  Give a little extra.  Nobody complains if it runs too fast.


So now I had massive amounts of data streaming into a PC that was connected to the Allen-Bradley data network.  I had plenty of disk space, what I didn't have was a lot of querying power.  One of my constraints, set by the new IT manager, was that I had to use Visual Basic and MS Access.  There was some history of a programmer that they hired in the past who used C++ and UNIX and created a lot of little programs that nobody understood or could fix.  So, since my IT manager knew  a little VB, then his assumption was that he could always follow my work.  This was VB 6.0 at that time.  So I did it.  When I completed the project, I made it work, but it was a bit sluggish.  My next project proposal was to add an Oracle or SQL database.  The factory already had an Oracle server being used for other purposes, so I proposed a database on the Oracle server. 

Before I completed my research I was contacted by a former colleague that was working for Building Technology Associates, Inc.  He told me he needed someone who could fix a problem involving two large databases that were incompatible with each other and maintained by two different IT groups within the same company.  At that point, I jumped at the new challenge.  When I left Pirelli, I had provided detailed documentation of all the hardware, wiring, software, databases that I built for the factory.  My IT manager reviewed these documents in anticipation of hiring my replacement.  His final comment was that he wished he would not have constrained me to Visual Basic since the programs that I created were beyond his comprehension.  This was not intentional and I had no idea how weak his programming skills were, but I'm confident that a knowledgeable person could have picked up right where I left off. 

The lesson to be learned is to leave good documentation.  That lesson I learned a long time before working for Pirelli.  My intention from day one was to leave a lasting impression that I was the best developer they ever hired and I brought value to their company and not headaches.

Accessing the Clipboard from Javascript

Accessing the clipboard from JavaScript,
or how to implement copy and paste of a web page

I almost forgot about this feature until the other day when I was consolidating data records.  I scratched my head and said "gee, wouldn't it be nice to just hit a copy button, navigate to another record and press the paste button and have all the text fields filled with the data."  Then I remembered that I've used the system clipboard before.  So I dug around my code and came up with this little gem:

function CopyToClipboard()

    if (window.clipboardData && clipboardData.setData)

        var laFields = Array('txtName', 'txtAddress1', 'txtCity');
        var lsResult = '';
        for (var i = 0; i < laFields.length; i++)
            var lsFieldData = document.getElementById(laFields[i]).value;

            if (lsResult != '')
                lsResult += '|';

            lsResult += lsFieldData;

        clipboardData.setData("Text", lsResult);

function PasteFromClipboard()
    if (window.clipboardData && clipboardData.getData)
        var lsData = clipboardData.getData("Text");
        lsData = lsData.replace(/(\r\n|\n|\r)/gm, "");
        var laData = lsData.split('|');

        if (laData.length > 0)
            var laFields = Array('txtName', 'txtAddress1', 'txtCity');

            for (var i = 0; i < laData.length; i++)
                document.getElementById(laFields[i]).value = laData[i];

The code here is obviously JavaScript.  The "CopyToClipboard()" function should be called by the onclick event of the copy button and the "PasteFromClipboard()" function should be called from the onclick event of the paste button.

This sample is looking for three fields (The actual routine copies a dozen fields, I cut them down for this sample).  The fields are named "txtName", "txtAddress1" and "txtCity".  You can also implement this by using a document.getElementsByName() call and iterate through all the input objects.  In my case, I simplified by specifying the id names of the input boxes I want to explicitly copy.

I used the pipe "|" symbol to separate values since there is the possibility that the data might contain commas or dashes.  Pipes are rarely used in text entry fields.

When you use this code you can click on the copy button and then open MS Word (or textpad) and paste into an empty sheet.  You'll see your data separated by pipe symbols in one line of text.  You can edit this text and copy it back to the clipboard and then paste into your form.

There are some complications with various versions of browsers.  I noticed that this does not work for opera, firefox or chrome.  I'm still looking into the details of how to fix this problem, but for now I use this in IE (our customers are in-house people using IE 9).

For a more detailed discussion of accessing the clipboard from JavaScript, I would recommend this article:

Automated Testing

Back to the subject of automated testing...

As I've mentioned in previous posts, I have been struggling with unit tests and how to apply them to our legacy code.  Since I've been involved in the development of my company's software for 15 years, I know where our weak spots are when testing.  One weak spot involves database changes.  Any time we change tables, fields or constraints we seem to have difficulty find all the code that uses the altered tables. 

So every time we change our data, I take great caution in what steps I perform to change the code to match.  The problem always seems to resurface.   We always miss something.

I've analyzed this problem and the difficulty stems around the use of search tools to find fields by name, tables by name or to find where insert and update queries are being performed.  Inconsistent formatting of query strings, specialized queries and issues such as mixed case characters can cause unanticipated search problems.  Let's face it, missed items in a search do not return an error message, it just misses points of interest.

My initial goal for automated testing was to try and get each query into a test harness.  Since our software is so large, it'll be a long time before we manage to put a test on every query, and we're not going to commission a development effort to apply tests to all of our legacy code.  In fact, we're going to add tests to each sub system as we enhance our software.  The final goal is to be able to make a database change, run the tests and see what breaks first.  Then fix the code and tests to account for the database changes.  If the tests are correct and they provide enough coverage, this should work better than search and repair.

So far the number of lines of code break down to this:

Which means that half of our code is covered.  Also, 100% of all queries are in 24 test methods.  Each test method contains several tests.  I've toyed with the idea of breaking each assert into individual tests, but there is no real point.  Each method tests one method that can contain more than one query.  The object of all the Assert commands is to try and test for every capability that the method under test has.

So how did I do it?  I had to roll my own database mock up object.  It wasn't too complicated due to the fact that our company maintains an excel spreadsheet with the table definitions in it.  I used that sheet to generate a test database in my local MS SQL server, run the tests, destroy the database and end.  It sounds slow but I devised methods to create tables and set the data, so I don't actually create the entire database (just the tables that I need to run the test).  I also created a method called "AddConstraints()".  This method will run down the constraint list and apply constraints where the two tables to be constrained exist.

I'm not going to go into details on my implementation in this blog post (maybe I'll get into that in a future post).  My implementation requires Aspose.Cells and MS SQL server.

What is left to test?

The next question to ask is what is that code not covered by these tests?  Most of it is front page code and wire-up code.  I diligently moved all the business logic into the methods of objects that are under the automated tests.  That still leaves 49% of the code that must be tested manually.

My next project will be to implement some sort of unit testing of the wire-up code.  I would like to be able to inject html put and get commands to test this code.  After I I'm satisfied with testing the wire-up code the only thing left is the interface rendering code (mostly HTML and javascript).  This last piece will always be manual since it takes a critical eye to see if the web page is rendered correctly.


I have a list of subjects I want to blog about.  Starting a blog was a bit slow due to the fact that I didn't know what I wanted to post (not precisely anyway).  So what I did was scan through subjects I work with and build a list of things that I do that can contribute to someone else's project.  Now I have quite a large list and not enough hours in the day to blog.  Therefore, I'm jumping from subject to subject on the weekends.

I plan to expand on this exact subject and how I pulled it off in a future blog post.  If you're interested in more details on this subject, leave me a message and I'll expedite it.

Saturday, April 13, 2013

The Perils of Bad Data

If you've spent any amount of time with a database, or dealt with customer service from "name your vendor" you'll have some experience with bad data.  Bad data is where somebody made a data entry mistake and the data is in a database that is not actively managed.  Sometimes I get junk mail with my name spelled wrong.  That's bad data.  Sometimes I call help desk and my account information is wrong.  That's bad data.  Junk mail from previous employees (some who have moved on 10 or more years ago) still arrives at my work place.  That is bad data.

Is my company immune to bad data?  Oh do I wish.  I was fixing a data problem last week involving a duplicated company record.  Our databases contain lists of companies that perform roof construction work for our clients.  Instead of fixing one record and moving on, I wanted to take a peek and see if this was a serious problem that just went undetected.  Sure enough, out of 1,200 company records, about 60 were obvious duplicates (company names were close, addresses were identical, etc.).

My job is about finding solutions to issues.  This is one of those issues.  My immediate solution is to fix the duplicate records by merging data down from the duplicates into one master company record for each company.  This is not a long-term solution to the problem, but an immediate fix to prevent on-going data operations from getting more corrupt than they already are.  Here are a few possible solutions that I have devised off the top of my head:

1. Flag any new company records for review.  This will require someone to "own" the process of reviewing records.  I'm not proposing that the new records be locked until reviewed.  I'm proposing that the reviewing person merge duplicates if they occur, quickly, before a lot of new data is entered.

2. Design a report to list company records that have not been accessed in years (I'll need to come up with a realistic trigger point for this, 5 years? 10?).  The idea is to store a date in a field that signifies the last time someone looked at the record.  This could be tricky since bulk operations must be excluded and there is always the trap that someone stumbled into the record and decided they had the wrong company and left (marking the record as viewed recently).

3. Create a table or field to track how often a record is used.  This could get complicated, but it might be worthwhile to track frequency of viewing, which would include if the record was queried for other purposes (it might show up in somebody's personnel record).  Then the least viewed records can be examined to determine if the data is obsolete and should be refreshed.

4. Use the Levenshtein distance algorithm to determine the closeness of a new company name to existing companies.  Trigger a possible "duplicate" if the difference between two company names is close.

I'm confident that the contact information in some of the records that we have are long gone.  I'm also guessing that some of the companies in that list are gone as well.  I'm also going to have to do a quick analysis of all databases to make sure bad company data hasn't propagated to other locations. 

Bad data can cost your company man-hours, reduce the quality of your product and cause customer frustration.  Don't let it get out of hand.