infectionNet - for people who manage infections

Better data collection

Ambiguity is the root of all evil when you’re collecting data. Although the differences between '07 July 2007', 'July 07, 2007' and '07/07/2010' are insignificant when writing a letter, they are very significant when importing into a database. If some of your data collectors write '07 July' while others write 'July 07', you’re going to end up spending time rewriting it. Same goes for names, identification numbers, nursing units, facilities, and almost any other field you can think of.

For example

Mary Smith | 111222333444 | July 7, 2010 | Unit 5 | General Hospital
Smith Mary | 111222333444 | 07/07/2010 | 5th Floor | General

Are completely different records to a database. This is an easy fix for a programmer, but what are the odds you have one of those on hand? Luckily, there's a few simple things you can do to get rid of bad data like this.

Simple steps

  1. Tell them what you want
    If you want someone to fill in a date, tell them how you want it written. On all my data collection forms, I always write 'DD-MMM-YYYY', so for my example above, I’ll always get 07-Jul-2010. When I want a name, I tell people I want ‘Lname’ and ‘Fname’, in that order. I get the same result every time.

  2. Ask only one question per field
    I saw a field on a collection form last week that asked: “Were antibiotics given? Were they appropriate? Logically, in your head, this is one field. For data collection purposes, however, these represent two distinct fields. Even if you end up having 100s of fields, you need to make sure that each field addresses only one question.

  3. Take out all the guesswork
    This one sounds obvious, but gets overlooked all the time, and will save you major headaches down the road. Putting a simple (Y/N) after your questions will clean up a lot of your data issues. For example, questions like ‘Were antibiotics given? (Y/N) Were the antibiotics appropriate? (Y/N), will yield useful answers. If these questions aren’t black and white, leave room for a ‘Comments’ field.

Get it right the first time

This sounds like absolutely basic information that anyone would know, but yet I see reams of poorly constructed forms and piles of unusable data every day. Imagine if you had records of 400 MRSA cases, all with different date formats, all with little notes in Y/N fields, some with full patient names, some with just patient initials, etc. It’s a nightmare that’s going to cost you time to fix.

The difference between this:

Mary Smith | 111222333444 | July 7, 2010 | Unit 5 | General Hospital | Antibiotics were given, but they were not appropriate

and this

Smith | Mary | 111222333444 | 07/07/2010 | 5th Floor | General | Y | Not appropriate

is huge. Especially when there are 400+ Mary Smiths.