Vary De-identified Names Across Clinical Data Using Caristix Cloak

By | Published: January 6th, 2012

One of our Cloak customers is de-identifying several GB of clinical data coming from several healthcare information systems (including 2 ADTs and a lab system) at an IDN. This customer is asking some great questions that would help other Cloak users get more out of the software. Here’s an excerpt from our conversations. We’ll be posting new Q&As in the coming weeks.

Is there a means by which the names in a message can be de-identified, i.e. patient, physician, etc., without it being the SAME across the message? For example, using the names.xls spreadsheet (which is such a time-saver… oh man!), I’ve replaced the patient and the caregiver names. However, in my PV1 segment, I’m finding that all the caregivers are the same name as the patient.

You can use Excel files in Cloak to generate replacement data. For instance, I might have an Excel file listing cities and zip codes. Cloak will manage de-identification so that when a replacement zip code is chosen at random, you get a city associated with the zip code. That way, the data still make sense. The same technique lets you build an Excel file with names and genders, so that Cloak provides female first names to female patients.

If you use the same Excel column to cover several fields, the same row (so, in this case, the same value) will be used.

To get a different name, you can do one of two things:
1. Open the Excel file and add a column with names (such as physician names, for instance). This way the patient will have the physician listed on that current row.
2. Copy the Excel file; change the copied file so you have two different files for patient names and physician names. This way the association between patient and physician is going to be random once you set the de-identification generator type parameters.

Read more about using Excel files to generate replacement data in Cloak.

Categories : De-identification