Define Layout of Output Data

Define Layout of Output Data

In clinical research, typically we are used to dealing with “flat” datasets—the type of data tables that can be readily handled by Excel, SAS, or other statistical packages. However, to work with database programmers most efficiently, researchers are advised to describe the expected layout of the output data, possibly by including an empty table shell.

Data output can take one of two forms:

“Long” Data Form: A “long” dataset has multiple observations per patient. Each observation may represent one entry (e.g. heparin infused on a given day). A patient receiving multiple infusions during an admission will have multiple rows with the same patient ID as shown below.  

Example of a table shell:

Patient Name/MRN (or De-Identified ID) Date Total Heparin Infused
1234234 4/12/2011 10 mL
1234234 4/13/2011 15 mL
2341231 4/12/2011 6 mL
2341231 4/13/2011 8 mL
2341231 4/14/2011 5 mL

 

“Wide” Data Form: A “wide” dataset has only one row of observations for each individual. Each column contains a separate observation for the same patient, making the table “wider” with more observations.

Example:

Patient Identifier Datetime1 Glucose Value Time 1 Datetime2 Glucose Value Time 2 Datetime3 Glucose Value Time 3
341234   70 mg/dL   124 mg/dL   137 mg/dL
1234123412   180 mg/dL   145 mg/dL   210 mg/dL
12341324   302 mg/dL   287 mg/dL   176 mg/dL
2341234123   145 mg/dL   199 mg/dL   197 mg/dL

 
Tip: Discuss with your programmer to determine which format is easier to compile. Generally speaking, however, the LONG form is easier and will take less time to deliver. Statistical software can be used to transform “long data forms” into “wide data forms” and vice versa.

  • Find Columbia Doctors