In clinical research, typically we are used to dealing with “flat” datasets—the type of data tables that can be readily handled by Excel, SAS, or other statistical packages. However, to work with database programmers most efficiently, researchers are advised to describe the expected layout of the output data, possibly by including an empty table shell.
Data output can take one of two forms:
“Long” Data Form: A “long” dataset has multiple observations per patient. Each observation may represent one entry (e.g. heparin infused on a given day). A patient receiving multiple infusions during an admission will have multiple rows with the same patient ID as shown below.
Example of a table shell:
|Patient Name/MRN (or De-Identified ID)||Date||Total Heparin Infused|
“Wide” Data Form: A “wide” dataset has only one row of observations for each individual. Each column contains a separate observation for the same patient, making the table “wider” with more observations.
|Patient Identifier||Datetime1||Glucose Value Time 1||Datetime2||Glucose Value Time 2||Datetime3||Glucose Value Time 3|
|341234||70 mg/dL||124 mg/dL||137 mg/dL|
|1234123412||180 mg/dL||145 mg/dL||210 mg/dL|
|12341324||302 mg/dL||287 mg/dL||176 mg/dL|
|2341234123||145 mg/dL||199 mg/dL||197 mg/dL|
Tip: Discuss with your programmer to determine which format is easier to compile. Generally speaking, however, the LONG form is easier and will take less time to deliver. Statistical software can be used to transform “long data forms” into “wide data forms” and vice versa.