How to Work with a Programmer
You’ve submitted your DISCOVERY form and been assigned to a programmer. Below are key tips to help you set the right expectations in the process of working with the programmer.
- Miscommunication can and will happen. Be cognizant that health services researchers, clinicians, and programmers have different technical language in mind when describing the same data. For example, when researchers think “database,” they may be referring to a table/flat dataset that can be further analyzed and manipulated via programs like Excel and SAS. For a programmer, however, a database refers to the so-called “relational” database where the clinical data is stored and maintained.
- Becoming familiar with how the data is presented and where it is located on the application side (e.g. Eclipsys, WebCIS) will be helpful when making your data request.
- The programmer may discover a better source of the data you’re looking for.
- Be aware that the process of obtaining data is iterative when working with a programmer.
- Expect that the dataset you receive may be missing certain elements.
- You will need to meet with the programmer likely more than once to clarify your request point-by-point.
- You may think of additional variables that you want to add to the query—be patient and polite in asking for additional variables.
- If possible, speak with another researcher who has gone through the DISCOVERY process before. This will help you to better understand what is and can be expected. You may refer to the “How to Find a Researcher to Work With” section for tips.
- Be aware that your perception of how the data are structured may be wrong.
- It is likely that data generated for you will appear as separate sets of data for each variable (e.g., one spreadsheet with all patient lab glucose values, another for all patient blood pressure values)
- Similar to when you are using a dataset (e.g. NHANES) for the first time, you will likely have to spend time understanding the codebook and values delivered.
- Expect to invest time in error checking and cleaning the data once it is received from the programmer.
- Be realistic with turnaround times for data requests. There are inherent structure limits in the Clinical Data Warehouse for programmers to access, retrieve, and format data.
- Parsing of free-text notes is complex and often requires natural language processing, which is not always feasible or reliable. Programmers will be able to provide the notes, but parsing them would require the requester to hire an analyst or seek other solutions.
- Be aware of different naming and organizational conventions—how the data is filled out, stored, and delivered can be different. For example, in one setting viewed from the clinician’s side, the patient’s prescription information may be contained within 4 dropdown menus and a write-in free-text note. However, on the programmer’s end, the system aggregates the different medications into a single cell, separating individual medications with commas.
- Be aware that extreme data values are possible. Knowing the plausible ranges for your variables of interest can help you filter out errors from data entry.