Step 1: Define the Study Objective

What is the purpose of the study (study objective)? Identify the goals of the study.  This initial step may include a quick check to determine the capabilities of the database (e.g., the available data fields) and to ensure that the study design and conclusions are consistent with the database, although this is not considered a formal requirement of the first step.

Step 2: Identify the Data Elements

Define the clinical elements required for answering the study objectives.  Each clinical element will either correspond with a data element in the database or be created by multiple data elements.   Each data element will be extracted from the database to help define the patient groups or the other units of comparison.  Data elements cannot be changed and it is recommended to define the data elements as clearly as possible.  

Examples of Data Elements
  • Unique identifier number (MRN)
  • Date of Birth
  • Gender
  • Date of service (Unique Visit Number: CSN)
  • Primary diagnosis (ICD10 Code)
  • Secondary diagnosis (ICD10 Code)
  • Medication Information (name, strength, National Drug Code, quantity, days supply)
  • Clinical Laboratory Report
  • Physician Services
  • Pharmacy Report
  • Duration of hospitalization

Step 3: Identify and Apply Specific Inclusion/Exclusion Criteria

Eligible participants must be identified for further analysis.  To create a list of eligible participants, inclusion/exclusion criteria are applied to extracted data. 

The criteria should be defined based on the structure of the database elements. Each criterion is applied one at a time and examined between steps to check for unexpected results and adjust accordingly.

For example:  

Exclude: Patients pregnant at visit
Include: Patients with medication A

First, the individuals that were pregnant at visit will be removed from the database.  Second, only individuals that have medication A listed will remain.

Database studies require that control groups be created. Control groups must be well matched for disease, diagnoses, comorbid conditions, age, and gender. Differences in patient
characteristics (e.g., age, comorbidities, and severity of illness), pharmaceutical therapy, and other important clinical differences between the groups may then be summarized and reported.

Step 4: Complete a Preliminary Data Query and Review Results

During the preliminary database query, an initial analysis determines the total number of patients present after applying the inclusion/exclusion criteria.  Determine if the query conforms to the predefined power analysis for statistical analysis.  Data elements are reviewed and characteristics as part of the initial analysis.  For example, the number and percentage of patients that meet the inclusion criteria, have the diagnosis of interest, etc.

Step 5: Create and Modify Analyses Variables

Data analysis and summarization may require the modification or creation of new calculated variables.  For example, the patient's age at date of visit is calculated by measuring the difference in the date of service and the date of birth.  The newly calculated age may then need to to be placed in a bin to set an age range.  

Step 6: Apply the Appropriate Statistical Tests

The last step is to determine how the patient database responds to the study objective.  The appropriate statistical test are applied to evaluate the significance of differences in the study population.  When reporting study results, it is important to recognize the difference between statistical and practical significance from the findings.  A study may show a statistical significance, but it is important to interpret how the data element relates to the population in question.  

References:

  1. Johnson N. The six-step process for conducting outcomes analyses using administrative databases. Formulary. 2002;37:362-64.
  2. Thomas L. A review of statistical power analysis software. Bulletin of the Ecological Society of America 78(2):126-39. April 1997.
  3. Lenth R. Some practical guidelines for sample size determination. Department of Statistics, University of Iowa. 2001.
  4. Lorence DP, Ibrahim IA. Benchmarking variation in coding accuracy across the United States. J Health Care Finance. 2003;29:29-42.
  5. Motheral B, Brooks J, Clark MA, et al. A checklist for retrospective database studies—report of the ISPOR Task Force on Retrospective Databases. Value Health. 2003;6:90-97.
  6. Motheral BR, Fairman KA. The use of claims databases for outcomes research: rationale, challenges, and strategies. Clin Ther. 1997.
  7. Sax, Michael J. Essential Steps and Practical Applications for Database Studies. Supplement to Journal of Managed Care Pharmacy. January 2005; 11:1.  
  • No labels