Data analysis

  • Obtain appropriate programming and statistical support given your expertise
    • Do not underestimate the time for data cleaning and data analysis
    • May require vast majority of programmer time
  • Be aware of tools available to assist with analyzing data
  • Look at methods/statistical reports available through various database projects
    • For HCUP data read Methods Series – methodology and programming codes
    • Medicare/Medicaid data – ResDAC website has link to statistical resources and technical publications
    • SEER/Medicare data – NCI Health Services and Economics website – analytic support
    • MarketScan Data – publications in literature
  • Think about population-based rates and how to obtain
    • Age, gender, rural/urban location, region, median household income
    • Payer (have enrollment files for Medicare and Medicaid but not HCUP)
  • Think about how you will handle missing data – imputation or drop encounter/person from analysis
    • If dropping person/encounter consider potential bias if information is not missing at random
  • Create tables that you will populate to help guide your analysis and clarify what you are trying to obtain from the data
  • Remember with these large databases, statistical significance is frequently obtained – clinical significance is not
  • Consider analytic techniques that can address some (not all) of the limitations of administrative data analysis
    • Multivariable analysis, propensity scores, instrumental variable analysis
  • DOCUMENT all analytic decisions regarding coding of variables and process of selecting population