Dissertation

My doctoral dissertation was focused on measuring and predicting emergency department (ED) use. This retrospective, longitudinal, observational study was designed to: 1) analyze the types and frequency of ED use in two commercially insured populations; 2) develop prediction models of ED use based on administrative (insurance claims) data; and 3) use additional data (medical records, payor/provider data, and  neighborhood socioeconomic status) to improve the accuracy of claims-based models.

We used medical claims data to identify diagnosis codes for ED visits, which were used to categorize those visits according to the New York University Emergency Department Algorithm. The nonemergent, emergent but primary-care treatable, and emergent but preventable/avoidable categories defined primary-care sensitive (PCS) ED visits.

We obtained two different datasets: one from a Massachusetts managed-care network (MCN) with data from 2009-11, and one from the Truven Health Analytics MarketScan database with data from 2007-08. All persons had commercial insurance. In the MCN sub-study, we studied 107,449 individuals enrolled for at least 1 base-year month and one prediction-year month in Massachusetts who were assigned to a participating PCP in the MCN. In the MarketScan sub-study, we studied 15,136,261 individuals enrolled for at least 1 base-year month and one prediction-year month in the US.

In both sub-studies, the claims data were used to create morbidity scores using DxCG software. In the MCN sub-study, we linked the claims data to patients’ electronic medical records (EMRs), which contained additional information about patients’ medical history and health behaviors (via problem lists). We geocoded the patients’ addresses to Census tracts using ArcGIS. Census tract information was used to assign neighborhood-level characteristics, such as income, to patient records. We also used data on patients’ primary care practices regarding the subspecialty of primary care and the quality of both the practice and the provider.

We used the data to: 1) calculate the prevalence of any ED use, number of ED visits, and estimated number of PCS ED visits; 2) build and validate predictive models using administrative claims data and estimate ED risk prediction models; 3) compare the performance of models predicting any ED use, number of ED visits, and PCS ED use; 4) refine these models by adding patient characteristics from EMRs, neighborhood characteristics, and provider characteristics.

This process allowed us to determine the predictors of different measures of ED use and create predictive ED risk scores (percentiles of risk) that could be used to identify patients at highest risk of ED use during the subsequent 6-month and 12-month periods. We found that the PCS ED use measure, for which we propose a new method of calculation, was more discriminating, showed higher sensitivity to neighborhood-based risk factors, and was both conceptually and statistically preferable as an outcome measure. We demonstrate that adding neighborhood-level risk factors and payor/provider data improve the accuracy of prediction models. We also find that age, living in poorer neighborhoods, prior ED use, and morbidity are the strongest predictors of ED use.

Developing accurate predictive models for ED use will enable patients deemed at higher risk (and/or their providers) to be targeted for education and care management. Additionally, ED prediction models could help in creating performance measures to enable rewarding providers for providing better care and access to care for their patients.