United States of America's CTO Wants You to Kick Ass with Big Data

I recently watched an 8-minute TechCrunch interview of United States of America’s Chief Technology Officer, Todd Park, that got me really excited.  It turns out that the Federal government has a lot of free data. In the interview, Mr. Park encourages developers and entrepreneurs to download these data for the purpose of building new products, services, and companies. Park emphasizes that the President of the United States has fully endorsed the idea that key datasets be made available to the public. The Obama administration recently announced their “Big Data Research and Development Initiative,” in which they are committing more than $200 million in new commitments to Big Data projects. As Park states in the interview, the government want entrepreneurs to use the free data to “… kick ass and create useful services for people…” I’d like to try.

Free Data from Data.Gov

So, being the data lover that I am, I examined the different types of data sets on the data.gov site. The data cover a broad range of topics, from Energy and Education to Safety and Health, each including various types of data sets on a given topic. If you like data, have a flair for product development or just like solving problems, I highly recommend you browse the list of free data sets available for download.

I downloaded six data sets from the health.gov site.  Each data set contained unique metrics for each hospital. The six data sets were:

  1. Survey of Patient’s Hospital Experience: Percent of respondents who indicated top box response (e.g., “always;” overall rating of 9-10; Yes, Definitely recommend.) across seven customer experience questions and two patient loyalty questions.
  2. General Hospital Information: Describes the hospital type and the owner.
  3. Outcome Measures: Includes three mortality rates and three readmission rates for: heart attack, heart failure, and pneumonia
  4. Process of Care Measures: 12 measures related to surgical care improvement
  5. Hospital Acquired Condition (HAC) Measures:  Percent of patients who acquire HAC.
  6. Medicare Spend per Patient: This measure shows whether Medicare spends more, less, or about the same per Medicare patient treated in a specific hospital, compared to how much Medicare spends per patient nationally.

My Big Data and Patient Experience Management

Analyzing each separate data set would provide insight about the metrics contained in each data set. What is the percentage of Types of hospital? What is the average patient rating across hospitals? What is the typical mortality rate across all hospitals? What is the average Medicare spend across hospitals? While the answers to these questions do provide value, the true value of Big Data lies in understanding the relationships (in a statistical sense) among different variables. By understanding relationships among different metrics, you can build predictive models that help explain the reasons behind the numbers (e.g., Are mortality rates related to patient satisfaction?; Do efficient hospitals deliver better service?).

To understand the relationships among different variables, I merged the six data sets together into one Big Data set; so, in the basic form, this super data set included 4610 hospitals on which I had all the metrics from each data set, including patient satisfaction, mortality rate, and Medicare spend. Using this Big Data set, I will be able to examine how the variables are related to each other, building predictive models of patient satisfaction/loyalty ratings. The analysis of these different metrics may help hospitals understand how to deliver a better patient experience through customer experience management practices.

My Analytics Plan

In upcoming posts, I will present the analysis of these hospital data. I am not an expert in patient care but I do understand the metrics well enough to give it the ol’ college try. In my analyses, I will try accomplish a few things. Here are three that immediately come to mind.

  1. Create Meaningful Patient Metrics. To accomplish this, I will look at many metric simultaneously via a factor analysis. This approach will help me see if I can aggregate/combine some questions together into a single metric (e.g. average all seven patient experience ratings into one metric). The ultimate goal is to create a metric that is reliable, valid and useful.
  2. Understand Predictors of Patient Satisfaction.  I will use correlational and regression analysis to understand the drivers of patient loyalty. In addition to using patient experience ratings in the analyses, I will also be able to include objective hospital metrics (e.g., mortality rates, process measures, Medicare spend) to understand many more factors that could impact patient loyalty.
  3. Understand Merits of Different Hospital Metrics. How do you measure the quality of a hospital? Is patient satisfaction/loyalty the best hospital metric? Is mortality rate? By simultaneously looking at different performance metrics for many hospital, we can help understand what each metric means in the context of all other metrics. Creating an overall hospital quality metric can only be accomplished when we understand how all metrics are related to each other.

If you have any ideas on how I can analyze these data, I would love to hear them.

I will be watching The Health Data Initiative (HDI) Forum (The Health Datapalooza) (June 5 and 6) via webcast to learn what other entrepreneurs are doing in the area of healthcare data. The HDI is a public-private collaboration that encourages innovators to utilize health data to develop applications that raise awareness of health system performance and spark community action to improve health.