Last updated: June 10, 2020
CPP collects daily data from 53 sources: each state prison system, ICE, the Federal Bureau of Prisons, and Puerto Rico. Over the past few months, we have carefully considered how to define our data. Each system reports their data differently, defines their data differently, and includes different data points. Some states are inclusive of community supervision, some include jail systems. And, many states’ data reporting systems change constantly. So, in the name of data integrity and transparency, we have created a data dictionary both for our team and for you who may be viewing and using the data. This document is living and breathing, changing constantly. You can find it here: https://docs.google.com/document/d/1XKLrAtT2UEsQ_fcicwWJRnHCh6zR5eZptG06bjtfEZg/edit?usp=sharing
For each state our data dictionary includes:
Where the data was sourced from, with a link to the system’s data page.
The system’s definition of its data and which variables they are reporting.
CPP’s definitions of the data. Sometimes systems remove positive cases when they are “recovered”. We add them back in to get a cumulative count. Sometimes systems don’t report cumulative testing numbers but instead report number of pending, positive, and negative numbers. We add those all up. In the data dictionary, we detail this unique process for each state.
If you have questions about the data dictionary or thoughts on how we can improve it please reach out to us!