Objective Data for a Polarizing Subject

Benjamin Ian Witter
4 min readNov 19, 2020

Working for transparency on a murky subject.

I was excited to work for an organization focusing on human rights and Human Rights First says it all in the name. They are a non-profit focused on human rights in the USA which is a sensitive topic for many people as some believe there is no work to be done. A group dedicated to pushing the nation to living up to it’s own hype is something I can get behind.

The specific issue my team was tasked with involved giving transparency to police use of force in America. My part in this project was acquiring and preparing the data to be presented. The end product would be a website mapping the locations and dates for as many instances of police use of force as we could access reliable data about.

A source of inspiration and data. Mapping Police Violence . org

Because it was such a volatile subject I was a bit concerned with how this site might be used, but as a publicly available resource with objective data I felt it was ethical to work on. As I worked I found myself realizing a hidden issue with the project. The data we could show would present regions and departments who were the most honest in the worst light and departments who do not share their data as blameless. I wanted to work on this project to increase transparency not punish it. I voiced my concern and left it to the web design team to include warnings about how the information appears.

Example of data formatting

Step one of my work was always going to be locating data. Thankfully much of this work had been done by previous teams and I was able to use journalistic source for additional data. My next task was to prepare the data to be formatted into something useable. Much of the cleaning had already been done and I was very grateful for the previous teams. After a short time cleaning data that had been missed I set about compiling all the data into one useable set. This was the most time consuming as although the data was clean each source had a unique structure to their data and so I had to reformat the data while combining the individual parts. There were occasional errors in the data (dates, coordinates) so I had to correct those as well to keep the data clean uniform and usable.

The final total of incidents found 1995–2020

I coordinated with the Web backend designer to make sure the data would be received and useable. The data I worked on was used to populate the database which was used in the frontend. This was static data and the API created by the rest of the Data Science team adds updated data on to the existing set.

Trello for coordinating tasks

After getting my primary tasks completed I was able to provide source information as well as write some code to provide more information from the API. I worked with the web team to try and provide help when I could or direction for where we could go with the data.

Leaving the Project as Thus!

We were able to add many features and realize a lot of the work done in preparation by previous teams.

  1. We were able to send real data from the Data Science API which is accepted by the backend of the web team.
  2. We populated the map with an extensive base of data giving the project more weight than a sparsely populated product.
  3. We added more data types to the base of the project to allow for more displayable information for a user.

The future teams working on this project will have more categories of data and more easily manipulated data to implement in creation of features. The likely additional features will be searching by type of force or other search criteria on the map. Another feature may be finding the types of force used by city and prevalence. The next team may even decide to include an actual predictive model with the site passing data from the user rather than just adding updated data. I foresee trouble with limitation of data by the reliable sources as well as the limited span of the data across the USA. I would recommend looking for publicly available police website which might provide even more data from a broader sample of America.

In the end, I feel even stronger as a team member because I was able to communicate with almost every other member of the team and offer assistance. I was able to communicate the limitations of the data and why we were not able to implement certain features or provide a global model even when other members were pushing for them. I think this experience will give me some perspective for my future work. I hope working on such a topical project will make me attractive to other organizations who can see how challenging the data work was.

--

--