My Reflections on Data

10/30/2020

By Richard Lin

As typical of pandemic times, I have had a largely atypical experience during my internship at BAG. If I told myself a year ago that I would be working at the Gleaners, my past self would have expected a ten-week summer internship, with fields and crops and an office. In reality, I have experienced almost none of those things. The office is instead my home; “ten weeks” have lasted well into October; and for me, food isn’t found in banana boxes and gleaning trips, but in numbers and spreadsheets.

In almost all ways then, I have departed from my known and my comfortable, and I have so many thoughts and so much to say about my experiences navigating the unusual. But I don’t have unlimited space, and so for now, I am glossing over the difficulties of home-office life (after all, I am sure many of us have lived through it first hand), or the struggles I have had navigating thesis writing and my project at BAG (it’s my first time working on one, and also two, long-term sustained self-lead projects). In the short space I have then, I want to talk about something that is at the heart of my work here at BAG. Something dreary and beautiful, noisy and elegant: data.

In our modern, technologically ingrained lives, data is monolithic. It is the engine behind everything from county funding decisions to the incredibly accurate targeted ads you see on Google. Yet, in its ubiquity, it becomes mundane. We are so deeply situated in data, so surrounded by its effects, that we easily forget about the massive efforts required to make it useful—and, when it does become useful, the truly remarkable and versatile things that it can do.

Of course, I may be speaking too broadly, and you may in fact be in constant appreciation and awe of data. But personally, when I started my data project at BAG, I ran into some fundamental difficulties. For context, the task at hand was essentially to address the following question:

“How can we use data from the distribution records that we and partner agencies have to better the food system?”

And all of a sudden, data wasn’t something that directly served me. It wasn’t something that built and supported the structures that surrounded my life. It was, instead, a giant heap of numbers and letters that had no structure to it. And so, I ran into my first realization, and the big problem.

Data doesn’t do anything. Someone has to do something to it.

Given any amount of thought, this statement may be incredibly obvious. But generally, when we interface with data, the system around it has already been built. It becomes a real mountain to climb, however, when no previous work has been done. Most raw data won’t scream out any discernable conclusions. You have to go searching for it instead.

So then…where do you start? Well at first, I had this big spreadsheet which listed all the food pantries that the Greater Boston Food Bank serves. So, I just began calling up each pantry, and getting information about their food stocks and their demands. And in talking to the pantry directors, in learning their systems and terminology and procedures, I began to understand what things were important to focus on, and what things could be shelved for another time. Using that, I dove into the distribution data that each partner organization had given me. Of course, each organization’s spreadsheets had different categories and phrasing, so I had to digest each dataset, get clarification from its sources, and make compatible what was initially incompatible. I spent my whole summer doing this—in fact, I am still doing this—and it was through this process that I had my second realization, and with it, a solution to the big problem and a process for my project.

Data is as deep as you let it be. And its uses are what you make of it.

Refraining from making any grand analogies to life here, I really did appreciate this realization. Because unlike the first realization, it’s deceptively NOT obvious. Given a list of hundreds of pantry locations, what do you do with it? Well, you can plot it:

But then what? You could run a density analysis:

And then, at the same time, you can download some publicly accessible census data, run that through a couple of algorithms (which you can borrow or develop yourself), and plot that too:

And then you can compare that to the pantry locations, and see which neighborhoods are potentially being underserved. Or, you could plot the distribution data you received from various organizations, and compare that instead:

And you can show this on maps, like I’ve done above. Or you could do it with charts:

But it all came from data sets that looked like this (blurred because of contact information):

Or this:

And here is my third, and final, realization.

Data is dreary and beautiful, noisy and elegant. And it is immensely powerful.

If you can sift through the overwhelming amounts of raw information, and piece it all together, you can create something that delivers information easily and informatively. You can discover something that you didn’t know already or confirm something that you had a hunch about. And given all that, you can then do further things—send food, feed mouths. And at the heart of it all, there is data.

0 Comments

BLOG

My Reflections on Data

Leave a Reply.