5 Questions with Colin Koopman, Author of “How We Became Our Data”
From banking to social media, our lives are becoming ever-more entwined with our data, and questions about the truthfulness and privacy of our records feel increasingly pressing. In his new book, How We Became Our Data, Colin Koopman takes us back to where this explosion of record-keeping all started. To understand more about our data and why it’s so important, we asked Colin a few questions. He shows us how central data is to so much of our lives and what it has to do with cultural phenomena from redlining to The Great Gatsby and Buzzfeed quizzes.
First, can you give us a quick introduction to the “informational person” —who does this term describe?
Informational persons live through, are recognized by, and reflect on themselves in terms of their data. How We Became Our Data describes how, in a period of a few quick decades in the early twentieth century, it became obligatory for us to live as informational persons. Why does this matter for us now? In short, because we are poignantly aware of how much of our lives are today transacted in terms of data. Consider social media selfhood and mass surveillance dossiers as two convenient exemplars. The repeated scandals of Facebook and the NSA are equally disturbing even if in profoundly different ways. That we feel scandalized, that we know that something is wrong when a company gives away our personal data or exploits it, or when a government conducts clandestine surveillance on hundreds of millions of individuals, is a function in part of how central data is to our lives, our living together, and our prospects for continuing to do so.
Scandalized though we are, we typically feel as if there is nothing we can do, or nothing to be done, when we, or others, or all of us, are subject to infopower. Our lack of confidence and resolve stems from our lack of understanding about what it means and how it matters for data to be installed in the very kernel of our lives. How We Became Our Data looks to gain such understanding by investigating the history of that installation process. The main focus of this investigation is on the power that data have—that is, on how we informational persons are the subjects of what I call “infopower.”
You discuss how central data is to our daily lives. We offer up our data in a variety of ways—when we make purchases, create accounts, attend school, travel, and find employment. Is all this data ultimately necessary? Would it be possible for our society function without such extensive record-keeping?
Is data necessary for society in some absolute sense? No, but nothing seems to be. Is data necessary for our society? Absolutely. There have of course been societies in the past without extensive informational infrastructure. There will be such societies again in some distant future day. But for the horizon of our times, the societies that we know across the modern and developing world depend deeply on data systems.
Consider the U.S. context, which is the main focus of the book. Nearly every feature of the American dream as it is envisioned today depends today on data. What has long been the quintessence of that dream—a home of one’s own—depends on an extensive data apparatus spanning from recordkeeping in title to mortgage markets to the credit instruments through which those markets function. Reams of carefully-curated data are obligatory for anyone who wants to be a homeowner in the U.S. today. Why does this matter? Well, it matters to you personally if you want to get a mortgage. But why it really matters is for political reasons of justice and fairness. When data systems become obligatory they become a terrain for political inequality. For example, in one chapter of the book, I trace the emergence of how racial data (or what I call the “datafication of race”) came to be an explicit feature of home appraisal, mortgage markets, and government-subsidized lending. One result was the launch of racially discriminatory lending practices in the 1930s, exemplified by, but not isolated to, infamous ‘redlining’ maps. Decades later, glaringly unequal rates of homeownership across racial lines persist to this day. The American dream, we all know, is a fantasy for some, and an injustice to others. That injustice was perpetrated in part through systems of data. By no means is data the whole story behind racial segregation. But in an age of Big Data we need to remember how innocently it felt so many decades ago to deploy information systems into which racial bias was being coded. Of course, that feeling of innocence, just like that bias, continues to be reproduced today.
In the book, you locate the beginning of informational personhood in the period running from the mid-1910s to the mid-1930s. What kinds of data were being collected then and why?
Domain after domain was subject to datafication in these decades. It was an exuberant age of information creation, curation, and computation—the long decade of the 1920s were roaring with data, alongside all the jazz, baseball, and alcohol. We can even see how the most famous literary artifact of the period, The Great Gatsby, was premised in part on an informational exploit. The ordinary James Gatz fashioned himself into the searing personality of Jay Gatsby in precisely that decade in which it was becoming increasingly difficult for Americans to become someone other than who their records say they are.
In the book, I trace this extensive process of datafication across three domains that remain crucial to us today. I noted above my chapter on datafication of race as it featured in homeownership and financial markets. Another chapter of the book examines the emergence of personality psychology as a stable and respectable field of science. The history here begins in 1917, the year of the first-ever personality questionnaire of a type that remains familiar to anyone who has taken a Myers-Briggs test or even most Buzzfeed quizzes. By 1937, the field could be consolidated into the first textbook on personality psychology, written by Gordon Allport of Harvard University. A third historical chapter in the book charts the emergence of the standard birth certificate as the gold standard of documentary identity, a process that was initiated by the Census Bureau in 1903 and was deemed completed by 1933. Prior to this period, Americans were registered at birth sporadically and unevenly. It was across these decades, then, when a Gatsby conducting their lives outside of official databases became increasingly fantastic. Now, we are all expected to have birth certificates. The rule is proven by the exception, for those who do not have birth certificates or other such paperwork are today so readily subject to the political burdens of being undocumented. There is a history behind how seemingly innocent scraps of paper have become a terrain for some of today’s most massive injustices.
Concerns about privacy with social media and surveillance feel particularly pressing. How—and why—might we protect our data?
This is a question I am frequently asked when I give talks about the book. It is a natural question for all of us to have on our minds today. But I think there are two issues here—one is less interesting to me while the other feels much more pressing. The less interesting question is the one rooted in personal concerns about one’s own data. How do I protect my data? That question is, to be sure, important, but it is one that has a ready answer. Just Google it—or better yet, search for it at a privacy-protective search engine like DuckDuckGo over a browser like Firefox Focus piped through a Tor-enabled internet connection that you can set up using Orbot for your Android phone. That said, we clearly need stronger legal remedies to back up these technological fixes. And we need them most of all for those who are too vulnerable to have easy access to some of these technologies.
This takes us to the better question, which is the one you, in fact, asked—the question not about ‘me’ but about ‘we’. How do we protect our data? This question enables us to think about data as a social resource and as a social terrain upon which burdens and benefits might be unequally distributed. This question enables us to think about the ways in which different information architectures, and different data formats, shape the way that computational systems and processes like algorithms produce information about who we are. If an algorithm produces data that is used to help make a decision about us, then those data can be subject to audit to see if they are being implemented in a way that burdens some populations while benefitting others. If so, we can then consider those audits as part of a series of legal and moral questions about discriminatory uses of data by private corporations, public agencies, and even in some contexts isolated individuals. This, of course, is just one of many questions we might ask about how our data can be protected. For all of these questions, we need to refuse to give priority to the self-centered question of how to protect myself in a digital ecosystem where so many others are left vulnerable. We need to prioritize questions about how we can design fairer information ecologies that do not reproduce and deepen already-existing social inequalities.
In the final chapter, you briefly nod to methods of resistance for countering infopower. As people who are shaped by our data, how might we resist the politics of information and reclaim control over the production of our personhood?
Well, again here, I think we need to frame the question of resistance as a political question that concerns all of us, rather than as a personal question about how I can protect myself along the wild frontiers of unregulated information systems. The political question is urgent. It is also, unfortunately, a question that is currently sharper than any actual answer to it that has been offered. In other words, I think we are still in a phase of diagnosis when it comes to the bugs and exploits internal to infopower. We need to understand what these bugs are— in what ways data systems function as a terrain where power can be exercised. Once we have an adequate diagnostics of our condition, we can then begin to think seriously about repair. My sense is that there are today numerous piecemeal efforts underway for implementing better data designs. These efforts are happening on all three legs of the stool of political reform: law, technology, and education. There are research labs conducting audits of algorithmic systems. There are design teams looking to code privacy-enhancing options into information architectures. There are investigators working to open up data that should be available to all. There are others who are devising ways to educate future generations about keeping certain kinds of personal data purposefully obscured. Yet none of these efforts looks like an overall solution.
So far, nobody has been able to connect together all these disparate efforts into a synoptic perspective that looks like an overall solution. That said, maybe we do not need an overall solution. Maybe we just need the extraordinary energy of lots and lots of talented individuals devoting themselves to working out lots of little solutions to the manifold potential problems that our information systems are producing. If so, then probably what really matters most will be education for the next generation of data designers and information architects. We need to find ways to teach the tens of thousands of data scientists building machine learning applications to be cognizant upfront about the potential discriminations baked into the data formats their systems will implement. This will be anything but a simple task. The most challenging problem any of us can ever face is the task of education. Data scientists, like everyone else today, want to be impressed with the complexities of artificial intelligence systems. But that can only ever be a faint glimmer of the in-principle-unpredictable complexity of human intelligence and the learning it aspires to.
Colin Koopman is associate professor of philosophy and director of the New Media & Culture Program at the University of Oregon.