Kyle Greene

The dominant form of artificial intelligence in use today is machine learning, which refers to algorithms that use vast sets of data to improve their accuracy in tasks like identifying patterns, labeling information, and predicting outcomes. Machine learning-based artificial intelligence is everywhere—both in the sense that our daily lives constantly intersect with it, and in the sense that we are constantly bombarded with new, often negative, stories about it. For instance, today is April 10th, 2019. In the New York Times, Sarah Jeong writes about “How A.I. is Changing Insurance” and warns that “[s]ome technologies are better left in the laboratory.” Meanwhile, the Washington Post notes that, “Congress is starting to show interest in prying open the ‘black box’ of tech companies’ artificial intelligence with oversight that parallels how the federal government checks under car hoods and audits banks.”

There is undoubtedly great profit for prophets of disaster, but skepticism about the growing role of artificial intelligence in the economy and society is a serious trend and not merely the province of sensationalists. Academics, journalists, policy-makers, and a variety of social and political activists have raised a long list of pressing concerns about the expanding presence of artificial intelligence. There are dozens of important and fruitful discussions being held, but two areas of concern stand out. First, proponents of brawny antitrust point out the danger of several large technology firms dominating the artificial intelligence space due to their control over insurmountable troves of data. Second, advocates seeking to make certain that people of different identities are treated equitably by artificial intelligence warn of situations where insufficiently diverse sets of training data lead to inaccurate predictions and diagnoses. Both concerns are weighty and serious. Yet they often pull in opposite directions, highlighting the need for concerns over artificial intelligence to achieve greater coherence and synthesis before discussions turn into actions.

Recently, the Columbia Law Review held a wide-ranging symposium on “Common Law for the Age of AI.” During the second panel, titled “Responsibility & Liability,” Frank Pasquale discussed the possibility of tort liability when inappropriate or inadequate health data was used to develop artificial intelligence systems. For instance, could a hospital be liable if they misdiagnosed a patient after failing to purchase an additional data pack that included a more diverse and comprehensive set of training information? In the third panel, titled “Public and Private,” C. Scott Hemphill wondered if the barriers to entry raised by big data, and the transfer of monopoly power into secondary markets made possible by such a stark data advantage, would leave artificial intelligence innovation solely to the Facebooks and Googles (and Alibabas) of the world. Both panelists, and both panels, were compelling and thought-provoking.

But what do we do with these concerns? If regulators require companies to meet high minimum thresholds of data breadth and diversity or if courts impose tort liability for insufficient training data sets, do we foreclose nascent competitors from entering the market? In the long run, that could prevent the development of, e.g., a more accurate artificial intelligence diagnostic system that yielded better diagnoses for all patients. Or, if we break up companies with too much data, will we condemn ourselves to a future populated by corporate archipelagos employing inaccurate and biased algorithms? In this possible future, the discerning patient would first need to inquire into the training data that was used before going in for their diagnostic scan.

Of course, the easiest solution might be to break everybody up and mandate open sharing of data, which seems to meet both sets of concerns. But you do not always get more of something when you make it cheaper to “buy”, especially if that something cost time or money to produce in the first place. Although it isn’t likely that data would dry up and disappear, the quality of shared data would fall off as sophisticated and powerful firms look to find their advantage at another step in the process (perhaps by retaining the top programming talent that would be needed to make sense of such messy data). Never mind the standardization of data storage and other data practices that would be necessary for successful broad-based sharing—who would decide and implement all of this? And, finally, what of privacy? If I release my data, or sell my data, to one company, I have not decided that it is now fair game for the entire business world to use.

As a basic level of fluency in discussions about artificial intelligence begins to propagate throughout legal networks and society, the landscape of concerns should settle into something increasingly coherent and decreasingly chaotic. But until that takes place, our regulatory, legislative, and judicial responses must be cautious and searching. Decisions must be made with an eye towards their impact on a dynamic and complex system, not with an assumption that all else is holding still while the decision-maker toils in solitude. After all, the growing presence of artificial intelligence is so rife with potential problems partly because it is such a potent source of potential solutions. While there are many ways to grapple with the drawbacks, there are far fewer ways to do so without also hamstringing the benefits.