Putting the “I” in Data: The Ethics of Using Human-Generated Data as a Research Tool

Authors: Sara Mannheimer  , Scott W.H. Young, Doralyn Rossmann


Conference paper

Summary

The widespread use of social networks means that people increasingly communicate thoughts and activities in a public forum, and social network data seemingly have the potential to provide insights into human behavior on an unprecedented scale. But the research community hasn’t yet developed ethical guidelines for working with social network data, and neither do most institutions have policies in place to address the ethical implications of social network research. This lightning talk explores the ethical implications of using human-generated data to conduct research, and suggests an ethical framework structured around the ideas of context, expectation, and value analysis.

Abstract

Human-generated data such as social network data lie in a liminal space between public and private. The widespread use of social networks means that people increasingly communicate thoughts and activities in a public forum. As a result, the stuff of daily life is easily converted into data streams that can be examined, mined, and analyzed for research purposes: Twitter provides API access to user content; Gnip sells realtime and historical social network data to both academic and corporate researchers (Gnip 2016); and Facebook has collaborated with academic researchers to conduct studies with its user data (Kramer et al 2014). Social network data seemingly have the potential to provide insights into human behavior on an unprecedented scale. Academics have mined posts by social network users to conduct inquiry into human behavior, including measuring sentiment (Kouloumpis et al. 2011), tracking the spread of disease (Lampos and Cristianini 2010), and forecasting election results (Tumasjan et al. 2010). As social networking sites continue to be developed, and as users continue to post content, social network data becomes an ever-bigger “big” data source.

But do social network users expect that their posts are being broadcast beyond their immediate online community? And do they expect that by participating in social networking sites, they are also consenting to have their posts examined for research purposes? While social network data may not fit the traditional definition of “human subject data” (Department of Health and Human Services 2009), they are nonetheless generated by humans, and should be treated with extra care. The research community hasn’t yet developed ethical guidelines for working with social network data, and neither do most institutions have policies in place to address the ethical implications of social network research (Moreno et al 2013). A month after Proceedings of the National Academy of Sciences published the controversial “emotional contagion” study, in which Facebook and Cornell University researchers manipulated content on Facebook users’ News Feeds to elicit either positive or negative emotions (Kramer et al 2014), the journal released an editorial statement of concern regarding the ethical conduct of the researchers (Verma 2014). However, Cornell’s institutional review board (IRB) concluded that no review was required for the study, since Cornell researchers “did not participate in data collection and did not have access to user data” (Carberry 2014). Even when academic researchers work directly with social network data, IRBs may consider the data to be “existing data,” and therefore exempt from IRB oversight (Zimmer 2010). The academic community is just beginning to address the ethics of data-driven research. In a 2012 article, Gearhart suggests that “there is no reason to think that potential abuses are less likely to occur in the instantaneous use of social media than they are in traditional modes of communication” (Gearhart 2012). And in a 2015 report for the Council on Big Data, Ethics, and Society, Metcalf proposes that data science researchers “should be positioned in continuity with…a long-running conversation in the humanities and social sciences about researchers’ responsibilities toward human subjects” (Metcalf 2015). But without formal policies in place, the research community must establish its own guidelines to ethically engage with social networks and protect human-generated data.

This lightning talk explores the ethical implications of social network research, including perceived publics, informed consent, and publication of social network datasets. The talk also proposes a novel ethical framework that is structured around the ideas of context, expectation, and value analysis (Mannheimer et al 2016). Developed for use by librarian-researchers, the framework can be adapted to fit any research project working with human-generated data. The lightning talk aims to further the ethical conversation surrounding research with human-generated data, and encourages coming together as a research community to interrogate the ethical dimensions of data-driven research.

Competing Interests

The author declares that she has no competing interests.

References

Carberry, J 2014 Media statement on Cornell University’s role in Facebook ‘emotional contagion’ research. Cornell University Media Relations Office. Available at: http://mediarelations.cornell.edu/2014/06/30/media-statement-on-cornell-universitys-role-in-facebook-emotional-contagion-research/ [Last accessed 9 May 2016]

Department of Health and Human Services 2009 Code of Federal Regulations, Title 45: Public Welfare, Part 46: Protection of Human Subjects. Section §46.102 Definitions. Available at http://www.hhs.gov/ohrp/regulations-and-policy/regulations/45-cfr-46/index.html#46.102 [Last accessed 10 May 2016].

Gearhart, C 2012 IRB Review of the Use of Social Media in Research, The Monitor. Available at: http://www.quorumreview.com/wp-content/uploads/2012/12/IRB-Review-of-the-Use-of-Social-Media-in-Research_Gearhart_Quorum-Review_Monitor_2012_12_01.pdf [Last accessed 12 May 2016].

Gnip, Inc. 2015 About Gnip. Available at: https://gnip.com/about/ [Last accessed 9 May 2016].

Kouloumpis, E, Wilson, T and Moore, J D 2011 Twitter sentiment analysis: The good the bad and the OMG!. Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media. Barcelona, Catalonia, Spain on July 17-21, 2011. Menlo Park: The AAAI Press, pp.538-541. Available at: http://www.aaai.org/ocs/index.php/ICWSM/ICWSM11/paper/view/2857 [Last accessed 9 May 2016].

Kramer, A D, Guillory, J E and Hancock, J T 2014 Experimental evidence of massive-scale emotional contagion through social networks. Proceedings of the National Academy of Sciences, 111(24), pp.8788-8790. DOI:10.1073/pnas.1320040111

Lampos, V and Cristianini, N 2010 Tracking the flu pandemic by monitoring the social web. In: 2010 2nd International Workshop on Cognitive Information Processing (CIP), Elba, Italy on 14-16 June 2010, pp. 411-416.

Mannheimer, S, Young, S W H and Rossmann D 2016 On the Ethics of Social Network Research in Libraries. Journal of Information, Communication, and Ethics in Society 14(2). DOI:10.1108/JICES-05-2015-0013

Metcalf, J 2016 Human-subjects protections and big data: open questions and changing landscapes. Council for Big Data, Ethics, and Society. Available at: http://bdes.datasociety.net/council-output/human-subjects-protections-and-big-data-open-questions-and-changing-landscapes/ [Last accessed 9 May 2016].

Moreno, M A, Goniu, N, Moreno, P S and Diekema, D 2013 Ethics of social media research: common concerns and practical considerations. Cyberpsychology, Behavior, and Social Networking, 16(9), pp. 708-713.

Tumasjan, A, Sprenger, T O, Sandner, P G, and Welpe, I M 2010 Election forecasts with Twitter: How 140 characters reflect the political landscape. Social Science Computer Review. DOI:10.1177/0894439310386557

Twitter, Inc. 2016 Twitter Developers Documentation: Rest API. Available at: https://dev.twitter.com/rest/public [Last accessed 9 May 2016].

Verma, I M 2014 Editorial Expression of Concern: Experimental evidence of massive-scale emotional contagion through social networks. Proceedings of the National Academy of Sciences. Available at: http://www.pnas.org/content/111/29/10779.1.full [Last accessed 9 May 2016].

Zimmer, M 2010 “But the data is already public”: on the ethics of research in Facebook. Ethics and Information Technology. 12(4), pp.313-325. DOI:10.1007/s10676-010-9227-5