APIs continued

Dad Joke

My neighbor had the nerve to ring my doorbell at 3 o’clock in the morning.

Luckily I was still up playing the drums.

Housekeeping

  • Self-assessment reflection + proposal

Working with online data

  • Save the raw data
  • Make your process as reproducible as possible

Limitations of online data

  • Found, not made
  • Missing (sometimes invisibly)
  • Can result from changing/hidden processes

Code Challenge Review

Ethics Discussion

Vitak, J., Shilton, K., & Ashktorab, Z. (2016). Beyond the Belmont Principles: Ethical Challenges, Practices, and Beliefs in the Online Data Research Community. Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, 941–953.

Ethics in online data

  • Some dangers
    • Privacy
    • Transparency
      • Data is “exhaust” and people may not know it exists
      • What is “public” data? Expected audiences may differ greatly
      • Informed consent is difficult
    • Unintended consequences
      • Data can be aggregated or used by bad actors
      • Salganik gives 18 examples of states using population data for genocide/forced migration

Ethics in online data

  • More dangers!
    • Data release
      • Data can be “de-anonymized”
      • NYC taxi data
      • ZIP + birthdate + gender is enough for ~1/3 of people
    • Generalization
      • Twitter users != all people
      • Most active users != all users

Balance is needed

What are the dangers of too much caution? > - Emotional contagion as example case

Some principles

  • Respect for persons, beneficence, justice, respect for law, and public interest
  • Think through possible harms to participants and the systems
  • Do work that has benefits
  • Consult with others
  • Assume that data is sensitive
  • Aggregate when possible
  • Notify participants of data collection and obtain consent when possible
  • Alter quotes/usernames
  • Share research with participants