This week, a handful of academic researcher teams will have access to a new tool from Facebook designed to collect near-universal real-time data on the world’s largest social network.
When it comes to gaining access to Facebook data and how, the company now known as Meta still feels echoed by the Cambridge Analytica scandal of 2018, in which a political consulting firm seized the personal data of millions of anonymous Facebook users. cropped to make a detailed profile. potential voter. The company shut down thousands of APIs over the three years that followed and is only now beginning to restore widespread access to academic research.
Nerdshala previewed Facebook’s new Academic Research API and spoke with Kiran Jagdish, Facebook Product Manager, who led the project. Facebook Open Research and Transparency (Fort) Team.
“This is just the beginning,” Jagdish told Nerdshala, describing the researcher API as a beta version of the toolkit it hopes to eventually introduce. The API, first announced at F8 this year, is Python-based and runs in JupyterLab, an open source notebook interface.
In light of Facebook’s many past privacy woes, the new Researcher API comes with some preliminary caveats. First, the API will be made available only to a small group of established academic researchers through an invitation system. The company plans to expand access beyond the initial test group in February 2022 to incorporate feedback from the test into a wider launch for all academics.
Another precaution: Researcher’s API runs in a very controlled environment that Jagdish describes as a “digital clean room.” Academic researchers with access to the API can log into the environment via a Facebook VPN, collect data and crunch numbers, but raw data cannot be exported – only analysis.
The idea is to protect user privacy and prevent any data analyzed from being re-identified, but this limitation may unfairly upset some critics of the company, noting that the researchers collected APIs. All the public data that goes is already floating there but it is difficult to collect and analyze with Facebook’s existing tools.
When launched the API will provide access to four buckets of real-time Facebook data: Pages, Groups, Events and Posts. In each case, the tool will only start from public data, and only from sources within the US and EU. For groups and pages, there must be at least one administrator in a supported country to make that data available through the API.
Through the tool, researchers can analyze large swaths of raw text using methods such as sentiment analysis, which tracks the feelings and emotions expressed by people through their speech on a given topic. Beyond text-based posts containing most of the available data, researchers can also access related information such as group and page descriptions, dates of their creation as well as post responses.
Multimedia data such as raw images will not be included, nor will comments or user demographic data (age, gender, etc.). The API also won’t collect any data from Instagram, though Jagdish admits the platform is very valuable to researchers and the team is exploring ways to make Instagram data available.
The Forte team is looking forward to working closely with academic researchers to develop and design existing tools, which Jagdish describes as a work in progress. While Meta indicated that its initial set of academic partners is yet to be finalized, the company has invited researchers from 23 academic institutions around the world to kick the tires.
Researchers who completed the team’s onboarding process and agreed to their privacy policies were granted access on Monday, November 15. Facebook requires anyone accessing the research to agree to privacy constraints, including not re-identifying specific individuals within the data.
The research API is only available to a handful of academic institutions for now, but the Fort team plans to provide access to other groups, including journalists. The goal is to create a public roadmap that gives researchers and journalists a transparent look at what the team is working towards.
The company has a lot to believe in the research community. In August, Facebook cut off access to advertising data for two leading researchers affiliated with NYU’s Cybersecurity for Democracy Project, prompting rebuke from many academics and regulators. One of those researchers. focused on tracking misinformation and political advertisements through Opt-in browser tool called Ad Observer, Facebook in September apologized For providing them with incomplete data for an elite group of researchers known as Social Science One—a mistake that undermined months of work and analysis.