For many years in my research work, I studied organization, co-production, and innovation. Since 2013, the attention of the public debate has shifted mostly to the issue of platforms and peer2peer production, hence the need to understand which tools and methodologies could be applicable to understand this phenomenon. Two main solutions I found related to the different types of digital ethnography: net-nography and digital methods. Each of them presents advantages and criticalities, but the idea of mixing the two can produce a synergistic collection of data and a mutual reinforcement for the hypotheses verification and the interpretation of the results.
By the end of this case, students should be able to
- Recognize the opportunities and limits of the different digital ethnographic approaches
- Understand the specific criticalities in the study of the peer2peer economy (such as difficulty of interaction with the community, censorship by the moderators of the platform)
- Mix the combination of ethnographic techniques as a way to overcome part of these limits
Studying the Peer2peer Economy as a Methodological Challenge
When we refer to the peer2peer economy (Arvidsson et al., 2016), we are speaking of a new mode of production and exchange of goods and services based on human networks that collaborate through a platform. Some examples of the peer2peer economy are the file sharing of audio/video files, or even experiences like Wikipedia or Open Street Map, or, more recently, platform of the sharing economy like Airbnb or Lyft.
Studying peer2peer economy needs to address some specific methodological issues:
- Peer production is something hybrid, a mixture of interaction between digital and social infrastructures. The peers operate mainly through a digital platform, but their relations are not always or exclusively generated online. Moreover, peers’ digital interactions can also produce physical outputs or vice versa: For example, is the peculiar structure of Airbnb’s platform to convince more and more people of the advantages of a short time P2P renting, or the structure of the housing market can explain better the success or failure of Airbnb’s business model in specific contexts? Answering these questions is not so easy.
- The actors involved are hybrid too: They are pro-sumers (producer + consumers) (Bruns, 2008; Toffler, 1970), social actors with fluid motivation and roles in the value chain. For example, is a BlaBlaCar user a driver or a simple consumer of the platform? The peers share their assets, skills, and knowledge, acting as a community of professionals. At the same time, the platform apparently seems not to make any recruitment selection because it is mainly open in participation, and the peers do not have to show specific credentials.
- However, the platform in itself plays a crucial role in “designing” the quality of peers’ interactions. In some cases, for example, the reward system could avoid professionalization (in the case of BlaBlaCar, in some countries the platform itself puts a limit to the cost of the ride to define it as a reimbursement and not as an income), or, in other cases, the platform can foster the professionalization of the peers (like Uber through its digital reputation system that excludes those who cannot guarantee a certain quality of the service).
The aim to address all these complexities needs an open-ended and flexible research design that could not be exclusively quantitative or qualitative. That is the reason why mixed methods could represent the right approach for this issue.
What Bind of Digital Ethnography Can I Use?
There is more than one way to engage the digital. We could actually distinguish two different digital ethnographic approaches (Table 1): net-nography, theorized by Kozinets (2009), and the Digital Methods approach (also called digital ethnography) of Rogers (2013).
|Table 1. Net-nography and digital methods.|
Type of method
Type of analysis
As a context
As a meta-source
Net-nography is a research approach based on human interaction in digital and non-digital spaces, applied through the digitalization of different traditional social research techniques (so-called virtual methods): via chat interviews, computer-assisted web interviewing (CAWI) survey, and shadowing observation online. However, focus group or face-to-face interviews could also be applied, following people also in their moments of meeting offline. The approach is used mainly in the study of brand communities and it is often used in digital marketing analysis.
Conversely, the Digital Methods approach considers the Internet as a direct source of methods and techniques and not simply a space of interaction. Therefore, we need to adopt methods intrinsically digital, using the same language (code) and infrastructure of the web. As Rogers said, we need to follow the medium and not people. We need to elaborate specific digital research tools, mainly algorithm based, capable of taking into account how data are still collected and saved routinely by the platforms. For example, in the case of Twitter, knowledge is organized in 140/280 characters and through a very specific technical object that is the hashtag, that plays a specific function of identifying keywords, useful also for monitoring or searching specific contents. Studying how the hashtags are related together on twitter can make you understand how issues and people are related too. In another example, Rogers, analyzing the statistics about the application programming interface (API) of Wikipedia, put in evidence how bots are more active than human users within the platform, despite the rhetoric of user-generated content. So we work mainly with Big Data, the increasingly wide amount of data that we are producing unconsciously simply using the web. For example, look at the case of Inside Airbnb (http://insideairbnb.com/about.html), an independent, non-commercial set of tools and data that allow you to explore how Airbnb is really being used, analyzing publicly available information about a city’s Airbnb’s listings, and providing key metrics about its impact on the housing market in a specific context.
Table 1 summarizes the main differences and similarities between the two ethnographies. Net-nography and digital methods could be always considered as two separate approaches but, as admitted by Rogers, not all the interactions happened exclusively online and the digital traces examined could not be sufficient for a complete understanding of the analyzed phenomenon. From these limitations, the idea to develop a triangulation-oriented methodology (Denzin, 1978) comes out. Triangulation is a term borrowed in social sciences from the methods applied in topography to indicate a process of “convergence” mixing the results obtained through “different” tools for a better “location” (or understanding) of a concrete object/issue. As argued by Venturini, Bounegru, Gray, and Rogers (2018), also the investigated phenomenon of P2P must be to some extent performed in such platforms, but some crucial social dynamics play out prevalently in face-to-face interactions. At the same time, the non-digital places where media are consumed (at work, at home, or in specific events) remain outside the grasp of digital methods and they could play some role. Therefore, they should be assessed with other techniques. An advantage of the “follow the medium” strategy is that the School of Amsterdam managed by Rogers has started to develop a lot of tools, many of them for free, that, studying the APIs of the platforms, that could be useful to realize such analysis (for a complete list, see https://wiki.digitalmethods.net/Dmi/ToolDatabase). However, the digital infrastructures we would want to analyze are not always open and accessible and APIs could be changed every time. The opportunity to combine Rogers’s approach with a net-nographic “follow the person” strategy allows us to partly overcome these limits or find other ways to access the data through the interaction with key actors/gatekeepers like the founders or the main influencers of a digital community.
My research activity is focused on the understanding of peer2peer communities. This model of production is based on a horizontal and collaborative structure based on digital disintermediation. However, some research questions could come along:
- Research Question 1: Can we always talk of a community of peers, horizontally structured (as promoted in the public debate)? Can such a structure coexist with hierarchies and power asymmetries?
- Research Question 2: Can possible asymmetries and hierarchies in P2P production be the “natural” result of a technological design of the platform, or are they related to the motives and profiles of the peers?
To answer such questions, we need to “follow the person” and “follow the medium” strategies at the same time. Moreover, we choose to adopt the most different systems’ criteria in a comparative research design: We have compared two very different cases within the peer2peer phenomenon:
- a community of a digital time bank that uses a specific platform to exchange skills and information using time units as a complementary currency (https://timerepublik.com/?locale=en);
- an Open Data community that brings together a group of active citizens using open license data released by the public administration to produce new smart services (http://opendatasicilia.it/).
The first case is based on the digitalization of a traditionally “offline” peer2peer practice but ruled by a for-profit organization through a single proprietary “closed” platform. The second is more related to the history of the peer2peer activities, like the open source and open software movement, where independent activists interact and participate to collaborative projects through a multi-platform strategy (some are open like Github or Nabble, some are closed like Facebook or Twitter).
The research design was articulated into three main phases (see Figure 1). Each phase is a combination of net-nographical research activities (described in red boxes) and digital method activities (described in green boxes) in a complementary perspective.
Figure 1. Research phases and activities.
- Phase 1—Meet the Community: It represents a preliminary stage to arrange the activities and tools in the next phases. First, we need to start to analyze the history of the community/platform (how long it has worked, how it has changed over time). This analysis can be done either by finding documents/articles online or participating during offline meetings (if they happen), identifying and interviewing some key actors (founders or particularly active members). From the digital ethnography perspective, this phase also involves gaining confidence in the digital infrastructure: How is it organized? Has it an open structure (e.g., How much are conversations and messages visible and accessible to all users)? Are there some limitations in the APIs? Which other digital channels are used alongside (a Facebook page, a Twitter account, a newsletter or a blog, a YouTube channel, etc.)? All these questions above route some practical decisions and action in our research next phases. If the platform is a closed infrastructure (as in the case of the digital time bank), many digital tools available are not suitable for the analysis. Therefore, we need specific technical support to “scrape” the data, creating a specific algorithm to do it. The net-nographic interaction with some members of the community could help us to overcome some doubts or criticism: Do we need some authorization (ethically important)? In the case of the digital time bank, interviewing the founders and making them aware of the aims of our analysis were crucial to having access to data and a concrete look at the digital infrastructure from the inside. Talking with them was also crucial to help the members of our research team to define some specific tools to analyze the data available, deciding to work side by side with the founders.
- Phase 2—Join the Community: We can simply organize some weeks of ethnographic observation online and offline. From a net-nographic point of view, we could use chat interviews, or design a diary of our ethnographic observation offline, trying to elicit the main variables we are interested in (how people interact, what is the language style, how they plan their activities, etc.). We can start from the socio-biography of some members and deepen mainly the motivation and the social drivers to participate in such communities.
At the same time, from a digital methods perspective, we can adopt a quantitative analysis of the members, using the data within the platform (like in the time bank) or in the Facebook group adopted (for the Open Data Community), helping us to understand their social profile. In this case, we can adopt specific tools available online for extracting data and calculate analytics from a Facebook page, like Grytics (https://grytics.com/).
An important activity could also be some content analysis of the messages exchanged, to reconstruct the narratives and confront them with what emerged during the chat or the face-to-face interviews. From one side through direct interaction or a CAWI survey, we could ask directly to the members and then we can cross this information with what emerged from a computational analysis of the messages and feedbacks exchanged within the platform used, adopting also free tools like Wordle (http://www.wordle.net/), that produces word clouds, or TextStat (http://neon.niederlandistik.fu-berlin.de/static/textstat/TextSTAT-Doku-EN.html), that gives you the occurrences of the words used within a specific text. The content analysis is crucial to understand the identity, values, and self-representation of the members. For deepening these kinds of analytical objects, we can use also specialized software, like NVivo or T-Lab, that can provide more sophisticated content analysis (like thematic maps, clustering the main topic debated within the community).
- Phase 3—Work With the Community: This phase allows us to deepen how the community concretely works and the effectiveness of the P2P mode, verifying at the same time how much horizontal and collaborative could be the productive process. From a net-nographic point of view, we can adopt the technique of the mystery shopping, a specific form of shadowing used in marketing to evaluate the quality of a service, so we can make a direct experience acting as a member/user of the community. In this way, we are able to see directly how workflows and information are organized and structured over specific projects: Who are the more (inter)active members? Is there any informal leader?
At the same time, we look at how it is articulated the network of transactions. Digital methods tools available could be helpful in this phase (i.e., Netvizz or Gephi) to simplify data analysis and graphic visualization of the social networks and of the information flows during the P2P production activities. The analysis of the social networks could help us to detect who are the central nodes and the concrete structure of P2P exchanges, using typical metrics of the social network analysis like density (number of ties expressed as a percentage of the number of ordered/unordered pairs) and centrality (the degree to which a network revolves around a single node/actor).
Method in Action
In Phase 1, the team focused on detecting the characteristics of the two platforms: how they were designed, how the interaction with the user was built, what forms of mediation were carried out by the founders, and which data were immediately visible and usable for the analysis and which not. From this, we realized immediately some crucial differences about how the two communities worked, posing different problems for accessing data and understanding the peer production process. Of course, if the Exploratory Phase 1 failed somehow, for example, due to scarce availability of the peers or the managers of the platforms, we would be forced to detect another community for our analysis.
The two cases are somehow polarized in terms of data collection and analysis. The time bank analyzed is based on a proprietary platform. Therefore, major information are closed and the website represents the only “space” of users’ interaction. This makes impossible to use digital tools ready to use like Gephi or Netvizz. In this sense, the preliminary observation phase was central to understanding what was possible and what was not but also helped us to understand the degree of availability of the platform managers to support the research and make them aware of the utility of our analysis. In this case, we had to provide a computer specialist capable of working with the manager of the platform to extract the data files in a .csv format and to adapt the best known program for social analysis like SPSS, STATA, and T-Lab. Explaining how our hypothesis could improve the management of the time bank was the most relevant argument to convince the founders to sustain and support concretely our research activity. This allowed us to involve them in the research activity by accessing the data directly and overcoming the problems of APIs’ closeness.
Conversely, the Open Data Community is a more informal organization that uses mainly public platforms like Facebook (to exchange ideas and best practices), Github (to produce together), and Nabble (to communicate with each other). This facilitates the use of the digital tools already available and the support of the founders was less relevant for our analysis. In this case, we can adopt tools like Github scraper and Github organizations meta-data lookup (available in the website of the Digital Methods School). The community also uses the open platform Nabble (http://www.nabble.com) that allows us to download directly all the conversation online among the community members.
Another difference which emerged during the analysis was that the time bank is sustained by a solidarity public aim but, operatively, it is based on one-to-one interaction/exchanges. Interactivity was more difficult because people were not helpful and trusting in our first attempt to approach them and any request of interactivity needed to be managed individually. The activity of the Open Data Community, instead, is the result of a different value code, the open source principles, and the hacking framework, so they believe in data disclosure of their work. Invoking the Open Source and Open Access principles explicitly (recall also in their online manifesto), we got easier access to their data, interacting smoothly with the members as a group, reading all contents and extracting information from the social network they used without any restriction or limitation.
During Phase 2, we start to contact directly the peers using directing the platform or the social network page of the communities. Interviewing people online was less time-consuming and expensive than traditional interviews, while offline ethnographic observation could be difficult to plan because the community does not interact offline in the same way. For example, the time bankers interact mainly online and rarely decide to meet each other, whereas open data activists organize public meetings once or twice a year. Also in the interaction online, the availability of the two communities was very different. Open data activists are more available to talk about their work. In the time banks, the users showed less availability to be interviewed and also the attempts to interact with them were considered somehow a “spamming” activity. This created some problems also with the founders that did not want to bother their “clients” and we had to arrange with them a softer way to interact with the peers and ask for their collaboration (like offering some incentives in terms of time credits). In this case, the managers of the platform offered to act as gatekeepers in the interaction with the users, creating a sort of selection bias in the peers interviewed.
At the same time, with a technical support and permission of the founders, we start to scrape all the data of the social profile of the users and their message on the platforms. The great amount of data and texts collected has been subjected to statistical analysis and content using different software available (i.e., Grytics or T-Lab). Thanks to this computational social analysis, we were able to profile users of the two communities quickly.
Relating the social profile of the users with the content analysis of the messages, it revealed some strong similarities between the two communities. For example, in the case of the digital time bank, the platform explicitly gives the image of a sharing community. However, looking at the notice board of offers and requests for help, it was easy to notice a certain similarity with online job applications, sometimes translated into a commercial and catchy style. For example, “Hello, Windows is making you crazy??? Before finishing insane in an asylum contact me, I have the right solution for you.” The same aim of visibility and reputation is strongly present in the comments and feedback, also in the Open Data Community. In light of the significant prevalence of figures with skills related to the digital economy, the analysis put in evidence how the majority of them are just young computer or communication experts without a steady job who join the community also as a personal branding tool to promote their skills and expand their relational network.
During the third phase, from a net-nographic point of view, we acted as users of the community by asking them for help or by involving us in a specific task through the platform. We started asking for help through the platform. For example, we placed messages on the bulletin board such as, “Hey, I just signed up and I cannot use this tool … can you help me?” Or, “I would like to use this program, but I would need to have some clarification on this aspect. Could someone contact me in chat?” In the time banks, this was not so easy because the majority of the members are not particularly active and our requests for help remained mainly unsatisfied. Therefore, the collection of information in this phase took longer and it was harder than expected. In the case of Open Data Community, we detect more interactions but always mediated or provided by the same members, the most evident leaders of the community.
These evidences were corroborated by the differences emerged in the computational analysis of the structure of transactions. For example, in the case of the Open Data Community, using tools we can provide some metrics about transactions and help with the network representation. For example, in the case of the Open Data Community, we used Netvizz (http://www.up2.fr/index.php?n=Main.Netvizz) to extract the communications network data of the Open Data Facebook group. These were processed via Gephi (https://gephi.org), a free tool for network representation compatible with Netvizz meta-data (see Figure 2). The darker nodes/spots identify the brokers and the most relevant actors in the communities.
Figure 2. Example—Network analysis with Netvizz and Gephi on the Open Data Community.
In the case of the time bank platform, the support of the managers was crucial to extract the data and use more traditional social network software like NodeXL or UCINet. In this case, we can reveal the same score about centrality and density to compare it with the Open Data Community. The graph also testifies clearly the peculiarities of the time bank community compared with the Open Data Community (see Figure 3). In this case, there are less crucial nodes, but it is more evident how there is a larger amount of non-active/less active users who are totally detached by the rest of the network.
Figure 3. Example—Network analysis on the digital time bank community.
Despite the rhetoric of a peer network, these communities are both asymmetrical and vertical. Their exchanges are rarely based on symmetric reciprocity: who gives something, rarely receives something in return from someone else, but rather there is mostly a distinction of roles between people who produce and people who do not play any active role (simply receive, learn from others, etc.). The ones that produce are the most central nodes of the network from which the majority of communicative flows started and they were also the most influential actors within the community.
Practical Lessons Learned
Looking at the development of the research activities, there are some criticalities and lessons to keep in mind.
The different criticalities in accessing data and in the interaction with the communities represent a methodological issue but are also relevant results of our analysis. The different level of control on information reveals not only how challenging could be studying P2P communities but also the different degree of asymmetry within them.
Data about the profile of users could not be used directly for the analysis, not only because some information was not made visible on the platform at the behest of the founders or who manage it but also due to a lack of involvement of the same peers who were very reluctant to fill in all the available fields and provide detailed information about themselves. For example, in the Open Data Community, the absence of a unique platform interaction diminished the data available about the members’ profiles. These limits may be a hindrance to using only digital methods. Interacting directly with the members in some way could play a compensatory role in the absence of important data about their profiles that could be collected using more traditional tools like a questionnaire.
Ethical and privacy concerns about Internet data should be taken into account, although cautionary forms, such as informed consent, are hardly applicable in a digital environment, also for the elusive boundaries between public and private information. Making an agreement with founders and giving correct communication to the members during the study are crucial, above all in the case of a proprietary for-profit platform, like the time bank analyzed. Choosing a more informal activists group, like the Open Data Communities, is easier for a researcher because they are more available to share their data. They did not ask for specific agreements for their data because they want to be known for their activity and they did not seem to pay so much concern about the publicity of their information.
The dynamic and precarious nature of online data may change just as you are doing your research. Being within the community, establishing significant human contact with it, allows you to keep this process under control and find a strategy to adapt if some changes occur. For example, during the analysis of the digital time bank community, there was a new release of the platform that changed the visibility of some contents and the system of interaction. These changes are results in themselves, but we need to register any changes and take any copy, track, and registration of all the interaction and information collected. Establishing some relevant relations with the founders or members of the community allowed us to recover some data or information that risked getting lost with the new release.
The digital tools applicable are less expensive than other software for empirical analysis and are more suitable for analyzing digital analytics, but they have limits too (e.g., in the amount of information they can process). Researchers need to know and understand these tools closely because they could have a relevant impact, also in the research budget: In some cases, they are free; in other cases, you can increase their functionality by paying a fee (e.g., Gephi or Grytics).
The case study shows how the two approaches could be used in a triangulation research design. They may be individually incomplete or limited in understanding such complex, ambiguous realities like peer2peer networks. The distinction between the two types of digital ethnography risks becoming pretentious in the concrete activity of research. The positive process of mutual reinforcement between the two methods is confirmed in our analysis revealing the dualistic representation of the P2P phenomenon.
An exclusive use of digital tools could generate the risk of a data-driven approach without the resourceful interactions with members that could provide interesting data about peers’ profiles and support the interpretation of the results.
As argued by Venturini et al. (2018), media inscriptions are not created by or for the academic community, and private companies and private groups continue to be the real gatekeepers of their traceability. Without getting the support and the collaboration of the community analyzed, it is very difficult to open up such a “black box,” mainly in the case of the time bank community. That is why a preliminary net-nographic approach could be so helpful, also for a more effective digital methods strategy. The initial net-nographic activities enable us to identify the founders, having the opportunity to elicit intentions and gain full access to the community information. Their support was fundamental to better understand the functioning and the potentiality of the platform.
At the end, it can be argued how cross-platform cases (like mainly for the Open Data Community) are richer in terms of data and tools than a simple one-platform case (like the time bank), despite the heterogeneity of the empirical material and we had to face specific issues about the APIs of each platform (e.g., Facebook and Twitter changed them very often, to limit any data scraping from the outside).
In conclusion, net-nography can give more information about people’s motives and profiles, whereas digital tools give more information about the structures of relations, values, and transactions. This is the best point for their complementarity. Combining net-nographic activity with digital methods could guide researchers to give an overall sense about the functioning and the outputs of the digital collaborative production.
Exercises and Discussion Questions
Take into consideration a peer2peer platform that you know or use and try to answer these questions:
- What are the main characteristics of the digital infrastructures adopted? How do they work? What actions are possible and what are not?
- Could you detect easily some key actors? It is possible to detect people who get more resources? If so, what kind of resources and how do they spend them within the community?
- What kind of data are available directly in the platform? What information is missing?
- What ethnographic activities or digital tools could be applicable to this case?
Netnography: An Overview (Schulich MBA class taught by Robert Kozinets): https://www.youtube.com/watch?v=UWApBu2ERTU
Digital Methods Summer School in Amsterdam: https://wiki.digitalmethods.net/Dmi/DmiSummerSchool