Episode 2 Show Notes

If you enjoyed this episode, here are some more materials to check out:

Susan Halford’s Bio and Twitter

Some of Susan Halford’s Articles 

Tinati, R., Halford, S., Carr, L., & Pope, C. (2014). Big Data: Methodological Challenges and Approaches for Sociological Analysis. Sociology, 48(4), 663–681 (Acces through Paperpile

Halford, S., & Savage, M. (2010). Reconceptualizing digital social inequality. Information, Communication and Society. (Available online

Halford, S., Pope, C., & Weal, M. (2013). Digital Futures? Sociological Challenges and Opportunities in the Emergent Semantic Web. Sociology, 47(1), 173–189. (Access through Paperpile

Staab, S., Halford, S. & Hall, W., (April, 2019). Web Science in Europe: Beyond boundaries., Communications of the ACM. 62, 4, 74-79 (Available online)

Halford, S., Pope, C., & Carr, L. (2010). A manifesto for Web Science. (Access through Paperpile)

Some of Susan Halford’s Books 

Halford, S., Savage, M., & Witz, A. (1997). Gender, careers and organisations. London, UK: Macmillan Publishers Limited.

Halford, S., & Leonard, P. (2001). Gender, power and organisations. Palgrave Macmillan.

 

Episode 1 Show Notes

If you enjoyed this episode, here are some more materials to check out:

Jim Hendler’s Bio and Twitter

Some of Jim Hendler’s Articles:

Berners-Lee, Tim, Hall, W., Hendler, JA, O’Hara, K., Shadbolt, N., Weitzner, DJ. “A framework for Web Science.” Foundations and Trends in Web Science, vol. 1, no. 1, 2006. Gale Academic OneFile

Hendler, J., Shadbolt, N., Hall, W., Berners-Lee, T., & Weitzner, D. (2008). Web science: an interdisciplinary approach to understanding the web. Communications of the ACM, 51(7), 60–69. (Access through Paperpile)

O’Hara, K., Contractor, N. S., Hall, W., Hendler, J., & Shadbolt, N. (2013). Web Science: Understanding the Emergence of Macro-Level Features on the World Wide Web. Foundations and Trends® in Web Science, 4(2–3), 103–267. (Access through Paperpile

Shadbolt, N., Hall, W., Hendler, J. A., & Dutton, W. H. (2013). Web science: a new frontier. Philosophical Transactions. Series A, Mathematical, Physical, and Engineering Sciences, 371(1987), 20120512. https://paperpile.com/shared/Okfz6s

Horowit-Hendler, S., & Hendler, JA, January 14, 2020) Conversational AI Can Propel Social Stereotypes | WIRED 

Jim Hendler’s Books:

Semantic Web for the Working Ontologist: Effective Modeling for Linked Data, RDFS, and OWL 

Social Machines: The Coming Collision of Artificial Intelligence, Social Networking, and Humanity 

Web Science: Understanding the Emergence of MacRo-Level Features on the World Wide Web 

Foundations and Trends: A Framework for Web Science

Episode 27 Transcript

Deborah McGuinness: So I’m kind of famous for this wine and foods ontology that I literally did in my very early days, when I was taking a graduate class. You know, I had to write an expert system that would make a recommendation. And so I said, “Okay, well, what am I passionate about?” Well, I happen to be passionate about wine and food. 

Noshir Contractor: Welcome to this episode of Untangling the Web, a podcast of the Web Science Trust. I am Noshir Contractor, and I will be your host today. On this podcast, we bring in thought leaders to explore how the web is shaping society, and how society, in turn, is shaping the web. 

My guest today is Deborah McGuinness, who you just heard talking about creating ontologies for computers. These ontologies can help us pair the perfect glass of wine with our steak – or develop personal health management plans. Deborah is the Tetherless World Senior Constellation Chair and Professor of Computer, Cognitive, and Web Sciences at Rensselaer Polytechnic Institute, or RPI, in the United States. She is also the founding director of the Web Science Research Center and is a fellow of the American Association for the Advancement of Science.  She’s also the recipient of the Robert Engelmoore  Award from the Association for the Advancement of Artificial Intelligence. Welcome, Deb. 

Deborah McGuinness: Thanks for that wonderful introduction. It’s wonderful to be here.

Noshir Contractor: So I have to say that the title of your chair intrigues me. Tell us more about the Tetherless World Constellation.

Deborah McGuinness: Well, a constellation is a feature that our university president put together. Usually, universities have one professor in one area and then don’t have overlapping professors. But her idea was to bring together constellations or groups of stars to make significant contributions in carefully chosen areas. So the original plan was for this contribution to be in, kind of, mobile computing and the future of the web. And then we modified that some to be really, less about mobility, and more about the future of how we work with tetherless communications, as well as tethered communications. I typically refer to that as the future of the web.

Noshir Contractor: That’s a fascinating vision.

Deborah McGuinness: Yes. One of the reasons I left my position directing the Knowledge Systems Lab at Stanford University was because of the interdisciplinary nature and strengths of RPI. I find that my most fascinating work is at the intersection of communities. And actually, that kind of is a perfect tie in to web science, because I don’t think I know of any discipline that’s more interdisciplinary than web science.

Noshir Contractor: That’s absolutely where I wanted to go next, given your interest and skills at being able to navigate interdisciplinary work. You’ve been one of the pioneers in this area, so take us back a little bit to how you got started in the area of web science.

Deborah McGuinness: Well, you know, I’ve been working in knowledge representation and reasoning and the languages and environments to model and reason with knowledge for my entire career. So in the early days, that was languages literally for the Semantic Web, but it was before we called it the Semantic Web. So it was languages that let you get to the implicit information from the explicit statements and were computationally amenable to working with computers. Then, when I went to Stanford, we did a lot of really big, often government-sponsored projects, to do ontology-enabled, or encodings, of meaning. So we did large applications that understood what terms meant, because we encoded those meanings in ontologies. And so, you know, for my entire career, I was making these languages and I was making these environments that were making kind of smart recommender systems or smart data portals. And then when I went to RPI, we kind of took that to another level and made it even bigger. And so when web science was emerging, they needed people who had languages that could not just encode how you’re going to write something on a page, or how you’re going to link one page to another, but actually, what those terms in the page mean. And then also, as I mentioned earlier, I’m really just fascinated by interdisciplinary work. And this just seemed to be a complete and total perfect match for that.

Noshir Contractor: So I’m going to take you back and try to help unpack some terms that you  use in the context of web science, for somebody who may not be familiar. You use the words knowledge representation, language, ontology. By language, you mean computer languages, I guess? 

Deborah McGuinness: Yes, I typically mean languages for computers. We might focus more on markup languages, so languages that help you annotate terms that you’re going to see in a description of something. And that has initially been, well, “I’m going to write this in red in a particular font.” You know, I’m kind of famous for this wine and foods ontology that I literally did in my very early days when I was taking a graduate class. You know, I had to write an expert system that would make a recommendation. And so I said, “Okay, well, what am I passionate about?” Well, I happen to be passionate about wine and food. Later, we called it The Semantic Sommelier. Someone would say “I’m having steak for dinner.” So we also had some rolls in the background that said, with a meat dish without a spicy sauce, we might have a red, full-bodied, dry wine. Once we’ve got that markup, and I’ve got, say, Forman Cabernet Sauvignon in my database, and then we can retrive, not just that particular wine, but we can also retrieve the description of the wine. So let’s say I’m in a restaurant and they don’t have that wine. I can say to the sommelier, “Well, do you have any other red, full-bodied, dry wines?” And then they could list off the ones that match that description.

Noshir Contractor: And so in the context of a recommender system for a search, you would say, “Show me wines that have a certain quality.” And if the information about the wine is encoded in a markup language that includes those characteristics, then rather than just search for the word Sauvignon Blanc, you will now be able to get a Sauvignon Blanc recommendation based on certain attributes of the wine that have been encoded into the website. Can you amplify it in a more accessible manner than I just fumbled through that?

Deborah McGuinness: Oh, well, actually, I thought you did a pretty good job. So I created this wines ontology and this foods ontology. I made it public. And I was also very active in the description logic community. And so at the time, anybody who was doing work in description logic always looked around for a way to test their work. So almost everybody who did a thesis in the 80s and the 90s – and I think they’re still doing it – tests on some version of the wines and foods ontology. And then later, when I was very active in the World Wide Web Consortium’s standardization effort to make recommendations for languages for the web for encoding meaning, we also wrote a guide to how to use the language, and we used a version of my wines and foods ontology.

Noshir Contractor: That’s a great story. I recall you at web science summer schools and web science workshops for the Web Science Conference talking about these issues and getting excited about it. You mentioned that one of the things that the World Wide Web Consortium has tried to focus on is creating these standards. And the example you gave early on in terms of markup language for things like, you know, whether you want something in red or whether you want a particular kind of font – those kinds of markup languages are extremely standardized, I would argue, around the world. How do you assess the extent to which ontologies have been standardized and embraced and adopted on the web?

Deborah McGuinness: You know, that’s a really interesting question. To get a very detailed, precise description that you can really make critical decisions based on, like how you should treat somebody in a healthcare setting, for example, you really better have somebody who understands the domain – so in this case medicine – very well, and understands what the language that you’re going to encode that meaning in is capable of doing, and then further understands what the reasoning systems that are going to use that encoding can do with it. And that’s a couple of skills that not everybody has put together. So the ontologies see great success when people really understand what they can do. And then they start to see some disillusionment, when people understand how hard it is to get them really well done and very precise. So they’ve taken off, and they’ve kind of gone through the Gartner trough of disillusionment maybe a couple of times. And the reason, I believe, they’re on the upswing, again, is because as, you know, the world knows, machine learning has exploded, and the datasets are getting larger and more accessible. The machine learning community and the extraction community and the embedding community is starting to realize that if they get a little bit of semantics. They can start to tell their algorithms how to use the meaning and get even better results.

Noshir Contractor: So most people have heard of things like tagging on the web, and in a sense, tagging is a form of ontology, but it’s a crowdsourced form of ontology, so it doesn’t have some of the rigor that you’re talking about.

Deborah McGuinness: Yes, so you can see efforts like ConceptNet. You know, in the very early days, MIT just said, “put a bunch of sentences together.” And so those sentences have words in them. And you didn’t have the connections between them, and you didn’t give people information about how to do it. But if you have even small synonyms, like automobile, or auto and car, we might call them synonymous, then you can make that link. You can start with just simple bits or small amounts of semantics from just making relationships between synonymous terms. But then you can also start to make more sophisticated relationships, like wines might have a color associated with them, and they might be made from a particular type of grape. And then, over time, when we’re trying to make more sophisticated recommendations, such as, say health advisors, you might start to have information about when your blood work is out of a range, say for a glucose measurement, which is related to diabetes. You might want to target an intervention with a drug that can help to get your blood work back into the right range.

Noshir Contractor: You have actually been working for a long time, specifically, applying semantic web concepts in the area of health. Tell us a little bit about where things started in that area and where you think there is potential for further advancement in terms of health web science.

Deborah McGuinness: That’s really, I think, an up-and-coming area. One of my large projects right now is from the National Institute of Environmental Health Science. And it’s to create a data portal. And I’m in charge of the data science piece of that, where basically, I need to come up with the ontology or the terminology that allows us to integrate data. And in this case, it’s about exposure, like whether your mother might have been exposed to heavy metals at a time during your development, where that was not good for you. So it captures information about exposure and health outcomes. So that in itself, I think is critically important, because we can collect data, we can integrate it in the way that you could pool the data together and do studies on larger numbers of people, which might let you have more confidence in the outcome of any statistical correlation that you’re seeing. You’re only going to be able to do that integration and harmonization if you understand what terms mean. You typically get some data that comes with a data dictionary. So when I see education or ED1, that means that the mother went to junior high school as her highest level of education, which allows us to figure out whether I’ve got studies whose highest education level was college or beyond. And that lets me pull data together and look at more studies that might be compatible to put together. So that’s kind of step one. But then, the next step, the one that I think I’m even more excited about, is personalized health, and, you know, precision medicine, and where I can enable people to help themselves. I want to help everybody in some patient’s ecosystem. So I want to help the person to make wiser choices when they’re not going to their doctor. And I want to help the medical professional make suggestions that are aware of a person’s individual status. So if I’ve got some blood work and one of those numbers is out of range, we can see whether there’s some intervention that might be amenable to me, that I might be able to make a small lifestyle change before I start to make a medication change. But I think a future is something like a personal health knowledge graph. So a graph has nodes and arcs.

Noshir Contractor: What would be an example of that?

Deborah McGuinness: So I might have a personal knowledge graph about Deborah. So Deborah’s a node. And then she’s probably got a lot of arcs coming out. One of the arcs might be demographic information. We might have information about my age. We might have information about my location. And so all of those are going to be arcs that are going to have values. But then you might also feed in the information from the monitor that I wear on my wrist that captures my motion, my steps, and actually also my sleep score. And you might have information from my smart scale, for what I weighed this morning. And then you might actually also be able to track that over time.

Noshir Contractor: Okay, so we’ve got the knowledge graph, I have an idea. How does this then translate into helping you leverage this personal health knowledge graph that you just described?

Deborah McGuinness: Yeah, so I want my personal health knowledge graph appear to be locally available, you know, through probably my smartphone. And maybe I’m allowing it to give alerts. So I could actually also let it give me alerts, when I’m near a healthy venue, when it’s close to the time that I might eat when I’m away from home. I’m not aware of anything that does that today. But there’s probably some startup doing it somewhere.

Noshir Contractor: And so one of the ways it knows whether a place is healthy or not, is because they are using an ontology system, where they label themselves as “I am healthy.” Is that how this would work?

Deborah McGuinness: Another way of doing it is labeling the site with your menu items. A lot of sites these days have some kind of nutritional information about the things that they’re serving. So you could have a query that says, “Does this restaurant expose that it’s got items for sale that fit particular characteristics?” So let’s say under a certain number of grams of carbohydrates, maybe that have ethnic aspects, you know, maybe I want Indian food that meets those characteristics or something. So it’s not just that the restaurant says, “I put a label of ‘healthy’ on my restaurant,” but they expose information that lets the smart query ask the right kind of questions that are personalized to me.

Noshir Contractor: You mentioned a few minutes ago that you haven’t seen many of these applications out there. Why do you think we haven’t seen it? And why do you think now is the time that perhaps a startup is working in this area?

Deborah McGuinness: That’s a very good question. I think we’re poised to do that now, maybe better than, say, 10 years ago. There’s way more open data all over the world. I think it’s more common today that restaurants have this kind of information. And there’s also potentially more awareness. There’s more awareness that being overweight or metabolically unhealthy is a tremendous risk. It’s a tremendous risk for a lot of diseases. But it’s definitely a risk for COVID. 

Noshir Contractor: So you’ve given us a lot of food and drink for thought in talking about how web science is contributing to our health. You co-authored a book on this back in 2014. And as you look at that now, seven years moving forward, where do you see us going in the next few years in terms of health web science?

Deborah McGuinness: You know, I think we might have been a little bit early on health web science then. I think there was less acceptance in the medical community. I think these days, medical professionals, they have too big of a workload; they don’t have enough time. I think we’re starting to see a lot of apps or services that they’re beginning to trust because those apps or services are vetted, and they’re showing with evidence basis that they’re making good recommendations that the doctors can at least somewhat rely on. You don’t want your app to be taking over what your doctor did for you all the time. But you want that to be helping. And then at the same time, I think more and more we’re seeing the regular Joe looking for tools and applications that can help them lead a healthier, high-quality life. I don’t want to rely on having to go to my medical doctor every week, because, you know, nobody can afford that time or money wise. I want to be able to have tools that help me to do that in my day-to-day life. And you’re also seeing a push from technologists who realize that we’ve got a lot of foundational hardware, a lot of foundational data, what appears to be unlimited compute power. So the time is kind of ripe for these applications to take hold.

Noshir Contractor: Which brings us back full circle to the idea of web science being so interdisciplinary, because this is a classic example, as you’ve described it, of people like yourselves who come from a background in computer science and web science having to work very closely and gain the trust of, in this case, the professionals in the health community, as well as the laypeople, the general public. And until you have that sort of connections and trust amongst these various stakeholders, you’re not going to see health web science reach critical mass.

Deborah McGuinness: Yes, that’s exactly right. 

Noshir Contractor: And I can imagine that many doctors might be initially threatened or suspicious of these technologies, because, for example, there is quite a lot of chatter these days about whether the notion of you going for an annual physical checkup is somewhat antiquated. Why go once a year when you have all these health monitoring devices that are monitoring a lot of your vital statistics 24/7?

Deborah McGuinness: Well, I don’t think any of these tools are going to replace the need for a medically trained professional, I think they’re just going to augment the professional. I don’t think there’s really any replacement for a truly caring, trained medical professional seeing you at least now and then, and certainly helping in a time of crisis.

Noshir Contractor: I’m sure you have reassured many physicians who might be listening in on this podcast. Again, Deb, thank you so much for taking time to give us a lot of insight about how much more we can do in the area of health web science than maybe a few years ago when we were fascinated by websites like WebMD and so on. There’s so much more that we could be doing, and you have certainly been one of the thought leaders and visionaries in this area. And thank you again for taking time to talk with us about some of where you see health web science going.

Deborah McGuinness: And thank you very much for your insightful questions. And I look forward to continuing this discussion on the web and off.

Noshir Contractor: Absolutely. Untangling the Web is a production of the Web Science Trust. This episode was edited by Susanna Kemp. I am Noshir Contractor. You can find out more about our conversation today – while enjoying a highly recommended wine – in the show notes. Thanks for listening.

Episode 26 Transcript

Sandra González-Bailón: Even though there’s a lot of political organizations and a lot of politically motivated individuals who are trying to organize the next big thing to pursue their cause. It’s very difficult to predict social dynamics. I don’t think that’s depressing. I think that’s actually a reminder that nothing in social life is fully determined.

Noshir Contractor: Welcome to this episode of Untangling the Web, a podcast of the website’s trust. I am Noshir Contractor, and I’ll be your host today. On this podcast we bring in thought leaders to explore how the web is shaping society, and how society in turn is shaping the web. 

My guest today is Sandra González-Bailón, who you just heard talking about the unpredictability of political mobilization and how we can study that on the web. Sandra is on the faculty at the Annenberg School for Communication and affiliated faculty at the Warren Center for Network and Data Sciences at the University of Pennsylvania. Her research lies at the intersection of network science, data mining, computational tools and political communication. Her articles have appeared in journals such as the Proceedings of the National Academy of Sciences, Nature, and Social Networks, among others. She is the author of Decoding the Social World, published by MIT Press in 2017, and was also the keynote speaker of the ACM Web Science Conference in 2019, in Boston. Welcome Sandra.

Sandra González-Bailón: Hi Nosh, thanks for the invitation to join! It’s a pleasure to be here.

Noshir Contractor Thank you again for joining us today. Sandra, I’d like to believe that you are amongst the first of what I would call “bona-fide” web scientists, who began your career looking at the social world through the lens of the web. And I want to begin by asking you to help dissect the title of the book, Decoding the Social World, and the subtitle, Data Science and the Unintended Consequences of Communication

Sandra González-Bailón: This idea of unintended consequences takes root in the fact that often in network systems, there’s no one who’s really in charge of the dynamics that take place in those networks. And so, you know, you might want to create a message that would go viral, but it’s really not up to you to allow that message to go viral. That depends on what other nodes in the network other people in the network will do. We can try to reverse engineer and unpack how those dynamics emerge and take place, and get a better sense as to how they happen, when no one really is designing those dynamics or in charge of determining how those dynamics emerge.

Noshir Contractor Could you give an example from your own research where something that happened was not intended or not anticipated? Some collective behavior, for instance?

Sandra González-Bailón: A lot of my applied research in which I use social media data analyzes episodes of collective effervescence, right? Moments where the whole is more than the sum of the parts, when suddenly you have a critical mass of people who are communicating about a particular issue or a particular topic or organizing around a particular movement. And of course, we hear a lot about the episodes that are successful, the moments where those processes of collective effervescence result in massive mobilization on the streets, in massive protests. And what we often forget is that for every successful instance of massive mobilization, there are many examples of unsuccessful attempts at mobilizing that critical mass. And so even though on one level, those episodes of political mobilization are intended — of course, there’s a cause to fight for — what is unintended, is the level of success. You don’t always have control over how many people will be retweeting your hashtag, or how many people you will convince to mobilize and take to the streets. That’s how I understand unintended consequences. And it’s really a shorthand to refer to the lack of control that we often have on collective dynamics.

Noshir Contractor Is the implication then Sandra, that we can only explain some of these major events in retrospect? That we are incapable of engineering events such as these? That might sound a little depressing.

Sandra González-Bailón: It depends on how you think about it. Because to me, it’s the opposite of depressing — if we could anticipate and predict it would mean that we live in a deterministic world, right? For me, unintended consequences offer really a window through which freedom and agency can squeeze in. 

Even though there’s a lot of political organizations and a lot of politically motivated individuals who are trying to organize the next big thing, to pursue their cause. It’s very difficult to predict social dynamics. I don’t think that’s depressing. I think that’s actually a reminder that we are not determined, and that nothing in social life is fully determined. And I think at the same time, there’s also value in trying to understand how these things happen.

Noshir Contractor Let’s talk little bit more about some other examples from your book, because you also look back at the historical developments in technologies. And I think that that’s really important, because we tend to be caught up in the moment and make it sound like what we are seeing today is truly different from what has happened in the past. In many ways it is, but what has your historical insights about technology taught us about how to better prepare to understand the web today.

Sandra González-Bailón: I start the book with a preamble where I explained that the book really is a story of recurrence and change, right? The recurrent aspect is that for some reason, we keep on using the same metaphors to refer to these technological breakthroughs, right? In the 1800s, 1900s, we talked about the telegraph as the global nervous system of the planet, right, which is exactly how we talked about the internet these days. And so some things don’t seem to change too much, right? Human imagination seems to be like traveling the same old cliches, when it comes to coming up with metaphors. But what has changed and definitely has changed a lot is how we use the data that we generate through the use of those communication technologies to try to understand their impact on society better. 

Like the fact that people suddenly could communicate across continents, via the Telegraph, it definitely shrank the world, it reduced social distance immensely. But they couldn’t get the sort of data we can get today. There’s a lot of progress when you look back, you know, the sort of questions that we can answer now, with that data, we have many more answers than they could aspire to have back then the metaphors haven’t changed, the answers to the same questions have improved immensely,

Noshir Contractor: One of your recent articles published in PNAS in 2020, focuses on how exposure to news grows less fragmented due to mobile access. What is it that you found in this article that surprised you?

Sandra González-Bailón: The findings in that article are counterintuitive in the sense that the prevalent view around how people get exposure to news suggests that technologies are entrapping classes of like-minded people. In the context of news consumption, this means that you would only consume those sources that will reinforce your opinions, and your kind of predispositions. And what we find in the article goes against those claims in the sense that, we find that rather than narrowing down your news, digital technologies, and in particular mobile technologies are sort of widening of the range of new sources that you consume. 

Digital trade data is very rich, but it’s very different ways in which we can collect that data. And so one of the things that we do in that paper is to incorporate it into our analysis, mobile access to news, which changes the sort of conclusions that you can draw from the analysis compared to what you would find if you only tracked people through their desktop computers.

Noshir Contractor: One of the things that was interesting about this study was the kind of data that you use. Can you talk a little bit about the data, which was a five year time window involving 10s of 1000s of panelists. How does that compare with pure digital trace data?

Sandra González-Bailón:  Yeah, this is a data set that’s compiled by a media measurement company. And it does rely on log data, right. So they do have mechanisms to track the behavior of their panelists when they are aligned. And what’s a little bit novel is the fact that they also track what people do on their mobile devices. We all know, intuitively and through our own personal experience that everything has gone more mobile now, right? And so from a scientific or measurement point of view, if we only track what happens in our computer, we are missing a huge part of online activities.

The other part is that, of course, the interface matters. The medium that you use as a content provider to deliver that content matters as well. Someone wrote a few years ago that the web was dead because of the rise of the app. So apps are like walled gardens, right? they are not this open, massive open space that the web is. The reality also is that online activities are more fragmented. One of the things that we also kind of point at, you know, in the conclusion of these of these papers that you mentioned, know is that we are also at a crossroads here, where we have to decide how we’re going to collect data moving forward, how we’re going to guarantee access to those walled gardens. Otherwise, even though we have more data than ever before, if only a handful of people can get access to that data, then there’s really not much difference, right? 

Noshir Contractor: So there’s a paradox between the fact that there is a lot of data being generated, but a lot of that data is then not available for access to researchers like yourself to help reach these conclusions. I’m still puzzled by what seems to be a counterintuitive conclusion. 

You’re saying in the study that as people access news and other items through mobile devices, that we grow less fragmented, yet society seems to be increasing, at least in terms of popular media coverage, increasing in our fragmentation. How do you reconcile the findings of your research with what we are told in the public about us getting more and more fragmented? Does the public media have it wrong?

Sandra González-Bailón: Well, the public media often has it wrong, but in this particular case, it’s a matter of what’s the level of analysis. What the paper does is analyze exposure to news and political content, but we’re looking at the information sources that people get exposed to. We don’t really talk about the processing of that information. What are you going to do with that information? Because sometimes, you know, you may read Fox News, that doesn’t mean that you are agreeing with Fox News. Exposure to content and information is one of those things that was very difficult to measure in the past. Now we have more fine, granular data to measure exposure, but it’s just that is, as just that, it’s just exposure to content. 

Of course, the second part of the equation refers to the effects of that exposure, right? And so I think sometimes in the public discussions, we conflate a lot of things, right? We may be living in a very polarized society, but don’t blame digital technologies. What we see when we look at how people use those technologies to gain access to information is that their media diet are actually pretty rich and diverse. Now, of course, you know, that’s not the only reason why we might have fragmentation.

One of the beautiful, but also frustrating things of research is that it is very specific to very specific questions, and then you can extrapolate only so much from that research, right? Reality is very complex, there’s many moving parts. The conclusions that we draw from our analysis refer only to exposure to political content. It is not true that people get exposed only to a very narrow set of sources that they might agree with already, right? The number of sources you get exposed to amplify widens up over time part, especially due to mobile technologies. Now what the consequences of that are, it’s a second question. And there might be another paper following up on that first paper where we consider that question.

Noshir Contractor: So to summarize, what I’m hearing you say is that we can’t blame the web, for not giving us wider exposure. What we do with the wider exposure is different in terms of polarization and creating echo chambers.

Sandra González-Bailón: Yes. And of course, again, the web is one network, one layer in a very complex media environment we inhabit these days. What happens within Facebook is a different world. Again, Facebook is a walled garden, maybe in Facebook, we have this phenomenon of us getting trapped into echo chambers and ideological bubbles. Twitter is another layer, right? What happens on YouTube? These are kind of pockets outside of the web, very prominent pockets of activity. And maybe there, the answers would be different, right? What we analyze in that particular paper refers only to web activity, to what happens in these in this vast public open marketplace of idea called the web, right? Apps are a different world.

Noshir Contractor: One of the things that I also want to talk about is an even more recent publication of yours again, in the Proceedings of the National Academy of Sciences, focusing on the role of bots, versus verified accounts, in terms of dealing with contentious political events, can you tell us a little bit about what you found in that study?

Sandra González-Bailón: Yeah, so that study was also motivated by journalistic accounts of how much influence bots or these automated accounts that are engineered to meddle with organic communication in social media? Of course, the very first question you have to answer is, how are you going to identify robots, right? And so we capitalize on developments in automated classifiers, that, you know, that use a number of features to predict whether an account is bought or not. And then we also look at the verified feature that Twitter itself uses to identify accounts of public interest. 

And so we come up with three categories of Twitter accounts, we called the media accounts, and these are automated. So these are accounts that our classifier suggests are automated, but that have also been verified by the platform. Legitimate news sources, oftentimes use bots to push notifications on your feed in a systematic fashion, right. And then we have the bots, which are accounts that our classifier suggests, are automated, meaning non-human, but that are also not verified by the platform. And then we have the rest, which are what we call the human accounts. 

One finding which is consistent with what prior research suggests there is a huge volume around the events that we analyze, comes out of these automated accounts. But we also find that the verified media accounts are more central in the networks of information flow, meaning they were the reference points during these events. So people going on Twitter to find information about the protests that we analyze, tend to pay more attention to these very verified accounts and amplify those verified accounts more often.

Noshir Contractor: So the two movements that you studied, the one was the Yellow Vest Movement in France in 2018. And then the act of civil disobedience in the Catalan referendum in 2017. Tell us a little bit about why you chose these two events and examples of where a bot was more or less central than verified accounts in these two movements that you studied.

Sandra González-Bailón: The analysis that we run, really paying attention on the overall patterns, the aggregated patterns, right? And then this is partly to minimize cherry picking episode, where the one account that turned out to not be a verified account, got so many retweets. So we really look at the overall patterns of visibility. 

And the reason why we decided to focus on these two events was, partly because we got a lot of press coverage, suggesting that these manipulated accounts were exacerbating conflict. And when you have episodes where people are protesting on the street, it’s highly volatile. And so if bots had that kind of influence, that would be something that we should know. It’s dangerous, right? Like they could really take things for the worst. 

The other reason, and that was more in terms of research, is a lot of the work that is done in this area focuses attention only on the US and, or the Anglo Saxon world. That means the kind of knowledge that you can gain of how generalizable these dynamics are gets restricted, but also in terms of the methods and the tools that we have at our disposal, right, one of the things we do in the analysis is look at the sentiment or kind of the emotional content of these tweets. And many of these tools are designed for English only. So what we did in the paper is adapt one of these tools that allow us to extract sentiment. We adapted to Spanish, Catalan and French. And I think there is value in doing that.

Noshir Contractor: So one of the things that I’m noticing, more broadly across these two papers that we’ve been discussing, is that in both of these cases, your findings are suggesting that the general sentiment of blaming the web is not always well-founded. Is that a sentiment that you see broadly or just happens to be a coincidence based on the two papers that we’ve been discussing here? And what are the implications of your sense about the role of technology in these two cases,

Sandra González-Bailón: You echo the right spirit of those papers. I think there’s a lot of things we could blame social media companies or technology companies for, but we have to blame them for the right things, not for what we think they’re doing wrong. So I’m hardcore believer in evidence-based decision-making. So let’s pay attention to the evidence, so that we can think about how to redesign these platforms, to make them do what we as a society want them to do. That discussion has to be based on the best available evidence and technology companies need to be held accountable, because they do have an impact in the democratic and the democratic process. But those discussions should be based on the best available evidence and not on headlines in newspapers.

Noshir Contractor: So I’m going to put you on the spot, Sandra, given that you are a scientist, this area, and you pointed out that you would like to be able to influence change in media platforms? Do you have a pet suggestion that you think based on your own research, you would like to see media companies do?

Sandra González-Bailón: Yeah, I think they do have a role to play in improving the quality of the information environment that we inhabit. They are not the only ones, right? Some of the old players have a responsibility to and they don’t always do what they should be doing. And I do think that these companies have changed some important parameters. Algorithms for example, the way they reinforce certain patterns, or the way they float certain information, that’s a new parameter. And we don’t fully understand how those algorithms are shaping everything, including the democratic process. This is not to say that we need to get rid of algorithms, but we have to understand the impact that those algorithms have. 

These companies often operate on the basis of business models that prioritize certain parameters, that get optimized in these algorithms. And those parameters are not necessarily the quality of the online conversations or the quality of our democracy. I do believe that these platforms are not doing their best in facilitating or encouraging the sort of healthy conversations that healthy democracies require. And we can do better in that respect.

Noshir Contractor: And so the algorithms are optimizing for things, but not necessarily the things that are good for society, it might be optimizing for things that might be serving the interests of the platforms and a business model, as opposed to democracy, for example.

Sandra González-Bailón: Absolutely. And again, if we were to get together and start having a conversation about what’s good for democracy, I’m sure we wouldn’t agree, right? I think it’s also unfair that sometimes we post demands on these companies that we as a society don’t have answers for. I do believe that we can settle on a common ground that has enough agreement to facilitate that kind of answer. And then once we have that answer, we can work hopefully together to make sure that we guarantee that we optimize for those parameters.

Noshir Contractor: I want to zoom out a little bit, Sandra, and ask you, where do you see the most important research that needs to be done in terms of the web, moving forward, either by yourself, or by the web science community? What are the big questions that you see, we need to be addressing 

Sandra González-Bailón: So one question relates to the impact of these technologies and democracy in general. And I think there’s been a lot of emphasis on data sharing, sort of forcing many of these companies to offer data. That’s important, but I think more important than having access to the data, is defining the questions. It’s very difficult to come up with a data dump, it’s got to every question that any researcher may have. 

One of the big challenges for us and one of the priorities we should all be focusing on is, what are the main questions that these companies, but also the research community should be working on to answer? And what are the kind of data infrastructure and research infrastructures that we need to be able to answer those questions, because this requires collaboration and creating bridges across labs and teamwork. And I’m not sure that academic institutions are designed to encourage that kind of teamwork. I think that’s one of the big challenges we face as well. We have to facilitate that kind of collaboration. And I think that’s where we should be putting all our energies on.

Noshir Contractor: I want to thank you so much, Sandra, for joining us today. You’ve persuaded me about the lack of determinism in our models being a good sign for society. And you’ve also talked us through two examples in your own recent research, where people tend to blame technology for certain kinds of phenomena, whether it’s contentious political events and bots or whether it’s increasing or decreasing polarization. And your research has convinced us that it’s one part of the puzzle ,but blaming technology is not in and of itself, valid, in the case of at least these two studies that you’ve done. So thank you again so much for joining us, Sandra. And I look forward to seeing your continued research in these areas.

Sandra González-Bailón: Thank you, Nosh, and I look forward to meeting you in-person again soon.

Episode 25 Transcript

Nigel Shadbolt: The reason the web was taking off a scale, the reason we have these extraordinary constructs emerging, like the blogosphere, was that human beings were involved — human beings who were incentivized to participate, to share and join information together.

Noshir Contractor: Welcome to this very special 25th episode of Untangling the Web, a podcast of the Web Science Trust. I am Noshir Contractor and I will be your host today. On this podcast we bring in thought leaders to explore how the web is shaping society, and how society in turn is shaping the web.

My guest today is Professor Sir Nigel Shadbolt, one of the founders of web science. You just heard him talk about why studying the web goes beyond the technical. Nigel is Principal of Jesus College and professorial Research Fellow in Computer Science at the University of Oxford. In 2009, he was appointed, along with Sir Tim Berners Lee, as information adviser to the UK Government. This work led to the release of many 1000s of public sector data sets as open data. He is the chairman and cofounder of the Open Data Institute, and a founder and chief technology officer of the ID protection company, Garlic. He is a fellow of both the Royal Academy of Engineering and the British Computer Society, and was knighted in 2013 for services to science and engineering. Nigel has researched and published on topics ranging from cognitive psychology to computational neuroscience, and the Semantic Web. Welcome, Nigel.

Nigel Shadbolt: Thanks, it’s great to be here.

Noshir Contractor: Take us back to what prompted you and your colleagues to come up with the idea of taking what was then a relatively young web, and recognizing the importance of creating a discipline called web science. 

Nigel Shadbolt: I’d began my career a long time ago, my PhD was in artificial intelligence University of Edinburgh in the 1980s. And then I’d spent 15 years building an AI group within a pond of psychology. I’ve really found that extraordinarily enriching, you know, to understand the basis of human cognition to understand, if you’d like, the basis of the existence, proof of intelligent systems. And toward the end of my time, at Nottingham, I got a series of PhD students who had been looking at this new explosive area of the web. And the web appeared as this extraordinary construct that suddenly brought data at scale together. So that was a turn for me. 

And when I moved down to Southampton and joined Wendy Halls group, we were very much seeing the web as a decentralized data asset, as well as an extraordinary concept for combining human ingenuity. We can come to that later. 

That project that we worked on together, through Directed Word, was called advanced knowledge technologie — the act project — that bought leading universities in the UK together to look at how we can exploit this emerging, construct — the web, and tools from knowledge engineering and elsewhere, machine learning, as it was then, to try and understand data and knowledge at scale. And that brought me into contact with Tim Berners Lee. It was our interest around the Semantic Web that really brought us together but we, in contact with people like Jim Hendler, who I’d known earlier, because he also had prior history in artificial intelligence, and Danny Weitzner. So we were sat there sharing ideas around the Semantic Web the challenges therein. and the more we got into that, the more we felt there was an itch that needed to be scratched, which was all around this idea that, too often the challenge became reduced to one simply have technical architectures, where of course, in fact, the reason the web was taking off a scale, the reason we have these extraordinary constructs emerging, like the blogosphere, was that human beings were involved — human beings who were incentivized to participate, to share and join information together. And as we shared our experiences — Danny with a background in law, Jim with a background in AI and cognitive science, somewhat like myself, Tim and Wendy, we realized that there were all sorts of aspects of what we were trying to understand in the web that would never be solved, by simply appealing to the technical standards of web servers or web browsers. 

So this, this immediately suggested, a wider interdisciplinary need. And we had always been interested in convening larger groups to discuss these wider issues of the impact of the web. And we struggled for some time to think about whether this needed to be convened at all, or would it just simply take care of itself. 

That is certainly the case that we were aware that there were cognate disciplines. But the unique phenomena that the web presented us with was, for the first time, structures at scale that demanded to be explained and understood in their own terms. And we started with examples like the emergence of search engines at scale, like Google, the emergence of the blogosphere, the emergence of the beginnings of those social media platforms, the emergence of large collaborative activities like e-science, and I think we sat down and worked out that we wanted to persuade people that there were scientific questions that sat at the center of this intersection of methods that demanded their own singular attention.

Noshir Contractor: One of the things that you touched on briefly was the Semantic Web. For those who may not be familiar with that phrase, how does the Semantic Web distinguish itself from the web itself? We know the web as a set of web pages in its most primitive form that link to one another. But Semantic Web goes beyond that.

Nigel Shadbolt: It originates out of this really interesting idea, that if you could take some key ideas that were around in artificial intelligence and knowledge representation at the time, and distribute it at scale in the web, there was this notion that a little semantics went a long way. 

Now, what did that mean? It meant that, for the first time with the web protocols, we had ways of persistently pointing to objects of interest in the web, either concepts or relations, things in the world, things in the cyberspace, we can argue about what those objects were. But they could be referenced, they could be dereferenced, with a URL, you click on a link, you get something back. So how could you think of the web as a linked graph of connected structure and content? We’re very used to thinking in those terms now, but back then,we didn’t have this way of thinking. So one of the early efforts was to generate a semantic markup language that went beyond, for example, what people understood at the time in HTML. And so the idea was to develop languages or ontologies that could be machine processed to describe the content, the semantics, the meaning of the content on the web.

Noshir Contractor: So can you give us a use case example — If a page has semantic markup, what could it do differently or more effectively, than what a page that simply has HTML?

Nigel Shadbolt: The Semantic Web of the early 2000s was a really rich place. So it wasn’t widely distributed enough. But there were, for example, ways of linking to academic texts. In fact, we again, we see the legacy in the way that bibliographic content is linked and threaded nowadays — there are controlled vocabularies for publications, there are controlled vocabularies for certain sorts of work we do, we, as researchers have our own identifiers, that describes something about the world we inhabit. And the vision was to try to do that much more at scale. So you know, and these experiments, these deployments still exist.

I would treat the web as a kind of distributed database. And I could send queries out to the web to find and collect information about all the conferences, the academic conferences in a particular subdomain, who were the key speakers. And that could be interrogated directly off the markup languages and the databases representing the markup languages in those pages.

Noshir Contractor: All of a sudden, I feel that what I’m able to get from the web today pales in comparison to what you’re describing, we could be getting from the web if you’re using these kinds of semantic mark-up languages.

Nigel Shadbolt: (Laughs). I think many of us wondered, yeah, if it would have become widely distributed —  and it’s about network effects — you could get really powerful affordances. And for a while sites that were originally marked up using semantic web standards were extremely successful. The BBC, for example, ran its Olympics using this markup format, it ran its natural history programs, with a whole set of semantic web annotations that allowed you to literally query the content behind the web pages behind their great natural history programs.

I mean, some people think that there are important elements that have persisted. But the full blown inference scale across the web. I think one of the things that got in the way is that the perfect was sometimes the enemy of the good and the standards that were being promoted spent far too much time worrying about detailed niceties.The original web succeeded, because in a way it allowed things to be a bit scruffy around the edges, you know, there’s that great phrase to let the web scale, let the links fail. Pragmatism is always an important feature, I think, in understanding the different forces at work on a web at scale.

Noshir Contractor: You’ve just talked about one example of a challenge. Where in general, in the field of web science, have you seen progress? And where do you see continuing challenges for the next decade of websites? 

Nigel Shadbolt: So I think one extraordinarily powerful area (is) the whole understanding of the network structure and topology of the web. And I remember in the original article, an article that Tim and I published in Scientific American, and then again, when we published in the ACM communications,  we knew a use case would be understanding how to extract insight from the web graph. I think that’s been a tremendously powerful success. I would also say that the push for a certain sort of openness around the underpinning data that was the resource the one of the key elements of the Semantic Web. It was no accident, in a way that a number of us who were involved in that Semantic Web effort also became involved in the open data movement, because the key to success at scale has been open resources that everybody can exploit. And the greatest example of that in the earliest of the web was effectively the Google phenomenon, you know, Google became the extraordinary organization, it is, off the back of open data, crowd-sourced effort, you know, humans making links that expressed their interest and relevance in content. 

Noshir Contractor: You spoke about this example of how humans were crowdsourcing the development of links across the web. That is one of the early examples I imagine, of what you have written about in terms of social machines. Tell us what social machines mean to you and what you have been doing and what you’ve learned from that in terms of the web?

Nigel Shadbolt: When we launched the web science initiative, back in 2006, when we were kind of thinking about that, as an enterprise. We were very aware the confluence of challenges, the deep synergies that existed between disciplines was something that we needed to understand. But of course, it was always understood that the web worked because it connected people at scale. And people are extraordinary information processers. There, of course, we have all the richness of our own humanity connected at scale. And when Tim wrote his book, Weaving the Web, he made a reference in that book to the concept of the social machine as being a world in which the machines did all the kind of routine boring stuff and that allowed humans to flourish. The truth, of course, decades later is somewhat more complex. Some people worry that it’s the people being given the boring tasks to do — why aren’t those things fully automated by our machines. 

But what we do see in a social machine is this intermix of data assets, linkage, algorithms at scale on the web, and human cognitive capacity and that interpenetration of machines, and human problem-solving at scale defines a social machine. Now some of them are realized very simply. So the social media platforms, which link people together — and largely, it’s the linking and sharing of experience and moments that define them — when they began had very little in terms of fundamental processing of the content of those interactions. As time has gone on, the amount of inference can be drawn over our interactions, the amount of additional services that can be woven into a web at scale, from query answering to, speech recognition, through to photo recognition, there is so much now that machines are doing to organize and manage our own information that the social machine construct is very helpful. 

It reminds us that really quite complex phenomena are made up of components of quite simple interactions, you know, likes and preferences, linkages, assemblages, aggregations. and in what we’ve done in the past, me and my team and others, in defining how to provide a classification — a taxonomy of different kinds of social machines, how to understand their characteristics — we see a spectrum from highly routine automated forms, various forms of citizen science will count through to much more effectively, creative exploratory tools.

Noshir Contractor: What would be one of your favorite social machines that most people might not have heard of?

Nigel Shadbolt: Well, I don’t know that wouldn’t have heard of, but one that I admired from the outset was a particular crowdsource platform, Galaxy Zoo. These were astronomers, who didn’t have enough fundamental research funds to spend on software engineers to build them automatic classification and recognition software. And they had all of this data coming in from the sky surveys, endless numbers of pictures of nebula and stars, etc. and not enough machine processing to classify it. 

And what they did in that work was provide a platform that allowed people to participate and train them and induce them to be able to classify objects of interest. Human recognition and visual system is extremely powerful at categorizing and recognizing subtle distinctions. And still very often best in class at recognizing differences in equivalences. 

And so we had millions of images being processed by hundreds of 1000s of volunteers, who ripped through this and began to make actually individual discoveries as well. Famously, participants in this citizen science effort are featured as authors on scientific papers of newly discovered astronomical phenomena. And that’s a lovely example where, again, what started out as a necessity for the scientists became a valuable resource in and of itself.

It introduced a wide community to the challenges of astronomy, it solved the problem for them. And along the way, they learned something really interesting, which is: allow the people to participate. Because originally they had allowed a great deal of in exchange or chat between users of the platform. Through time they realized that the side channel conversations, some of these volunteers were becoming experts in their own right at forms of exotic phenomena that the astronomers were either too busy or hadn’t noticed themselves.

Noshir Contractor: Which brings us to some of your more recent research, you made reference to the fact that social machines started out as computers doing the boring part and allowing the humans to focus on more the creative part. And you immediately gave a caveat that some people think that that might have flipped. The new work that you’ve been doing, which is called human-centric AI. To what extent are you concerned that it will stay human- centric?

Nigel Shadbolt: If we go back to our concerns, right back, when we began the website effort, it was very much from the outset about recognizing the intrinsic value and worth of the human element, you know, in all of this. And in an age of a resurgence of AI and algorithmic decisionmaking, new powerful methods being deployed, the concern is, do we retain and confer the values that matter? 

We want, essentially, to imagine building systems which augment us and don’t oppress us. And I think that’s why you see what some people call the renaissance of ethics in scientific areas where the concern is the maintenance of human values, and certainly AI ethics has huge amount of attendant work around it. 

And in my group, that materializes as concerns around choice — do we as individual consumers and citizens actually have effective choice when it comes to how our data is used? How our data is actually analyzed and aggregated? How can I effectively opt out? How can I exert more self determination? That’d just be one example. The second would be, do we think hard enough about age appropriate design about how as humans develop, grow up, their sensibilities, the ability for agency changes, and persuasive design methods, which is all about clever software engineers working out how to put the sweet spot to get the kid to click, we got to think hard about the pros and cons of all of that.

Noshir Contractor: You know, one of the things that has changed since you first began to study web science, is the ubiquity of data. Initially, people were using it on desktops and creating a certain small category of websites and so on. How has the ubiquity of data and its impact on the use of AI changed the way we think of the web and web science?

Nigel Shadbolt: I think it’s a fundamental change. I think you’re absolutely right. In a way, it was the game changer for AI. I think AI, as a field, got the web quite late. We were busy building knowledge base systems, we were busy very much with this kind of desktop or best server base model of our knowledge assets. And then suddenly, connectivity was out the box. 

And it used to be a very significant effort to arrange and integrate your data assets together. And data curation was a huge challenge– suddenly, we have billions of pages of English text to analyze, billions of images, so on. So that’s been a game changer. 

Of course. It’s introduced new classes of concern, which is at scale, modern algorithms are extremely data-hungry. And do we know enough about the characteristics of data to understand that the outputs – the classifiers, the decision makers — are giving results that are representative? Well, they may be represented with the data that’s been collected. But is that data even though it’s at large scale representative, the problem you want to solve? And so I think we’ve become much more aware in data science of the need for understanding the qualities and characteristics of good and effective datasets for training. 

There’s a considerable concern around now that the data assets themselves, how can we guarantee that they have not in some form or other been tampered with? How can we authenticate them. And my work with the Open Data Institute is very much now around things like data assurance, we talk about data institutions, new ways of putting governance structures. And again, it’s not just the technical; we need technical architects to deliver the web at scale. But we also need institutional architectures to make sure that data is held and governed ethically and responsibly.

Noshir Contractor: Nigel, you were there at the very onset of both the web and web science, certainly. Where do you see the field of web science going? Unlike some other fields, the early stages of web science were nurtured by the Web Science Trust, by the Web Science Research Institute before that, and that set up a trajectory that was somewhat different from perhaps the launch of other disciplines. How do you see that has shaped web science? And where do you see it going now?

Nigel Shadbolt: At the time, we felt we wanted to use the convening power we had to draw attention to urgent research questions. In some sense, the questions were sufficiently urgent that they were going to get attended to. We thought it was important that there was a framework and in fact, in trying to work out how we should be as broad-minded as possible when it came to methods and methodology and techniques, we spent quite a lot of time convening groups from other disciplines together, we spent a lot of time imagining what curricular could be that weren’t just about network science, for example, they went broader than that. 

So the question is, how do you stop it about being everything? You know, how do you provide a practical, pragmatic solution? And I think that the problems that we we sought to understand are still problems that we are seeking to understand. We have a better understanding, but I wouldn’t say we have perfect understanding. I remember that particular meeting, we tried to imagine what the grand challenges for our field were and with the ubiquity of data and the power of computing, and the developments in cognate disciplines, and just the sheer amount of development that have been in network based analysis and graph based databases, for example, knowledge graph work, has provided for a real acceleration of work in that field. And I think we could take a number of areas and say, that similarly, was a topic that web science  called out, people were contributing to it, and it has succeeded through time, developed and matured. It doesn’t worry me too much that people necessarily say, “I want to put the label web science on this.” We often see in the development of subjects that field labels come and go. And indeed, what remains are the questions and the methods that have been put in place. 

And I think what we would still argue is fundamentally important to web science is to be inclusive and admit and embrace diversity of disciplinary work. For me, the danger signs are always when people begin to patrol the boundaries of their discipline in a way that becomes exclusionary.

Noshir Contractor: Well, Nigel, you’ve been an incredible champion of this interdisciplinary work. And I suppose it comes somewhat easy to you, given that you yourself have had an interdisciplinary background, you’ve been interested in computer science in AI and philosophy and psychology. And so it makes sense, Nigel, that someone with your own interdisciplinary background would be championing for exactly that in the field of web science. And we’ve all been the beneficiaries of that. So thank you, again, Nigel, for what you’ve done to help advance web science. And certainly thank you very much for joining us today to share some of your ideas and your concerns.

Nigel Shadbolt: It’s been a great pleasure, Noshir — very, very good to talk with you.

Episode 24 Transcript

Azeem Azhar: Research is often blamed for being a bit slow moving, (jokingly) “&ou know, I’ve been wondering about this topic. for 17 years.” Well, not in this case. And it was just over a month later that Moderna produced the first vials of its vaccine, 31 days later, after the sequence was initially released. And that is really, really remarkable., hundreds and 1000s of people. 

Noshir Contractor: Welcome to this episode of Untangling the Web, a podcast of the Web Science Trust. I am Noshir Contractor and I will be your host today. On this podcast we bring in thought leaders to explore how the web is shaping society, and how society in turn is shaping the web.

Today,my guest is Azeem Azhar, an entrepreneur, investor and author. You just heard him talk about how exponential technology enabled rapid vaccine development and distribution. He’s the founder of Exponential View, a podcast and newsletter that explores the political economy of the exponential age, reaching an audience of more than 200,000 around the world. He’s also an active startup investor with investments in AI, work from home and climate change. He’s on the board of the Ada Lovelace Institute, and sits on the World Economic Forum’s Global Futures Council on Digital Economy and Society. Previously, he founded PeerIndex, a big data analytics firm acquired in 2015. His first book, “The Exponential Age: How Accelerating Technology is Transforming Business, Politics and Society,” was just published this month and was a featured book at this year’s ACM Web Science 2021 conference. Welcome, Azeem.

Azeem Azhar: Noshir, it’s wonderful to be with you.

Noshir Contractor: I’m so delighted to have you on the show, because you have been at the start of a lot of the things that will develop on the web. And I would love if you could start by talking to us about the role that you played — take us back to where things were at that time.

Azeem Azhar: Oh, I mean, it was just an absolutely amazing time. I first accessed the internet, in 1991, through a green screen terminal at university. It just opened my mind — the idea that I could talk to anyone anywhere in the world, and there was a real innocence and intent about how people spoke about issues. They shared a lot of material. There was no sense of there being ownership in a strange way. It was a real sort of commons of contribution. I graduated University completely unable to get a job — 53 job rejections, until eventually, The Guardian who had rejected me for several jobs, asked me to come in and help on a little event they were holding in an art gallery. And the event was a web event, and I showed up and the help they needed was setting up the modems to connect to the internet. And and that’s where we were in those day. It was very, very primordial, But if I think about what the Guardian in particular was willing to do and how they were willing to experiment in 1994, it’s pretty remarkable. I mean, it’s pretty far sighted to say, well, we should try and play around with the web. 

Noshir Contractor: You mentioned in your book that you were amongst the first people to join social networking websites like 6 Degrees, like Friendster and MySpace. Tell us why you got interested in it. And what attracted you to those websites at those very early stages?

Azeem Azhar: Well, you know, I had already fallen down the hole that Tim Berners Lee, and then previous to him people like Jon Postel and John Licklider and then Leonard Kleinrock had created. And I was never going to climb out of that hole, as it were, there was too much to discover. And so you learned quite quickly, early on, if you’re an early internet user that the internet was really about people. So, when these first websites that allowed you to connect with each other, emerged, it was a really natural space, And of course, the challenges they had with this social network, was that computers were slow. And you wondered what the purpose of it was, because not everyone was on the internet. It wasn’t really your friends — it was just a bunch of people who happen to have discovered the service. But the power of being able to connect people together was really visible at that time.

Noshir Contractor: And we’ve come such a long way from those early forays. And you describe this journey as being an exponential change. Tell us a little bit about what you mean by that word, exponential in the context of exponential change and the emergence of the exponential age.

Azeem Azhar: An exponential change, you know, mathematically, is essentially any change of a constant proportion. So it’s compound interest. And what I define as an exponential technology is a technology that improves at a 10%, or higher rate every year for the same cost over many, many decades. And the consequence of that with those key technologies is that prices declined very, very rapidly. 

As prices decline rapidly, elementary economics tells us that we’ll use more of this stuff. So as computing prices, courtesy of the exponential, decline in computing power dropped, we used much more computing —  as we use much more computing, and I mean, billions 10s of billions, hundreds of billions of times, more computing.And because of that, what economics tells us is that complimentary businesses emerge. There are things that you couldn’t do with this technology that you now can do and businesses and services emerge on top of them. So from Moore’s law and silicon chips, we got cheap computers. From cheap computers, we got a web that could connect everybody. 

But the thing that I found fascinating as I unpacked this question is the impact of this declining price, was that it’s not just that things got cheaper, we use them more frequently, we might use them in more areas. But that exponential reality transmitted up to products and services that are quite far removed from the underlying technology. So Facebook was the first product to reach 3 billion users. Many of us don’t think about Moore’s law when we use Facebook, but that’s why it got there. And we’ve just heard in the last few weeks, that TikTok is now the most downloaded app in the world, which didn’t even exist when I really started to think about the book. So this idea of exponential reality is that it weaves through from the kind of core technologies all the way through to the products that get built on them. And then the services and entrepreneurs and the market respond. So the technologies and the products demand very, very fast growth rates. And that requires rapid deployment of capital. And so this venture capital industry springs up around to fund these companies very, very quickly. And the thing feeds in on itself. So that’s what I mean by exponential technologies. And the exponential age is this notion that this pattern of accelerating change is becoming widely commonplace across our political economies. And I date that that inflection point at some point between 2011 and 2014.

Noshir Contractor: And then what do you mean by the exponential gap in this context, but as you point out, that exponential age comes with an exponential gap?

Azeem Azhar: The technologies and the businesses that are built on them, and the people who can take advantage of them, improve exponentially, and they create new potentials, and new potentials that we perhaps don’t have words for. But we as humans, live within societies that are regulated by, by habits, by norms, by conventions, by formal institutions, and by informal institutions. And largely, those institutions change incrementally at a linear pace. And so there is a gap that emerges, of the acceleration going upwards and this linear trajectory. And I think the exponential gap explains why we have a common pattern of a sense of friction, of division emerging about how we think of some of the fundaments of society in the political economy.

Noshir Contractor: What would be examples of our ability to try to address the exponential gap?

Azeem Azhar: I’ll give you, one example. If we look at companies. Traditionally, the way that we economists have thought about companies and regulators have thought about companies is that companies benefit from increasing returns to scale, and at some point, get to some diminishing marginal return. And that diminishing marginal return is like a force of gravity to hold a company to a certain size. The other force of gravity was that industrial inputs progressively got more expensive. The 1,000,000th kilogram of iron ore that you extract cost a lot more than the first kilogram of iron ore. And those things would slow down companies’ abilities to grow very, very big. Now, courtesy of essentially web based technologies and databases, we start to see companies being able to break that force of gravity, and they do so in two ways. The first is that a lot of companies now benefit from network effects. While the millionth kilogram of iron ore is more expensive to extract than the first, with a network effect business, the millionth customer adds value to all the previous 999,000. That’s a phone network. That’s Facebook, that’s Twitter. There are other types of network effects that emerge in this AI world that relate to our data network effects. So we increasingly rely on machine learning and algorithms to derive value in businesses. The data network effect means that the more people who publish web pages, the more people who search on Google and click or don’t click on results, the more information Google has about what good looks like and no competitive entrant to the market, however hard they try, can get that insight. And with every cycle, every click, every search we do, Google gets better. And barriers to its economic moat gets deeper and wider. And so those two things fundamentally change how we need to think about companies. In the 20th century, if a company had 70% market share, you can bet your bottom dollar the CEO had done something dodgy. They had bought up all of the silver, they had fiddled something with the regulators, they had done, they’d colluded with a competitor. In the exponential age, companies just get to 70% market share, because that’s where network effects take them. 

And so then the question is not so much whether these companies are nefarious, or their bosses are good or bad. They may be, they may not be, it’s that the physics of exponential age companies is very different to the physics of an industrial age company. And that is the exponential gap.

Noshir Contractor: And this, of course, raises issues of ethical dilemmas that might come along with these rapid growths. And you founded in Peer Index, a big data analytics firm that was then acquired in 2015. And you talk in the book about how that experience in some ways shaped your thinking about the exponential age. 

Azeem Azhar: There was a standard that sort of evolved in the early 2000s, called FOF, friend of a friend. And the idea was that you could use that standard as a way of keeping records of who you know, and what the nature of that relationship was. So there was some semantic depth to it. And I really fell in love with that idea. I built a FOF browser, in a blogging platform that I was running in 2003, 2002. And I had fallen in love with network science, and the fact that you could learn a lot about a group of people through their relationships without necessarily knowing who they were. 

And by 2007, 2008, it was clear — Twitter had more than a million users, Facebook had more than 10 million — people were going to get addresses on the internet, they were going to be connected to other people. 

And at the time, these networks were all open. And so I thought, wouldn’t it be really interesting if we could mine and interrogate and analyze and construct analytics in order to help people discover the richness of other people more easily.So the initial idea behind PeerIndex was to help answer questions like, tell me who knows something about sushi in Chicago. or help me find someone who knows something about shin splints in London. And by being able to look at the pattern of what people are posting on Facebook and Twitter and so on.But we could also then say to you, “ook, this is how you will be seen by systems.” And you can now look at the impact of what you say and do. And we could do that because Facebook and Twitter and these other networks were all open at the time.

Noshir Contractor: That sounds absolutely fantastic. What could possibly go wrong with it? Why are there troubling aspects? Because that sounds like an ability for us to globally know who knows who, who knows what, who knows who knows who knows what.

Azeem Azhar: It’s amazing and actually in this funny way, it’s the heart of the problem. The big issue I think ends up being around partly around consent. We used a model of implied consent, which is you can always make your Twitter feed private. And you can always ask us not to be indexed, but leave your things public.

And then, and then there’s the issue of the kinds of things that you can infer about people on the basis of their their behavior. We didn’t do this, but we could predict many, many types of personal classes and behaviors. And I think that that’s also also problematic. We battled with some of those questions. And in the end, the initial idea that we could provide this as a consumer product for consumers to use, didn’t really work out. And what worked out was it was a marketing analytics product that brands wanted to use to understand audiences. What was quite interesting about moving to the brands was, they didn’t care about individuals, they cared about averages and aggregates. So actually, all those problems went away. But it led to the next issue. Once you understand that you can affect people’s behavior. by tweaking aspects of an algorithm or showing them giving them a score, you actually have some kind of power over them. And that is not power to which they have consented to, or they have any way of challenging. 

Noshir Contractor: One of the things you mentioned, as you described, the development of theory index was that at the time these platforms were open. You were able to get the data from there, even if you were implying consent on part of the users, it was still available. Since then, as you know, platforms like Facebook, don’t make that data available any longer. Why do you think that is? Do you think that they are trying to internally monetize the kind of peer index vision that you had? 

Azeem Azhar: I think they do it for exactly that reason, which is that the data is in the core heartland of their network effect. So not only does it drive their monetization, because it’s the data that drives te the ad targeting. But the second issue is that, once you as a network, make your user data entirely visible, I don’t have to be part of the network in order to access your network. And so people forget that there was a product called friend feed, and friend feed aggregated Twitter and Facebook and a bunch of other things. So in a single panel, that was not run by these companies, you could look at all of your social networks in one place, you wouldn’t see the adverts because those were not in the content feeds. And you could message back into those networks. And that weakens the network effect, which is ultimately the source of these companies’ scale. The data policies of the networks were changing very, very rapidly, andthey were being tightened. I think the thing that was that would have been frustrating for me was that The honest reason for why they were being tightened, which was this is for our strategic long-term benefit from Facebook or from Twitter, was never the one that was presented, right, the one that was presented was, we want to provide users like you or I with a consistent user experience. And if you can access Facebook or Twitter from some third party application, they might not get a consistent user experience. So I think that the real argument, simply, it’s a business reason, “we wanted all — this is our pie.”

Noshir Contractor: One of the chapters in your book talks about the world being “spiky.” As you mentioned, this was obviously a play on Thomas Friedman’s 2005 “World is Flat.” And even before that William Gibson talked about the future being here, but it was not evenly distributed. How does your use of the word spiking build on or differentiate from those approaches?

Azeem Azhar: The key ideaof the world being being flat was this notion that there’s an equalizing force of around technology tied to a particular type of economic paradigm, that if people adhere to those rules, and things would be better for everyone. And, what I think has started to happen, and what we will see because of these, these technologies, is that, in fact, the local rather than the global, will end up being economically and socially more desirable in many, many contexts.

But there’s another part of it, which is, that if you randomly form a network, you get these nodes that have got more connections to them, you get agglomeration in a random network. But in a network where people are going to move for economic or emotional or cultural reasons, you are going to see even more agglomeration, because you’re going to see intent as to where people will go. And I think as the world moves to a more complex, advanced economic position, that kind of agglomeration will continue. So my view about the world is that while we will maintain global relationships, and we need to maintain a sense of global governance to certain types of problems — many relate to the web, many relate to things like climate change — we will also start to see increasing spikes emerge and some of the assumptions that were really of the the that neoliberal era, unpick.

Noshir Contractor: So Azeem, I wanted to take us to the present, while you were writing the book, the world was confronted with the pandemic. Clearly, there are two aspects of exponential change. On the one hand, the spread of the disease, but to me more interestingly, in terms of development of the vaccine, and then in the getting people vaccinated, also represent exponential change. I was particularly intrigued by your description of a particular website, Virological.

Azeem Azhar: Virological is a sort of GitHub for virus scientists. And very early in 2020, on the sixth of January, an Australian University virologist put a very simple statement on Virological — This is a website that typically gets a few dozen visitors a month. And he simply said, Look, the Shanghai Public Health Clinical Center is releasing a Coronavirus genome from a case of respiratory disease from the Wuhan outbreak. The sequence has been deposited on GenBank and will be released as soon as possible. Now GenBank is a code repository for sequences run by the National Institutes of Health, and people flocked to it.

Within a matter of days, hundreds of researchers are looking at this genome, because it’s new and it’s interesting. And we’ve not really got cases outside of China by this by this case. But what I found fascinating, is that research is often blamed for being a bit slow moving, you know, “I’ve been wondering about this chapter for 17 years,” well, not in this case. And it was just over a month later that Moderna produced the first vials of its vaccine. 31 days later, after the sequence was initially released. And that is really, really remarkable. What’s remarkable is not just that we could sequence the virus so easily. And that’s as a consequence of another exponential technology, which is genome sequencing. But then courtesy of the web, which is another exponential technology, we were able to get it out to, you know, hundreds and 1000s of people. And then the techniques that Moderna used, many of which relied on a machine learning based system to help manage data, discover data and look for patterns, were also applications of exponential technologies. And so you end up within 12 months of the virus being identified, we had seven different vaccines that had been approved, and 24 million people had received their first shot of the vaccine.  A large part of just being able to do this and coordinate people to deliver and then receive the vaccine is entirely dependent on computers and databases and smartphones.

Noshir Contractor:  One of the things you talk about in the exponential thesis is that there was a change, an exponential change both in the amount of things being invented, and the ways in which they get scaled. Tell us a little bit about the difference that you see between the exponential change in invention versus scaling up?

Azeem Azhar: There are larger markets to go after. And it’s cheaper to do this invention than it ever has been. One core idea that I talked about is the idea of combinations, the fact that technologies from different domains can combine and they’re reliant on there being open standards and modularity.

On the other hand, the question is, why can we then adopt them so much more quickly, and the reason we can adopt them so much more quickly. And I think this is where part of the thesis is a bit complex, right? It relates to the fact that there are global networks of information and global networks of, of distribution. And I think back to the first iPhone, which was launched in 2007. And it was available in one store in San Francisco, just off Union Square. And when the iPhone 12 was released, it was available in 300 cities around the world on the same day. And that is a testament to being able to coordinate and deliver these products over the place at the same time. And I think that that’s one of the interesting wrappings of the book and my argument, which is that the exponential age isn’t just about a process where silicon chips get faster and faster and faster. It’s that, that speed that acceleration has a way of echoing through other parts of industry, and then butting in quite quickly into our, the rest of our lives.

Noshir Contractor: That teases well for my closing question. We spent some time talking about the pace with which the exponential age is upon us. Will it ever stop?

Azeem Azhar: Well, I think in the timeframe that I’m thinking about in the of the book, which is, you know, decades, it will continue. I think we were scratching the surface, there are still incredible breakthroughs that are happening. And even things that happened while I was writing the book, I talk about in the book about a Romanian company called UiPath. And when I wrote the first draft, UI path was one of the fastest growing software companies in Europe and was had was worth a billion dollars. By the second draft, I’ve had to write that up to 7 billion, by the third, it was past 10. And just as we’re going to print, I had to quickly go in and change that number to $35 billion. So it’s a 35x increase in the valuation in the year or so that I went from first draft to to go into print. So I think it does continue. There’s a more metaphysical, I suppose question, maybe it’s a physics question, which is, can it continue forever? I mean, physicists will tell you that, ultimately, there are a limited number of atoms in the universe. And there is there’s sort of issues of their complexity and what can an atom really support. So I’m sure there could be some physical limit to all of this. But that is the subject of a book that will have to be written by somebody else.

Noshir Contractor: Very good then. But speaking of your book, I really enjoyed it. And I would recommend it very much — the title of the book, The Exponential Age, how accelerating technology is transforming business, politics and society. Azeem, thank you so much for taking time to talk with us today about the exponential change that we are witnessing and in particular, being able to tie in many cases to topics of interest to those who are following the web and in web science in particular. Thank you so much again. 

Azeem Azhar: Thank you, Noshir. Really appreciate it. 

Episode 23 Transcript

Rory Cellan-Jones: In the early stages, quite inexperienced and solo-bedroom developers, they were called, could make a big impact. And Edward Bentley, age 16 was one of them. He was this friend of my son’s, lived about a mile away. He developed this game, put it on the App Store. And one evening, the phone rang at the family home, and his father got a phone call from Apple on the West Coast saying, “Mr. Bentley, your app is being made App of the Week. And you’re going to need to open a bank account here for all the 1000s of dollars that you’re going to earn.” And he was mystified. Turned out his son had put his dad’s name against this because he was too young to be officially the owner of the app.

Noshir Contractor: Welcome to this episode of Untangling the web, a podcast of the Web Science Trust. I am Noshir Contractor and I’ll be your host today. On this podcast we bring in thought leaders to explore how the web is shaping society, and how society in turn is shaping the web.

You just heard from my guest today, Rory Cellan-Jones, talking about how the introduction of app stores to smart phones produced an enormous amount of creativity on the web and cemented the social smartphone era. Rory has been a reporter for the BBC for 40 years, covering Business and Technology stories for much of that time. At the beginning of 2007, he was appointed technology correspondent to expand BBC coverage of the impact of the internet on business and society. His first big story was the unveiling of the iPhone by Steve Jobs, something that we will be talking about today. He now covers technology for television, radio and the BBC website. And in 2014, he began presenting a new weekly program, Tech Tent, on the BBC World Service, a personal favorite of mine. He’s just published a new book, titled “Always On: Hope and Fear in the Social Smartphone Era,” And he spoke about that book at the recent ACM web science conference. Welcome Rory. 

Rory Cellan-Jones: Good to be here.

Noshir Contractor: I want to start again by thanking you for all the incredible coverage and storytelling and weaving that you have done over the years as a technology correspondent. And most recently, in the book that I have enjoyed reading, titled ‘Always On: hope and fear in the social smartphone era.” I want to start by trying to punctuate how you define the social smartphone era.

Rory Cellan-Jones: Well, my job as technology correspondent at the BBC started, as you say, in January 2007, and the first big story I covered was Steve Jobs unveiling the iPhone in San Francisco, which was an extraordinary event in extraordinary performance by a brilliant, charismatic and very difficult man. 

I made a big bet, really on that event. I responded to complaints to the BBC, that we were plugging a product on our Nightly News program with my story, by saying, ‘Well, I think this could end up being a Henry Ford Model T Ford moment,’ which I thought at the time, maybe I went over the top there. But I think it’s proved to be correct. It was the moment that smartphones really became mainstream from then on. There had been smartphones, but they’d been clunky difficult devices. The iPhone transformed all that and brought mobile computing to the masses. But about the same time, all sorts of things were happening all together. If you think about those years. 2004, Facebook was created 2005, YouTube was created. 2006, Twitter came along, and then 2007, the iPhone. And what you had, quite quickly ,was not just these incredibly powerful devices in everybody’s pockets, but these extraordinarily powerful social networks. And my sort of thesis is those two combined to have an extraordinary impact on the way we lived.

Noshir Contractor: So back in 1991, Sir Tim Berners Lee had released into the wild, the World Wide Web. But as he said, coming up before 2007, we had the release of platforms like Facebook, and YouTube and Twitter. But the central piece, as I understand, is that all of this changed dramatically when all of these events, activities on the web now became possible in the mobile. 

Rory Cellan-Jones: Let’s think of what Sir Tim said about the web when it came along. He talked about it as a ‘read, write, web.’ So I grew up in the age of television, the great mass medium that was this sort of big box in the corner of the room, which you did not interact with. The web was supposed to be an interactive medium that you know, we were all supposed to participate in it, build it, create it, so on. And that did happen a little bit at the beginning, but not a vast amount. Don’t forget that for millions, billions of people, there was no access to the web because they didn’t own a computer. It was a fixed line experience, It was an experience largely confined to the office in the home. So the arrival of smartphones, and the connectivity they provided, was a mass democratizing force along with those social networks, and we can discuss later the negatives as well as the positives. But What that unleashed was not just a democratization of the web, but a huge wave of creativity of content being created by these extraordinary devices. 

I started in television in 1981, a very long time ago, I could no more thought of creating my own television program all on my own, than I could have thought of landing on the moon. But these devices, enabled anyone to create quite sophisticated content, which we see done today on YouTube, and so on. There are lots of problems from that. But that did begin to really bring Tim Berner Lee’s original vision to life in a very full way.

Noshir Contractor: One of the things that you chronicle in the book is that at the time there was a battle between what you call the bell heads, and the net heads, the people who came from the phone systems and those who came from the computer systems. 

Rory Cellan-Jones: Yeah. That culminated, obviously, in the arrival of the iPhone. Up until then, don’t forget the mobile phone industry had been around since the mid 80s. But it was telecoms people, it wasn’t so much software people. And from 2007 onwards, we know who the big victors in this battle were. They were Apple and Google. I mean, Google, obviously, in some ways, much more important, and the Android is on, you know, 80% of the world’s phones. So it was a triumph of software and apps over just the pure sort of telecoms engineering types.

Noshir Contractor: And alongside the fact that people are now using the mobile platform to engage with Facebook and YouTube and Twitter. One of the early applications also in the mobile space was mobile payments, which curiously got its innovation and start in Kenya.

Rory Cellan-Jones: I mean, the whole mobile payments world, what’s fascinating about it is very, it took off, yeah, really, through M-PASA, which allow people in Kenya to transfer money easily — and not between smartphones, between very basic phones, was far ahead of anything that happened, for instance, in the United States. And actually, the United States is even compared to Europe, is way behind has always been way behind in that area in the payments area. Checks, I gather is still quite big in the United States. Whereas, I’ve not written a check for years. So I think it’s all about need. There was a need in places like Kenya, which there wasn’t quite in the United States, you had, you know, obviously a reasonably sophisticated payment system in the United States. You didn’t have that in Kenya, but they managed to leapfrog to being ahead.

Noshir Contractor: Alongside mobile payments, another area that also became quote, unquote, a killer app on mobile platforms was gaming. You talk about a very simple app developed by a 16 year old Edward Bentley called the impossible game.

Rory Cellan-Jones: That’s a fun story. I mean, one of the things to remember is that although the iPhone was, of course, extraordinarily important when it came out in 2007, firstly it’s quite a primitive device. It only had 2G. And secondly, it only had the apps that Apple put on it. And actually the arrival of Apple’s App Store the following year, and then Google’s Play Store was what really cemented this revolution. 

Steve Jobs was the ultimate control freak, didn’t really like the idea of putting people putting any old software on his phone, but was persuaded eventually to open this app store. And that, you know, sparked this extraordinary wave of A. creativity and B. economic activity. And in the early stages, quite inexperienced, and solo bedroom developers, they were called, could make a big impact. And Edward Bentley, age 16 was one of them. He was this friend of my son’s, lived about a mile away. He developed this game, put it on the App Store. And one evening, the phone rang at the family home and his father got a phone call from Apple on the West Coast saying, Mr. Bentley, your app is being made App of the Week. And you’re going to need to open a bank account here for all the 1000s of dollars that you’re going to earn. And he was mystified. His son had put his dad’s name against this because he was too young to be officially the owner of the app. But of course, Mr. Bradley senior was very pleased with the money that rolled in.

Noshir Contractor: And I’m sure if it was Americans who were paying it, they were sending it by checks at that time. 

Rory Cellan-Jones: Yes, yes. (Laughs).

Noshir Contractor: Of course, in a more serious note, you also talked about the sudden surge of messages that Biz Stone, cofounder of Twitter, began to get about a country that he hadn’t heard of called Moldova.

Rory Cellan-Jones: I talk in the book, obviously, about the positives and the negatives of social media. And we were incredibly optimistic, around 2011, 2012, about the impact social media was having. Biz Stone talked to me about that was the moment that really struck him, when he was suddenly being told that his company, Twitter, was helping to sort of foment a revolt in Moldova against the authorities. But more so the Arab Spring, I mean, don’t forget that Facebook was given a lot of credit. So that was the time that social media — and obviously only made possible by smartphones was an incredibly democratizing force. That’s what we felt. back then.

Noshir Contractor: One of the things that you also point out is that smartphone gave a big boom to AI, because it all of a sudden made a lot more data available that could be used by AI.

Rory Cellan-Jones: It was a sort of two way relationship. AI helped, you know, make a lot of the things you do on a smartphone much more sophisticated. But the big breakthroughs in the last decade in AI particularly is fed by vast amounts of data. And don’t forget, one of the huge changes the smartphone is broadened is in the way photography works, in the sheer volume of pictures we’re taking. And as computers were taught to, you know, recognize, for instance, the difference between a dog and a cat, was one of the great triumphs of AI over the last decade. the sheer volume of data from all these billions of smartphones. And these people taking pictures of everything they saw was one of the things that helped fuel that advance.

Noshir Contractor: And one of the things, though, that That has now fermented is a fear of what can happen with all of these data. You did an interview with Stephen Hawking, where he famously said that AI will make humans obsolete.

Rory Cellan-Jones: Yeah, this, this was an extraordinary interview. And this was quite early in the big conversations that we’ve had about AI, it was 2014. The way I did an interview with Stephen Hawking worked, is you had to send off the questions in advance. And he would then write the right replies, and then eventually you’d record it. And I sent off half a dozen questions about this, that and the other with a final question about ‘Oh, what do you think about AI?’ And that became his extraordinary answer, which basically said, If full ai ai was developed, it would be smarter than human beings and would therefore see no, real use for us, and we would become obsolete. And it was such a an extraordinary statement, that I got very excited. And then I realized that this wouldn’t be news until he actually said it. And he got ill for a while. And it was about another six weeks before we could actually record the interview, where he pressed a button on his computer and his answer came out and rocketed around the world and helped to spark the ongoing debate we’re having about the ethics of AI.

Noshir Contractor: And in this he was joined by other comments. Elon Musk talked about AI being the biggest existential threat. Dame Wendy Hall, who you and I know from the web science community, quoted as saying that AI might evolve faster than us and we might end up being slaves of the machine. 

Rory Cellan-Jones: On the other hand, that there was a certain amount of a backlash when the Stephen Hawking interview came out,  from people who were actual practitioners who thought he was probably worrying about the wrong thing. And I think as years have gone by, we’ve all begun to think maybe we shouldn’t be worried about the kind of Terminator style vision he was painting. There are far more imminent, and immediate concerns about AI — things like bias being built in.

And there was an interesting postscript to that interview. In the book, one of the great figures in AI certainly in this country is Demis Hassabis, the founder of DeepMind, which is now owned by Google. He told me that he’d gone and seen Stephen Hawking some months after my interview.. Hassabis felt that he put Hawking’s mind at rest to some degree by explaining how far away that kind of vision of artificial general intelligence.  

Noshir Contractor: One of the things that I want to turn to was your comment and your interview with Sir Tim Berners. Lee, who claimed, of course, in this very momentous Olympic moment in July of 2012, where he sent out that message, this is for everyone and caught a lot of journalists and the general public by surprise.

Rory Cellan-Jones: That was particularly the American, the NBC commentators, who said ‘Who is this guy? And someone said “Maybe you should google him.” And of course, the point is that Google would not really happen, but for Tim. 

I put that point as the high point of optimism about this area. I was actually there in the Olympic Stadium in London when that opening ceremony was happening, it was an amazing evening. And we did feel incredibly positive about all these developments about the web, about mobile economy activity, about social media. And I’ve interviewed Tim Berners Lee over the years, and in the last two or three years, his mood has darkened so much about what has been done with his creation. And he told me that what really woken him up was the Cambridge Analytica affair, and the way that he saw the web being used for malign uses of persecution of minorities, for manipulation of elections, and so on. And he said for years, he hadn’t worried when people said, there’s all sorts of bad things on the web. He said, I mixed with the people that I want to talk to on the web, they’re all really interesting. I just don’t interact with those people doing that bad stuff. And then he said, he came to realize those people doing the bad stuff on the web, as he put it, these people vote. In other words, they can determine the future of my country and other countries. And therefore, I need to worry if they are being manipulated in malign ways.

Noshir Contractor: Given that the smartphone has created such incredible opportunities for surveillance, we now have to think about ways in which we are being made aware and can have full control on who is surveilling us, and how we are being surveilled and you spoke with Jane Chapelle, co founder of digital shadows, a company that is trying to help us in this enterprise. And he is quoted saying something we’ve heard many times, if you’re not paying for something, you are the product. 

Rory Cellan-Jones: Yeah, I took my phone to him just so that he could explain what was inside it and how it was tracking me. He took me through just how many different sensors there are in a modern smartphone, and how many radio different radio systems I think he counted about 5  providing this huge flow of data, this data flow, which we are providing, until recently, very unconsciously, to advertisers. And of course, it’s advertising money that fuels the modern web, and is the source of the huge power that companies like Facebook and Google have. That for instance, every time you sign up to an app, you are signing up very often to being tracked wherever you go on the web. That’s why that pair of shoes that you happen to look at yesterday, keeps falling around on the web. And we have begun to have the debate about whether we’re comfortable with that. And it’s a difficult debate because we get something from it. ie we get free services, Google, Facebook, and whatever are free to us. But in return, we we are consenting to being tracked. And of course the last few months, Apple has brought in this new system whereby you’re asked if you want to be tracked, and of course that that is changing the balance of power. On the internet, there’s a debate to be had about whether Apple’s motives are pure, as pure, as it says, because it’s not really in the advertising business. But in any case, we are beginning to have that debate, because the other thing that’s been very prominent, just in recent weeks, is the use of these devices for surveillance by governments and cyber criminals, we’ve seen that story with the Israeli company providing software, which can effectively turn your your phone on your iPhone on, turn the camera on, turn the microphone on and, and to spy on you, incredibly effectively. And that’s obviously a big challenge for the mobile phone industry and a big, big concern for all of us.

Noshir Contractor: That is indeed a very scary story. On the potentially positive aspects of monitoring, a lot of the ways in which the smartphone also has the ability to monitor us is in terms of health related issues. 

Rory Cellan-Jones: Health tech has become a real interest for me. And it’s been a global interest, obviously, in the last year during the pandemic, what role could smartphones play. For instance, there have been a number of contact tracing apps developed. But for me, personally, the interest has been that I was diagnosed with Parkinson’s a couple of years ago. And so I’ve taken an interest in what the technology can do to help me, and there’s quite a lot of work going on. It’s more in monitoring, rather than treating the condition at the moment, because the condition is quite difficult to monitor. If you’re a patient like me, you see your specialist once every four, six months, and they say how’s it gone? And you think, ‘kind of okay, too hard to tell, really.’ But I’ve been taking part in a trial, uses sensors. The hope is to develop a smartwatch, or more sophisticated version of the Apple Watch, perhaps, that would measure your symptoms on an ongoing basis. And for instance, would be able to tell whether an hour after you taking a pill, which I take, you know, four times a day your symptoms are approved or not. So there’s a lot of work going on in that area and health that generally is a huge, exciting and potentially life-changing area.

Noshir Contractor: And of course, talking about Parkinson’s disease and the fact that you’ve been so open about it, you narrate in the book the story about your conversation with then producer Priya Patel, who first sort of prompted you and into being able to put a tweet out about this to share this news with the world.

Rory Cellan-Jones: I was doing a live broadcast about 5g. one day on a breakfast television, I’d been diagnosed some time earlier — I was talking about it. And my hand was shaking quite violently. I didn’t realize at the time. But then this great producer said to me, have you ever thought about going public, because that was pretty obvious. And I said, “Yeah.” And I just sent out a tweet. And it showed the positive side of social media, because within milliseconds, it felt like I was getting huge amounts of response, and you know, very warm and helpful and positive responses. That was great. There was one person out of 1000s, who said, Oh, you’ve been standing too close to a 5g mask. That’s why you got Parkinson’s,’ but I was able to ignore that.

Noshir Contractor: Unfortunately, the world was not able to ignore that during the pandemic, where once again, the 5g virus theory was raised especially in the UK.

Rory Cellan-Jones: I’ve spent a lot of the last year covering that. And it’s ne of the things that makes me actually quite angry is how much nonsense there is talked generally about technology in that area. And ridiculous rumors about some connection with the virus. I mean, there’s a spectrum there are people who, you know, justifiably, I suppose, have concerns in general about the impact of mobile technology, and whether it’s causing them harm. I don’t believe it is causing them harm, but a lot of them genuinely do. But then there’s the spectrum, which goes way over into these wild conspiracy theories that say, it was only because 5g was switched on in Wu Han that the virus started or that it’s making people more vulnerable to the virus. And as a non-scientist, I’m very humble in front of science. So I trust the science. And I listened to the majority scientific opinion, just as I do on climate change. And the majority scientific opinion, says this technology is not harmful.

Noshir Contractor: On a potentially more hopeful note, you discuss what the pandemic would look like, instead of being COVID-19. It was COVID-05.

Rory Cellan-Jones: I think of how I’ve got through this pandemic, which has obviously been difficult, and many, many millions of others have got through it. And if this had happened in 2005, it would have been just about impossible for me to carry on working from home. I use smartphones and great fast connectivity, which I didn’t have in 2005. To do my work, it would have been very difficult to, to shop as effectively from home a lot. It’s not just that there’s better connectivity. It’s that the arrival of them, the mobile phone, kind of supercharged the online economy, made things like for instance, home delivery of food, made them more economic, gave them scale. So without those kinds of facilities, those sort of services that have come in the smartphone era, it would have been pretty challenging.

Noshir Contractor: There is a sweet irony though, that even though we talk about the pandemic as being in lockdown, we are still seeing the value of the mobile phone. Because even while we are locked down, a lot of the services that we rely on are indeed mobile services, or mobile-enabled services.

Rory Cellan-Jones: : That’s a very good point. Yeah, I mean, all of those delivery services, all of those careers, they’re all powered by apps. It’s the app economy that has really come to our age during the pandemic.

Noshir Contractor: Thank you, again, so much for taking us through the issues that I had with hopes and fears, has helped us navigate the web and leverage the web in ways that we could not have imagined, or the launch of the smartphone. I also want to thank you for all your incredible coverage of making all the technological progress accessible to people around the world. As I said, I’m personally a big fan of the show on the BBC World Service. And I certainly recommend your book to anyone who wants to get caught up quickly on the history, both in terms of hope and fears of the smartphone. Thank you again, for joining us today. 

Rory Cellan-Jones: Thank you. It’s been a lot of fun.

Episode 22 Transcript

Pablo Boczkowski: When I think of how my daughters access the world of information, how, for instance, they do homework, with three screens not to, so they have the computer screen for work, they have their phone next to the computer screen where they are monitoring Snapchat, and they have the TV, where they’re binge watching their favorite show, and all of that at the same time. So in order to understand their world, and how much their effect and their sociality really resides on the screen,

Noshir Contractor: Welcome to this episode of Untangling the Web, a podcast of the web science trust. I am Noshir Contractor and I will be your host today. On this podcast we bring in thought leaders to explore how the web is shaping society and how society in turn is shaping the web. 

Our guest today is Pablo Boczkowski. You just heard him talking about how his experience as a parent impacts how he thinks about his own research. Pablo is Hamad Bin Khalifa Al-Thani Professor in the Department of Communication Studies at Northwestern University, as well as the founder and director of the Center for Latinx Digital Media where he hosts his very own podcast titled El Café Latinx. He’s also the cofounder and the co-director of the Center for the Study of Media and Society in Argentina, and has been a senior research fellow at the Weizenbaum Institute for the Networked Society in Berlin, Germany. He’s the author of six books, four edited volumes and over 40 journal articles. Three of his books are being published in 2021 — Abundance: On the Experience of Living in a World of Information Plenty, published by Oxford University Press The Digital Environment: How We Live, Learn, Work. Play and Socialize Now with Eugenia Mitchelstein at MIT Press and The Journalism Manifesto with Barbie Zelizer and Chris Anderson, published by Polity. His work was featured at a Meet the Authors session at the 2021 ACM Web Science conference. Welcome, Pablo.

Pablo Boczkowski: Thank you very much, Noshir. It’s a pleasure to be here. Thank you for having me.

Noshir Contractor: You’ve had an incredibly productive 2021. And I’m not even sure which of these three books to start with. But let’s start by talking about the one that I know has gotten a lot of attention. And it’s the book titled Abundance. Tell us a little bit about what got you interested in this particular title, and why you chose to title the book abundance.

Pablo Boczkowski: Well, Abundance is a book that came out in May of this year, but it was in the works since March of 2016. Abundance draws from two major sources of information. The most important one is 21 months of fieldwork in Argentina Muslim one Osiris and the suburbs boroughs in several provinces, amounting to 158 interviews. And about a third into the field research we conducted an in-person survey with a national representative sample of people to get a better sense of the land, some of the larger structural issues. 

The research project started generally as an exploration of the interrelationships between the consumption of news entertainment and digital technology, in particular mobile devices and social media platforms. And in December 2017, we ended the fieldwork. About six months after that I was in Buenos Aires. I was traveling with my eldest daughter, and we are walking down this avenue, which is the main artery in Buenos Aires — a very popular city, about 4 million people. And I saw an image that really stuck with me. It was an image that is sadly quite familiar in many large metropolises around the world. There were two people ostensibly living on the street, this was maybe 7 or 8 p.m, so it was dark already; it was winter time. They were sitting next to each other on a couple of worn out chairs. They were surrounded by cardboard boxes turned upside down as if they were summer demarcating their semi-private space, you know, within the public space. They were facing the street with their possessions tucked away between their backs and the walls of a building. And they had sort of an improvised dinner table, that essentially was as far as I could see a large cardboard box turned upside down. They were having dinner, they were eating from a plastic container. And they had a can of coke next to it.

And all of this was very familiar sadly. What really caught my attention was that one of them was holding a mobile phone. Right? That they were both looking at, right while they were eating. So it was that tiny bit of light emanating from the phone. And it was a little bit, if you wish, a popularized 21st century version of people eating in front of the TV as the iconic media moment of the 1960s right. 

So the reason why I mentioned the story is I have been wrestling with many themes from the field work and many findings from the survey. But what that image did for me was it may coalesce what ended up being the main topic of the book, which was the contrast between even in an extreme situation of material scarcity they were connected to a world of abundant information. 

Noshir Contractor: Well the story is an extremely powerful and evocative way of capturing the central thesis of your book, in terms of seeing the separation between the simultaneous material scarcity, coexisting with the abundance of information in digital environment. A lot of the time what you’re describing as abundance is sometimes equated with terms like information overload, or things like data smog, how do you distinguish what you’re doing as being perhaps more celebratory than words like information overload that make it sound more negative or pejorative?

Pablo Boczkowski: That’s an excellent question. There is a long tradition of thought dealing with this notion of information overload as the umbrella term. All the work about information overload has a few characteristics that cut across most of the scholarship. One of them is the idea that there is an optimum amount of information, and that after you reach that threshold, it sort of is an inverted U shape in which you start getting diminishing returns. It’s the idea that the information is used to make decisions about which there is a right and a wrong. The notion of information overload tends to focus on the cognitive side of the human experience, and much less on other dimensions of experience. If you look at the terabytes and terabytes and terabytes of information that are consumed today, most of this information is not consumed on a daily basis by most people to make decisions. It is consumed to entertain themselves on Netflix, to learn about others on social media and to express themselves on social media,? And there is really no optimum. So what is the optimum number of episodes of your favorite thriller that you should watch in a day of release of a new season? How many hours should you spend on social media? It’s the same question of how many hours should you spend socializing with your friends at the park? There is no real optimum there. It is situationally dependent.

And when you consume that information, you’re not only processing cognitively you’re living it emotionally and relationally,. So information overload has this discourse of deficit associated with it. I wanted to move away from that. So the notion of abundance is much more sort of agnostic with regards to valuation. The idea is that whether something is positive or negative depends on the situation and the values that people assigned to that, that there is no right or wrong answer to most of the uses of information. And that it’s not only about the cognitive, but also the emotional and the interpersonal. 

Noshir Contractor: Well, I think you just absolved a lot of people who have been feeling guilty about the amount of time they have been doing doomscrolling and bingewatching. And now all of a sudden, they’re gonna feel good about the fact that they are doing exactly what you’re calling for. And that is wallowing in the abundance of information that they receive on the web.

Pablo Boczkowski: I have a funny story. My favorite show on Netflix is Money Heist. When season number three came out, I told everybody, the day comes out, I’m not leaving my apartment. People laugh, but I didn’t feel guilty at all. 

Noshir Contractor: Basically says that you live up to your research in your own practices as well. Good for you! One of the things that is a recurring theme, both in this book, but also in other work that you’ve done is the observation that a lot of people who are looking at the study of media in general and web science, in particular, tend to focus on what is happening in the global north, and you have made a very concerted effort and a very passionate plea for being able to broaden that stage to include the global south in the case of abundance, specifically Argentina? Why Argentina? 

Pablo Boczkowski: There are three factors that I think make Argentina a very, very suitable national setting for the questions that I posed in the book. The first one has to do with the use of material scarcity. you know, when we talk global north, we talk about 14% of the global population between 13 and 14%. So it’s a minority in statistical terms, and these are countries which are much more prosperous economic conditions. And they tend to be countries with more stable political situations. Now, most of the world is not like that, the other 86% registers much less prosperous economic conditions, much more income inequality and inequality in terms of social capital access to opportunities, etc, etc. and political environments and social institutions in general that are weaker and more uncertain. So Argentina sits a little bit in the middle. It is a middle income country by World Bank standards. It has a long and very sad history of political and economic instability. During the 20th century, it had the recurrent cycle of democratic governments being interrupted by dictatorial regimes.

I mean, a lot of the discussion of information overload, etc, etc, most of the research has been done in the global north, and it assumes access to material resources, it takes that for granted. Looking at Argentina shows that you cannot really take for granted access to the wealth of information. So, for people to to be able to access WhatsApp — and WhatsApp is by far the most popular platform in the country more than Facebook — people have to make much more of an effort, you know, to get a smartphone, relative to their income than will they have to make in a country like Norway, or Germany, or Canada, or the US or the UK. So it shows, you know, how much people care about this world of information, and then the lengths to which they’re willing to go in order to access it.

And therefore, it puts the issue of inequality in a different light. The second issue for why Argentina is a very good case for this, as I said, before people use or access this information, use these devices, these platforms, not only to make decisions, instrumentally, about work settings, but for the most part to connect, to relate to each other. So it’s been a lot of work over the past 10 years mostly, 20 stretching, that has looked at the relational side of this, what does this mean for our everyday sociality?

Most of this work has been done in the global north, where patterns of everyday sociality tend to be more instrumental. And the cultures are more individualistic than the more gregarious, collectivistic cultures that you find in South America and Southeast Asia, for instance, for that matter. So Argentina, in particular, has a very, very strong associational culture. It’s a very, very suitable space to test, then, whether really, these devices are making us more lonely, as it has been, in general, the idea circulating in academic and media settings, or whether in a context that has a very strong associational culture, the effects are different. And the third reason why Argentina is a very important case, I think, has to do with news, politics and trust. 

So Argentina is a country with a long, deeply held distrust of institutions in particular news, and in that sense, it’s a little bit of an avant garde of where the work has been going. So the competing of these three things, and the role of information in the polity, I think presents a very good combination for why this is not a lesser version of what you find in the global north, but a country where you have national conditions that are particularly suitable for an inquiry of this kind, and that reflect much more what what is happening in the other 86% than what you can come up if you study only the 14% of the global north and and try to imagine that that also applies to Ghana, Nigeria, Paraguay, the Philippines, or Pakistan.

Noshir Contractor: I think you’ve just made a very eloquent argument in support of why being able to expand web science to focus on the global south is not just a luxury, but a necessity. Another issue that you raised in your work is also the generational differences. You wrote this book, while you were parenting two teenage daughters, tell us about how the experience of parenting and listening to how media is being consumed by different generations influenced your thinking in the book?

Pablo Boczkowski: If I situate myself in my adolescence, as most Argentine, I’m a huge soccer fan, Argentina won its first World Cup in 1978. At home, I was 13 years old, and I watched that tournament in a black and white TV. We had a landline at home that we had to wait over a year to get that installed. And maybe there was also some bribing to the local government. So they will please give us a phone, right? So when I think of, you know how my daughters access the world of information, how, for instance, they do homework, with three screens not two, so they have the computer screen for work, they have their phone next to the computer screen where they are monitoring Snapchat, and they have the TV, where they’re bingewatching their favorite show, and all of that at the same time. So in order to understand their world, and how much their affect and their sociality really resides on the screen, that what happens on Snapchat, or on Instagram, a lot of their sociality of who they are as individuals, all of that happens through information that is mediated that is not face to face. And, and in order to help them you know, when they came to me with questions, so when they told me stories in angst or when they cried or when they left, for me, in order to fully participate in that I had to ask them a million questions in order to understand that world. the project is a little bit the result of a breach that my children and I built to communicate so that I could partake of their world and they could express their world to me. 

Now that is that crosses another dimension about age that was very surprising to me. And it has to do with the fact that as measured, you know by the survey and also clear in the interviews, age has become the dominant social structure organizer to access and use to personal screens, to social media platforms and to the world of entertainment, more so than socioeconomic status.

Noshir Contractor:  That is interesting. So in some ways, people share consumption of social media based more on the age than on the socioeconomic status.

Pablo Boczkowski: And which devices they use — not only the platforms or the hardware, if you wish, and how they entertain themselves. Now, the implications of this are humongous, because if you think about social structure, you know, social structure is something that in the daily lives of people is fairly stable. That is, what research has shown time and again, is that you change socioeconomic status, very, very rarely and very slowly. But you, age every day, and you change cohorts, right, you know, when Mannheim — when Carl Mannheim, who was the first social scientist to talk about the importance of cohorts, and generations really in history. A generation lasted 20 years. A generation that is in part, defined by access and use of technology lasts less than five years now. And we age every day, our experiences are changing all the time. So our society is much more in motion as a result of the dominance of age over socioeconomic status. It is changing constantly. It is in motion, and it’s much more uncertain. 

Noshir Contractor: Well, one of the things that you have been working on even before the current work is focusing not just on media consumption, but the production of news. I remember seeing your work a long time ago, where you were looking at how newspapers were trying to navigate what was happening with the digital environments, and with the web, etc. And then again, I see that you now have this book, titled The Journalism Manifesto. What is it that the journalism manifesto looks like today that was different from a few decades ago?

Pablo Boczkowski: A lot has changed. You know, more has changed in the news industry in the production of news in the past 25 years than probably in the previous 50 to 75. So The Journalism Manifesto is a manifesto as it says in the title. It’s a strong and polemic argument that the news needs to change and how we think of it needs to change. 

So my coauthors, Barbie Zelizer, Chris Anderson, and I focus on three main interfaces of the news. The role of elites — historically, the news has been made for the elites, and by elites, and we argue that that has led to a very narrow storytelling of the society we live in today, or first draft of history, that there are many groups that have been historically marginalized, even among the best intended, that have not been part of those telling the stories on or those whose voices are represented in the news. The second interface is the interface of the norms. The idea of norms is how information is processed, right, norms of objectivity, neutrality, etc, etc, we argue that norms that historically favored certain kinds of processing of information at the expense of others. So, we argue for other norms to be included, like norms of inclusiveness, laws of cosmopolitanism. And the third interface is the interface of audiences. And the interesting thing about these is that 100 years ago, even 25 years ago, what the research show is that newspaper people or news people told the stories for each other and to each other. They had very little knowledge of the audience, it was very badly seen that you would cater to the audience and for that you needed to know them. And they assume that the audience is who are going to be there, they took the audience’s for granted, essentially, if they build it, they will come if they publish a story, somebody will read. What the web has done is two things for the audiences. Number one, it revealed a lot of information about the audience is because as we know, every time the server serves your page, the server records information about that. Number two, what that revealed is that the audience became not only known but much more uncertain, because once the web opened competition up across the information industries, right? Before, you know, news organizations had sort of a quasimonopoly, natural monopoly oligopoly position. You know, in America, for instance, in 97% of the Metropolitan markets, there was only one newspaper.

Now everybody competed with everybody else. And the other thing that we know now is that the audience is different than the audience we imagined in the 1960s. That is an audience that is much more emotional driven by what news makes them feel, not only what makes them think. It is an audience that is really cared about kin and being represented not the abstract polity, but having their own kin, their own social network represented. It’s an audience that wants to express themselves as much as they want to consume — to tell their own stories. So if the news media are to survive, they need to engage the audience where they are at, they need to tell stories that are told by people from different groups who feature a broad spectrum of factors in society, guided by norms of inclusivity, cosmopolitanism, among others, and that therefore tell stories about kin in emotional ways, and allowing people to express themselves, not only to listen.

Noshir Contractor: As I look at the conversation we’ve had today, as well as the corpus of your scholarship, I think you are making a really compelling intellectual argument for much more of a cultural perspective on web science. Where do you see this work go in the future? And how do you think web science needs to be paying more attention to these aspects that you have raised today?

Pablo Boczkowski: I think the development of computational tools in the social science has been one of the most incredible and productive areas of growth in the social science. And I think it’s only the beginning of that, for obvious, you know, technical reasons. But I think the energy and the attention that has been paid to that has made us sometimes pay comparatively less attention to dimensions of the human experience, that cannot really be captured by that. For instance, counting frequencies of words, would not tell you what that means, to the people who are using them. So I wish that as the development of computational methods and an actual, you know, computing technology develops, we don’t forget to continue investing intellectual resources and capital resources for that matter, in the development of intellectual work on a more cultural perspective that can complement from a cultural interpretive standpoint, the incredibly exciting work that is at the forefront of computational social science.

Noshir Contractor:  Thank you, again, Paulo for all the work that you’ve been doing in this area, and for coming and sharing some of these insights with that. So I will certainly recommend Abundance to anyone who is interested in learning about different ways of rethinking the extent to which we are consuming media for cognitive purposes, rather than for affective purposes, as well as for understanding that we might be blinded by views of media consumption based on the global north versus the global south or by our own age groups, as you’ve described. Thanks again, Pablo, very much for taking time to talk with us.

Pablo Boczkowski: My pleasure. Thank you very much for the invitation.

Episode 21 Transcript

Taha Yasseri: On Tinder, both sides swipe on each other. And then when there is a mutual interest, they can talk to one another. It’s symmetric by design. But in practice, we see that 80% of conversations are initiated by males. And even in those cases, the 20% of conversations that females start to talk and take the initiative, they are punished for that.

Noshir Contractor: Welcome to this episode of Untangling The Web, a podcast of the Web Science Trust. I am Noshir Contractor and I will be your host today. On this podcast we bring thought leaders to explore how the web is shaping society and how society in turn is shaping the web.

My guest today is Taha Yasseri. You just heard him talking about the gender gap that exists when it comes to who starts conversations on the dating app, Tinder.  Taha is an associate professor at the School of Sociology and a Geary Fellow at the Geary Institute for Public Policy at University College Dublin, Ireland. He has been a Senior Research Fellow at the University of Oxford, a Turing Fellow at the Alan Turing Institute for Data Science and AI, and a Research Fellow at Wolfson College at the University of Oxford.  He has studied the dynamics of social machines on the Web, online collective memory — and my favorite: online dating. Welcome Taha.

Taha Yasseri: Thank you very much. Noshir, I didn’t know you’re on the market.

Noshir Contractor: I am not in the market which is exactly why I like to be an observer. But there are lots of people who are in the market for online dating. And as you mentioned in one of your recent articles, it’s over a $2 billion business just in the United States and is expected to continue growing in the foreseeable future. 

Taha Yasseri: A lot of things that we do have changed due to internet-based technologies and web technologies. But to me, one of the most important things that have been revolutionized over the past 10 to 20 years is dating and mating — the way that we meet and we choose our partners for shorter relationships and for longer relationships, sometimes lifelong relationships. 

Noshir Contractor: I remember when online dating first began, and there was still a stigma against it. And people would not even be willing to admit that they were going online to look for dates. That changed. Tell us why you think it changed. What brought about that change? And what got you interested in doing research on this topic.

Taha Yasseri: Any new technology has its own stigma. Particularly online dating was seen as a tool or an environment for you know, not committed relationships or behavior that are promiscuous, which is something in general, something that the societies are less judgmental about it at the moment, but also, people realized, no, actually people can find partners online and through this apps or websites that they can buy, you know, settle down with and get married to and create lovely and happy families. 

Noshir Contractor: One of the things that we have been asking ourselves well before online dating is what are the kinds of traits that men find attractive about partners they are seeking, as well as females find attractive about partners that they are seeking? What is your research on online dating told us about differences or disparities in user behavior between male and female users?

Taha Yasseri: One of the most striking things that we have seen is the imbalance or the gender gap in initiation of the conversation. These modern technologies, they try to give equal weight to both genders. For example, let’s say on Tinder, both sides swipe on each other. And then when there is a mutual interest, they can talk to one another. It’s symmetric by design. But in practice, we see that 80% of conversations are initiated by males. And even in those cases, the 20% of conversations that females start to talk and take the initiative, they are punished for that. They receive less responses, compared to conversations that are initiated by males. As if collectively, we judge them because of cultural biases and the baggages that we as a society, still are dealing with. So in a newer project, We looked at 10 years trends in online dating, and we were hoping to see this gap is actually getting closed, and the balance is increasing. However, it wasn’t the case, actually, we realized that over 10 years, the gap in initiation rates has increased. And that simply tells me that it’s not only the technology, we have cultural baggages and we have things that we want to move on. And only having a shiny website is not the only solution. We require other things as well, which might not be even easy to attain through that technology.

Noshir Contractor:  And indeed, online platforms are reifying the norms that preceded the platform’s in terms of males being the ones who were expected to initiate these kinds of interactions.

Taha Yasseri: That’s very true and we can look at those trends at a very large scale. People click and people send messages. And we look on into the logs generated by these activities rather than asking people because of course when it comes to dating and mating, people’s behavior can be very, very different what they say on a survey or on a questioner. And I think web science methods and computational social science methods are particularly adequate to to address these questions, and to look at the traits in mating — our preferences and our behavior.

Noshir Contractor: Well, one of the things that we know historically, or at least, It’s socially been circulated that women put more emphasis on income and education when it comes to potential partners. And there is always the debate about the importance of physical attractiveness. What does your research show about these questions and if these criteria are changing, since online dating first began?

Taha Yasseri: You’re absolutely right. It has been predicted or reported based on a small scale studies that females put more emphasis on societal features like income or education, and males with more emphasis on physical attractiveness. This was something we observe in our analysis as well. But what we saw and it was interesting was that the emphasis on education and income is decreasing. People are more accepting of differences between their own education and income level and their potential partners, particularly when you look at female users. And it could have many reasons. Of course, one important factor here is women are much more independent today compared to 10, 15, 20 years ago. And that makes the income or education level of their potential partners less relevant so they can focus the interest into other factors. 

When it comes to physical attractiveness, one of the things that I find fascinating coming out of the analysis was that we looked at the popularity of profiles, measured through the number of messages that people receive, versus the self reported attractiveness. If I tell you Noshir I’m a 10 out of 10, you would think I would be very popular on this online websites. Well, actually, in practice the most popular male users or profile owners on online dating websites, and the ones who actually think they are at 10 out of 10, are not receiving as many messages. This is something we call the douchebag effect, because you know, someone who thinks “I’m a perfect 10,” particularly a man who thinks I’m a 10, out of 10, probably is lacking some other personality features that are attractive to female users. But when we look at the female users, the higher they rated themselves, the more messages they received, the ones who thought are 10 out of 10, were actually the ones who received the most messages. 

The other factor why very attractive males do not receive a message could have to do with self confidence and self esteem of female users. They might think, oh, that guy is out of my league, I might not even try. Whereas men don’t have this understanding of their capabilities. You know, even if they are sure that the potential female partners out of the league, they still try.

Noshir Contractor: There’s also been a perennial debate about whether dating and partnerships and romance is more likely to succeed when birds of a feather flock together. Or the other saying, which is opposites attract. What did you find about the similarity between profiles and the extent to which it might have predicted future success in terms of online dating?

Taha Yasseri: One of the things that people have credied online dating for is there a higher ratio of interracial marriages and relationships today compared to 10, 15 or 20 years ago. It is very difficult to argue that this is primarily due to the surge of online dating or it’s something that happens anyway, parallel to online dating. But I do think online dating provides us with much more diverse of a pool of potential partners. When we looked at data, however, we realize that homophily, or similarity between potential partners is not a very strong predictor of success. We couldn’t measure success after relationship, of course, because the irony of online dating is that if it works, you lose your customers. But we could see, for example, if people exchanged phone numbers, or if people carry on chatting for a while, we took measures of success like that. And we realized homophily plays very small role, this could be a reflection of a bigger change in our society is that now we are more curious and more accepting of people who are different to us.

Noshir Contractor: I want to move this to another part of your research where you have argued about whether we can use algorithms and intended bias injected within these algorithms, to move us away from our natural tendencies for homophily, for creating echo chambers or creating fragmentation. And so tell us a little bit about how you got interested in this notion, how we tend to naturally move towards segregated networks in segregated societies. And then you have a very depressing message, you say that, even if we are to use positive algorithms to try to break away from these tendencies for homophily and echo chamber — your research shows that we’re not likely to be successful.

Taha Yasseri: We all agree that we have become very fragmented in our political opinions, particularly in the US, I would say, in the UK, some countries that have gone through a lot of trouble in recent years just because of the divide in the society. So in that sense, bubbles have formed and echo chambers are there. I had heard and I had read that people say it’s up to the platforms to break the bubbles. They have to use algorithms to mix people up and connect people from other opinion camps. And we thought okay, well let’s let’s see if that works. So we developed a mathematical model, and we realized, as long as we have homophily in the network, as long as there is a slightest tendency for an individual, to prefer a connection to like-minded people, over a connection to someone dissimilar to them, no matter how much algorithmic bias we introduce, bubbles will form. We might postpone them, but we never can break them. Because that tendency — that homophily tendency — is so strong that we basically practically need huge amounts of algorithmic intervention, which, of course, takes all of the joy out of the online social networks, right? We do not go on Facebook or Twitter just to fight. And that’s actually what social network companies have capitalized on. Because if you’re happy there, we interact with people who are like minded and support our opinions, we spend more time there and we are more likely to click on the ads and so on. So confrontation is not something social networks advocate for, and combining that with homophily, and the ease of disconnecting from people who are different to us on social networks, all this together make the formation of echo chambers and bubbles inevitable. It sounds very grim. I agree. But we also propose a couple of solutions. It’s not that that’s the end of the story.

You know, before the internet, I live in a village, not everyone thinks like me. My neighbor might vote differently, might think differently. It’s not that I just move next day, you know, I still go to the same church or to the same strip club, depending of my interests. And I interact with people who are different to me, and through this interaction, I might not completely change my opinion. But at least I appreciate the differences. I learned to understand and acknowledge the existence of other opinions. On Twitter and Facebook, we are encouraged to block people, but we shouldn’t just block others or unfollow others, because we don’t agree with them. This is such a new web thing that we just don’t talk to the person so easily, and web gives us the opportunity not to see that person ever again. Whereas in that village, I had to see that neighbor anyway. We somehow have to introduce mechanisms, which encourage people to keep the interaction on and carry on interacting with people who are not exactly the same as themselves. And can I think of an example, of course, Wikipedia, that’s where these conflicts and these clashes of opinion happen. And I have spent years studying edit wars between editors of Wikipedia. One thing that we realize is that the more conflict and the more interaction between opinions and an article there is the quality of the article increases. 

Noshir Contractor: And so what I’m hearing you say is that if Facebook were to take work to make the algorithm make interventions that you think might help, that unlike what you or the algorithm might hope, which is that I will look at this and consider other points of view and it will broaden my perspective. Instead, what the user would do is simply walk away from Facebook because it’s not feeding them what they want to hear.

Taha Yasseri: Either that could happen. So that explains why social media platforms might not even try. The other thing and that is based on research that Chris Bale and his colleagues have done in their control experiment. People who are exposed to content from a different opinion have become more extreme in their own opinion because that wasn’t necessarily a interaction between humans, it was me seeing some content supporting the opposite opinion. And I never had the chance of having an active interaction with some human of that opinion. So I don’t think content sharing, meditated by algorithms is the solution. All we need is human to human direct interaction. And it is not comfortable. We all know that. And the cost for the platform could be that people walk away and their revenue might go down. 

Noshir Contractor: So you already spoke about the fact that as as an example of good engagement, where editors and Wikipedia would go off to one another. And the more they argued with one another and debated one another, the higher the quality of the final Wikipedia page that they were debating. But you also talked about the role of not just human editors of Wikipedia battling with one another. But bots, in Wikipedia, battling with one another. 

Taha Yasseri: Yes, they do. They’re not doing much creative work there, but they do a lot. In some Wikipedia editions, more than half of the edits are coming from bots. But because they never asleep, they’ve worked 24 hours a day. And they do very little things. They fix typos, they add commas, as we continue with Wikipedia. And as we develop the technology, bots now do more sophisticated things. They detect vandalism, they even create articles based on a structured information they are fed with. As I said, we were studying conflicts among humans. And it was a very long shot for me to think maybe we should also look between if there are edit words between bots. And my hypothesis was, there wouldn’t be any because bots are not emotional. They don’t take things personal. But then as soon as we looked into the data, we realized there have been pairs of bots undoing each other’s contributions for more than three years. And no one have noticed, because no one is actually looking at bots, we trust these machines, because they’re predictable with that. Yes, they’re predictable at the individual level, to some extent. But if you have learned one thing from complex system studies is that system behavior is very different to individual’s behavior. 

Noshir Contractor: How do you think that humans will play a role in brokering or mediating these kinds of arguments that emerge and don’t seem to end amongst bots?

Taha Yasseri: Ss long as we know the system, and we can predict its behavior, the good thing about sociology of machines, as opposed to sociology of humans is that we have full power, and we have all the agency that we need. Whereas if we understand the problem in a society, we might not be able to come up with an immediate solution, even if we come up with the solution, they might not be able to implement it. But the good thing is that those bots have no agency and they are serving the purpose of the owners and the society they’ve worked for. In that sense, I think things are easier. However, the difficulty comes from the fact that we have zero history of sociology of machines. We just arrived to this land and we just discovered this creatures or started to build them and embed them at every corner of our bedrooms and living rooms and the streets. We are creating the systems, it’s already a bit late to start analyzing, studying the social behavior. But as soon as we do that, and we understand how they behave, coming up with a solution and implementing it, I think should be easier than the long lasting problems we have in our own societies.

Noshir Contractor: Web science scholars have for a while thought about the web as being a social machine. And what you’re highlighting is that given that the web is a social machine, or was a collection of social machines, we need to come up with a new sociology of these social machines.

Taha Yasseri: That’s a very elegant way of putting this, that is true. The thing that I might propose to change here is to turn machine to machines. Because we have different machines coming from the fact that we have different actors. One of the sad things we learned during the pandemic was that this utopian image of a global society is not relevant. Neighboring countries blocked each other’s purchases, because of competition. When it was about the masks and the tests, and then the vaccines and so on. Therefore, the social machine of the web is not just one entity. There are competing entities. And when we saw complexity in behavior of Wikipedia bots, are very, very nice and good. I can only imagine how things could go bad and wrong when we have competing interests among not very good and not very well behaving, automated bots that are fighting for the benefits of the owners.

Noshir Contractor: I want to end by taking you to yet another issue that you have been doing some really exciting research, and that is on the topic of collective memory. One of the things that has been argued about the web in general is the fact the internet doesn’t forget. We have the archives that is allowing us to go back and look at rewind. At the same time, the European Union has to lead the way in terms of regulation that gives individuals the right to be forgotten, or at least for some of their actions to be forgotten. Tell us a little bit about how the web is able to advance our understanding of what our collective memory is, how we socially generate these common perceptions of any event on the web, and how those perceptions might change over time.

Taha Yasseri: Collective memory is not a new term, people have been talking about it at least for a 100 years. But it’s the first time we can measure it, we can put a number on it, we can look at an airline crash, and measure how many people read the Wikipedia page about this event, how many people googled it on on Google Trends data. And then 10 years later, look at the same rate and see how this number has declined over years. This is very materialistic and very operationalized, maybe oversimplified way of measuring memory. But It’s a good starting point. We have taken a similar approach, as I just described, and looked at logs of pageviews on Wikipedia and Google Search volumes, and so on, one of the first things we realize is that well, our attention is biased. We are much more attentive to things that are closer to us that are benefiting us and that are related to us. But then we also realize our memories are very much biased, to be remember past events only if they are somehow connected to us. Web science allows us to study these patterns. 

Noshir Contractor: What did your research show us about any difference in generations when looking back at events in the past? Were there differences in how one generation might view a set of events  compared to others?

Taha Yasseri: One of the limitations we have to admit that web science has is it doesn’t give us much of historical view. In our analysis, we of course, had data for the last 20 years, but we couldn’t say how much our results generalized 200 years ago, based on the data that we had from recent years, one thing that we could say is that are tied to time and scale of collective memory is around 40 years. 

Noshir Contractor: But Wikipedia does have entries for events that happened centuries ago.

Taha Yasseri: That’s very true. And that’s exactly why we could see how people react to those events in the last 20 years, and how people reacted to events more much more recent in the last 20 years. And these are people who are using Wikipedia. What we cannot talk about is how people would have reacted to those pair of events 100 years ago, because we simply didn’t have any tool to measure their behavior.

Noshir Contractor: Exactly. Well, there are many things that web science can do and others that we may recognize our limitations, at least science at this point in time. So again, I want to thank you so much for talking with us about how the web has changed online dating or maybe hasn’t changed online dating, the extent into which algorithms may or may not be able to help us confront the challenges we faced with echo chambers — the sociology of machines, as he said about how we might be looking at bots fighting with bots, mediated by humans, and then again, how all of this shapes our collective memory, I want to thank you again for taking time to talk about this. You’ve been such an exciting scholar at the forefront of web science. And we all look forward to seeing continuing research come from you and your team of collaborators. So thank you again, for joining us today. 

Taha Yasseri: Thank you very much. It’s been a great pleasure. Thank you for having me. 

Episode 20 Transcript

Richard Rogers: What has interested me and what I’ve developed as, as a so called web epistemologists, is thinking about, not just what’s specific about the culture, so what one would call web or platform vernaculars nowadays, but also what’s specific about the methods.

Noshir Contractor: Welcome to this episode of Untangling The Web, a podcast of the Web Science Trust. I am Noshir Contractor and I will be your host today. On this podcast we bring thought leaders to explore how the web is shaping society and how society in turn is shaping the web.

My guest today is Richard Rogers. You just heard him speak about what he terms “digital methods.” Richard is a professor and chair of New Media and Digital Culture at the University of Amsterdam. He also is Director of the Digital Methods initiative, known for the development of software tools for the study of online data. And he is the author of two award winning books Information Politics on the Web and Digital Methods among others. His most recent book is titled Doing Digital Methods. He is currently working on a book titled Mainstreaming the Fringe: How Misinformation Propagates in Social Media. And Richard was Program co-chair for one of the very first Web Science conferences back in 2013. Welcome, Richard. 

Richard Rogers: Thanks very much. Great to be here. 

Noshir Contractor: I’m delighted that you’re able to join us today. Take us back to those early days when you were first getting involved in the web. What prompted you to think about focusing on the web as the object of study?

Richard Rogers: Oh, that takes me way back. So I think it was in the mid-90s, when I was asked to write an article about climate change, I started sort of surfing around and noticed that certain websites linked to other websites, but then the websites didn’t link back. So that’s when I started thinking about creating software that actually maps how websites linked to one another, ultimately resulting in a piece of software called the issue crawler, which To this day, is still crawling the web and mapping links between websites.

Noshir Contractor: Tell us more about the issue crawler, that was definitely one of the first tools to study the web. And tell us what you intended it to do, why you call it the issue crawler, and where it is headed these days.

Richard Rogers: So when we started looking at links between websites, what we noticed was that a lot of websites would be linking to one another around social issues. So we coined the term issue networks — and well …coined the term, sort of repurposed it, looking at how not only NGOs and academics but also governments and corporations would be interlinking or not linking and so that we came up with a kind of link language. So there were critical links, this is like Greenpeace linking to Shell. There were aspirational links, there were these NGOs linking to governmental organizations or international organizations, and then international organizations wouldn’t link back. So there were these missing links. We call this sort of the politics of association. And that’s what we were putting on display with our link maps.

Noshir Contractor: How would you interpret when one website link to another and the other did not return the link or reciprocate the link?

Richard Rogers: It’s about reputation, largely. We found that, for example, in one small study of Armenian NGOs, so they would link copiously to one another, and then they would also sort of aspirationally link to UN organizations, and the UN organizations would link to one another, but then they wouldn’t link to the Armenian org — So it’s a kind of a lack of recognition. It’s about reputation, it’s about relevance, in some sense. 

Noshir Contractor: How relevant is web linking today as compared to what it was when you were first developing issue crawler?

Richard Rogers: So it’s interesting, when I first started writing about hyperlinks, I talked about them in terms of a sort of link economy and link economy actually supplanting an earlier economy, which I refer to as the hit economy. And so now, you could argue that the like, economy has taken over from the link economy. And of course, we’ve seen the sort of widespread industrialization of the hyperlink. You also see that links have changed, right. So it’s quite actually quite complicated, more complicated than it used to be, to map links. 

Noshir Contractor: You talked about the evolution from the link economy to the like, economy. Tell us more about what you mean by the like economy.

Richard Rogers: There’s a term that I sort of repurposed from sort of critical business studies called vanity metrics. And so I’ve been studying, quote, unquote, vanity metrics. And this is follower counts, like counts, view counts, all of these numbers that show how well you’re doing online, especially in social media. This is what you could summarize as the like economy. 

Noshir Contractor: One of your major contributions to web science over the years has been your work in the area of web epistemology. Can you tell us a little bit more about how you got interested in that, what it means and what have we learnt about that?

Richard Rogers: So generally speaking, web epistemology is the study of the web as a particular knowledge and or information culture with its own specificities. Wh at has interested me and what I’ve developed as a so called web epistemologists, is thinking about, not just what’s specific about the culture, so what one would call web or platform vernaculars nowadays, but also what’s specific about the methods and so what I’ve tried to develop over the years, or what I’ve called digital methods, 

Noshir Contractor: What are some of the things that we have unearthed that we would not have been able to do if we didn’t think about the web from an epistemological standpoint? 

Richard Rogers: If you think about web science, in particular, I think it came from a particular insight about the web — that the web is not just like a cyberspace as we once thought, this sort of realm apart, it’s not necessarily only to be studied as, as the virtual or as a virtual society, but rather, that the web has interesting societal data, right? How do you then capture this data, and think about making findings that you then ground in some ways. Amongst those ways, would be to ground them, quote, unquote, online. So this is one of the notions I’ve tried to develop, online groundedness. So the idea of using web data to make findings about what’s happening in society and culture, and then go out and grind, grounding them in the online, of course, we can triangulate, we can we can bring in other data from you know, the ground. But, but this is, we can also bring in data from different realms online. 

Noshir Contractor: One of the things that you touched on here is the ability to be able to study all of society, not just the online world, but by using tools that are gleaning information from multiple platforms online. Could you give me an example to make this more tangible, a concrete example of an issue that is more pervasive, but that you’re able to glean information from one or more online sources to get insights into it?

Richard Rogers: Well, I mean, you know, the flagship project was Google Flu Trends. And that was a very interesting project, and it ran for a number of years and and what it did was anticipated. The incidence of flu by search queries and what went wrong with Google Flu Trends? Is it sort of just a general warning about this sort or admonition about this sort of work? Right. So, when people are searching? are they searching? Because they have symptoms? Or are they searching because it’s flu season, and they’ve heard about it, flu season on the TV news. So is the phenomenon happening in the wild? Or is it happening in media? I mean, that’s for me was one of the more interesting examples also, because of the critique there have, but there are others as well. So for a number of years, for example, queries on AllRecipes.com were used in order to sort of map the geography of taste in the US.

Noshir Contractor: This area that you just talked about, the example that you gave, which is fascinating, is part of the infrastructure that you’ve been developing, more generally called the digital methods initiative. The goal of that is to do research that goes beyond the study of online culture only. Can you tell us more about the genesis of the digital methods initiative? And what are the kinds of things that you believe you could observe and study as part of the digital methods initiative.

Richard Rogers: So it goes back to the beginning of web science, in fact, so it goes back to 2007. And it’s been around. Since then, we’ve developed I think about 100 tools. And most of it is situated software. So we come up with software that we need for a particular research project, and then a lot of it sticks around, it becomes more sort of, like general purpose, but other tools go away, depending on use. But right now, we maintain quite quite a lot.

And we use this software both for societal and cultural research, as well as sort of media research, media critique. More specifically, a recent study that we did was we looked at what happens to about 20 so called extreme internet celebrities when they were deplatformed from mainstream social media platforms. And then they migrated to telegram. So we built a telegram data extraction tool in order to see what they were doing online there and to see whether or not they were acting in the same ways that they were acting before, for example, 

Noshir Contractor: And what did you find?

Richard Rogers: We found a few things, some intuitive, but a couple of things that were really counterintuitive. So the intuitive findings were that their audiences had thinned considerably. Counterintuitive was that they were still posting the same amount, or they were posting very, very frequently. And this went on for quite a few months, despite the fact that you could say that the media that the platform had less sort of oxygen giving capacity in the sense that there are ewer viewers. But the most counterintuitive thing that we found, was that their language became far less offensive over time, which then led to a number of different speculations. One speculation was that maybe they were offensive before for their audience. And not they’re not just generally that offensive, for example. Or that they entered such an offensive space that they couldn’t be more offensive than the space that they ran. So these are two different scenarios, let’s say. But nevertheless, those were some of the constitutive findings.

Noshir Contractor: I want to take us to an exhibit that you were involved in, which was featured at the Zed KM, entitled Making Things Public Atmospheres of Democracy that was curated by Bruno Latour and Peter Weibel. That sounds fascinating. Tell us more about this exhibit. 

Richard Rogers: We built a couple of exhibitions interactives. One is called the issue barometer. And the issue barometer would basically show the rise and fall of attention in particular social issues.So we took a set of NGOs, multi issue ones, also single issue ones, made an issue list on the basis of what it is that they were campaigning for on their websites. And then over the course of three years, we followed their campaigning behavior, showing how attention to particular issues rises and falls.

Noshir Contractor: To what extent do you think this helped illuminate this issue for the general audience of policymakers, do you see that these kinds of tools might increase literacy or awareness about some of these issues?

Richard Rogers: Yes, I think so. This is sort of issue trend research, if you will. You can imagine policymakers these days with issue trend dashboards, so, this is one of the earlier ones, but this was also in some ways a mirror for for non-governmental organizations. So are you demonstrating commitment, despite changes in funder agendas and sticking with particular issues? Or are you sort of following the money, so to speak. And so this was also part of the critical angle to this particular exhibition.

Noshir Contractor: To what extent are you able to use these kinds of methods to uncover disparities that may exist between the global south and the West, for example, or other forms of disparities that we see in society? Are there some examples from your work that show how these methods can bring exposure and bring those issues and those disparities to light?

Richard Rogers: what I just described, colleagues and I termed issue drift. And so particularly, nongovernmental organizations or governmental organizations sort of drifting away from things that are important when they could be sticking with them. One of the kind of critical projects that we undertook along these lines was called issue celebrities. We looked at a very important issue in the global south. And that is awareness of mines, land mines, and the clearing of mines and landmine related injuries. And we looked in particular at a charity or funding organization that was set up by Paul McCartney, and his wife at the time, Heather Mills. And, and it was quite serious that so, they raised year after year, something like $4 million, which was quite close to the total UN budget for the same activity, but then they broke up. So what happens to this global South issue when these celebrities break up and then leave it? It seems cynical on the one hand, but it’s quite serious on the other with when we’re talking about this kind of money. So this is one project that addresses that particular aspect.

Noshir Contractor: You’re working on on this book on mainstreaming the fringe — how misinformation propagates on social media.

Richard Rogers: In the run up to the 2020 US elections, we studied the extent of the so-called misinformation problem with a cross-platform analytical approach on seven social media platforms and we found that each of them in quite specific ways, but generally speaking, they all marginalize the mainstream. So for example, Twitter amplifies what is referred to oftentimes as hyperpartisan sources. On TikTok, they use particular sort of ronic sounds to instill mistrust when a mainstream media clip, for example, is played. But in all very specific ways, each of them sort of marginalized the mainstream. And of course, this has, you know, quite some implications for you know, taking seriously, news. 

Noshir Contractor: I’m still stuck back on what you mentioned earlier about TikTok sounds, tell me more about what you mean by that.

Richard Rogers: TikTok is this sort of, sort of music-driven platform and on the interface, when a particular sound is used, you can click on the sound and see other videos with the same sound. And so you can sort of map the use of particular sounds, okay. So there are certain sounds, which are used to instill mistrust in what it is that you’re looking at. And so this is this is quite interesting. And it turns out that a lot of the top let’s call them political videos on Tiktok, in the run up to the 2020, US presidential elections, were using those sounds. It develops in a kind of new type of misinformation. A lot of the videos are satirical, right? So that you think that, oh, that it’s no big deal, but at the same time, the satirical videos are introducing other sort of misinformation techniques. So you’re getting these hybrid types across social media platforms, you get new hybridity is that complicates the sort of typical topologies of misinformation, but the one on TikTok, I found was particularly interesting.

Noshir Contractor: I think one of the things that has recently emerged in web science is the endeavor to study multiple platforms. And you Richard have been at the forefront of being able to look at these multiple platforms. What I found interesting about the examples that you gave is that in many ways, while multiple platforms might allow us to triangulate some insights, you’re also finding that each of these platforms are used in distinct ways. 

Richard Rogers: I’ve been working on the kind of difficult problem of commensurability and crossplatform analysis. Especially in marketing research, a lot of the work that’s done on crossplatform analysis is about the study of engagement. So each platform has metrics. But each platform is also quite specific, right? So you can’t just blindly think that a hashtag usage is the same in Twitter as it is on Facebook, as it is somewhere else. My sort of short answer is that you need to understand the quote unquote, platform vernacular. So which types of digital objects are privileged and which are not privileged. And with that knowledge, you can then move towards something that is a more satisfactory striving for commensurability.

Noshir Contractor: That’s really been a challenge. I noticed that you’ve been spending some time focusing on a technical definition of “memes.” Tell us more about what got you interested in this particular topic at this particular point in time?

Richard Rogers: There was a Facebook engineer who was quoted a year or two ago saying, you know, 95% of the content that’s passing through memes. And I was like, oh! What I came across is, depending on the software, the memes are defined differently. For example, Know Your Meme, which is this sort of well known database that started in 2006 or (200)7, it has a particular way of thinking about a meme, and that is sort of the special internet phenomenon that requires a literacy in order to understand On the other hand, if you go to Crowdtangle, which is Facebook’s data collection software, both for research, as well as for marketing it has a meme search. And what it finds are images with text. Okay? So images with text is a very, very roomy definition of a meme. And the database definition is quite different. 

And then in the middle are a number of other ones. What I was looking at recently were: Okay, so what’s a meme?What’s a meme according to, for example? IRA disinformation operatives? So I went through about six or seven of these different ways of thinking about about memes.

Noshir Contractor: And what would these definitions allow us to do more specifically? you know, what is the advantage of creating this classification? What new insights that we gain by using this classification?

Richard Rogers: When thinking about how to study memes, you want to think about how to sort of demarcate this this phenomenon, right. And there are a variety of different ways, and I think that I think that’s the largest contribution. More specifically, what I’ve been doing is thinking about different kinds of sort of automation practices of, meme detection. And what we’re finding, generally speaking, is that the automated detection mechanisms are currently not that good at detecting what a sort of person or set of people, who are doing close reading, would call a meme. 

Noshir Contractor: Well this is interesting, thank you again, Richard, for giving us these little peeks with specifics, and all the rich kind of research that you’ve been doing and all your contributions over the years to a broader understanding of web science. And I wish you the best as you continue some of these efforts, and we’ll be tracking them in the years ahead.

Richard Rogers: Yeah, till then. My pleasure.