Episode 17 Show Notes

Emilio’s Website:

Home

Some of Emilio’s Articles:

Ferrara, E., Varol, O., Davis, C., Menczer, F., & Flammini, A. (2016). The rise of social bots. Communications of the ACM, 59(7), 96-104.

Bessi, A., & Ferrara, E. (2016). Social bots distort the 2016 US Presidential election online discussion. First Monday, 21(11-7).

Ferrara, E. (2017). Disinformation and social bot operations in the run up to the 2017 French presidential election. First Monday, 22(8)

Badawy, A., Ferrara, E., & Lerman, K. (2018, August). Analyzing the digital traces of political manipulation: The 2016 russian interference twitter campaign. In 2018 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM) (pp. 258-265). IEEE.

Stella, M., Ferrara, E., & De Domenico, M. (2018). Bots increase exposure to negative and inflammatory content in online social systems. Proceedings of the National Academy of Sciences, 115(49), 12435-12440.

Badawy, A., Lerman, K., & Ferrara, E. (2019, May). Who falls for online political manipulation?. In Companion Proceedings of The 2019 World Wide Web Conference (pp. 162-168).

Chen, E., Lerman, K., & Ferrara, E. (2020). Tracking social media discourse about the covid-19 pandemic: Development of a public coronavirus twitter data set. JMIR Public Health and Surveillance, 6(2), e19273.

Jiang, J., Chen, E., Yan, S., Lerman, K., & Ferrara, E. (2020). Political polarization drives online conversations about COVID‐19 in the United States. Human Behavior and Emerging Technologies, 2(3), 200-211.

Ferrara, E. (2020). What types of covid-19 conspiracies are populated by twitter bots?. First Monday, 25(6).

Ferrara, E., Chang, H., Chen, E., Muric, G., & Patel, J. (2020). Characterizing social media manipulation in the 2020 US presidential election. First Monday, 25(11).

Chen, E., Chang, H., Rao, A., Lerman, K., Cowan, G., & Ferrara, E. (2021). COVID-19 misinformation and the 2020 US presidential election. The Harvard Kennedy School Misinformation Review.

Emilio’s Social Media:

Twitter: @emilio__ferrara

Linkedin: https://www.linkedin.com/in/emilio-ferrara-160a9215/

Episode 15 Show Notes

If you enjoyed this episode and want to learn more, here are some materials to check out:

Some of Munmun’s Work:

  • Ernala, S. K., Birnbaum, M. L., Candan, K., Rizvi, A., Sterling, W. A., Kane, J. M., and De Choudhury, M. (2019). Methodological Gaps in Predicting Mental Health States from Social Media: Triangulating Diagnostic Signals. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland, May 4-9, 2019). CHI 2019. http://www.munmund.net/pubs/CHI19_MethodGaps.pdf  
  • Birnbaum, M. L.*, Ernala, S. K.*, Rizvi, A., Arenare, E., Van Meter, A., De Choudhury, M.** and Kane, J. M.** (2019). Detecting Relapse in Youth with Psychotic Disorders Utilizing Patient-Generated and Patient-Contributed Digital Data from Facebook. In Nature Partner Journal – Schizophrenia. npj Schizophrenia. * Co-first authors; ** Co-supervising authors https://www.nature.com/articles/s41537-019-0085-9 
  • Choi, D., Sumner, S., Holland, K., Draper, J., Murphy, S., Bowen, D., Zwald, M., Wang, J., Law, R., Taylor, J., Konjeti, C., and De Choudhury, M. (2020). Development of a Machine Learning Model Using Multiple, Heterogeneous Data Sources to Estimate Weekly US Suicide Fatalities. JAMA Network Open. 2020;3(12):e2030932. https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2774462  
  • Chancellor, S., and De Choudhury, M. (2020). Methods in Predictive Techniques for Mental Health Status on Social Media: A Critical Review. In Nature Partner Journal – Digital Medicine. npj Digital Medicine. https://www.nature.com/articles/s41746-020-0233-7  
  • Saha, K., Torous, J. T., Caine, E. D., and De Choudhury, M. (2020). Psychosocial Effects of the COVID-19 Pandemic: Large-scale Quasi-Experimental Study on Social Media. In the Journal of Medical Internet Research. JMIR. https://www.jmir.org/2020/11/e22600  
  • Chancellor, S., Birnbaum, M. L., Caine, E., Silenzio, V., and De Choudhury, M. (2019). A Taxonomy of Ethical Tensions in Inferring Mental Health States from Social Media. In Proceedings of the 2nd ACM Conference on Fairness, Accountability, and Transparency (Atlanta GA, January 29-31, 2019), forthcoming. FAT* 2019. http://www.munmund.net/pubs/FAT*2019_EthicsTaxonomy.pdf 

Munmun’s Twitter: @munmun10 

Munmun’s Lab: Lab’s Twitter: @SocwebGT

Munmun’s Organization: @gtcomputing

 

Episode 3 Show Notes

If you enjoyed this episode, here are some more materials to check out:

Dame Wendy Hall’s Bio and Twitter

Some of Dame Wendy Hall’s Articles

Berners-Lee, Tim, et al. “A framework for Web Science.” Foundations and Trends in Web Science, vol. 1, no. 1, 2006, p. 1. Gale Academic OneFile

Hendler, J., Shadbolt, N., Hall, W., Berners-Lee, T., & Weitzner, D. (2008). Web science: an interdisciplinary approach to understanding the web. Communications of the ACM, 51(7), 60–69. (Access through Paperpile)

Tiropanis, T., Hall, W., & Shadbolt, N. (2013). The web science observatory. IEEE Intelligent. (Access through Paperpile)

Tiropanis, T., Hall, W., Crowcroft, J., Contractor, N., & Tassiulas, L. (2015). Network science, web science, and internet science. Communications of the ACM, 58(8), 76–82.  (Access through Paperpile

O’Hara, K., Contractor, N. S., Hall, W., Hendler, J., & Shadbolt, N. (2013). Web Science: Understanding the Emergence of Macro-Level Features on the World Wide Web. Foundations and Trends® in Web Science, 4(2–3), 103–267. (Access through Paperpile

Related to this Episode

Web Science Trust Website 

The future of the four kingdoms of the internet  (An article in the Financial Times about the four internets that Dame Wendy Hall describes in this episode)

Episode 24 Transcript

Azeem Azhar: Research is often blamed for being a bit slow moving, (jokingly) “&ou know, I’ve been wondering about this topic. for 17 years.” Well, not in this case. And it was just over a month later that Moderna produced the first vials of its vaccine, 31 days later, after the sequence was initially released. And that is really, really remarkable., hundreds and 1000s of people. 

Noshir Contractor: Welcome to this episode of Untangling the Web, a podcast of the Web Science Trust. I am Noshir Contractor and I will be your host today. On this podcast we bring in thought leaders to explore how the web is shaping society, and how society in turn is shaping the web.

Today,my guest is Azeem Azhar, an entrepreneur, investor and author. You just heard him talk about how exponential technology enabled rapid vaccine development and distribution. He’s the founder of Exponential View, a podcast and newsletter that explores the political economy of the exponential age, reaching an audience of more than 200,000 around the world. He’s also an active startup investor with investments in AI, work from home and climate change. He’s on the board of the Ada Lovelace Institute, and sits on the World Economic Forum’s Global Futures Council on Digital Economy and Society. Previously, he founded PeerIndex, a big data analytics firm acquired in 2015. His first book, “The Exponential Age: How Accelerating Technology is Transforming Business, Politics and Society,” was just published this month and was a featured book at this year’s ACM Web Science 2021 conference. Welcome, Azeem.

Azeem Azhar: Noshir, it’s wonderful to be with you.

Noshir Contractor: I’m so delighted to have you on the show, because you have been at the start of a lot of the things that will develop on the web. And I would love if you could start by talking to us about the role that you played — take us back to where things were at that time.

Azeem Azhar: Oh, I mean, it was just an absolutely amazing time. I first accessed the internet, in 1991, through a green screen terminal at university. It just opened my mind — the idea that I could talk to anyone anywhere in the world, and there was a real innocence and intent about how people spoke about issues. They shared a lot of material. There was no sense of there being ownership in a strange way. It was a real sort of commons of contribution. I graduated University completely unable to get a job — 53 job rejections, until eventually, The Guardian who had rejected me for several jobs, asked me to come in and help on a little event they were holding in an art gallery. And the event was a web event, and I showed up and the help they needed was setting up the modems to connect to the internet. And and that’s where we were in those day. It was very, very primordial, But if I think about what the Guardian in particular was willing to do and how they were willing to experiment in 1994, it’s pretty remarkable. I mean, it’s pretty far sighted to say, well, we should try and play around with the web. 

Noshir Contractor: You mentioned in your book that you were amongst the first people to join social networking websites like 6 Degrees, like Friendster and MySpace. Tell us why you got interested in it. And what attracted you to those websites at those very early stages?

Azeem Azhar: Well, you know, I had already fallen down the hole that Tim Berners Lee, and then previous to him people like Jon Postel and John Licklider and then Leonard Kleinrock had created. And I was never going to climb out of that hole, as it were, there was too much to discover. And so you learned quite quickly, early on, if you’re an early internet user that the internet was really about people. So, when these first websites that allowed you to connect with each other, emerged, it was a really natural space, And of course, the challenges they had with this social network, was that computers were slow. And you wondered what the purpose of it was, because not everyone was on the internet. It wasn’t really your friends — it was just a bunch of people who happen to have discovered the service. But the power of being able to connect people together was really visible at that time.

Noshir Contractor: And we’ve come such a long way from those early forays. And you describe this journey as being an exponential change. Tell us a little bit about what you mean by that word, exponential in the context of exponential change and the emergence of the exponential age.

Azeem Azhar: An exponential change, you know, mathematically, is essentially any change of a constant proportion. So it’s compound interest. And what I define as an exponential technology is a technology that improves at a 10%, or higher rate every year for the same cost over many, many decades. And the consequence of that with those key technologies is that prices declined very, very rapidly. 

As prices decline rapidly, elementary economics tells us that we’ll use more of this stuff. So as computing prices, courtesy of the exponential, decline in computing power dropped, we used much more computing —  as we use much more computing, and I mean, billions 10s of billions, hundreds of billions of times, more computing.And because of that, what economics tells us is that complimentary businesses emerge. There are things that you couldn’t do with this technology that you now can do and businesses and services emerge on top of them. So from Moore’s law and silicon chips, we got cheap computers. From cheap computers, we got a web that could connect everybody. 

But the thing that I found fascinating as I unpacked this question is the impact of this declining price, was that it’s not just that things got cheaper, we use them more frequently, we might use them in more areas. But that exponential reality transmitted up to products and services that are quite far removed from the underlying technology. So Facebook was the first product to reach 3 billion users. Many of us don’t think about Moore’s law when we use Facebook, but that’s why it got there. And we’ve just heard in the last few weeks, that TikTok is now the most downloaded app in the world, which didn’t even exist when I really started to think about the book. So this idea of exponential reality is that it weaves through from the kind of core technologies all the way through to the products that get built on them. And then the services and entrepreneurs and the market respond. So the technologies and the products demand very, very fast growth rates. And that requires rapid deployment of capital. And so this venture capital industry springs up around to fund these companies very, very quickly. And the thing feeds in on itself. So that’s what I mean by exponential technologies. And the exponential age is this notion that this pattern of accelerating change is becoming widely commonplace across our political economies. And I date that that inflection point at some point between 2011 and 2014.

Noshir Contractor: And then what do you mean by the exponential gap in this context, but as you point out, that exponential age comes with an exponential gap?

Azeem Azhar: The technologies and the businesses that are built on them, and the people who can take advantage of them, improve exponentially, and they create new potentials, and new potentials that we perhaps don’t have words for. But we as humans, live within societies that are regulated by, by habits, by norms, by conventions, by formal institutions, and by informal institutions. And largely, those institutions change incrementally at a linear pace. And so there is a gap that emerges, of the acceleration going upwards and this linear trajectory. And I think the exponential gap explains why we have a common pattern of a sense of friction, of division emerging about how we think of some of the fundaments of society in the political economy.

Noshir Contractor: What would be examples of our ability to try to address the exponential gap?

Azeem Azhar: I’ll give you, one example. If we look at companies. Traditionally, the way that we economists have thought about companies and regulators have thought about companies is that companies benefit from increasing returns to scale, and at some point, get to some diminishing marginal return. And that diminishing marginal return is like a force of gravity to hold a company to a certain size. The other force of gravity was that industrial inputs progressively got more expensive. The 1,000,000th kilogram of iron ore that you extract cost a lot more than the first kilogram of iron ore. And those things would slow down companies’ abilities to grow very, very big. Now, courtesy of essentially web based technologies and databases, we start to see companies being able to break that force of gravity, and they do so in two ways. The first is that a lot of companies now benefit from network effects. While the millionth kilogram of iron ore is more expensive to extract than the first, with a network effect business, the millionth customer adds value to all the previous 999,000. That’s a phone network. That’s Facebook, that’s Twitter. There are other types of network effects that emerge in this AI world that relate to our data network effects. So we increasingly rely on machine learning and algorithms to derive value in businesses. The data network effect means that the more people who publish web pages, the more people who search on Google and click or don’t click on results, the more information Google has about what good looks like and no competitive entrant to the market, however hard they try, can get that insight. And with every cycle, every click, every search we do, Google gets better. And barriers to its economic moat gets deeper and wider. And so those two things fundamentally change how we need to think about companies. In the 20th century, if a company had 70% market share, you can bet your bottom dollar the CEO had done something dodgy. They had bought up all of the silver, they had fiddled something with the regulators, they had done, they’d colluded with a competitor. In the exponential age, companies just get to 70% market share, because that’s where network effects take them. 

And so then the question is not so much whether these companies are nefarious, or their bosses are good or bad. They may be, they may not be, it’s that the physics of exponential age companies is very different to the physics of an industrial age company. And that is the exponential gap.

Noshir Contractor: And this, of course, raises issues of ethical dilemmas that might come along with these rapid growths. And you founded in Peer Index, a big data analytics firm that was then acquired in 2015. And you talk in the book about how that experience in some ways shaped your thinking about the exponential age. 

Azeem Azhar: There was a standard that sort of evolved in the early 2000s, called FOF, friend of a friend. And the idea was that you could use that standard as a way of keeping records of who you know, and what the nature of that relationship was. So there was some semantic depth to it. And I really fell in love with that idea. I built a FOF browser, in a blogging platform that I was running in 2003, 2002. And I had fallen in love with network science, and the fact that you could learn a lot about a group of people through their relationships without necessarily knowing who they were. 

And by 2007, 2008, it was clear — Twitter had more than a million users, Facebook had more than 10 million — people were going to get addresses on the internet, they were going to be connected to other people. 

And at the time, these networks were all open. And so I thought, wouldn’t it be really interesting if we could mine and interrogate and analyze and construct analytics in order to help people discover the richness of other people more easily.So the initial idea behind PeerIndex was to help answer questions like, tell me who knows something about sushi in Chicago. or help me find someone who knows something about shin splints in London. And by being able to look at the pattern of what people are posting on Facebook and Twitter and so on.But we could also then say to you, “ook, this is how you will be seen by systems.” And you can now look at the impact of what you say and do. And we could do that because Facebook and Twitter and these other networks were all open at the time.

Noshir Contractor: That sounds absolutely fantastic. What could possibly go wrong with it? Why are there troubling aspects? Because that sounds like an ability for us to globally know who knows who, who knows what, who knows who knows who knows what.

Azeem Azhar: It’s amazing and actually in this funny way, it’s the heart of the problem. The big issue I think ends up being around partly around consent. We used a model of implied consent, which is you can always make your Twitter feed private. And you can always ask us not to be indexed, but leave your things public.

And then, and then there’s the issue of the kinds of things that you can infer about people on the basis of their their behavior. We didn’t do this, but we could predict many, many types of personal classes and behaviors. And I think that that’s also also problematic. We battled with some of those questions. And in the end, the initial idea that we could provide this as a consumer product for consumers to use, didn’t really work out. And what worked out was it was a marketing analytics product that brands wanted to use to understand audiences. What was quite interesting about moving to the brands was, they didn’t care about individuals, they cared about averages and aggregates. So actually, all those problems went away. But it led to the next issue. Once you understand that you can affect people’s behavior. by tweaking aspects of an algorithm or showing them giving them a score, you actually have some kind of power over them. And that is not power to which they have consented to, or they have any way of challenging. 

Noshir Contractor: One of the things you mentioned, as you described, the development of theory index was that at the time these platforms were open. You were able to get the data from there, even if you were implying consent on part of the users, it was still available. Since then, as you know, platforms like Facebook, don’t make that data available any longer. Why do you think that is? Do you think that they are trying to internally monetize the kind of peer index vision that you had? 

Azeem Azhar: I think they do it for exactly that reason, which is that the data is in the core heartland of their network effect. So not only does it drive their monetization, because it’s the data that drives te the ad targeting. But the second issue is that, once you as a network, make your user data entirely visible, I don’t have to be part of the network in order to access your network. And so people forget that there was a product called friend feed, and friend feed aggregated Twitter and Facebook and a bunch of other things. So in a single panel, that was not run by these companies, you could look at all of your social networks in one place, you wouldn’t see the adverts because those were not in the content feeds. And you could message back into those networks. And that weakens the network effect, which is ultimately the source of these companies’ scale. The data policies of the networks were changing very, very rapidly, andthey were being tightened. I think the thing that was that would have been frustrating for me was that The honest reason for why they were being tightened, which was this is for our strategic long-term benefit from Facebook or from Twitter, was never the one that was presented, right, the one that was presented was, we want to provide users like you or I with a consistent user experience. And if you can access Facebook or Twitter from some third party application, they might not get a consistent user experience. So I think that the real argument, simply, it’s a business reason, “we wanted all — this is our pie.”

Noshir Contractor: One of the chapters in your book talks about the world being “spiky.” As you mentioned, this was obviously a play on Thomas Friedman’s 2005 “World is Flat.” And even before that William Gibson talked about the future being here, but it was not evenly distributed. How does your use of the word spiking build on or differentiate from those approaches?

Azeem Azhar: The key ideaof the world being being flat was this notion that there’s an equalizing force of around technology tied to a particular type of economic paradigm, that if people adhere to those rules, and things would be better for everyone. And, what I think has started to happen, and what we will see because of these, these technologies, is that, in fact, the local rather than the global, will end up being economically and socially more desirable in many, many contexts.

But there’s another part of it, which is, that if you randomly form a network, you get these nodes that have got more connections to them, you get agglomeration in a random network. But in a network where people are going to move for economic or emotional or cultural reasons, you are going to see even more agglomeration, because you’re going to see intent as to where people will go. And I think as the world moves to a more complex, advanced economic position, that kind of agglomeration will continue. So my view about the world is that while we will maintain global relationships, and we need to maintain a sense of global governance to certain types of problems — many relate to the web, many relate to things like climate change — we will also start to see increasing spikes emerge and some of the assumptions that were really of the the that neoliberal era, unpick.

Noshir Contractor: So Azeem, I wanted to take us to the present, while you were writing the book, the world was confronted with the pandemic. Clearly, there are two aspects of exponential change. On the one hand, the spread of the disease, but to me more interestingly, in terms of development of the vaccine, and then in the getting people vaccinated, also represent exponential change. I was particularly intrigued by your description of a particular website, Virological.

Azeem Azhar: Virological is a sort of GitHub for virus scientists. And very early in 2020, on the sixth of January, an Australian University virologist put a very simple statement on Virological — This is a website that typically gets a few dozen visitors a month. And he simply said, Look, the Shanghai Public Health Clinical Center is releasing a Coronavirus genome from a case of respiratory disease from the Wuhan outbreak. The sequence has been deposited on GenBank and will be released as soon as possible. Now GenBank is a code repository for sequences run by the National Institutes of Health, and people flocked to it.

Within a matter of days, hundreds of researchers are looking at this genome, because it’s new and it’s interesting. And we’ve not really got cases outside of China by this by this case. But what I found fascinating, is that research is often blamed for being a bit slow moving, you know, “I’ve been wondering about this chapter for 17 years,” well, not in this case. And it was just over a month later that Moderna produced the first vials of its vaccine. 31 days later, after the sequence was initially released. And that is really, really remarkable. What’s remarkable is not just that we could sequence the virus so easily. And that’s as a consequence of another exponential technology, which is genome sequencing. But then courtesy of the web, which is another exponential technology, we were able to get it out to, you know, hundreds and 1000s of people. And then the techniques that Moderna used, many of which relied on a machine learning based system to help manage data, discover data and look for patterns, were also applications of exponential technologies. And so you end up within 12 months of the virus being identified, we had seven different vaccines that had been approved, and 24 million people had received their first shot of the vaccine.  A large part of just being able to do this and coordinate people to deliver and then receive the vaccine is entirely dependent on computers and databases and smartphones.

Noshir Contractor:  One of the things you talk about in the exponential thesis is that there was a change, an exponential change both in the amount of things being invented, and the ways in which they get scaled. Tell us a little bit about the difference that you see between the exponential change in invention versus scaling up?

Azeem Azhar: There are larger markets to go after. And it’s cheaper to do this invention than it ever has been. One core idea that I talked about is the idea of combinations, the fact that technologies from different domains can combine and they’re reliant on there being open standards and modularity.

On the other hand, the question is, why can we then adopt them so much more quickly, and the reason we can adopt them so much more quickly. And I think this is where part of the thesis is a bit complex, right? It relates to the fact that there are global networks of information and global networks of, of distribution. And I think back to the first iPhone, which was launched in 2007. And it was available in one store in San Francisco, just off Union Square. And when the iPhone 12 was released, it was available in 300 cities around the world on the same day. And that is a testament to being able to coordinate and deliver these products over the place at the same time. And I think that that’s one of the interesting wrappings of the book and my argument, which is that the exponential age isn’t just about a process where silicon chips get faster and faster and faster. It’s that, that speed that acceleration has a way of echoing through other parts of industry, and then butting in quite quickly into our, the rest of our lives.

Noshir Contractor: That teases well for my closing question. We spent some time talking about the pace with which the exponential age is upon us. Will it ever stop?

Azeem Azhar: Well, I think in the timeframe that I’m thinking about in the of the book, which is, you know, decades, it will continue. I think we were scratching the surface, there are still incredible breakthroughs that are happening. And even things that happened while I was writing the book, I talk about in the book about a Romanian company called UiPath. And when I wrote the first draft, UI path was one of the fastest growing software companies in Europe and was had was worth a billion dollars. By the second draft, I’ve had to write that up to 7 billion, by the third, it was past 10. And just as we’re going to print, I had to quickly go in and change that number to $35 billion. So it’s a 35x increase in the valuation in the year or so that I went from first draft to to go into print. So I think it does continue. There’s a more metaphysical, I suppose question, maybe it’s a physics question, which is, can it continue forever? I mean, physicists will tell you that, ultimately, there are a limited number of atoms in the universe. And there is there’s sort of issues of their complexity and what can an atom really support. So I’m sure there could be some physical limit to all of this. But that is the subject of a book that will have to be written by somebody else.

Noshir Contractor: Very good then. But speaking of your book, I really enjoyed it. And I would recommend it very much — the title of the book, The Exponential Age, how accelerating technology is transforming business, politics and society. Azeem, thank you so much for taking time to talk with us today about the exponential change that we are witnessing and in particular, being able to tie in many cases to topics of interest to those who are following the web and in web science in particular. Thank you so much again. 

Azeem Azhar: Thank you, Noshir. Really appreciate it. 

Episode 20 Transcript

Richard Rogers: What has interested me and what I’ve developed as, as a so called web epistemologists, is thinking about, not just what’s specific about the culture, so what one would call web or platform vernaculars nowadays, but also what’s specific about the methods.

Noshir Contractor: Welcome to this episode of Untangling The Web, a podcast of the Web Science Trust. I am Noshir Contractor and I will be your host today. On this podcast we bring thought leaders to explore how the web is shaping society and how society in turn is shaping the web.

My guest today is Richard Rogers. You just heard him speak about what he terms “digital methods.” Richard is a professor and chair of New Media and Digital Culture at the University of Amsterdam. He also is Director of the Digital Methods initiative, known for the development of software tools for the study of online data. And he is the author of two award winning books Information Politics on the Web and Digital Methods among others. His most recent book is titled Doing Digital Methods. He is currently working on a book titled Mainstreaming the Fringe: How Misinformation Propagates in Social Media. And Richard was Program co-chair for one of the very first Web Science conferences back in 2013. Welcome, Richard. 

Richard Rogers: Thanks very much. Great to be here. 

Noshir Contractor: I’m delighted that you’re able to join us today. Take us back to those early days when you were first getting involved in the web. What prompted you to think about focusing on the web as the object of study?

Richard Rogers: Oh, that takes me way back. So I think it was in the mid-90s, when I was asked to write an article about climate change, I started sort of surfing around and noticed that certain websites linked to other websites, but then the websites didn’t link back. So that’s when I started thinking about creating software that actually maps how websites linked to one another, ultimately resulting in a piece of software called the issue crawler, which To this day, is still crawling the web and mapping links between websites.

Noshir Contractor: Tell us more about the issue crawler, that was definitely one of the first tools to study the web. And tell us what you intended it to do, why you call it the issue crawler, and where it is headed these days.

Richard Rogers: So when we started looking at links between websites, what we noticed was that a lot of websites would be linking to one another around social issues. So we coined the term issue networks — and well …coined the term, sort of repurposed it, looking at how not only NGOs and academics but also governments and corporations would be interlinking or not linking and so that we came up with a kind of link language. So there were critical links, this is like Greenpeace linking to Shell. There were aspirational links, there were these NGOs linking to governmental organizations or international organizations, and then international organizations wouldn’t link back. So there were these missing links. We call this sort of the politics of association. And that’s what we were putting on display with our link maps.

Noshir Contractor: How would you interpret when one website link to another and the other did not return the link or reciprocate the link?

Richard Rogers: It’s about reputation, largely. We found that, for example, in one small study of Armenian NGOs, so they would link copiously to one another, and then they would also sort of aspirationally link to UN organizations, and the UN organizations would link to one another, but then they wouldn’t link to the Armenian org — So it’s a kind of a lack of recognition. It’s about reputation, it’s about relevance, in some sense. 

Noshir Contractor: How relevant is web linking today as compared to what it was when you were first developing issue crawler?

Richard Rogers: So it’s interesting, when I first started writing about hyperlinks, I talked about them in terms of a sort of link economy and link economy actually supplanting an earlier economy, which I refer to as the hit economy. And so now, you could argue that the like, economy has taken over from the link economy. And of course, we’ve seen the sort of widespread industrialization of the hyperlink. You also see that links have changed, right. So it’s quite actually quite complicated, more complicated than it used to be, to map links. 

Noshir Contractor: You talked about the evolution from the link economy to the like, economy. Tell us more about what you mean by the like economy.

Richard Rogers: There’s a term that I sort of repurposed from sort of critical business studies called vanity metrics. And so I’ve been studying, quote, unquote, vanity metrics. And this is follower counts, like counts, view counts, all of these numbers that show how well you’re doing online, especially in social media. This is what you could summarize as the like economy. 

Noshir Contractor: One of your major contributions to web science over the years has been your work in the area of web epistemology. Can you tell us a little bit more about how you got interested in that, what it means and what have we learnt about that?

Richard Rogers: So generally speaking, web epistemology is the study of the web as a particular knowledge and or information culture with its own specificities. Wh at has interested me and what I’ve developed as a so called web epistemologists, is thinking about, not just what’s specific about the culture, so what one would call web or platform vernaculars nowadays, but also what’s specific about the methods and so what I’ve tried to develop over the years, or what I’ve called digital methods, 

Noshir Contractor: What are some of the things that we have unearthed that we would not have been able to do if we didn’t think about the web from an epistemological standpoint? 

Richard Rogers: If you think about web science, in particular, I think it came from a particular insight about the web — that the web is not just like a cyberspace as we once thought, this sort of realm apart, it’s not necessarily only to be studied as, as the virtual or as a virtual society, but rather, that the web has interesting societal data, right? How do you then capture this data, and think about making findings that you then ground in some ways. Amongst those ways, would be to ground them, quote, unquote, online. So this is one of the notions I’ve tried to develop, online groundedness. So the idea of using web data to make findings about what’s happening in society and culture, and then go out and grind, grounding them in the online, of course, we can triangulate, we can we can bring in other data from you know, the ground. But, but this is, we can also bring in data from different realms online. 

Noshir Contractor: One of the things that you touched on here is the ability to be able to study all of society, not just the online world, but by using tools that are gleaning information from multiple platforms online. Could you give me an example to make this more tangible, a concrete example of an issue that is more pervasive, but that you’re able to glean information from one or more online sources to get insights into it?

Richard Rogers: Well, I mean, you know, the flagship project was Google Flu Trends. And that was a very interesting project, and it ran for a number of years and and what it did was anticipated. The incidence of flu by search queries and what went wrong with Google Flu Trends? Is it sort of just a general warning about this sort or admonition about this sort of work? Right. So, when people are searching? are they searching? Because they have symptoms? Or are they searching because it’s flu season, and they’ve heard about it, flu season on the TV news. So is the phenomenon happening in the wild? Or is it happening in media? I mean, that’s for me was one of the more interesting examples also, because of the critique there have, but there are others as well. So for a number of years, for example, queries on AllRecipes.com were used in order to sort of map the geography of taste in the US.

Noshir Contractor: This area that you just talked about, the example that you gave, which is fascinating, is part of the infrastructure that you’ve been developing, more generally called the digital methods initiative. The goal of that is to do research that goes beyond the study of online culture only. Can you tell us more about the genesis of the digital methods initiative? And what are the kinds of things that you believe you could observe and study as part of the digital methods initiative.

Richard Rogers: So it goes back to the beginning of web science, in fact, so it goes back to 2007. And it’s been around. Since then, we’ve developed I think about 100 tools. And most of it is situated software. So we come up with software that we need for a particular research project, and then a lot of it sticks around, it becomes more sort of, like general purpose, but other tools go away, depending on use. But right now, we maintain quite quite a lot.

And we use this software both for societal and cultural research, as well as sort of media research, media critique. More specifically, a recent study that we did was we looked at what happens to about 20 so called extreme internet celebrities when they were deplatformed from mainstream social media platforms. And then they migrated to telegram. So we built a telegram data extraction tool in order to see what they were doing online there and to see whether or not they were acting in the same ways that they were acting before, for example, 

Noshir Contractor: And what did you find?

Richard Rogers: We found a few things, some intuitive, but a couple of things that were really counterintuitive. So the intuitive findings were that their audiences had thinned considerably. Counterintuitive was that they were still posting the same amount, or they were posting very, very frequently. And this went on for quite a few months, despite the fact that you could say that the media that the platform had less sort of oxygen giving capacity in the sense that there are ewer viewers. But the most counterintuitive thing that we found, was that their language became far less offensive over time, which then led to a number of different speculations. One speculation was that maybe they were offensive before for their audience. And not they’re not just generally that offensive, for example. Or that they entered such an offensive space that they couldn’t be more offensive than the space that they ran. So these are two different scenarios, let’s say. But nevertheless, those were some of the constitutive findings.

Noshir Contractor: I want to take us to an exhibit that you were involved in, which was featured at the Zed KM, entitled Making Things Public Atmospheres of Democracy that was curated by Bruno Latour and Peter Weibel. That sounds fascinating. Tell us more about this exhibit. 

Richard Rogers: We built a couple of exhibitions interactives. One is called the issue barometer. And the issue barometer would basically show the rise and fall of attention in particular social issues.So we took a set of NGOs, multi issue ones, also single issue ones, made an issue list on the basis of what it is that they were campaigning for on their websites. And then over the course of three years, we followed their campaigning behavior, showing how attention to particular issues rises and falls.

Noshir Contractor: To what extent do you think this helped illuminate this issue for the general audience of policymakers, do you see that these kinds of tools might increase literacy or awareness about some of these issues?

Richard Rogers: Yes, I think so. This is sort of issue trend research, if you will. You can imagine policymakers these days with issue trend dashboards, so, this is one of the earlier ones, but this was also in some ways a mirror for for non-governmental organizations. So are you demonstrating commitment, despite changes in funder agendas and sticking with particular issues? Or are you sort of following the money, so to speak. And so this was also part of the critical angle to this particular exhibition.

Noshir Contractor: To what extent are you able to use these kinds of methods to uncover disparities that may exist between the global south and the West, for example, or other forms of disparities that we see in society? Are there some examples from your work that show how these methods can bring exposure and bring those issues and those disparities to light?

Richard Rogers: what I just described, colleagues and I termed issue drift. And so particularly, nongovernmental organizations or governmental organizations sort of drifting away from things that are important when they could be sticking with them. One of the kind of critical projects that we undertook along these lines was called issue celebrities. We looked at a very important issue in the global south. And that is awareness of mines, land mines, and the clearing of mines and landmine related injuries. And we looked in particular at a charity or funding organization that was set up by Paul McCartney, and his wife at the time, Heather Mills. And, and it was quite serious that so, they raised year after year, something like $4 million, which was quite close to the total UN budget for the same activity, but then they broke up. So what happens to this global South issue when these celebrities break up and then leave it? It seems cynical on the one hand, but it’s quite serious on the other with when we’re talking about this kind of money. So this is one project that addresses that particular aspect.

Noshir Contractor: You’re working on on this book on mainstreaming the fringe — how misinformation propagates on social media.

Richard Rogers: In the run up to the 2020 US elections, we studied the extent of the so-called misinformation problem with a cross-platform analytical approach on seven social media platforms and we found that each of them in quite specific ways, but generally speaking, they all marginalize the mainstream. So for example, Twitter amplifies what is referred to oftentimes as hyperpartisan sources. On TikTok, they use particular sort of ronic sounds to instill mistrust when a mainstream media clip, for example, is played. But in all very specific ways, each of them sort of marginalized the mainstream. And of course, this has, you know, quite some implications for you know, taking seriously, news. 

Noshir Contractor: I’m still stuck back on what you mentioned earlier about TikTok sounds, tell me more about what you mean by that.

Richard Rogers: TikTok is this sort of, sort of music-driven platform and on the interface, when a particular sound is used, you can click on the sound and see other videos with the same sound. And so you can sort of map the use of particular sounds, okay. So there are certain sounds, which are used to instill mistrust in what it is that you’re looking at. And so this is this is quite interesting. And it turns out that a lot of the top let’s call them political videos on Tiktok, in the run up to the 2020, US presidential elections, were using those sounds. It develops in a kind of new type of misinformation. A lot of the videos are satirical, right? So that you think that, oh, that it’s no big deal, but at the same time, the satirical videos are introducing other sort of misinformation techniques. So you’re getting these hybrid types across social media platforms, you get new hybridity is that complicates the sort of typical topologies of misinformation, but the one on TikTok, I found was particularly interesting.

Noshir Contractor: I think one of the things that has recently emerged in web science is the endeavor to study multiple platforms. And you Richard have been at the forefront of being able to look at these multiple platforms. What I found interesting about the examples that you gave is that in many ways, while multiple platforms might allow us to triangulate some insights, you’re also finding that each of these platforms are used in distinct ways. 

Richard Rogers: I’ve been working on the kind of difficult problem of commensurability and crossplatform analysis. Especially in marketing research, a lot of the work that’s done on crossplatform analysis is about the study of engagement. So each platform has metrics. But each platform is also quite specific, right? So you can’t just blindly think that a hashtag usage is the same in Twitter as it is on Facebook, as it is somewhere else. My sort of short answer is that you need to understand the quote unquote, platform vernacular. So which types of digital objects are privileged and which are not privileged. And with that knowledge, you can then move towards something that is a more satisfactory striving for commensurability.

Noshir Contractor: That’s really been a challenge. I noticed that you’ve been spending some time focusing on a technical definition of “memes.” Tell us more about what got you interested in this particular topic at this particular point in time?

Richard Rogers: There was a Facebook engineer who was quoted a year or two ago saying, you know, 95% of the content that’s passing through memes. And I was like, oh! What I came across is, depending on the software, the memes are defined differently. For example, Know Your Meme, which is this sort of well known database that started in 2006 or (200)7, it has a particular way of thinking about a meme, and that is sort of the special internet phenomenon that requires a literacy in order to understand On the other hand, if you go to Crowdtangle, which is Facebook’s data collection software, both for research, as well as for marketing it has a meme search. And what it finds are images with text. Okay? So images with text is a very, very roomy definition of a meme. And the database definition is quite different. 

And then in the middle are a number of other ones. What I was looking at recently were: Okay, so what’s a meme?What’s a meme according to, for example? IRA disinformation operatives? So I went through about six or seven of these different ways of thinking about about memes.

Noshir Contractor: And what would these definitions allow us to do more specifically? you know, what is the advantage of creating this classification? What new insights that we gain by using this classification?

Richard Rogers: When thinking about how to study memes, you want to think about how to sort of demarcate this this phenomenon, right. And there are a variety of different ways, and I think that I think that’s the largest contribution. More specifically, what I’ve been doing is thinking about different kinds of sort of automation practices of, meme detection. And what we’re finding, generally speaking, is that the automated detection mechanisms are currently not that good at detecting what a sort of person or set of people, who are doing close reading, would call a meme. 

Noshir Contractor: Well this is interesting, thank you again, Richard, for giving us these little peeks with specifics, and all the rich kind of research that you’ve been doing and all your contributions over the years to a broader understanding of web science. And I wish you the best as you continue some of these efforts, and we’ll be tracking them in the years ahead.

Richard Rogers: Yeah, till then. My pleasure.

Episode 17 Transcript

Emilio Ferrara: I feel like in web science and jointly adopting theories and data science, tools, and computational tools allow you to come up with the right blend of theory and data. That allows us to understand this phenomenon beyond just simple characterization, or simple theoretical explanation without a support from empirical evidence. This is really the fascinating power of web science, putting together these two things and balancing them together.

Noshir Contractor: You just heard from our guest today, Emilio Ferrara, who is an Associate Professor at the University of Southern California in Los Angeles. He has appointments in… Communication at the Annenberg School for Communication & Journalism, in Computer Science at the Viterbi School of Engineering and in Preventive Medicine at the Keck School of Medicine.  

He’s also a Research Team Leader for AI at USC’s Information Sciences Institute and the Director of the Annenberg Networks Network, ANN for short. And earlier this year, Emilio became the Chair of the Web Science Trust Network of Laboratories (WSTNet for short), which makes him an especially valuable guest today to talk with us about what he envisions as the next generation of Web Science. Welcome, Emilio.

Emilio Ferrara: Thank you very much, Noshir. I appreciate being here today.

Noshir Contractor: I’m just delighted that we have an opportunity to talk with you. I want to start by a tagline that I see associated with you, and that is your interest in networks and societies plus humans and machines. Can you unpack what you mean?

Emilio Ferrara: These new technologies emerge within our society, and they have effects on our society, their effects on the ways we connect among each other. They have effects on what we see on the web, and so on. So I feel like one of the most interesting opportunities in the context of web science that has been emerging over the last couple of years is certainly the ability to study how all these components interact — social media platforms, social networks, online and offline. And these emerging tools of artificial intelligence allow humans and machine to collaborate with each other. So this intersection of these different disciplines and areas has been the focus on my research for the last decade.

Noshir Contractor: You’ve done a lot of work, obviously, in the area of social bots, how would you consider or characterize the research that you’ve been doing, as an instantiation of this tagline that you have in terms of networks in society and humans and machines?

Emilio Ferrara: So we’re seeing the effect of social bots, which are accounts controlled in part are entirely by software, rather than human users, on social media platforms in a number of application domains spanning from politics, to public health, and so on. And these accounts are operating in public spaces in online networks, and they also interact with human users. So here you have the intersection between humans and machines affecting our social networks and our society.

Noshir Contractor: That’s terrific. Now, one of the first areas where I began to read about your work was the DARPA Twitter bot challenge that got a lot of visibility. DARPA has had an interesting history about technologies having been in involved with helping get the internet itself started. And this Twitter bot challenge was something that caught a lot of attention. Tell us what the challenge was about and what what what you learned from your experience with it.

Emilio Ferrara: Our team at Indiana University led by Fil Menczer and Alex Flamini was selected for this program. And I was lucky enough to be part of these efforts. And one of the goals of this program, as it developed, was to understand the possible effects of social bots on social media communication, especially in the context of public health.   

So DARPA  organized in 2014, this challenge for the detection of bots engaged in vaccine debate online. So the goal was to distinguish these anti vaccine and pro vaccine bots in the discussion. The challenge itself took place over several weeks over which we would receive Twitter content, sort of a playback of the to their content that will deploy our technologies to detect such accounts. 

So three teams did extraordinarily well. They detected all the bots, the team from the Subramanian and University of Maryland and the team at Indiana University led by Phil Manzer and Alex Flamini that I was part of, and then the team by USC led by Ron Gaston and Christina Lerman. And ultimately, this was just before I moved from Indiana University to USC.

Noshir Contractor: So you moved from one winning team to the other winning team, our competition? 

Emilio Ferrara: Yes.

Noshir Contractor: Back then, you were already looking at understanding the role of social bots and social media to deal with issues associated with vaccination, way before we all were experiencing what we have over the past year with the COVID crisis.

Emilio Ferrara:   I feel like public health has been maybe the most salient area where the manipulation of social media can have an impact on the real world and the change of behavior in the real world. Of course, there has been a lot of emphasis on politics, but public health sometimes goes under the radar. And it’s really not well established yet, the extent to which the manipulation of public health related discussion can be detrimental and dangerous for our society and social media definitely play a big role. So we looked at vaccine debates, long before COVID-19, we started looking into that around 2013 or so. 

And that was just around the time when a big measles outbreak occurred in California. And interestingly enough, the vast majority of anti vaxxer and vaccination groups, were actually aligning with left leaning or more liberal ideologies. Whereas today, when we look at the hesitancy around the COVID vaccine efforts, these emerge mostly from more conservative users. So these should tell you how much of a bipartisan issue is vaccine hesitancy, and how important it is to understand vaccine vaccine hesitancy through the lens of social media, because social media allow us to get a very diverse representation of political ideologies and how these ideologies interact with public health behavior.

Noshir Contractor: I want to go back to something you said, and that is that initially, the anti vaccine will be largely from the left, and now they’re from the right. I understand the second one. What is your explanation about the first?

Emilio Ferrara: That is an interesting phenomenon that we have seen, and you’re definitely right early on. And, you know, a decade ago or so these anti vaccination movement, especially opposing mandatory vaccine regulations for children emerge mostly from liberal progressive users. Were individuals with high education, typically from good upbringing, upbringing, in urban areas in rich states, like California, and you know, states in the West Coast.  And this was not necessarily for religious beliefs, but really for personal beliefs, and concerns about vaccine safety, vaccine side effects in children and so on. Interestingly enough, we have seen this shift towards a more diverse population of users that are opposing vaccination campaigns and a shift towards more conservative users opposing COVID vaccinations over these recent last year or so. So it’s really a bipartisan issue. And it’s a very complex issue to explain that cannot be explained exclusively with political beliefs.

Noshir Contractor: You’ve talked about the sort of relationship between political polarization on the one hand and online conversations not just about politics, but also about the pandemic. And you published an article late last year titled “What types of COVID-19 conspiracies are populated by Twitter bots.” So Emilio, my question to you what types of COVID-19 conspiracies are populated by Twitter bots?

Emilio Ferrara: Unfortunately, a lot. And unfortunately, some of the worst conspiracies that you can imagine. So this is a paper in which I took an early look at the landscape of COVID related discussion on social media. So we were lucky enough to have this foresight in our lab to start tracking COVID discussion early on in, maybe before everyone else did, in January 2020. And we also published these data sets. We made it openly available to the research community, and publish the associated paper in the Journal on Medical Internet Research. 

We made it available because we thought these would eliminate one of the barriers to allow researchers to get large data sets and understand this phenomenon beyond what we could do in house in our lab. 

In this study, I highlighted the role of 10s of 1000s of bots over the first couple of months of COVID. And it turned out the bots were active in the spread of political conspiracies, conspiracies of various types, conspiracies pertaining the origin of the virus. Some of the bots suggested that the virus was a biowarfare that was deliberately created, for example, by China, and it was deliberately spread to the United States and the rest of the world. And this created a lot of anti Asian sentiment. So that was very problematic kind of conspiracy. The article also highlights how other conspiracies focus on misinformation about treatments. 

But I feel like the most concerning kind of conspiracy is that the study highlights are related to bots that spread extreme political beliefs, beliefs that are mostly aligned with the out-right movements and far right ideology and so on. So what this study highlights is the attempts to  hijack COVID the discussion and turn it into political extremism. So some of the most active bots that we uncovered that I documented in this study, are bots that effectively spread Q’Anon. And some of the most prominent hashtags that we see are spread by these bots are hashtags that are very popular hashtag for white supremacists.

There are very many troublesome ideas and ideologies that have been spread and injected into COVID. And many of them have been pushed by bots. One thing that fortunately we have observed is the fact that Twitter ultimately suspended a large fraction of these accounts. So there was a mitigation strategy in place. But this took place many, many months after these accounts started to spread these ideas. So it was already laid, in some sense, they had already a large effect on the network in terms of spreading these problematic ideas.

Noshir Contractor: You’re using web science to study this phenomena that was created on the web. What are some of the ways in which web science is providing you unique tools and techniques to study for better? And for worse? 

Emilio Ferrara:  That is an absolutely interesting question. And actually, I feel like web science was the catalyst that started this entire research direction. In fact, early on in 2014, we published one of the first studies that looked at how to use the tools of web science to study social movements on online. Over the last several years, we have been using the same tools of web science to study other social movements. For example, we have some work coming out immediately where we studied Black Lives Matter, through the same lens of social media and social media discourse. So web science has allowed us to focus on the behavior of individuals and the communities and groups on these platforms and understand how these collectives emerge and characterize their activities, not only from a computational standpoint, from a data-driven standpoint, but also from a theoretical standpoint, looking at these groups as organizations,  looking at these groups as collectives. 

And I feel like in web science and jointly adopting theories and data science, tools, and computational tools allow you to come up with the right blend of theory and data. that allows us to understand this phenomenon beyond just simple characterization, or simple theoretical explanation without a support from empirical evidence. This is really the fascinating power of web science, putting together these two things and balancing them together. And we have been learning a lot about how to increase diversity, how to understand the biases, and so on, through the lens of the web.

Noshir Contractor: I want to touch on something you just mentioned, the extent to which web science should be in your opinion, be primarily concerned with identifying issues, identifying bias, recognizing things that might not be obvious. And then on the other hand, for a lack of a better phrase, doing something about it. How do you see where web science is currently positioning itself? And how well it’s doing on either of these? And how much should it be doing on each of these areas?

Emilio Ferrara: It’s dear to my own heart and research agenda to understand how we can use the web and the technologies that are enabled by artificial intelligence and so on to improve the web and to improve our society in you know, very directly right. 

I was delighted to see that these years Web Science Conference topic is actually revolving around making the web a more diverse, more equitable place using web science as a as a framework. And I feel like web science, in its transdisciplinarity nature, provides the best tools to do that. We have the artifacts, we have the machines, we have the technologies and tools. And on the other hand, we have the collectives, the networks, the aggregation for where people come together, and they are at the same time using these tools, but they’re also affected by these tools, right. And in my opinion, if you look at these dimensions in a disjointed manner, you’re only going to be able to grasp a partial view of the problems. You need to look at both sides, if you really want to understand how you can improve society using these tools, and how you can mitigate the negative effects of these tools on human users. And, I feel like web science offers the best lens to do that.

Noshir Contractor: And you have done beyond your own research. When I introduced you, for this episode, I mentioned that you have recently become the director of the Web Science Trust network of laboratories. So first of all, congratulations, and thank you for taking on that important role. I would like you to tell our listeners a little bit about what the Web Science Trust Network of laboratories is, and what you see as a vision going forward for this network.

Emilio Ferrara: So the Web Science Trust operates a network of laboratories that are spread around the whole world, there are more than 20 section labs affiliated in this network. And they are all very well known groups of researchers whose work is often associated with web science and other sister disciplines, all revolving around the study of social networks, the web, human and machine behavior and so on. These centers have been pushing the disciplines collectively over the last two decades or so. And the network itself plays a role into shepherding in some sense the community and the official direction of this discipline.

I feel like as a director, my dream would be to enable all these labs belonging to this network to collectively operate and create new initiatives that can push forward with data, and can push the impact of web science into our world into even more evident, obvious avenues. So we are embarking on to our collective initiatives, initiative to pursue larger projects, collaborative projects that try to cross national boundaries, and push for more diversity in this field and more diversity in even in the labs and in the discipline as well. Coming up with our moonshot, which would be at least one major research project, bringing onboard as many of these labs as many of these countries as many of these sub communities as possible, and trying to pursue such moonshot project. So there is a bright future in my mind ahead for the web science community, for the Web Science Trust and the network of laboratories. 

Noshir Contractor: You did mention the word moonshot. If I had to ask you at this point, what would be an example of a moonshot study that would involve all the WSNET labs or at least a large number of them? What would that look like?

Emilio Ferrara: As for the moon shot, I feel like one of the major roles that web science and web science community can have in the future is really operating as a glue to bring together people from different fields and encourage them to pay attention to maybe the web as as a societal glue, as a as a system of systems, and allow the study of the web or systems of systems, with respect to some of the emerging problems that we collectively face, our society. 

The pandemic of courses may be the problem that is on everyone’s mind on these days. But there are many other problems that emerge. And the web can provide a lens to study them: sustainability, climate change, and so on are definitely very important problems. I feel like that’s something where the Web Science Trust can contribute because we can use the web as a monitor, as an observatory, to understand how people think about these problems, how people pay or not pay attention to climate change, to sustainability issues and so on. 

Obviously, there is another big problem that revolves around artificial intelligence and automation. So you know, as these AI revolution keeps emerging, there are going to be issues with job displacement. Web science, again, provides a lens to study human behavior in these new contexts and a way, maybe to anticipate issues and problems with that will exist in the future of work and society. 

And then, of course, there is always the aspect of democracy that is very dear to my heart, right. So as we observe the world change, as a reflection of all these phenomena, public health, pandemic, automation, and so on. Our countries, our democracy are constantly in peril in danger, right, because we have seen the rise of these nationalist movements, and extremism of every kind and sort that they’ve been leading and growing in the web. And we should study them through the web, right? Because that’s their natural environment. And these are the tools that we have at our disposal, and we should maximize them. So I think these are going to be part of the big moonshot, that the Web Science Trust, and web science as a community should, hopefully contribute to in the near future.

Noshir Contractor: That is extremely exciting. I think the idea of taking web science and training its focus on some of these grand societal challenges, would be incredibly powerful and compelling, if for no other reason, because so much of what is happening in all of these contexts is being coordinated via the web. And so as you said, using the web as an observatory and as a monitoring platform becomes important. And as you said, beyond just using it to monitor, you also have the ability to change some of these phenomena as a result of the tools that we have and the technologies and that we have related to the web, etc. 

I want to thank you again, Emilio, so much for all the excellent work that you’ve been doing in this area, for your leadership on the Web Science Trust Network of Laboratories, and for coming and sharing some of these ideas and exciting plans that you have with us today. So thank you again, very much, and we look forward to getting much more insights and research and leadership from you in the decade ahead.

Emilio Ferrara: Thank you very much. It was my privilege.

Episode 16 Transcript

Aleks Krotoski: I think that is the thing that has surprised me the most about where the web is now. The requirement the necessity for people to present as their offline selves, whether that’s for commercial purposes, or for social and psychological purposes. The great playground that we had of identity, the idea of being shielded from, full identity revelation that that we experienced, even as late as 2009. You know, we don’t have that as much anymore. We aren’t able to play with our identities as much as we were anymore. And I think that that has very interesting consequences for not just how we study web science, but also for the actual experience of the people who are living in this digital world.

Noshir Contractor: Today, our guest is Dr. Aleks Krotoski. Earlier, you heard her talk about the web’s impact on people’s offline and online identities. She’s an award-winning international broadcaster, author and academic, and she studies and writes about technology and interactivity. In 2009, she earned a Ph.D. from the University of Surrey, with her thesis focusing on information flow and the spread of ideas across digital spaces. Her book, “Untangling the Web: What the Internet is Doing to You,” based on her hit columns in the Guardian and Observer, was published in 2012. Since then, she’s continued to break ground in academia and journalism, and she’s currently a Visiting Fellow in the Media and Communications Department at the London School of Economics and Political Science and a Research Associate at the Oxford Internet Institute. Welcome, Aleks.

Aleks Krotoski: Hi, thank you Nosh, it’s wonderful to see you.

Noshir Contractor: It’s really good to see you and hear from you as well today. It’s been a long time since we spoke, and when I first met with you, you were working on your dissertation research, which I believe was one of the first efforts at doing a social science research project, which then we came to know as web science research. Tell us about your dissertation and your research and what got you interested in it.

Aleks Krotoski: You’re sending me down memory lane. Let me go back to what got me interested in it. So way back in the day, I mean, this is 150 years ago, I was presenting a television program in the UK about computer games. And I, over time, I became one of the assistant producers and became sort of instrumental in identifying the things that we should look at and the topics that we should cover and who went into these spaces. One of my co-presenters, I assigned her to review the game Asheron’s Call this is this is the era that we’re talking about. And I thought she was gonna come back and tell me about all of these things, these guys who sit around in their dark parents basements, and being really geeky, but she came back talking to me about the most extraordinary social phenomena I had come across to date. 

As a social psychologist, I had been interested in looking at how society interactions, group dynamics were functioning. But when I sent her into the space, I didn’t realize that I was going to hear back from her about justice systems, I didn’t realize I was gonna hear back from her about identity and identity play and identity development, I didn’t realize that all of these dynamics were not only present in these spaces, but also, they were effectively recreating the systems that we already had offline, that blew my mind. 

These were spaces, which in my mind, were completely separate from the physical environment in which we lived a place where we would be able to reinvent ourselves entirely, completely come up with new systems. And yet, here was evidence again and again — I started to recognize that we weren’t reinventing anything, we weren’t coming up with brand new systems, we were simply bringing our existing ideas about who we are, who society is, into the online space, and I really wanted to understand and explore that. So when I was looking initially at the work that I wanted to do, coming from a social psychology background, I was interested in patterns of communication. 

Now at that time, network science was particularly really coming to its own and there were two distinct, I would argue sort of brands of network science, there was the more mathematical science. Then you had the more sociological elements. And I was like, Well, here we have a technology that actually describes those connections. And then you can ask the people about those connections. And you can track and trace and follow networks of information to see to what degree the online world and those systems and those processes that we already knew offline, reflected, or were different from the offline processes. So that was kind of that was the nugget it all it all came back from Asheron’s Call and me sending somebody into the virtual world.

Noshir Contractor: There was a definite assumption that the offline and the online were very separate worlds. And that what we might find in the online world would be very different, perhaps, than what happens in the offline world. And you were looking in particular at a particular phenomena in these online games, do you want to tell us a little bit more?

Aleks Krotoski: I was looking at influence. And I was looking at the adoption of an innovation I was using the virtual world Second Life as my territory. And I remember, every time I logged in, I would write down the number of people who had accounts and the number of people who were active. The numbers that I was writing down were like 2500 people have accounts, which eventually turned into something like 15,16 million accounts. Like I watched the explosion of this virtual world, just a phenomenal network, a profoundly enormous network.

And at that time, Voice Over Internet Protocol (VOIP) was a system that was being introduced into Second Life. The developers themselves thought that this would be a very natural technology that people would adopt, because it would allow for business transactions, it would allow for interpersonal transactions, whatever transactions went on. But what they found is that people were not adopting it. And I was curious as to why. I wanted to know why this was stalling. And what I found by mapping a subset of 47,000 accounts and got the reciprocal relationships, and then really dug into what those meant to the people who were within that network. What we found is that online, there was a difference in the adoption of innovations. 

It came back to a very, very interesting, but a very well known phenomenon offline, which is about who you believe as credible, who you believe is trustworthy, who you believe is like yourself, and who you believe is a very prototypical member of your interpersonal network. Now, the difference between online and off is that usually those networks in face-to-face experiences, they’re quite rich. 

As we know, through web science, we know that the nature of interpersonal interaction offline and online is different because of the richness, of the leanness of the medium, just the amount of stuff that we can read, without having to literally read the information. What I found in this research was that you had the initial adopters. And then it reached a point at which people were either gender playing, or they were aged playing, or they were playing identities that were not identical, or overlapped with their offline selves. So if somebody was presenting as female, they didn’t necessarily want to do VOIP because people didn’t realize that, in fact, offline, they were male, their voice would give them away.

Noshir Contractor: And so what your research points out is that sometimes there is value and merit in not having all dimensions of our appearance and of ourselves be presented in a web environment, and that sometimes there is freedom in being able to conceal certain facets of your character in the online space.

Aleks Krotoski: 100%. That research came out, that was 2009, and then subsequently, in the decades since, I wrote the book that was sort of, in part based upon that research, and then also on other research that I’ve done, journalistically since,  I think that is the thing that has surprised me the most about where the web is now. And the requirement, the necessity for people to present as their offline selves, whether that’s for commercial purposes, or for social and psychological purposes, the great playground that we had of identity, the idea of being shielded from full identity revelation that that we experienced, even as late as 2009. We don’t have that as much anymore, we aren’t able to play with our identities as much as we were anymore. And I think that that has very interesting consequences for not just how we study web science, but also for the actual experience of the people who are living in this digital world.

Noshir Contractor: I think you’re right, there has been so much more emphasis on making sure that we authenticate people in various contexts on the web, that predilection for authenticating people has come up the price of not enabling and empowering people to have alternate presences on the web, as they did back in 2009.

Aleks Krotoski: And indeed, it’s not just the presences on the web. This is something I find very important that I do feel that we’ve lost because we do live so much of our lives online. The internet, particularly over the last year has finally in many ways become mundane, for many people. 

Our existences are fixed, right, it’s as if we are living our entire lives right in the past to the present in the now. But the context of that information feels like it has been lost. 

I remember Eric Schmidt many years ago, when he was the CEO of Google, said that he wanted to create a search that sort of that stuff from the past would just disappear. And the reason for that is because you know, I am very different from who I was when I was a teenager. Right? And I’m even different from who I was 10 years ago when I was being when I was writing about this. But you pull me up right online, and probably one of the first things that still comes up is a TV show that I presented, right, the thing that I was talking about earlier, the TV show that I presented between 1999 and 2002. Sort of having to explain to random people that I am not that person that I was 20 plus years ago, is exhausting. And it also means that the idea of people coming up in this space, are not able to naturally reinvent themselves or have spaces in which you can discard that old self and move into another space. And everybody mutually accepts that this idea of being unable to psychosocially develop, and to discard the self and sort of to always have to have the consequence. It’s ironic, in web science, we used to talk all the time about how the online space had no consequence. And now, the consequence is forever there.

Noshir Contractor: One of the things that Europe is perhaps been a little more advanced than the US is in efforts to have the right to forget, that has been brought up in the EU. And it I think it speaks to some of the concerns that you’re just expressing there.

Aleks Krotoski: Often the right to be forgotten is, you know, it’s granted not on the basis of some kind of embarrassment happened a few years previously, but usually to do with some kind of, you know, a bankruptcy or some kind of thing that you have served your time for, 

I mean, I’m talking about like, you know that embarrassing thing that you did in front of your Aunt Martha back when you were four and every time you see Aunt Martha she reminds you of it, you know? Like, well, now I’m 44, can you please stop talking about this? Like, the internet is your Aunt Martha. And you’re not sort of allowed to move forward. I’m curious whether to what degree this has, has an impact on people’s development of self and their feeling of freedom to reinvent? I don’t know if anybody’s been doing research on this. it has been some time since I’ve thought about that.

Noshir Contractor: Yeah, well, speaking of research, you are, Aleks, one of the best examples in of somebody who has straddled the academic and the sort public space within web science. And you’ve had fellowships at the University of Oxford and at the London School of Economics. And at the same time, you made reference to those columns that you wrote in The Guardian, which aptly was titled, Untangling the Web, the very namesake of this podcast series. And then the book that came out in 2013, with the same title. 

And in that book, you unpack a few dimensions of untangling, you talk about untangling me, untangling us, untangling society. And then finally untangling the future. Tell us a little bit about why you called the book and column Untangling theWweb. And which of these untangles have surprised you the most? Now, all these years, almost 10 years since the book first came out?

Alex Krotoski: Well the reason I wanted to call it untangling the web was somebody suggested, and I was like, that’s really clever. It’s a great description, because we are, of course, all wrapped up in this space, again, even more so now, over the last year. As those of us who have been studying web science for a long time and have been living it, it’s sort of like, oh, welcome, everybody, we’ve been waiting for you to come to the party. That in and of itself has been so interesting to witness the degree to which all of our research, all of our findings, were actually relevant and valid in a space in which the entire world if they have been lucky enough to have digital technology, you know, has sort of graduated to the space. But I have always been of the opinion that we are always entangled in whatever technologies are within our lives, whether it’s television, whether it’s the pen, even electronic light.  And it’s exciting. All of these innovations and inventions have had an enormous impact on our lives. And we have developed alongside them. So there was that kind of seeking to untangle ourselves from these spaces. 

But another thing that I really wanted to get across, my sort of main aim from this book was, I wanted to untangle what people’s expectations were of this technology, as something that was other. This is the thing that I think drives me more crazy than anything else, is the magical thinking around technology, because it devolves our responsibility as human beings for the decisions that we make, for the outcomes that happen away from us to the technology. T his book sought to look at the psychological research of each of these categories and the subcategories within them. Look at the psychological research the findings, before the web, right? Things that we expected about how we thought about celebrity, how we thought about love how we thought about death, how we performed these things, right, looked at that, and then looked at the meaning of those things after the web, to the degree that we had done any research in that space as sort of before and after. And to this day, I’m pretty adamant about this, what I found is that almost nothing is different. Right? The idea, the meaning of privacy is exactly the same before. And after we still want privacy. It’s just we’re performing it in a different way. 

But going back to your question about, what is the thing that I think, you know, in some ways as has changed, or has surprised me, in that time, I think part of it is the identity piece, is the fact that we are not as free to reinvent ourselves, or to develop our identities in this space.

Noshir Contractor: One of the things that you have lamented in your writings is that a lot of the research that you just described, and that you’ve been translating in this book, a lot of it is hidden behind the walls of the ivory tower. Why do you think that is Aleks? And do you think it has gotten better or worse in the past decade?

Aleks Krotoski: It’s a wonderful question. I just simply think it’s just the nature of of the ivory tower. I think in the last decade, it has become profoundly better. And that is because of initiatives like web science. That’s because of initiatives like open data. That’s because of initiatives of people who like Sci-Hub, getting that information out to the public so that the public can read it and can be informed, right? I remember there was sort of movements of people to release their content online ahead of time, as it was being developed, but I don’t think we really started to see that truly like a sort of show your work kind of thing, until the mid-2010s. I’m grateful for that. Why should the academy have the stranglehold on this information? Because these are things that absolutely profoundly affect people’s lives, everyday lives.It just requires critical thinking skills, and an ability to be a critical consumer of content.

Noshir Contractor: One of the things that you wrote about in the columns, and in the book was the evolution of the web itself, the growing pains that it went through, the life stages of the web that you describe.  If you could summarize some of those ideas, but also then project out in terms of, what do you see as sort of the next steps in the evolution of the web?

Aleks Krotoski: Great question. And I’m going to answer it, not with reference to the book, but more with reference to what I have seen and some of the things that I have seen over the last year, now that we are kind of feeling the weight of the of the world, in the web. One of the greatest and most profound moments of the web’s history, social history was the eternal September in September 1993, when AOL opened. And it was the first time that people who were new or newbies arrived and outnumbered the number of people who had already populated the web, thus irrevocably shifting the culture, because suddenly, the majority was not interested in what the old guard had to say. The majority was simply forging ahead and doing its own thing and creating its own norms. 

Well, I think we are about to witness a really interesting moment where people like us who’ve been studying and diligent and living it, and all that kind of thing, the web science community is either going to be embraced by or embrace. And I think that’s going to be kind of an interesting tactical way to do it. I don’t know how that’s going to happen. You know, the hundreds of millions of people who thought that the web was an interesting place to visit, but didn’t really ever imagine living there. Now that this enormous population has opinions, because they’ve lived it for a year, I think that’s when we’re going to start to see some really interesting innovations that are not going to come out of the small pockets around the world that have historically been the places where innovations and technology come from, but more people are going to become empowered, because they see the ways that technology does not fit them. And they are able to define how it can fit them. So I think that’s what we’re going to see in terms of the future.

Noshir Contractor:  Listening to you, I’m reminded of the notion that there was some people who were there initially, who might have been the so called digital natives, and that the majority of people were tourists who would visit from time to time and be charmed. But many of those tourists are now digital immigrants that have come to set foot for a long time here. And that’s the big change that you see and your sense is that they are going to change the web as much as the web might change what they’re doing, or perhaps change the web even more.

Aleks Krotoski: I think they’re gonna change the web even more. Because when you have a mob who comes on and has opinions, right, I watched it, it was so interesting, people kind of walked blindly into this space where they were like, I know the room, but I’m not really sure where to sit.

We have evolved norms that are perhaps different from the norms that existed before, perhaps completely uninformed by the norms that were before because that information wasn’t widely available, or even of interest to the masses, who suddenly had to go online and suddenly had to perform and be and do what it is that we have been doing as web scientists for decades and decades. 

So I think that, there is going to be an enormous reinvention. I profoundly hope that one of the outcomes is that people will stop seeing the web as something that is magical, and something that is other, and something that does stuff to me or you or your dad or your mom or your kid or your dog whomever, and actually is a tool and a technology that we operate in as much as the electrical system, you know, the the water system and all of those other things that that we operate within a society and that we use for our own purposes. 

Noshir Contractor: That’s a fascinating vision. Aleks, thank you, again, so much for joining us today on this podcast. As I’ve said, You are one of the best exponents and champions of web science, both as a research scholar, as well as a public intellectual in the space. And we thank you for all your contributions. And we wish you the best and look forward to seeing your continuing insights evolve in the area of untangling the web. Thank you.

Aleks Krotoski: Thank you Nosh, I feel that your listeners are now witnessing my blush. They’re feeling it through their ears. Thank you so much. What a treat.

Episode 15 Transcript

Munmun De Choudhury: Mental health is one part of medical sciences where we have not seen as much progress.…And that’s where the research that I do finds its motivation. Can we find other ways of assessment that can improve the status quo in the way we both diagnose people with mental health risks, and also the way we treat them?

Noshir Contractor: Today, our guest is Munmun De Choudhury, who is a professor of interactive computing at the Georgia Institute of Technology, where she leads the Social Dynamics and Wellbeing Lab. You just heard her talking about what motivates her innovative research in Web Science, which uses social media in order to understand and improve mental health. She adopts a highly interdisciplinary approach, combining social computing, machine learning and natural language analysis with insights and theories from the social, behavioral and health sciences. She has been recognized with the 2021 ACM-W, or the Association for Computing Machinery’s Women, Rising Star Award, the 2019 Complex Systems Society Junior Scientific Award, and over a dozen best paper and honorable mention awards from the ACM and the Association for the Advancement of Artificial Intelligence. Her work has also received extensive coverage in popular press including the New York Times, NPR and the BBC. 

Welcome, Munmun. 

Munmun De Choudhury: Thank you very much for having me here.

Noshir Contractor:  This is a pleasure to get a chance to talk with you. Your work focuses on how web science and the web more generally can help us to detect mental well-being issues, to mitigate those mental well-being issues, as well as to facilitate the treatment of these issues. I’m curious what got you interested in looking and applying these computational approaches to study wellbeing?

Munmun De Choudhury: I loved math and science. But I also loved all of the other coursework that I did as a kid in social science and social studies and humanities. Until I was, late in my college years, I didn’t know if there could be a possible way to connect and bridge the two, like, how do you do stem work,that is connected to people in some way or the other. So but thankfully, you know, I found myself around people who have been thinking about intersections of different disciplines for many years. And I think that reignited my passion to connect what I do as a computer scientist with something about people. 

I think the work that I do right now, that kind of started about a decade back when I joined Microsoft Research as a postdoc. It was also around the time when I had lost my father to cancer and that was a moment of an introspection for me in my life about what does research mean for me? What is it that I can do? And how can I kind take these ideas about connecting computer science with social science in a direction that would help me find meaning in that personal loss? That’s how I started to connect my work with the health field, with the wellbeing field. And over a course of time, I found my home in mental health broadly speaking. 

Noshir Contractor:  It’s always curious how certain personal events in one’s life can explain where we pivot in terms of our professional goals and aspirations. One of the things obviously, that has motivated your work is the very high prevalence of mental health issues. The National Institute of Mental Health has identified that one in four adults, or about 61 million Americans, report to experiences that are challenges in a given year. What do you think about the ways in which we are currently handling these issues, and how the web and social media approaches that you’re taking might be able to address some of the obstacles we face?

Munmun De Choudhury: So you know, the last 100 years or so have been incredible for medical science, we have made a lot of progress when it comes to illnesses such a infectious diseases, I mean, we are living through a pandemic right now, and if you just look at the pace of progress that we have done around it, it’s incredible. But mental health is one part of medical sciences where we have not seen as much progress. The methods that we currently use are pretty much still the methods that were prominent about the time of the First World War, which is when some of the earliest recognition was given to mental illness as illness. 

We saw some developments in the 60s and 70s, with the invention of antidepressants and other drugs. But in terms of assessment and diagnosis, we are kind of still about 100 years old, a primary paradigm is self-reports from individuals. And unlike other illnesses, where we have objective tests for diagnosis or, or to treat people across the course of their journeys, we don’t have it for mental health. And that’s where the research that I do finds its motivation. Can we find other ways of assessment that can improve the status quo in the way we both diagnose people with mental health risks, and also the way we treat them?

Noshir Contractor: And then one of the things that has also I think, contributed to some of the challenges is a general stigmatization of mental health issues, at least until relatively recently in society. And that brings me to a bit of a conundrum. If people are not in general willing to talk about these issues in public because of fear of being stigmatized, how does looking at social media help?

Munmun De Choudhury: So the beauty of our social media is that we can use it the way we want to. One of the interesting developments that we have seen as social media platforms have become a part of our lives more and more, is that people are finding people with the same lived experience, who are probably going to understand the struggles that they’re going with who are probably not going to be judgmental of the experiences that they have faced in life. And hopefully, it would be less stigmatizing. 

So as much as there is the concern that social media platforms are performative, right? At the same time, we do see other users where people are being candid. And this provides a window of opportunity to look at what struggles they’re going through in terms of their mental health. 

Noshir Contractor: And I imagine that it with some irony, while you might be less inclined to talk about some of these issues with your close friends, you might actually be more comfortable talking to strangers.

Munmun De Choudhury: Sometimes we don’t feel comfortable talking about our deepest struggles with people we know in the offline world, because they might be our coworkers, they might be our bosses. And we don’t want to disclose something that we feel we could be judged on. 

Noshir Contractor: A lot of your work has also looked at how you could look at the passively shared data on social media to proactively detect one’s risk of mental wellbeing challenges. What you are using as a detection strategy has to be somewhat more nuanced than just literally filtering social media postings for those who say they are depressed. Tell us a little bit more about how you go about getting that kind of information about individuals. 

Munmun De Choudhury: You’re absolutely right, that we are looking at more subtle signals, nuances and the writings of people: what type of words that they’re using. So I’ll give you an example. When we use a lot of first person pronouns, such as I, and me and myself, literature and psychology says that it shows an inward focus in terms of our attention, I’m talking mostly about myself. Sometimes, experiences of mental health can be detached from the external surroundings, from the social contexts that people live in. And that can manifest in this inward focused attention. 

But on the other hand, when we use words, such as we and us, it shows that we find ourselves as part of a larger collective, or when we use second person pronouns, it shows that we are interacting with another person. And these are really valuable cues when it comes to somebody’s mental health. 

We also find that social interactions are very, very valuable signals. Am I having a lot of interactions with other people? Am I getting the support that I think I should be getting? So these kinds of signals that are less consciously regulated by people, those are the types of signals that we look for in our work. 

Noshir Contractor: Now, the signals that you’ve talked about in my mind fall into two categories. One is looking at the content and doing a sentiment analysis if you may, or parsing the words to decide what kinds of pronouns people are using. But the other sort of signals that you refer to, are things like just the amount of activity you have, the amount of friends or the amount of responses that either you give to others or others give to what you’re doing.

The latter gets sometimes referred to as metadata, which is not looking at the content, but data about the interactions. Do you see any differences in the utility of and the efficacy of looking at the words versus the data about the data? 

Munmun De Choudhury: Yeah, so that varies by the platform. So for instance, if you’re looking at Twitter, normally, the content words carry a lot more signal.And that might be because people are relatively more candid on Twitter, compared to, let’s say, a platform like Facebook. But on Facebook, we found that some of these metadata or some of these social interaction attributes on these network aspects are more valuable, because for a lot of people, their presence on Facebook is also closely tied with their presence in the offline world.

Noshir Contractor: How do you reconcile that the signal that you get about a person from any one platform may be incomplete, inadequate? And are there ways of being able to cut across different platforms to be able to get a richer picture of someone’s well being? 

Munmun De Choudhury: So I would answer that in two ways. The first part is whether any one platform or a couple of platforms, is that sufficient for us to get a more comprehensive understanding of their mental health?

The reality is that right now, if you look at the state of the art in mental health diagnosis, or treatment, none of those signals are being factored in. So now, even signal from one or two platforms can be additional knowledge to the person themselves, to their caregivers, to their family, or clinicians, whoever might be able to take actionable steps and use that information in helping the person. It’s some data, it’s not all data, but I think it’s still valuable data.

Still, there is the question of, we have our identities that are fragmented across different platforms. And that is more and more the case. So a lot of the work that I have been doing has been to go across platforms and think of these data in terms of their multimodal natures. So I absolutely agree with you that as much as information from a couple of platforms are valuable, nevertheless, there is still value to considering the fragmented nature of our identities.

Noshir Contractor: I know that this is initial work that you’re doing, but are there any examples of insights that were different or modified? Because you were looking at multiple digital services providers?

Munmun De Choudhury: What we have definitely seen is not maybe as much of a contrast, but having data from one platform giving us context about data that we see on another platform. Some of our work recently was looking at physical activity data that is collected through smartphone use. And then we also had, for these individuals, we had their Facebook data. So lining those two up was really insightful for us. So when we saw that the person’s, let’s say, heart rate increase at a certain point in time, we can go back and see what might have been going on on their Facebook, maybe they reported a major life event, they reported something difficult that they were going through. So I think those are definitely some of the strengths of an approach that cross cuts across different sources. 

Noshir Contractor: Technically, how difficult is it today to be able to connect what someone said on Facebook with, with some of what they report, say on a fitness device?

Munmun De Choudhury: From a technical perspective, it’s quite difficult, because you need appropriate infrastructures that can collect that data. Social media data is longitudinal, it is fairly sparse, it is largely text. Fitness data, we are talking about, you know, a very high sampling rate, dense data. And again, I mean, these are largely time series data, for instance, they’re not text. 

There is the question of feasibility, right, like finding enough people who are willing to share not just their data stream on one platform, but across multiple platforms, there is a question there as well.

Noshir Contractor: In terms of ethical issues, are you able to get information about individuals if they consent to sharing that information across platforms? Or do you have to deal with some of the providers themselves?

Munmun De Choudhury: It is a question that is getting harder by the day. There is some good reason why it is getting harder, because I think discussions of privacy and ethics are finally getting the momentum that they deserve, in the field of web science, but at the same time, they’re these questions of multiple stakeholders, who have an interest in a data stream and have different policies, have different value systems around protecting or sharing the data. 

I think at the center of it are the creators of the data, they’re the people who believe they would benefit from, you know, this research. And so we have been doing a lot of work with mental health patients, where they have been voluntarily sharing that data with us. And I feel that is probably the path forward for this kind of research. 

Noshir Contractor: You mentioned that a lot of the data on the social media platforms is largely text, for example. But I’m also thinking of some of the more recent platforms that have got a lot of activity like Tiktok, like Clubhouse. TikTok is. you know, video based for the most part and then Clubhouse is audio based for the most part, have you had much success in being able to parse through video and audio as a way of detecting wellbeing?

Munmun De Choudhury: That’s I think a direction where there could be a lot of research that happens going forward, or we have done work in the image space, particularly on Instagram, which is an image heavy platform, we have done some work on Tumblr, which is also kind of multimodal with images and videos on text. The next frontier are platforms like Tiktok, like Snapchat, where a lot of the young people are going and hanging out. 

Noshir Contractor: A lot of young people now today rely on their social media presence for their own self esteem and for their mental wellbeing, and there have been some efforts recently by some of the platforms to actually not publicize the number of likes a particular post gets or the number of shares a particular post gets, so that people are not overly focused on building their self esteem on the basis of that, do you think that those approaches will work?

Munmun De Choudhury: I think there is definitely something that needs to happen there. Social comparison theory has often been put as sort of the causes of the negative impacts of these platforms on people’s mental health. I think the jury is still out whether these platforms are good or bad for people.

At least the current understanding in the scholarly community is that it just depends upon how somebody uses the platform. Whether it’s for good or for bad, I think the platforms do have a responsibility to consider that their platform(s) are being used by people, sometimes we’re for improving mental health, sometimes not for improving mental health, and how can we change the design of the platforms or the features.

Noshir Contractor: Have you looked at ways in which the research that you’re doing and the tools that you’re developing might be used not just by the person who is in need of help with mental wellbeing issues, but also tools that might be used by family members, or a clinician. Can you talk a little about how the work you’re doing, might have audiences beyond the person, him or herself? 

Munmun De Choudhury: My view of mental health is that it is not a solitary experience. It is an experience that is shaped and it impacts people around you. And therefore, if you’re thinking about solutions that are grounded in these approaches in web science, it’s important to also think about what would that technology look like for these other people in somebody’s social ecology. 

So there are two stakeholders that we have engaged with quite extensively in the past few years. The first is, like you mentioned, it’s the mental health practitioners, Wehave been working with a Northwell Health, they’re are big health system based out of New York State. And we have been, you know, recruiting and working with mental health patients, but also their clinicians, they are part of our research teams. We are kind of adopting a participatory approach there, in building both the algorithms that use patient’s data, but also what kind of technologies could be built on top of those algorithms that could help these clinicians in the treatment that they provide. And so the clinicians form a very important stakeholder in there who can benefit from these algorithms, because they can get a fuller understanding of what might be happening to the individual.

The other stakeholder kind at a very different scale, are public health stakeholders. And for the last two years, we have been working pretty closely with the Centers for Disease Control and Prevention or CDC, in helping with their public health efforts on suicide prevention. Organizations like CDC are realizing that, you know, a lot of these conversations on mental health, on suicide are happening online. That is an entire piece that is missing from their public health work. 

So the work that we have been doing together is to extract meaningful information about how people are talking about suicide, what kind of stressors are being expressed by people. And that knowledge could provide evidence to organizations such as the CDC, to figure out which communities might be in need of greater health than others, how do we allocate budget,to assign mental health resources? And how can we do that in real time fashion.

Noshir Contractor: And based on your encounters working with these different stakeholders, could you comment on what you see as their receptivity? First of all, is a patient typically enthusiastic or concerned about sharing that this kind of information, giving consent to share this information with physicians? Are physicians excited about using these kinds of tools? And are the CDC policy makers enthusiastic about it? What’s been your experience?

Munmun De Choudhury:  In our interactions, what we have seen is clinicians are curious about how this will impact their decision making, questions of these power imbalances in therapeutic relationships? How will that impact their own relationship with the patient? Right, that’s an important one. And also, there are questions of liability. I mean, when you have an algorithm that looks at social media data and makes an assessment of risk, what happens if it’s correct, what happens if it’s wrong? I mean, who takes the responsibility for that? So there are definitely the legal questions, the questions of infrastructural, institutional support. So, as a clinician, they might not feel comfortable to use such technology, if there is no support from their whole institution and allowing them to do that.

Actually, from the patients, the people with the lived experience, we have seen the least skepticism among all stakeholders. And I think the reason is, they see the value that this can bring maybe directly to their own lives, or maybe the lives of other people like them.

The question of stakeholders like CDC is a very interesting one,  I have been pleasantly surprised how open-minded they have been to technology. In my interactions with the researchers at CDC, I think there’s a great deal of interest in taking some of these algorithms that glean insights from the, from the web, and somehow making them a part of their public health efforts. 

Noshir Contractor: Well, we spend a lot of time talking about social media and websites that are there to be able to help individuals who are having these challenges. But there is also an undercurrent, a set of websites, that are actually there to facilitate people engaging in behavior, say eating disorders like anorexia and bulimia. What are your thoughts about those sites?

Munmun De Choudhury: We had so many aspirations from the web, in the 90s, about how it’s going to be liberating, and it could democratize our freedoms, in many ways that ave have been lacking until that point. and a lot of that has been true. But at the same time, it will be foolish for us to not recognize all of the things that are terrible on these platforms and about these technologies. And the example that you cited is a great one in how these platforms, while they can be used for good purposes, they can also be used for harmful reasons. 

And we see health misinformation is a huge problem. And that we see in the context of mental health, we see in the context of substance misuse, there is a lot of misinformation that goes on around. That is an aspect we desperately need to attend to, when it comes to health more broadly, and also more particularly mental and psychological wellbeing. 

Noshir Contractor: As we begin to wrap up here, Munmun, can you talk a little bit about how the general work that you’re doing on wellbeing applies also in the context of workplace experiences, and of course, including now remote work, as well as perhaps hybrid work moving into the future?

Munmun De Choudhury: I mean, this was true even before the pandemic, that  personal wellbeing and workplace wellbeing is deeply intertwined. But I think this blurring has only been intensified. What constitutes work, what constitutes not-work, those lines, we are not able to manage them very well. If we think about the future of work, there are also a lot of questions in that space. What does it mean to be able to understand workplace wellbeing now, and what is the role of technology because ?

Noshir Contractor: The way we’ve been working in the pandemic, a lot of our work even within the organization is using what is called enterprise social media, things like Microsoft Teams, Slack, Zoom and so on. And that of course means that the same kind of information that you’ve been studying in general social media platforms is now amenable as data to help detect issues within the workplace itself. Now, you have the tools and the data potentially to be able to look at interactions that are happening within the workplace.

Munmun De Choudhury: Absolutely, and workplace harassment needs more attention. I think there is tremendous opportunity to look at some of these workplace behaviors, but at the same time, how far should we go, so that it’s still justified and at one point does it become like, “Big Brother,” right? Remote work opens up these possibilities of using these technologies to both get an understanding of our struggles and difficulties, and at the same time, it can be deeply compromising to one’s privacy.

Noshir Contractor: Absolutely. Again, I want to thank you, Munmun, for taking time to talk about this really exciting area of research that you’ve been at the frontier of pushing, in terms of seeing how web science can help us with the general area of wellbeing, and mental wellbeing in particular. Your approaches and techniques are truly interdisciplinary, and the results and insights you’ve shared with us today, and the concerns that you’ve shared with us, are very reflective of the eclectic approaches you use, in terms of theories and models from a variety of social science and computer science disciplines. So thank you again for taking time to talk with us, Munmun. 

Munmun De Choudhury: Thank you so much for having me, and I enjoyed all the questions and conversations.

Episode 14 Transcript

Robert Ackland: Hyperlinks are connecting pages together, and allowing people to, as they surf the web, find new information. This, for me, has always been the thing that I’ve been most interested in, because there is a social science of why hyperlinks are created. And what does it mean for a website to create a hyperlink to another website?  It’s used in order to guide people’s attention, shape people’s attention. And so the types of actors that I studied, have been political parties, social movements, organizations, activists, and they are all making choices about how to hyperlink to, and why. And these choices have measurable impacts on shaping the attention of other people. 

Noshir Contractor: My guest today is Professor Robert Ackland from the School of Sociology at the Australian National University in Canberra. You just heard him talk about his work with hyperlinks. Rob works at the intersection of network science and web science, to study networks on the Web. Under a 2005 special initiative of the Australian Research Council, he established the Virtual Observatory for the Study of Online Networks, VOSON for short. His research has been published in journals such as Social Networks, Journal of Social Structure, Computational Economics and Social Science Computer Review.And his book, Web Social Science: Concepts, Data and Tools for Social Scientists in the Digital Age was published in 2013. 

Robert Ackland: Great to be here, Nosh.

Noshir Contractor: Welcome, Rob. Thanks again for joining us from downunder. I want to start by asking you what got you interested in web science, coming as you did from an economics background, initially interested in issues that were related to economic development? 

Robert Ackland: As you mentioned, Nosh, all my training is actually in economics. In my first academic job after my PhD, in 2001, I was working in an interdisciplinary research center, and I started working with a political scientist, Rachel Gibson, who’s now at the University of Manchester. And Rachel and colleagues were working on studying how political parties were using the web in order to undertake various political functions, such as raising awareness about issues, engaging with potential voters raising revenue. A big aspect of that work revolved around the hyperlink — the idea that if you have more hyperlinks pointing to your website, that can bring more eyeballs to your content, and therefore allow you to raise more awareness about issues that could concern you. 

I saw an opportunity there to use web crawlers to collect large scale hyperlink network data sets, and then to start studying these networks. I produced, for example, networks of political parties, looking at mainstream versus major parties, and conservative versus, liberal parties. So I started to look at the structure of hyperlinks of political parties. 

I realized that what I was doing was, in fact, social network analysis applied to the web. And a big part of my career has been looking at how can methods and approaches from social network analysis be adapted to study online networks?

Noshir Contractor: You use the word hyperlink a few times. And I know that that word just rolls off your tongue with a lot of ease. But for most people, when we think of social networks, we think of potentially, links between people. So you have a friend on a social media platform or a follower on a social media platform. But when you’re talking about hyperlinks, these are links not between people, but between websites. And then you use these to crawl the web. Can you unpack that a little bit more?

Robert Ackland: My interest in the web has always been the fact that it’s a socially generated network of resources, the resources are web pages, and also, the other media files. The piece of engineering that connects these resources together is the hyperlink. Hyperlinks do not get formed randomly. 

Hyperlinks are connecting pages together, and allowing people to, as they surf the web, find new information. This, for me has always been the thing that I’ve been most interested in, because there is a social science of why hyperlinks are created. And what does it mean for a website to create a hyperlink to another website? It’s used in order to guide people’s attention, shape people’s attention. And so the types of actors that I studied, have been political parties, social movements, organizations, activists, and they are all making choices about how to hyperlink to, and why. And these choices have measurable impacts on shaping the attention of other people. 

When I started studying the web, there was not the availability of tools and techniques to allow a broad range of social scientists, particularly those with an interest in social network analysis, to easily access and collect hyperlink data, and turn these data into what I call research ready data-sets, data sets that are amenable to social network analysis, and so I really designed the VOSON software to be a tool for social network analysis using data from using hyperlink data. 

The VOSON software was effectively a web crawler that allowed researchers to easily select a set of websites, and then find how those websites connected to one another through hyperlinks. 

Nosh, you made you made the point that today, it’s very common to think of people networking, on the web, or via social media, But in the early days, before social media, web 1.0 was an era where you had to have quite a lot of resources, in order to be able to put material on the web, for example, newspapers or academic institutions. The typical user of the web was a consumer of information. Web 2.0, which started with blogs, but then moved on to the social media era, became an era where it was possible to not only consume information for produce information. 

And so today, it’s very easy to conceptualize this idea that people go onto social media and connect with one another. In web 1.0. era, it was less easy to conceptualize this. But I really saw the hyperlink is the tool that allows organizations and groups to connect to one another. And I was interested in using social network analysis to study that phenomenon.

Noshir Contractor: So one of the things that you were pointing out is that websites are very strategic about which other websites they point to, because that’s how they represent themselves to the public, and are also very interested in which websites are pointing to them, and to the extent that we know in society that you’re judged by the friends, you keep, what you’re saying, Rob, is that a website is judged by the hyperlinks it keeps.

Robert Ackland: It’s always of great interest to know, well, who is hyperlinking, to whom? It’s a measure of popularity. In an information context, it’s a measure of authority, is your website an authoritative source on a particular topic. 

It’s very important to know who is linking to you, and also, the perception of your organization is very much influenced by who you direct your hyperlinks to, and so on. It’s one of the aspects of web science that I find very interesting and compelling.

Noshir Contractor: One of the things of course, that can happen with hyperlinks, as it does today with friend links, or follow links, is that you can create them and at some point, you could dissolve them or you can unfollow someone or unfriend someone or remove a link that you have with someone. So as you look at it from a historic point of view, is it possible to be able to go back in time and look at the archive to see when someone might have created a link from one website to another and when it might have dissolved. And what that might tell us about society?

Robert Ackland: It’s a really important aspect of research in the sense that the web is constantly changing. This is one of the reasons why governments are very concerned about preserving the web, because it’s a digital record of a country or of a society. 

From the perspective of a web scientist, I think there’s really two aspects of hyperlinks that are, in some ways, the holy grail for research. Number one: I find that when I present my hyperlink research to people, one of the first things they say is, you know, how has it changed over time? Another aspect that is very important, is knowing what amount of attention is traveling through hyperlink, it’s difficult to know exactly how many people were following that hyperlink. 

Noshir Contractor: So one of the interesting and important contributions, Rob, that you have made to the study of web science, is the development of this virtual observatory for the study of online networks. When you began that effort, it was focused largely on mapping hyperlinks between websites, and since then you have evolved the entire project to also look for mapping links that happen between organizations or people or organizations that have Twitter accounts . Tell us a little bit about why you got interested in creating what I think has become a remarkable public good for anyone interested in studying web science.

Robert Ackland: I was always interested in developing tools that could be used by non-programmers. Web science brings people from a whole lot ot disciplines. And the whole point of web science in my mind is studying how the web is contributing to society from a lot of different dimensions. It’s not just about the engineering, but it’s about the social, political and economic impacts of the web. As the web evolved to the social media era, I wanted to make sure the VOSON software evolved. 

We started then, collecting data relating to Facebook. So an early version of the VOSON software enabled research of Facebook, and of course,  API’s and privacy changes on behalf of the social media companies in terms of access to data means that a tool like VOSON has  to constantly be evolving as well, so VOSON’s designed to allow researchers to collect data from major social media platforms using application programming interfaces. So a lot of current tool does enable collection of Twitter network data. This is, I believe, really important for the study of political deliberation, how that is occurring on social media. And so to the extent that the social media companies continue to provide open access through to their data through API’s, then I’m very keen for the VOSON software to be a part of the web science toolkit.

Noshir Contractor: Along with your evolution of work, from hyperlinks to looking at other social media platforms, you’ve also evolved in your conceptualization of bots, where initially you could think of a bot as being a web crawler. We now know a lot about spam bots and chat bots and bots that can conduct automated high frequency trading and global financial markets. you also talk a lot about what you refer to as social bots. First of all, how do you define a social bot? And what differentiates a social bot from some of these other bots that we’ve just talked about?

Robert Ackland: So, my interest in social bots came about around the 2016 US presidential election  I think the 2016 US presidential election and also the Brexit referendum in that year, really raised awareness about the potential for social media to be a vector for influence. And the influence might be coming from foreign influence operations. So, troll accounts, for example, set up by foreign governments in order to try and influence political conversations, but another area of concern related to so called social bots. The idea of intelligent agents or bots, is not new, but the 2016 US presidential election, in around that time, there was concerns about how bots were being used, in order to shape conversations on social media. I became very interested in how to understand how bots might be having an impact on political conversations on Twitter. 

And so this really gets back to a very sort of a long standing and interesting question in social science research, and which is how we measure influence. Is it the case that they are influential? Because they’re very active? And they’re tweeting a lot? Or is it the case that they’re influential because the tweets somehow help to propagate particular information or raise prominence on particular themes or frames. 

I think the presence and impact of bots is a core issue and potential concern for web science.

Noshir Contractor: While there is a lot of research that highlights the dangers of the risks associated with social bots, can you talk a little bit about why and how you believe that social bots can actually help promote deliberative democracy in social media?

Robert Ackland:  If we think about bots, in other areas of society and the economy, they’re generally designed to be useful, in the sense that they provide information that helps people make decisions in a financial market setting, for example. 

The first work that I got involved in the area of social bots was with Tim Graham. We were interested in the potential for social bots to play a positive role in political deliberation online. If you think about what political deliberation involves, it’s this idea that people are engaging with one another, often with people who do not share the same views. And they are able to develop a common set of terms and understandings about a potentially divisive social issue, and potentially changed minds, or at least come from common understanding about what the problems are. 

We were interested in the idea that it might be possible for social bots to be designed to have a positive impact on political deliberation, for example, by connecting groups of people who otherwise are not connected in online conversations. One of the bots that could be designed in such a setting was what we call the bridger bot, and the idea was that such a bot might have to try to meet communities in social media, who otherwise are not connected to one another, to help promote cross community dialogue. 

Another thought that we had with regards to the potential positive role of social bots was the idea that certain clusters of social media users could benefit potentially from being exposed to ideas that are different to the ones that they currently have. And so, the idea was that it was a bot that could somehow start to operate or start to be present in their conversation, participate by raising ideas that were somehow counter to what the current thinking was. However, I would like to say this, this is where I think web science is really important — because it’s one thing for social scientists to conceptualize a popper bot or bridger bot, but this is an engineering and design issue. And so this is where websites play a role in terms of connecting engineers, computer scientists and social scientists in projects that are trying to study for example, the potential positive role of social bots.

Noshir Contractor: Speaking of positive roles of bots, you and Tim, inspired by Isaac Asimov’s three laws of robotics, postulate three principles of social bots.

Robert Ackland: So the paper on social bots that I co-authored with Tim Graham was partly inspired by our common interest in studying the web from social network analysis perspective. But there are actually two literary inspirations for this work. The first inspiration is evident in the title of the paper, which is “Do Social Bots Dream of Popping the Filter Bubble.” So this was a reference to Philip K. Dix’s seminal novel, Do Androids Dream of Electric Sheep. And this is a novel that inspired the Blade Runner film. So we were interested in this idea of social bots as autonomous agents, with a purpose. And we were interested in the idea that the purpose of social bots could be a positive one, in the sense of making a positive contribution to deliberative democracy online. 

However, this is an engineering problem. We’re social scientists. But we realized that the design of a social bot is  an engineering task. And so another literary inspiration for this work was Isaac Asimov, who famously proposed the three principles of robotics. And so we drew on those principles. 

And I want to emphasize that this is not an engineering paper that we’re proposing. In some ways it’s a thought piece. But the first principle of robotics, or our adaptation of Asimov’s principles, was that social bot must do no harm to a human being. And so how might we think of a social work creating harm to a human being? Well, by being annoying, for example, by butting into conversations where they’re not required, by creating noise in a social media conversation. 

The second is that social bots must protect their own existence, except where in doing that, it that would conflict with the first one. The idea there is that a social bot has to be designed well, in the sense that it’s, it’s not annoying, it doesn’t get outed very straight away as being a bot rather than a human. Because then that can lead to people on social media platform banning it. 

And then, the third principle that was adapted again, from Asimov’s three laws of robotics, was that social bots, social bots must make a significant improvement to deliberative democracy. 

Noshir Contractor: That’s brilliant. I love it. Another major contribution that you’ve made to web science is the book that you published in 2013, titled, “Web, Social Science Concepts, Data and Tools for Social Scientists in the Digital Age”. Tell us a little about your thinking when you decided that you would write this book, and tell us what you’re hoping to achieve by people who would read this book.

Robert Ackland: I’ve been involved in teaching at the IU for the last 10 years now, my teaching has been in the area of social science of the internet, online research methods. Essentially, my goal in my teaching has been to equip social science students with the conceptual concepts, and the also the tools and the methodological training, to allow them to do web science, in the sense that they can work with data being generated from the web, to understand the social, political, and economic impacts of the web. 

My book, really had two goals. Firstly, it was to introduce students and researchers to the web as a source of new data for studied social, political and economic behavior, the heavy emphasis on social network analysis, but also other methodological approaches.The second aspect of the book was to provide an understanding of how social scientists can contribute to the future development and pathway of the web, in order to allow the web to reach its full potential or to continue to have its full potential in terms of making a positive contribution to society. 

Noshir Contractor: I’d highly recommend that book to anyone who’s interested in helping us understand how we live online, and what are the consequences of that. It has been a true delight to get a chance to catch up with you, and to hear all about the ways in which you’ve been thinking about the past of web science, the present of web science, and also the future of web science. And I’m very encouraged and inspired by everything you’ve done to contribute to the web science community in terms of your own research, in terms of the platforms like VOSON, that you have helped develop, and the book that you helped write, to help shape the next generation of students. So thank you again, Rob, for joining us today.

Robert Ackland: Thank you. It’s been my pleasure to participate in Web Science and participate in this podcast, thanks Nosh.