Episode 13 Transcript

Jaime Teevan: When you start thinking about returning to the workplace, you can look at what we lose when we move remote? And what do we gain? Let’s do the stuff that’s better remote, remote and do the stuff that’s worse remote back in person. That suggests, large group meetings we can probably keep remote. First, there’s some pretty cool things about being able to share the slides, have in-meeting parallel chat, or see people’s names and know who everybody is, like those are actually benefits. On the other hand, meeting new people is something that you should really do face to face.

Noshir Contractor: Welcome to this episode of Untangling The Web, a podcast of the Web Science Trust. I am Noshir Contractor and I will be your host today. On this podcast we bring thought leaders to explore how the web is shaping society and how society in turn is shaping the web.

Our guest today is Jaime Teevan — you just listened to her talk about what work is better done in-person versus remotely as we prepare for the Next Normal. Dr. Teevan is chief scientist for Microsoft’s Experiences and Devices, where she’s charged with creating the future of productivity. Previously, she was the technical adviser to Microsoft CEO Satya Nadella and a principal researcher at Microsoft Research, where she led its productivity team. She developed the first personalized search algorithm used by Bing and introduced microproductivity into the office. Jaime was recognized as one of the MIT Technology Review Innovators under 35, and has received many awards, including the Anita Borg Early Career Award, the Karen Spärck Jones award, and the Special Interest Group on Information Retrieval (SIGIR for short) Test of Time award. Welcome, Jaime.

Jaime Teevan: Yeah, it’s a pleasure to be here.

Noshir Contractor: I want to start right now with the new report that you helped put together at Microsoft titled “The new future of work,” research from Microsoft into the pandemic’s impact on work practices. This was an excellent compilation of insights that came out of what I understand was a ongoing cross company initiative to coordinate efforts towards understanding the impact of remote work.

Jaime Teevan: We’re in the middle of a pretty significant transition that, if we don’t come out of it better, then we’re going to come out of it worse. It’s really an opportunity ahead of us to create a new and better future of work. Prior to the pandemic, we were already in the midst of a pretty significant change in how people get things done, with a move to the cloud, and the proliferation of Edge devices and real advances in artificial intelligence. But COVID took this primordial goo that was ripe for innovation, and it provided a spark. I mean, most of us moved from working from the office to working from home, literally, in a day. I’m pretty sure I have a plant back at my office that is dead. I haven’t seen it since last March.

And as a productivity company, Microsoft is really interested in understanding work practices and how people get things done. And we have a lot of sensors in place with which we can see work. So we obviously have large scale telemetry data of how people are using our products. We have really rich customer panels that we set up with all sorts of different customers to get more direct feedback. We use survey instruments. We’re also a large company, just all on our own, you know, hundreds of 1000s of employees who are working and had to shift from in-person work to remote work. So last March, when COVID hit all of these sensors shifted to focus on remote work.

Researchers from across the company came together in what we believe is the largest research effort to happen to understand changing work practices. And the cool thing about that is sort of all of the different non-traditional ways that people are coming together, too, we have all of these converging methods quantitative and qualitative, all sorts of different approaches, we even have EEG studies of people’s brains, and we look at a number of different populations.

Noshir Contractor: In your report, you talk about a lot of different areas, and I want to just pick on a few of them. The first one was the impact of this sudden switch to collaboration and meetings. can you talk a little bit about what these findings are in terms of how we change the nature of collaboration? To what extent did it broaden our networks at work or deepen our networks at work?

Jaime Teevan: Now, it’s a great question, because a lot of the insights from web science actually apply here, where we have seen really interesting evolutions of people’s networks as a result of the shift of remote work. When we moved remote, we had a lot of social capital that we had built up from interacting with people face to face, and we’ve spent the past year spending down that social capital. So when you look at the networks, and the way that people work together, what you see is actually our strong ties are the people that we are close to and work with, well, they have stayed relatively strong, and we continue to meet one on one with our managers, our close collaborators. But our weak ties, or the people we don’t know as well, those are atrophying. So like collaboration trends in Microsoft Teams, and Outlook, show that communications with those outside of our immediate teams have diminished with their move to remote. You can see large group chats have decreased nominally by 5%, whereas the one on one chat, those have increased by 87%. So we’re doing increased communication, and interaction with people we know well, we’re doing decreased communication with the people we don’t know well, and that’s gonna have a lot of consequences for like how work gets done moving forward.

Noshir Contractor: Absolutely. So we see that the technology is being used to deepen our networks, rather than to broaden the networks. And as you pointed out, the number of weak ties falling has consequences. Because again, we know from prior research that weak ties are very important in terms of engendering and fostering innovation and new ideas. Which is surprising though, Jaime, because in some ways, you could say that the technology now enables us to reach out more easily to people, when we are unfettered by geographic boundaries, etc. But even though technologically, it’s possible, what your research finds is that that’s not what people are doing.

Jaime Teevan: You probably remember at the start of the pandemic, like virtual happy hours were a thing. I feel like we’ve all gotten too tired. But like, at the start of the pandemic, it was amazing, I was like Oooh, I’m hanging out with my uncles and my friends on the East Coast and and all these people, I was like, Ooh, I’m doing regular time with them. And it was amazing. And I don’t do it anymore. I think it just gets — it’s just work to sort of maintain that broad network.

Noshir Contractor: One of the things that you also talk about in the report is the impact not just on collaboration and meetings, but also on personal productivity and well-being, including the ways in which this has meant working from home or living at work, take your pick, means that you’re breaking down boundaries, both to space and of time. Can you talk a little bit about what you found in that context?

Jaime Teevan: There’s a lot packed up in that question. We were using space as a technology to get work done right, space was delineating the start of the work day, and the end of the work day, it was creating natural boundaries between home and work. It was creating opportunities for serendipitous conversations and spontaneous interaction. it was actually a useful limiting function for meetings, because you could only have as many people who could fit into a meeting room, and now everybody can join any meeting they ever want to. So there were all sorts of values in how we were using space as a technology. And that went away.

It stopped providing useful temporal boundaries for us. So we saw that people were sending a lot more messages in the weekends or after hours, I think the number of IM’s that people send between 6pm and midnight went up 52%. And people who didn’t normally work on weekends saw their weekend collaboration, triple. So the kind of time boundaries that we were used to went away.

It’s nice, because like, now, I can be like, Oh, it’s lunchtime, I’m going to take a walk and hang out with my kids. Or I can say, I’m a morning person. So I like to wake up early, and I like to go to bed at eight. And that works. But it does make the coordination of work practices, very challenging, because your personal decisions are never decisions that just impact you. They impact other people.

People are working from different states or different countries are really rethinking about where they live. Our mutual colleague, Brent Hecht had a really interesting point, where a lot of the movement that used to happen was along latitudinal lines, because actually, it’s similar latitudes, you have similar environmental factors, like the same crops that grow at a certain latitude will do so elsewhere. And now what you’re actually starting to see is movement around longitudinal lines instead, because that’s when you’re on the same time zone. And we haven’t really figured out how to solve the timezone issue.

Noshir Contractor: You touched on the idea of giving you autonomy to be able to go for a walk in the middle of the day, etc. And that brings up of course, issues of those of us who have the economic resources to have a life that allows us to do that, as compared to those who might not have the opportunity.

Jaime Teevan: There’s several things embedded in that as well. So the report that we’re discussing right now focuses primarily on information workers, which represents a sizable portion of the world population in the country, but obviously not everybody. So after the pandemic, you saw about a third of people stay working in the workplace as essential workers. You saw about a third of folks furloughed because they, their physical presence was required to work and they couldn’t and they weren’t deemed necessary to go into work. And then you saw about a third of people move from in person work to remote work and that primarily is the information worker population that we’re looking at, and these populations are quite different demographically as well.

A lot of the burden of having to either return to the workplace or being furloughed, falls on people of color, falls on women, falls disproportionately on different people. Even when you look within the third, that transition from face to face work to remote work, you see pretty significant differential impact there as well. Business Leaders tend to be weathering the storm more successfully than others. Caregivers, and in particularly mothers, have had real challenges, particularly with children being at a school and having to pick up child care.

Noshir Contractor: Your report does talk about some of these societal effects that are both negative and positive. Some argue it’s the K effect, some people are doing better in the situation, others are doing worse, The overrepresentation of BIPOC workers and firstline and others on site. Your report also notes that the layoffs resulting from the work from home is disproportionately affected women, African American and Hispanics.

Jaime Teevan: The K effect is a good description of it. And my background is personalization, I think about how people, there’s individual variation across people. So there’s a piece of me that, that almost rolled my eyes as I’m like, oh, there’s a lot of variation within the impact of COVID. Like, yeah, tell me something new. But when you dig into the data, ‘s just an order of magnitude different than what we’re seeing anywhere else. It’s hitting in interesting ways, even in like, the work setups.

So even when both parents are able to work, women are much more likely to have their workspace be set up in a shared space, more likely to be interrupted. There’s variation by job role, as well. You’re seeing a particular challenge around well-being with managers and a particular bump in numbers of hours worked among them, new employees, anybody who’s changed roles, is having challenges as well. So you’re seeing a bunch of challenges showing up along a number of dimensions.

Noshir Contractor: Being a company that is involved in software and software engineering, your report also points out that software engineering got slightly more productive, actually, but also came with accompanying burnout.

Jaime Teevan: One of the things that I get asked about this report a lot is “Ooh, what surprised you about the findings. And we forget how surprising it is that people were able to work remotely at all. It is true, we’re seeing developers are productive, we’re seeing information workers in general are being surprisingly productive by the standard metrics But it’s coming at a huge cost, I think we can all feel that. There’s real hits to our well being and working in shared spaces, working longer hours.

People are being productive, but it has really driven a significant shift in the way that business leaders are thinking about work, to recognize that work isn’t just about, the stuff we’re doing. But it is about the person we bring to the task. It’s about the networks we have, you’d mentioned well being, it’s about our ability to respond well and be thoughtful and make the connections we need and not sort of be living in our panicked mind. When you talk about taking a crisis and trying to make it positive, I think that’s one of the potential positive outcomes of this, is that holistic view we’re increasingly taking towards work.

Noshir Contractor: A s we begin to see our way out of this pandemic, hopefully in the near future, people are talking about moving from the new normal, to the next normal, and no one expects that next normal to be remotely similar to the old normal. One of the things that I found quite interesting,, about the report was your efforts at trying to see what of these learnings and insights are going to stay post-COVID? People talk about the hybrid model. What are your thoughts about what that hybrid model might look like moving to the stage of post COVID?

Jaime Teevan: It is hard to imagine. And as researchers, we like to make good, thoughtful data driven decisions, and we don’t like to get ahead of our skis. And yet, everybody right now is having to make important big long term decisions based on very little short term data. And that’s hard.

We do have some places we can look to make this easier. Microsoft has offices in China, and China has actually opened up and we can start looking at what hybrid work looks like there and the kinds of decisions that people are making. The truth is even though we don’t really know the answers, we have a pretty good sense. And being able to make a decision from some data is better than being able to make it with none.

So we do have some recommendations that we’re making. When you start thinking about returning to the workplace, you can look at what we lose when we move remote? And what do we gain. Let’s do the stuff that’s better remote, remote and do the stuff that’s worse remote back in person. That suggests, large group meetings, we can probably keep remote first. There’s some pretty cool things about being able to share the slides, have in-meeting parallel chat, or see people’s names and know who everybody is, like those are actually benefits. On the other hand, meeting new people is something that you should really do face to face.

Noshir Contractor: One of the studies that you have recently been involved in has been doing a large scale analysis trying to isolate the effects of working from home on collaboration activities, but removing or controlling for all the other effects of COVID-19. Can you tell us about how that led you to some results that might be counterintuitive?

Jaime Teevan: What we’ve been doing is essentially, yes, trying to partition out what, what is going on right now, because we’re in the middle of a global health crisis and what’s going on because we’re working remotely, because it’s not, the same. (Laughs). One of the ways that we do that is by looking at prior to that pandemic, people who were working remotely, and so then you can see for those people how their behavior changed, before and after March, as compared to other people who moved from working in-person. It’s hard to control for absolutely everything, but it actually shows the folks who were working remotely beforehand, they didn’t have such a significant increase in meetings, as the rest of us, A lot of different sources of evidence that we have looked at, suggests there is an expertise that comes with remote work, and as we figure that out, then it becomes easier. We’ve all had this crash course in remote work now. So that as we return to the workplace, we’re going to be able to carry that over and still use it a little bit.

Noshir Contractor: I thought it was interesting that the results of this study point to the fact that working from home, after taking into account the partitioning of the COVID issues, actually resulted in less time on collaboration and more focus time.

Jaime Teevan: In general, one of the benefits of working from home is focus. People report some additional distraction from from children and kids and pets and leaf blowers. I feel like there’s a leaf blower in every meeting. (Laughs) But working from home is quite good for focused work.

Noshir Contractor: Speaking of focus, one of the things that you have also been looking at in a study is the role of multitasking. Until now, Jaime, when people said multitasking, it seemed to be a four letter word. But your research finds that multitasking in meetings has both positive and negative effects.

Jaime Teevan: First of all, it’s kind of cool that we can actually measure multitasking better than we could before. Every conversation you have is digitally mediated, everything you do is and that provides us so much more information and so much more insight. And so you can look at exactly how often does somebody email during a meeting? And you can say, Oh, I actually know the answer now. 30% of people email during a meeting, and then you can start looking at which meetings do people email in and which ones don’t they? It becomes interesting to start thinking about how you can sort of peripherally pay attention to a meeting, to jump in when it’s relevant. And think of all these long meetings that you go to, that there’s a lot of stuff you don’t care about, what if you were able to focus on it, right when it was relevant.

The other thing we’re seeing a lot of is and I wouldn’t call it multitasking, sometimes we’re calling it deliberative tasking, it’s actually doing multiple things on a single task. So you can see during a meeting, there’s a parallel chat going on. And often the parallel chat’s quite rich and has a lot happening in it. And there’s a conversation and maybe you’ve got the deck open, and you’re going through the deck, as well. And doing all of those different things on the same task makes for a really rich, deep interaction. It’s exhausting. It’s part of what makes meetings even more exhausting, but it’s a really intense and deep way to engage on a task.

Noshir Contractor: And and you’re absolutely right, that the reason we are talking about so many of the insights that we are able to glean is because we are working on the web. And that’s why this is such an important area of work, in terms of web science, and being able to leverage these various forms of data telemetry, as you call it, and one of the things that Microsoft has invested in over the last few years, is what used to be called workplace analytics and has recently been renamed as Viva insights. And the idea here, if I understand that correctly, is that if we are able to get all this insight about the way we work, and how we work, what if we could provide that information back to the workers and back to the organization’s?

Jaime Teevan: So In recent years, we’ve really seen the value of data and behavioral data in particular, and I absolutely credit the web for that, as well. The real insight with Viva insights is actually that that same data, when you start aggregating and looking at data over time actually allows you to understand, introspect and respond to things. And particularly during a disruption like COVID, the ability to understand what’s happening and make good decisions to be resilient to disruptions is, is a real, is really important.

Noshir Contractor: And at the same time, there are many who are also extremely concerned about the privacy implications of these data, who gets to see these data, what if these data are in some ways corrupted, and somebody is making decisions about you, including your job based on some kind of flawed data? So how are you and your colleagues thinking about the quality assurance issues associated with these data?

Jaime Teevan: Those are all such important questions. And I love that you include bad science in there too, because it really highlights the importance of our job as scientists. There are all sorts of challenges with aggregating understanding and making decisions based on data. And those challenges show up at the individual level with things like, you know, workplace surveillance and privacy concerns, they show up at the organizational level, when you start thinking about security issues, or data leakage from models that you might learn, and they and they show up at the societal level.

And you can see that in the conversations that are happening recently around responsible AI and ethics and competing, and just our ability to make reasonable inferences. We’re also seeing an increase in interest from countries and sort of thinking about their national interests and the data that they have in their, in their, within their boundaries. And so all of those are super important. And a place where research really comes in to help, not just to help us figure out how to do good science, make good inferences. We’re investing a lot in thinking about research related to responsible AI, research related to privacy-preserving machine learning, homomorphic encryption, differential privacy, I think there’s a lot that we need to do here. One of the things that I do like that is important not to forget, though, is this allows us to make explicit what is happening, and there’s a value to that, then you can introspect them and understand them and make decisions about what we think is correct and what we think isn’t correct. Like, the bare minimum is like we should try to not build biases into our system, the opportunity is we can understand the biases that are there and start correcting them and building systems that actually behave in a way that we that we would want things to happen in the world.

Noshir Contractor: When one one of the things that I think is important is that organizations like Viva insights,they make the possible visible, and then invite the debates that have to be had as a society about the issues of privacy, and the positive and negative impacts of it. And so I see this as being the first step and say, this is possible. Now let’s talk about how and why when and when we should be using these data and insights.

Jaime Teevan: I talked about how the web in many ways has created the current AI revolution with these feedback engagement loops, right? Where, you know, you engage with the system, data gets collected, that gets fed back in the system, the system makes better. We’ve seen that there’s problems that come out of them, as well.

But there’s an opportunity there to think about, like, how do we drive these feedback loops towards our goals and towards things that matter? So thinking of Viva insights, and this opportunity organizations have to start thinking about their organizational goals. You can start thinking about the recommendations that happen in the context of an enterprise. So we talked about how weak ties are atrophy, maybe we want people to have stronger weak ties, maybe we want Team A and Team B to be closer. And so instead of developing feedback loops that make recommendations within that context towards engagement, we can say, oh, let’s make recommendations that help Team A and Team B, be closer.

Noshir Contractor: And so in closing your as you think about the scholarship that you have done in the area of web science, and that you think needs to be done, can you give us from your point of view, what web science might have accomplished so far, and what really it needs to focus on moving forward?

Jaime Teevan: The big thing that it has accomplished is allowed us to see behavior at scale, and make decisions related to that. And then it has allowed us to see the influence of the technology we build on society at scale, as well, and start being able to quantify and understand that, and be thoughtful. And then that obviously creates a need for us to, to do that in a responsible way in a way that is thoughtful. Another thing that I’ve found interesting about the web is how dynamic it is. The web is constantly changing and there’s such an opportunity to learn from those dynamics and grow from them. I even think of just like, how much better you can understand a web page or piece of content if you don’t just see it right now, but you see its entire history. And I think we’re increasingly able to capture and understand the entire evolution of content in a way that’s really interesting. And then of course, raises all sorts of challenging problems that are associated with that.

Noshir Contractor: Well, what I will say is that we as web science community are grateful that organizations such as Microsoft Research and Microsoft more generally, is engaging with these issues in a way that you are uniquely qualified to do it because you have access to these data, and also making those insights, available to the larger scientific community.

Jaime Teevan: As a company we believe strongly that our success is other people’s success. We are a company that is designed to help other people accomplish their goals, other people get things done. And that requires a strong community, a strong academic research community and a strong business community. It’s fundamental to our mission in the world to support that.

Noshir Contractor: Jaime, thank you so much, both for your leadership in this area, your own scholarship, and also your ability of helping to steer this incredible report that I recommend strongly to anyone who’s interested in learning about how the new future work is going to be shaped. So thank you again for joining us today.

Jaime Teevan: Thank you, my pleasure.

Episode 12 Transcript

Danny Weitzner: What we realized with the web is that for better or for worse, we in fact, have entered what might look like a panoptic world, a world in which maybe not everything that we do, but so much of what we do is is recorded. The idea of somehow putting that genie back in the bottle, or somehow wrapping up all that data in a confidentiality framework, became obviously impractical, if not impossible, so what it pushed us to think about was a different approach to privacy, which I would actually argue is grounded in some early areas of law. But it’s an approach to privacy that emphasizes accountability.

Noshir Contractor: Welcome to this episode of Untangling the Web, a podcast of the web science trust. I am Noshir Contractor and I will be your host today. On this podcast we bring in thought leaders to explore how the web is shaping society and how society in turn is shaping the web.

Our guest today is Danny Weitzner, who you just heard talking about privacy on the Web. Danny is a 3Com Founders Principal Research Scientist at MIT Computer Science and Artificial Intelligence Laboratory. That’s CSAIL for short. He’s also the founding director of the MIT Internet Policy Research Initiative. His research interests include accountable systems, privacy, cybersecurity and online freedom of expression. Danny was the U.S. Deputy Chief Technology Officer for Internet Policy in the White House under former President Obama and also led the World Wide Web Consortium’s public policy activities. He is a recipient of the International Association of Privacy Professionals Leadership Award (in 2013), the Electronic Frontier Foundation Pioneer Award (in 2016), and was named a Fellow of the National Academy of Public Administration (in 2019). Danny is a proud founding member of the Web Science Trust and will be a keynote speaker at the upcoming 2021 ACM Web Science virtual conference. Welcome, Danny.

Danny Weitzner: Thank you, Nosh It’s great to be with you.

Noshir Contractor: Thank you so much for taking time I want to start, of course by remembering and going back in history. You were one of those who were there at the start of the entire web science movement. And as I just mentioned, you were a founding member of the Web Science Trust. And so take us back to how this all began.

Danny Weitzner: Thank you. And this really takes us back to 2004 and 2005, when Professor Wendy Hall, Professor Nigel Shabbat, and Professor Tim Berners Lee and I, shortly thereafter, joined by Professor Jim Hendler, got together, and frankly realized that the web didn’t quite have a place in computer science academia, and even more so didn’t quite have a place in the larger social science and humanities research communities. So the web by this time, of course, had an enormous impact on society. Much more was yet to come. But we knew because of really both Tim’s invention, and because so many people gathered around the web so quickly, that this was a world changing technology. But what we realized was that, oddly enough, computer science didn’t think the web was all that interesting at the time, because it was actually quite simple technology, elegantly designed, of course, but not not pressing the state of the art of any established field in computer science.

And at the same time, we knew that the way that the web was designed, and the way it was being adopted in societies all around the world, were creating enormous questions of privacy and cybersecurity and equity of access and the nature of democracy and the future of work and on and on and on. Coming from a computer science and a law background, we didn’t feel we had the tools to really wrestle with those questions. So the founding of web science was in some sense, a simultaneous play to both the social science community to help us understand the impact of this extraordinary invention, and the computer science community to focus its attention on how we should be designing the web and related technologies going forward in order to meet society’s most important goals.

Noshir Contractor: Let’s start by talking about what your focus has been largely in the area of combining web science, and the computer science aspects of it with, with the law, and in particular, with privacy issues when we think about technology and privacy, one of the things that has gone into the public sensibility is Michel Foucault’s notion of the panopticon where you now have the ability to watch everyone everywhere, all the time, But I also wanted to note that you have spent a lot of time on another concept that Michel Foucault brought up, that doesn’t get quite as much attention. And that’s the concept of countervailance. So tell us what are the differences between panopticon and countervailance and how both of these are important for web science?

Danny Weitzner: Sure. You know, I think that what the web crystallized, for us, is a recognition that to a first approximation, almost every action of significance is going to end up being recorded digitally somewhere available for access, available for analysis, and available for reuse. What we realized with the web is that for better or for worse, we in fact, have entered what might look like a panoptic world, a world in which maybe not everything that we do, but so much of what we do is recorded. And it’s often recorded for good reasons, because we get some value out of it. We write messages in email, because we want to communicate, we record our fitness data because we want to monitor our health. The idea of somehow putting that genie back in the bottle, or somehow wrapping up all that data in a confidentiality framework, became obviously impractical, if not impossible. And so what it pushed us to think about was a different approach to privacy, which I would actually argue is grounded in some early areas of law. But it’s an approach to privacy that emphasizes accountability.

This is where we come to focus on the idea of countervailance or surveillance. Because we can’t lock up all of our personal data perfectly to prevent misuse, what we instead have to do is actually surveil how that data is used, actually monitor how that data is used and by whom, and make sure that the rules that we want to live by are still respected. Our challenge now is to understand how to enforce those rules. The other side of web science has made, I think, a very important contribution to our work here, as well. That’s really the social science methods that we’ve been able to deploy to understand the impact of different kinds of privacy environments on people’s behavior. One of the critical privacy harms that we have to insure against in this increasingly transparent world is the risk of creating chilling effects. The very worst thing we could do on the web in our society, is chill people’s behavior and make them feel that if they interact too much in public, if they shared their thoughts too much, that somehow there’ll be negative repercussions for that. Tthe ultimate irony of the web, which is all about opening up information,would be if the result was that everyone went off and hid in their corners. We’ve taken this approach of trying to protect privacy through greater accountability, number one, and number two, to try to really understand the nuances of how people’s behavior is affected one way or another, by privacy.

Noshir Contractor: So on the one hand, where panopticon was saying pay attention to the prisoners, countervailance is saying pay attention to the prison guards.

Danny Weitzner: That is exactly right. We know that, particularly in the world today, where so much of our data is held by really a relatively small number of very powerful private organizations. I’d say still, we have not yet cracked this code of providing adequate accountability, either from a technical perspective or from a legal perspective. You know, it’s one of the things that I think we didn’t anticipate even in the mid 2000s, and certainly not, I think, when the web was initially designed. The web was meant as a great decentralizing force. And, of course, what we now have, at least in some arenas, is an extraordinarily centralized set of forces. That I think remains one of the great challenges really for web science.

Noshir Contractor: I’m reminded that, at least at one point, you were the head of a group at MIT called a decentralized information group, that was DIG for short.

Danny Weitzner: That’s right. I’ll say with a bit of humility, that we probably placed too much emphasis on the power of technical architecture to control social outcomes. Many in the internet community and the personal computer and computer community generally thought that if if individuals had information power in their hands, that we would end up in an almost a kind of decentralized nirvana where where power was radically distributed, and large institutions wouldn’t be the kind of threat that we see them as often. And I think we’ve just candidly got that wrong. And in a way it took, the attitude of web science to recognize why that was the wrong perspective, why that was a an overestimation of the power of technology and underestimation of all the social forces that determined the context in which that technology was was really used.

Noshir Contractor: You mentioned, Danny, that countervailance is something that should be embraced by these few private organizations that are controlling so much of our data. Can you talk a little about what active transparency would look like?

Danny Weitzner: So really, accountability, through countervailance requires a very thorough record, a log, an audit trail, you might say, of all the uses of personal information. And it requires the ability for independent third parties, which might include individuals, but probably needs to be more robust organizations, to actually assess how that information is being used.

As an example, we did a piece of work a piece of research with some colleagues of mine who work in cryptography, and theory of computation. And the case study that we took was the example actually of electronic surveillance conducted by governments for good reasons. When governments do a wiretap or or surveil criminals’ email or something, we want that activity to be secret, at least from the criminal, otherwise, the surveillance doesn’t work very well, and law enforcement aims are thwarted. But at the same time, we want to make sure that the police who were conducting those surveillance activities are actually following the rules, they’re only getting the information they’re entitled to get, are only using it for purposes related to the actual criminal investigation. And so we developed a cryptographic technique using zero knowledge proofs, and certain kinds of public ledger technologies in order to keep the computational processes secret, but at the same time record enough of them in a publicly visible way that you could prove to the public, that the police were actually conducting this activity using this very intrusive power in a way that was that was accountable. The same kind of thing you would want to see in lots of private sector contexts.

Noshir Contractor: Well, this is going to be a good challenge because it’s almost like we see an adversarial arms race between the private companies that own a lot of this data, and privacy activists who are trying to challenge them.

Danny Weitzner: That’s right. It’s again, where the two sides of web science, the ingenuity and system design, and insight from social and behavioral science really need to come together. Because we have to understand the larger picture of accountability that we’re trying to produce the kind of human behavior, the kind of institutional behavior, we’re trying to incentivize, and then figure out how to build the right systems to help that happen.

Noshir Contractor: Absolutely. That is very exciting for a lot of us in social and behavioral sciences to be thinking about those issues. I also want to take us back, as you have in your writings, back to Ben Franklin, who was the first postmaster of the US and the newspaper publisher, and had a key role in making our communication infrastructure, a provision for that infrastructure written into the constitution. Can you take us a little bit from there, and how it then led to all of the recent regulation and policies associated with what we know about the web today?

Danny Weitzner: Yeah, Ben Franklin, as the revolution as the American Revolution was happening, recognized that, we had to figure out how to knit together these 13 colonies, which had quite a bit of diversity to them, and quite a bit of distance separating them. It was for that reason that he persuaded his colleagues, the other founders, that creating a system of public post roads was an absolute priority. And this was in marked contrast to the European postal systems at the time, which were his reference point. These were mostly available to the royalty and privileged members of society. And Ben Franklin said that, oh, this has got to be a public service available to everyone, and most importantly, without discrimination as to content. So he really was, in some ways, the first American network architect. And secondly, he envisioned how this network of post roads could actually create a vibrant free press. Because of the intricacies of the way the postal charges worked, he realized that he had a way to give postmasters a particular advantage in being recipients of information. The newspaper names even that we’re used to, the New York Times, well, that was the times at which the ships arrived, the Washington Post — the reason that postmasters became newspaper publishers, is they got their mail for free. So it meant this network of postmasters could exchange information all around the country, and then publish it in their newspapers. And what they actually did is they sent each other their newspapers back and forth across these post roads. Franklin in this way, built a network that not only enabled the movement of physical goods, but enabled the movement of ideas.

Noshir Contractor: And so we come from there to also the provision where,the information providers or the platforms if you were not necessarily liable for what kind of content was put on some of these platforms. If you come to the 1996 communication decency Act in Section 230, in particular, tell us about what was the original thinking behind that, and how that may or may not have changed in today’s world,

Danny Weitzner: The carriers, the internet platforms, that included everyone from internet service providers to web hosting providers to the current internet platforms that we know of, today, social networks, Twitter, Facebook, Google, etc. Section 230 provided that these platforms would number one, not be liable for the speech of their users. So that if I get on Facebook, and I insult someone, that person might be able to sue me if my insult was sufficiently harmful, but they the person cannot sue Facebook.

I was at the Center for Democracy and Technology at the time, very much in the middle of Congress’s debate about how to approach the internet. What what a number of us realized was that if we really wanted these platforms to enable robust free speech, we couldn’t put the platform’s in the position of having to monitor and assess their potential liability for every single piece of information that flowed across the network, exactly in the same way, as you suggested Nosh, that if we told postmasters, they were liable for the contents of every message ever, every letter that they carried, then the mail would come to a halt.

But Section 230 also did something else, which is very important. It also said that, if platforms take steps to remove content that might be considered by their users to be offensive or harmful, for whatever reason, that they would not be liable for those actions. And that was a very intentional design, to encourage platforms to create environments that would would suit the communities that were using them. We knew we couldn’t possibly have one single content standard, to govern all the information on the internet, it just would be too complicated. What we envisioned was that there would be many different platforms, each of which would perform this kind of function and create environments that was that were targeted to different audiences.

Now, what we didn’t envision, as we talked about before, is that we would only have three or four platforms at any one time. In the early 2000s, not long after section 230 was was enacted by Congress, there were over 8000 Internet Service Providers just in the United States, and literally hundreds and hundreds of web hosts. Again, the early days of the internet was a much more decentralized, less concentrated environment. Today, it’s reasonable to ask what we should be expecting of these dominant platforms when it comes to speech, but I still think the underlying goal of Section 230 as a free speech protecting rule remains every bit as important now.

Noshir Contractor: Indeed, I want to turn our attention now to some more recent work that you’ve been doing in the context of the public health pandemic, trying to understand how we reach a balance between privacy and public health. And your project which is titled Private, Automated Contact Tracing or PACT for short can you tell us a little bit about how this would work in terms of balancing public health, good and privacy of the individual?

Danny Weitzner: To begin with, just to set the context, we’re now a little bit more than a year into the pandemic, and just a little more than a year ago, my colleague at MIT, Ron Rivest, who’s really one of the world’s leading cryptographers, and an extraordinary computer scientist, came to me and said, we need to figure out whether it’s possible to do privacy-preserving, contact tracing. It was understood even then, in early March of 2020, t hat because COVID was an infectious disease, deploying the traditional public health approach of contact tracing was critical — that once you find a case, you have to very quickly figure out who else might have been exposed to that individual and make sure that they quarantine or take appropriate steps both to protect themselves and to to limit the spread of the disease.

What we saw at the time, was that the countries that were hit with COVID, first, that just happened to be in Asia — China, Taiwan, South Korea, others, some of these countries very quickly adopted, very innovative, but very intrusive, smartphone based surveillance techniques. After the pandemic started in China, in order to travel around it all, you had to have a kind of a COVID pass, which showed that your, your risk of exposure was limited. And this was all based on a highly centralized system that Chinese public health authorities built in order to detect who might have been in close proximity to someone else who had been tested positive for COVID. And these were all systems that were used, that were developed. using the GPS capabilities of our smartphones. That is they were location based systems.

We realized that it was simply going to be unacceptable in the United States or other democratic countries, to have that kind of intrusive surveillance, whether or not it was going to work. And so we realized that, from talking with colleagues in public health, that really all that mattered, in assessing exposure risk for COVID was proximity to another individual, not your absolute location in the world. We realized that really what public health authorities needed was a way to detect who had been in close proximity for a sufficiently long time, to someone who tested positive, and that we needed a way to get notifications to those individuals who were potentially exposed. We at MIT with some colleagues at BU and Carnegie Mellon and other universities, worked very quickly to develop a protocol that would provide this kind of exposure notification based on proximity, not location in order to protect privacy. And we had colleagues also based in Switzerland, mostly EPFL, who also were developing very similar protocols.

Once we released our protocol, Apple and Google announced that they were going to work together to adopt a very similar design. And they have, I think, to their great credit, worked extraordinarily hard together to develop a single system, which is deployed on both the Android and the iOS systems to enable exposure notification. The critical property of this system is from a privacy perspective, is that it doesn’t collect any information about your location, it keeps any personal information about your proximity to an infected person, and any personal information about your infection status, entirely private to you, as the individual user.

If I’m an infected person, I don’t know who I infected. If I’m a contact of an infected person, I don’t know who infected me. And probably most significantly, and perhaps most controversially, the public health authorities don’t know any of this. Our system relies on notification to individuals, who then are instructed to contact public health authorities to take further action. This was a, you know, a decision that we took quickly, but not thoughtlessly, because we felt that if we built a system that required that everyone trust their governments to hold this information securely and without any risk of adverse consequences, that would just discourage too many people from using the system.

What we see around the world now is that states and countries that have deployed the system, there’s anywhere between 20 and 50% of the population uses the system. That’s a big number, but also reflects some substantial hesitation. This is another web science challenge. We are right now studying trying to learn how people have made decisions about whether or not to deploy this, this service, whether or not to turn this app on, on their smartphones. And we don’t have what I would regard is statistically relevant data yet, and we certainly haven’t published anything yet on it. But very early indications suggest that people are making pretty complex decisions that have to do with a sense of trade-offs. It’s not just a question of whether my I’m giving up too much privacy or not, it’s a question of what am I getting for it? Am I getting a system that’s actually going to protect me? It’s actually going to protect my community, my family or not? There’s a huge amount to learn about how people make those decisions,, what’s the right way to communicate about them. Even though, we’re well into the pandemic, and even though we designed the system,really in a matter of weeks, I think it’s going to take us longer to really understand some of the details of how it’s actually being used and understood.

Noshir Contractor: Well we’ve done quite a tour de force today talking about all the thinking and research that you’ve done in helping with policy in the area of privacy as applied to platform providers as applied to public health. Talk a little bit about training the next generation of web science scholars and one of the pioneering courses that you have been teaching in collaboration with colleagues at Georgetown Law School, between computer science and law students.

Danny Weitzner: When I left the government in 2012, my colleague, David Vladek, who is a law professor at Georgetown and was the head of consumer protection at the Federal Trade Commission’s did a lot of very, very important work on privacy investigations, David and I, wanted to keep working together. And we had both spent a lot of time working on privacy legislation together that had been proposed by the Obama administration. So we said well let’s let’s see if we can help law students and computer science students to work together on developing privacy legislation. This has developed into a course this now in its sixth year, and we bring together 15 law students and 15 computer science students, do a kind of a crash course on privacy law, a crash course on relevant, computer science concepts. And then we give teams of students made up of two law students and two computer science students the particular privacy challenge and we say, understand the challenge, technically, and come up with a legislative response to it. You know, we thought when we started this course, that what we were teaching about with privacy. And of course we do that. What we learned is that what we actually are teaching is how lawyers and computer scientists can work together.

And I think what we’re really learning is that addressing public policy challenges that web science brings to the fore, is really a team sport, that it really isn’t something that any one discipline could go off by itself, and either study alone, or act on alone.

Noshir Contractor: In closing here, your research and your teaching are absolutely stellar examples of how web science has to live up to the spirit of serving at the intersection of these different disciplines, and so I am really grateful that you took the time to talk with us about some of these issues, for your thought leadership over the decades on this particular topic. And as a shameless plug for those who would like to hear more wonderful words of wisdom from Danny, I would encourage you to think about attending the Virtual 2021 ACM web science conference From June 21 to June 25, where Danny has graciously agreed to be a keynote speaker, so thank you again, Danny, for joining us today.

Danny Weitzner: Nosh, thank you so much, I really appreciate you having me. This was wonderful.

Episode 11 Transcript

Ravindran Balaraman: In a country like India, the number of people who are active on the web far exceeds populations of most countries. But then here’s a significant fraction of our population that still doesn’t have access to the web and access to the services that are being provided on the web. So, this there is this digital divide.

Noshir Contractor: Welcome to this episode of Untangling the Web, a podcast of the web science trust. I am Noshir Contractor and I will be your host today. On this podcast we bring in thought leaders to explore how the web is shaping society and how society in turn is shaping the web.

Today I have the pleasure to welcome Ravindran Balaraman, who you just heard discussing the unique challenges people in India face accessing the Web. He is the Mindtree faculty fellow and a professor in the Department of Computer Science and Engineering at the Indian Institute of Technology Madras. He also heads the Robert Bosch Centre for Data Science and Artificial Intelligence at IIT Madras, which is the leading interdisciplinary AI research center in India and India’s first lab to join the Web Science Trust Network of laboratories from around the world. He co-founded the India chapter of the Association for Computing Machinery’s Special Interest Group on Knowledge Discovery and Dating Mining (SIGKDD for short), and he is currently the president of that chapter. His research is pushing the boundaries of reinforcement learning, social network analysis, and data text mining. And his work bridges the gap between theory and practice in machine learning. In 2019 he was instrumental in hosting the first Web Science Symposium in India. He was recognized in 2020 as a Senior Member of AAAI (Association for Advancement of AI) for his significant accomplishments within the field of artificial intelligence.Welcome Ravi.

Ravindran Balaraman: Noshir, thanks for having me on the podcast.

Noshir Contractor: It’s my pleasure. Thank you for joining us, I must say that I’m especially thrilled to have you as the first guest on this particular series that is joining us from the global south. I’m absolutely delighted to have your insights about what web science means and can do or can’t do in the emerging economies of the world. You have mentioned for example, that there are many India specific challenges that need to be addressed by web science. What do you think web science means in the context of countries like India?

Ravindran Balaraman: In a country like India, the number of people who are active on the web far exceeds populations of most countries. But then there’s a significant fraction of our population that still doesn’t have access to the web and access to the services that are being provided on the web, right. So, this there is this digital divide, which people talked about, when the IT services became more popular. Now, with the growth of the web, the society interactions are happening on the web, this kind of digital divide is getting exacerbated, it is getting much worse. Recently, a colleague of mine from our social sciences department, we have been looking at the impact of this worsening digital divide on the migrant population and in particular, their access to digital banking. So with the enablement of digital banking, there is so much more of our commerce, now, even in India, happens online. And there is a significant fraction of the society that is getting excluded from that. The migrant population, because they have now actually been, you know, transplanted into a slightly alien culture for them within the country. But they don’t want to use these web services that are available for the rest of the country. They don’t want to use them because they’re feeling even more alien. This is not within their realm of experience. One of the theories that we are posing now is that, maybe we should use techniques from AI, to make sure that these interfaces on the web that these people are getting access to reminds them of home, as opposed to having an impersonal voice that’s going to talk to them about, Okay, you want to do banking, and press this number or press that and then enter something here. And so can we have somebody talk talk to them in their local dialect.

Their portal to the web now becomes more like a slice of home. Given that very few countries have this kind of large internal migration of migrant population like India, it’s a problem that literally, we have to buckle down and start solving.

Noshir Contractor: This is really intriguing. Can you make a concrete example of what kind of migrant population you’re talking about, and what can be done to help them feel less alienated and more at home?

Ravindran Balaraman: So let’s take one concrete example. Like almost 80 to 90% of the construction workers in India are or people who are displaced internally, these are people who move in from a particular state in the north of the country called Bihar. And most of the construction workers in my state, my home state, which is the southernmost state in the country, come from Bihar. And it’s a completely different culture, not just language, from the way we dress, the climate, and the kind of festivals that we celebrate here, the food that is available to them, everything is different. So this is really alien country for them, and they tend to stick close to one another, right? And then you tell them that, okay, the government is offering you — no schemes — all you have to do is go online, you know, click a few buttons on your smartphone, all of them have smartphones, this is a surprise, all of them have smartphones, and they use that to connect with their families back home, right, and just call them or link with them on WhatsApp. So they are happy to do that. Not that they can’t get online, but they can’t integrate with a larger web community, mainly because they just want to use it as a conduit for connecting back home. My colleague in the social sciences department has been doing a lot of study on migrant populations within India. And so we are drawing on the insights that he has looked at, from their assimilation into local society, and then trying to look at how that affects their assimilation into the web. And then the insights that we have looked at is that the web actually gives us an opportunity to give them a slice of home, If you can tell them, okay, all your interactions with the web can happen in the local language and, and then you will log into a portal, and then it starts greeting you with local functions, local festival chat, asking you about your parents and stuff like that. So that’s the kind of idea that we are looking at. But that’s a solution that has to come from India.

Noshir Contractor: You touched on something at the start that I want to go back to you mentioned that as a result of the digital divide, difference in access has been exacerbated recently. And I want you to tell us a little bit more about the extent to which you think the presence of the web has contributed to this digital access divide. And or the extent to which AI is now becoming so permeated on the web is either mitigating or exacerbating these digital divide issues that you touched on.

Ravindran Balaraman: Almost every web service that you see online,? Whether it is like online communities like Facebook, or professional communities like LinkedIn, or whether you’re looking at services like Amazon or Google,, everything is strongly infused with AI. This enablement of AI is essentially making, you know, people rely more and more on the services because they are so much more convenient. Companies, because they are looking at where the bulk of their revenues are coming from, are tending to move more online. And so it makes it harder and harder for people to get services locally. So those who are not online are actually getting lesser and lesser services.

So it is certainly exacerbating the divide. Really, right now we are looking at how to make you know, the online access easier for people.

Noshir Contractor: It’s depressing in some ways to hear you say that AI might actually be exacerbating the divide, but you’re also looking and exploring at ways AI can be deployed to mitigate some of these access and divide issues. Can you give a specific example of something that is happening in India, that gives you hope?

Ravindran Balaraman: Languages is an important thing, right. So the most of the interfaces now have improved tremendously in India. A lot of companies are actually investing money in India, to build now local language interfaces. And my mother tongue is Tamil and I can talk to my phone in Tamil, and it does a perfectly fine job of transcribing it. Even if I give you a Tamil keyboard, right? So in fact, anyone who has tried using a Tamil keyboard knows that it is much, much harder to use than than the English keyboard. I would prefer typing in English than typing in Tamil, but I would love to talk in Tamil than in English. I can see more people getting integrated because of that.

I still think AI is at the end of the day a technology, right, it’s up to us to figure out how to use it. And and that is a stronger awareness among the government as well as among some of the bigger you know, enterprises.

We have to actually start providing all the services in a more accessible manner. And that realization now taking ground.

Noshir Contractor: You point out a really important issue that we tend to take for granted in many of the developed countries. In India alone, according to the census of India in 2001, there were 122 major languages in one country, and 30 of these languages were spoken by more than a million native speakers. So what you just described, the technology of using ways of translating these across languages, really helps connectivity on the web in a way that we take for granted when we speak one dominant language in the West. Can you talk a little bit about the ways in which the study of artificial intelligence has in and of itself changed as a result of the web? I remember going back several years when the initial Dartmouth studies were coming together to coin the term artificial intelligence, ai was mostly seen as a rule-based system where you would provide certain rules and certain kinds of reasoning systems. And today, that seems somewhat antiquated, or is it?

Ravindran Balaraman: Oh, yeah, that’s a huge debate. AI seems to go through these phases, right? So while, at one point of time, they say, it is all about logic, reasoning, rules, and inferencing on it. And then the next point of time, we say, Oh, no, throw out everything, you have to learn everything from data, learn from scratch, it’s all about statistics. We are seeing a strong swing towards the data driven statistical approach to AI. And part of the reason is the web. So it has been both something that really helped AI grow, because it’s giving you huge volumes of data. Not only it’s not only just giving you data, right, but it’s also giving you data with tags on it, because people are so good at labeling what they are putting out on the web., Because everything is becoming more and more digital, that data is getting readily digitized.

Some of the techniques that AI is using now has been around for a couple of decades, if not longer. They couldn’t succeed because they didn’t have this kind of volume of data that the web has enabled us to gather rights and so that way, the web has had significant influence on on the growth of AI.

Same time, I also have to say that the web also has caused us to kind of topple over and do things in a not so casual manner. Because if you look at some of the latest AI systems built completely on web data, you kind of see that they also tend to mimic the significant biases and prejudices that people bring to their writing things that they post on the web. And if you don’t do a capsule curation of what the data that you’re getting from the web, you’re going to systematize the biases by putting it into a machine. And it actually makes it easier for people to make the argument that humans can be biased the machines can’t be. But then what they fail to sees that the machine is going to be biased because it’s digesting the biased data that the people are putting out on the web.

So in some sense, it was great all that the web did. And it really gave a quantum boost to what AI was doing. But we are coming to a point where we have to start thinking very carefully about how we are going to take advantage of the data on the web.

Noshir Contractor: So Ravi, one of the examples that got a lot of attention in the US at least, was the fact that exactly as you said, if you use AI to screen job candidates, then these AI systems will reproduce the same biases in terms of gender, and underrepresented minorities in terms of interviewing and screening for job opportunities, etc. And one of the issues that raised was that very often these kinds of AI techniques, give you a result, but don’t necessarily explain how and why they got those results. Some of my friends joke that what AI lacks is a “why” button. And that if the if AI gives a result, you should be able to press a button that says why, and this raised the whole issue of explainable AI. Can you talk a little bit about whether you see that as helping address the issue and the concern that you just raised? But also, how far are we from being able to have explainable AI?

Ravindran Balaraman: So I strongly believe that before AI, can be truly, you know, let out free in the wild. We need to solve the explainable AI question. So in fact, the job screening thing was something that was pretty, obviously, AI going wrong,? But then there are a lot of subtle ways in which AI is influencing our behavior,? In fact, if I go online, right, so my phone starts recommending these stories for me, then it’s going to start coloring my view of what kind of stories are going to see, that’s just because the AI system is learning this, and then it’s putting those out. So it very quickly customizes it for your preferences.

We need to have something similar to the ”why” button. So what people do nowadays as explainable AI is to say that, oh, you asked me why I said that particular image is appropriate for for your search, right? I want to see a football match. And then it shows me a picture of a football match. And it might say something like, oh, that here at this top right corner, there is something that, you know, that caused me to make it into a football match. So it can’t even tell you that okay, well, I think it’s a football match because there are like 10 people here and there is one guy carrying a ball. Iit basically says okay, that is this part of the image, which makes me think that it is a football match. That’s certainly not a satisfying notion of explanation for people. So we are quite quite away from getting to explainability, as humans understand explainability I’m not even sure how soon we will be able to get there.

But, this is what I always tell people. You don’t know how a motor vehicle runs, when you don’t know the details of an internal combustion engine, but you’re happy to drive a car. Right? So if you can, The reason you’re happy to drive a car around is because he knows that there is somebody at least back there who understands and has done all the testing and everything for you. If you can come to a point where I can say that, I know why AI this, me being an AI expert, right? As long as I can say that, okay, I understand the explanations for AI is putting out and I’m happy to certify that AI is doing the right thing. The general public just had to accept it — okay, it’s come with a certification from AI expert, that they understood what it is doing. But if you’re going to say that, it has to go to a point where the general lay public, the end user is going to understand completely what the AI is doing, I think there’s still ways off from that.

Noshir Contractor: To what extent do you think that AI is enhancing trust on the web, or undermining it for the lay public?

Ravindran Balaraman: I mean I can tell you what I see around me, like at least in a large fraction of the Indian society, right? So we, unfortunately, tend to trust the web too much. The latest WhatsApp forward is taken as gospel. That’s mainly because the forward comes from a person that they know. And therefore they transfer the trust that they have of the person to the message that that’s been sent through them as well.

So the web in some sense really worsened the impact of rumors and things like that, because you have a verifiable media source that sent you the information. And we tend to kind of ascribe the same trust to that that piece of information as well, right? Even though the person who forwarded it to you, might not have known where the message came from. When we did get a news from newspapers and things like that, that mean, there is at least the hope that appropriate research has been done before things they put on print.

And I’m not sure whether AI is still playing a role here in terms of making this worse or better. But I think AI can play a role in making things much, much better in terms of,attaching provenance, or at least doing a very, very quick analysis of, you know, the consistency of information on the web. The biggest challenge in fact checking all the information that floats out on the web is you don’t really know what the ground truth is. And the rate at which information is generated on the web, you can’t also go after the ground truth, right. So at least AI systems operating at scale can verify the consistency of the information that’s out there is there are like 10 people saying one thing and 10 people saying something completely different, then at least you can say that, hey, look, I don’t think this is right, because there’s just too many different opinions about this. And everybody is also working on this kind of fake news vector, and so on so forth. But who’s to say your news is fake?

Noshir Contractor: In the past year, in particular, with the pandemic and the other global reckonings, there has been heightened focus on social justice issues. But I want you to talk a little bit about what social justice means, specifically within the Indian context. And to what extent does the societal interplay and impact of AI and web have for social good in India?

Ravindran Balaraman: Throughout the country, the whole notion of social justice is very strongly embedded,? In terms of opportunities, and jobs, and in academics and everywhere. It’s significantly different from state to state, there are places where this kind of social inequities are much more pronounced. It could be that the same community in the society is discriminated against in one state, but not in another state. It’s a very, very, very complex dynamic within the within the country. So it’s not clear how we would, you know, build AI systems that are uniformly fair across the entire country.

And again, sort of social good is concerned,? So there are various issues that people have looked at which they build solutions for in the west or in other countries. It’s kind of we struggle with implementing in India, because just implementing a system that would work for a million users alone, even though it will help the million users is grossly unfair to the Indian population.

Noshir Contractor: Can you explain more of why it’s unfair?

Ravindran Balaraman: It’s unfair in the sense that which million are you going to deploy to? Right, so who do you choose?Of course, there are a whole bunch of other factors that are going to come into play in terms of, to which fraction of the population do you have access that you are able to deploy your system to? There’s a whole bunch of other factors that are going to come into play.

We really need to figure out a way to scale it much, much larger, a couple of orders of magnitude larger than what we can do right now with our systems in order to make it truly country. Countrywide deployable.

Noshir Contractor: It sounds what you’re describing is a scaling problem. Help me understand why scaling is such a challenge.

Ravindran Balaraman: Well, let’s say that I’ve developed a system that tells me that, Okay, here are people with a certain medical conditions and, you know, they are having difficulty, you know, keeping to the drug regime, and you have to do some intervention to help them. Now I come up with a system that can look at analyze a million people, and then filter out, 1000 people who need this kind of intervention and then I can actually put, you know, like, healthcare workers, who can go help these 1000 people, right now I scale it. But now suddenly, I’m looking at 100,000 people who need this kind of intervention. It’s just not a question of computation being hard, it’s a question of actual deployment in the field, that makes it much harder.

Noshir Contractor: So the challenge is not just in the technology, and the web might help us identify those who are in need. But that still begs the question of how are we going to reach all those people in any physical, tangible way to provide the need that the technology has helped identify what they need. In closing here, one of the questions that we’ve been asking our guests is that as we have been going through 2020, and now into 2021, we’ve been dealing with obviously, the pandemic as well as many global reckonings sociocultural nature, political in nature. And I was curious to get your take, specifically, from an Indian vantage point, on how you think this period 2020 and 2021 would have been different, for better or for worse, without the web.

Ravindran Balaraman: I can’t imagine 2020 without the web. So we literally lived off the web, not only was I working on the web, I was meeting friends, having I mean, everything right, so I just can’t imagine how we would have survived 2020 without the kind of online work and online meetings that are happening. I strongly feel that things would have been for the worse in the last year without the web. You might have as well thought of how would they have gone through 2020 without electricity?

Noshir Contractor: Yes, indeed, yes, it’s become a utility that we take for granted in most cases now. Well, I want to thank you again very much, Ravi, for taking time to talk with us and specifically for giving us insights into how web science has a different lens when seen from the context of the developing world, in this case, particularly from India, etc. And we’re just delighted that IIT Madras, the Indian Institute of Technology, Madras, became the first member of the web science trust network of laboratories from India and you were certainly instrumental in making that happen.

And I wish you and your colleagues the very best in helping advance the notion of web science in developing countries etc. And we will be looking forward to hearing more about those insights in the years to come. So thank you very much again.

Episode 9 Transcript

Deen Freelon: Identity factors, which include, you know, not only race, gender, in some cases, sexual identity, national origin, also in some cases religion, really help to get a fuller picture of what’s going on the web and in various digital domains. So that’s something I’d encourage every web science practitioner to do, first of all, to read up on it, to figure out how to integrate that into the work they’re already doing, and then secondly of course, to implement that knowledge.

Noshir Contractor: Welcome to this episode of Untangling The Web, a podcast of the web science trust. I am Noshir Contractor and I will be your host today. On this podcast we bring thought leaders to explore how the web is shaping society and how society in turn is shaping the web.

You just heard our guest today, Deen Freelon, talking about why identity is key to understanding the complex interplay between the Web and society. Deen is an associate professor in the School of Media and Journalism at the University of North Carolina in Chapel Hill. His research covers two major areas of scholarship: political expression through digital media, as well as data science and computational methods for analyzing large digital datasets. He has authored or co- authored more than 30 journal articles, book chapters and public reports, in addition to editing a scholarly book. He has also served as principal investigator on grants from the Knight Foundation, the Spencer Foundation and the US Institute of Peace. Professor Freelon has been at the forefront of research into misinformation, disinformation, hyperpartisan content, ideological asymmetry, identity politics, and personalized information environments. And as a member of the web science community, Deen writes lots of software to analyze data, some of which he releases in open source spaces. Welcome, Deen.

Deen Freelon: Thanks.

Noshir Contractor: I’m so glad that you’re able to join us here today, I have been a huge fan of your work for a long time. Let me first begin by asking you, how did you first get interested in studying the web?

Deen Freelon:I’ve always been a bit of a nerd, my dad was an early adopter of computers, I learned how to do web pages when I was in high school, this is mid-90s. I went to college thinking that I was going to be a computer science major, but I was at Stanford at the time. And I I found that the way they taught it wasn’t quite wasn’t quite my speed. So I sort of pulled back and I majored in psychology. Later, I taught myself how to do PHP in my first job, which was a as a technology trainer at Duke University, which is in my hometown. And at the same time, I was teaching myself how to code, I was also becoming more politically aware, right. So this is around the time, 2000, 2003, start of the Iraq war, and all that. So the code piece and political piece were happening right around the same time. And so it was only later that I realized, wow, I kind of had these two pieces of my eventual scholarly identity, that were percolating and evolving at the same time. And this is actually before the field of Communication Studies, and probably web science as well, starts to become aware of computational methods and data science is a key component of both of those. And so really, it was kind of serendipity that I ended up having those skills and those interests at a time when those fields were starting to value those and starting to promote them.

Noshir Contractor: Well, I think we’re all very lucky for that serendipity, because you really were the right person at the right time. And one of the things that I really admire about your work, Deen, over the years is that you’ve taken issues and been able to capture it in a way that advances intellectual insights, but also speaks to a larger public. And you’ve done this in an amazing way in your scholarship, as well as your public engagement. Talk a little bit about how you began to think about these issues. I’ll throw a couple of recent papers that you’ve written. You have a paper called “False equivalencies: Online activism from left to right.” Tell us a little bit about what this false equivalence is, and why it might be going against the grain of some conventional wisdom that we might be listening to in this area.

Deen Freelon: That paper is really the culmination of a lot of thoughts that I’ve had over the past, I don’t know, probably half a decade at least. And the false equivalency is between the left and right, so you have a lot of work on the left that has really come from and we talked about this in the paper, from kind of the hashtag activism school, right, so let’s look, you know, there’s a lot of work on, you know, Black Lives Matter, there’s a lot of work on, you know, the climate change movement in terms of their use of hashtags. And so there is one view of the the left, and actually, that connects to prior work. That’s not computational or web science in nature, primarily in sociology and communication, that in which the left has overwhelmingly been focused on, when you, when you’re talking about social movements, social activism.

And you’ve got work on the right, that really comes out of the tradition of sort of the right wing media ecosystem, which of course, long predates the web, right, going back all the way back to the 30s. But you know, really intensifies the 1980s, and the sort of mistrust of the mainstream media, that, that dates back decades as well. And so those sort of very divergent research traditions, I thought were really interesting and important to look at in contrast in that piece. And so that’s really what it does, it tries to figure out, you know, how the left does business as far as activism goes, how the right does business? What similarities are there? They’re both online, they both use many of the same social media platforms. What differences are there? The literature tells us that, for example, disinformation is a much bigger problem on the right than it is on the left, the issue that we identify in the piece, or one of the issues we identify, is that there hasn’t been that much research on disinformation on the left. So there’s a couple possibilities. One possibility is the research record reflects reality, right? Disinformation is a bigger problem on the right, than it is on the left. Another possibility is that, because there hasn’t been quite as much research done on disinformation on the left, we simply don’t know.

What we call for in that piece is to try to figure out exactly what is going on as far as disinformation on the left goes. Searching through the literature, we didn’t really find that there were that many attempts to even answer the question. So what we want what we’re, what we’re advocating for is an affirmative answer, to this question of how much disinformation there really is, in terms of left wing left leaning or left oriented, so that we can characterize it against the disinformation that we know is rampant on the right.

Noshir Contractor: Deen, why do you think there hasn’t been more studies that have tried to examine disinformation on the left?

Deen Freelon: That’s a good question. I think some of the disinformation may not be quite as out there. I think, as we saw in terms of the events of January 6, there is a very strong argument to be made that the disinformation on the right, apart from how much of it there is, I think that the character of it is a lot more virulent and more likely to result in injury and harm to bodies, specifically, as well as to democratic norms. And so I think there’s a greater urgency there simply because of that. However, I do think it’s more than a mere scholarly curiosity in terms of characterizing the nature of disinformation that may appeal to the left as compared to that which appeals to the right. We simply haven’t done that work. I think it’s analytically important. I think it has public importance as well.

Some of it may have to do with the political commitments of the people who do the research.I don’t — I’m certainly not going to cast aspersions on anyone who does that kind of work, and I certainly don’t know enough about their political commitments to be able to say definitively, that’s just how, you know, confirmation bias and sort of, you know, motivated reasoning tend to work.

This is something that, again, extends from research tradition that extends, at least until the 60s, you know, the studies of the civil rights movement, being kind of the paradigmatic social movement. And even if you look at some of the definitions of social movement, some of it actually has, almost seems to have left wing politics built into it. And so I don’t think that’s a great idea. But I do think that some of the analytical pieces of this also play a role in determining what gets categorized as a quote unquote social movement, and what is studied as, you know, reactionary politics or, or mainstream politics, because they’re practiced by people of different ideological commitments.

Noshir Contractor: So you’re not making a conspiracy argument, you’re just saying that this is a scientific curiosity that needs to be balanced across the left and the right.

Deen Freelon: Yeah, I really try not, I really try to stay away from any and all conspiracies. I do think that, you know, in that review, I think we’re doing what good reviews do, which is to point out, you know, gaps in the literature to say, we’ve done a really good job over here, we haven’t done quite as much work over here. So let’s, you know, balance the scales a little bit.

Noshir Contractor: One of the things that, obviously, is front and center on many of our minds these days, especially in the United States, is the Black Lives Matter movement. And I want you to talk a little bit about your piece that was titled,”Black trolls matter: Racial and ideological asymmetries and social media disinformation.”

Deen Freelon: Sure. Well, I want to give credit for that title to the wonderful Jeff Hancock of Stanford University. That piece really grew out of my work on Black Lives Matter. I did a report, a public report that came out 2016 and a follow up empirical article a couple of years after that. And so that actually was one of my big entree into the world of online disinformation, because I had this big black lives matter dataset. And when the internet research agency, Russian troll list of handles came out at the end of 2017, I basically just looked into my Black Lives Matter dataset and said, Wow, there’s like 300, you know, some names from this data set represented in my Black Lives Matter data set. So I said, Okay, well, this is definitely something I have to study, because they seem to have some interest in activism, specifically Black activism. And that piece of research that you, that you mentioned, is really the culmination of that investigation.

What we found was that Black-presenting Russian trolls were actually more likely than any other of the categories that we looked at, which included right wing trolls,non-Black left wing troll trolls, and a couple of other ones. They were more likely to pull in retweets, replies and likes on a per tweet basis. And we thought that was quite remarkable, especially because the study design allowed us to disaggregate the influence of ideology from race.

Noshir Contractor: Can you talk a little bit more about that? What does it mean to be able to disambiguate race from ideology? And also, if you could just recap again, what exactly was the asymmetry in the social media disinformation that you found?

Deen Freelon: So we rely on that study on categories that came from a couple of researchers out of Clemson University, They came up with a really great initial typology, they lumped together, Black left wingers and non-Black left wingers, and so based on some theory that we detail in the piece, we made the theoretical argument for disaggregating those. We found out that a substantial amount of the effect for likes retweets and replies that were attributed initially to left-leaning were actually explained by Black-presenting, right. We found was a very, very strong indicator that the Black presentation was actually driving, a lot of, a significant portion of the effect.That’s where the asymmetry comes from, the asymmetry between left and right being more effectively explained by race than by ideology. And also the asymmetry between being sort of non-Black left wing as well as between Black left-leaning.

Noshir Contractor: That is incredibly interesting, because it’s so easy for us to conflate some of these in our stereotypes. I’m going to ask you a more general question, do you make a distinction between disinformation and misinformation?

Deen Freelon: If you look at our piece that ran in political communication last year, “Disinformation is political communication,” we talk about disinformation as being false or misleading content that is intentionally spread to damage a third party. So that is where the person spreading it is aware of the deceptive nature of what they’re spreading. And they’re doing it with a specific goal of damaging some enemy. Misinformation is where content is spread, without knowledge on the part of the spreader that it’s false, or that there is some deceptive element to it. And so what that actually implies is that dis- and misinformation are not necessarily inherent qualities of the content itself, but rather, they are relations between the people who spread them and the content.

Noshir Contractor: And so by that definition, then the two pieces that you wrote about Russia, one titled “The Russian Disinformation Campaign on Twitter” and the other about Russia as internet research agency, appearing in the US News is Vox Populi, tell us a little bit about how you got interested in this particular issue. And what were some of the key takeaways for you?

Deen Freelon: I feel that my interest in disinformation is sort of, you know, charitably achieved through my interest in social movements, and in the way that a lot of the most prominent disinformation including the IRA and others, have really tried to glom on to existing social movements, to be able to spread their falsehoods. And so I think that is something that is a logical outgrowth of outgrowth of the work that I’ve done.A lot of the work that I that I have done in this has really stuck close to the sort of the relationship between disinformation and social movements, because that’s something I’ve been interested in since I was a grad student.

Noshir Contractor: And you find that in the case of the “Russian Disinformation Campaign,” one of the things that you argue, which again, is counter to the conventional wisdom, is that the disinformation campaign on Twitter targeted political communities from across the spectrum, not just from the left, as some in the media would have us believe.

Deen Freelon: The internet research agency, which was a very specific group of paid Russian trolls that were paid by the Russian government, targeted, not only you know, folks in the Black community, or on the left, they also targeted folks on the right. And one of the studies, the study that was published in the Misinformation Review, my my colleague, Tanya Loca, and I point out that the specific identity that the IRA agents took on was the same identity of the people that they actually wanted to reach. So conservative presenting trolls wanted to reach conservatives, Black-presenting trolls, mostly reached Black individuals, left presenting trolls reached out to and actually ultimately reached left-leaning individuals. So in some ways, that’s actually helpful analytically to understand exactly what they’re doing. They’re playing on this, this idea that most of us who study social media, and web science understand, which is like follows like, right, you know, birds of a feather flock together. And so they’re really taking advantage of that specific tendency on the internet and social media, to be able to reach out to folks and have the real individuals who share those political identities to carry forth their disinformation for them. And that’s one of the main ways that they’re able to get traction is to have real people sharing, retweeting and engaging with their content, which gives it that imprimatur of reality.

Noshir Contractor: And what was interesting is that you suggest that the best way to counter that or at least one way to counter the Russian disinformation campaign, would be for people across the political spectrum to collaborate against it? Tell us more about that.

Deen Freelon: Now, that was, in all honesty, a bit of a pipe dream, right. I mean, we’re pretty, we’re pretty polarized, I think in our country right now. But I think if there are, I think if there are opportunities to do that, I think it would be a great thing. I don’t know anybody who openly proclaims that having, you know, foreign agents, infiltrating our political conversations is a good thing. So it seems to be at least in principle, to be something where people from differing sections of the political spectrum could come together and agree at least, that this is a bad thing, and we should find ways to, to combat it. So now, in terms of how likely that is, I don’t really know.

Noshir Contractor: Yes, we are living in rather, hyperpolarized times, as you might put it. You did a project that has been going on for a period of time called the filter map. Tell us more about where that project started and where it is now.

Deen Freelon: “The filter map” is the name of a piece that came out, I was commissioned to write this piece by the Knight Foundation, and it came out in 2018. And in that piece, I sort of take issue with some of the conversations that were occurring around ideas of the echo chamber, and the filter map. The idea at the time was, Oh, well, people really need to engage with content that lies outside of their own bubble, right. So it’s, it’s content that is produced by people who disagree with them, they need to engage across ideological lines. And my contribution to the conversation is, there are certain ideas that it is not fruitful for us to engage with, right. So if you’re talking about, you know, open racism, open sexism, you know, Nazism, things of this nature, these aren’t ideas that we should give the time of day to, so to speak. And so what I tried to do in the piece is I tried to articulate the kinds of ideas that we disagree with, that we may want to give the time of day to, and those kinds of ideas that we may not want to, right.

So the idea behind the filter bubble, is to say, whether you agree with something as sort of one aspect of your relationship to an idea. A second aspect is, if you’ve decided you disagree with something, whether it lies beyond the pale of things that you would at least consider as a second factor. And so that general set of ideas kindly of sat on the shelf for a little bit, until I was lucky enough with three of my colleagues to be able to receive one of the big Knight Foundation Center endowing grants in 2019. And at that time, I realized that I had an opportunity to put the ideas in this filter map into practice.

So I’ve collapsed it into two dimensions. And so one dimension is if you think about this horizontally left versus right, so there’s been a lot of progress in the past few years, a few years in terms of ideologically scaling, media personalities, media outlets, and Twitter handles, things like that. So you can think about that as being scaled in a horizontal axis, as well as on a vertical axis that would look at things like the total number of, you know, ratings that you get on PolitiFact, right. So if you’re high truth, you’re up here, you’re low truth, you’re down here, right? So and now you got two axes that shows left, right, one, high truth, low truth, up and down. And you can actually look at your own social media feed and see how much of each quadrant you actually get. So if you think about above, board where your high truth, that’s where you’re seeing the kind of content that you want to engage with, oh, here’s the high truth, right wing stuff, okay? Let’s think about that. Let’s engage with that. And if it’s low truth, well, it’s low truth, and it’s on my side, that’s maybe disinformation that’s trying to target me, that’s where I’m most vulnerable. And that’s what I want to keep out of my information stream.My hope is that that will help people understand their social media feeds better. And it’ll help put some of this heady, you know, theoretical stuff into practice in a way that ideally makes people’s lives a little bit better.

Noshir Contractor: This is an example of how you make your scholarship very actionable or potentially actionable by individuals in terms of giving them something to look at. You’ve also contributed by way of sharing code and your software tools that you’ve developed it etc. Tell us about why you chose to do that. And what do you see as the challenges and opportunities for people in the web science community to be sharing their code.

Deen Freelon: Sure. Well, I first started sharing my code when I was a grad student. And actually, the very first thing I shared is by far the most popular thing I’ve ever shared. And that is Recall, which is online, in a code of reliability calculator for content analysis. So that’s kind of that’s kind of fun for me, and I think, useful for many people.

In many ways the success of that project, which was really just an offshoot of my master’s thesis, I mean, the the short version of the story was that before Recall, the primary intercoder reliability program with something called Pram, and it only ran on Windows, and I had a Mac. And you know, I did my grad work in Seattle. And, if you know, Seattle, it’s very rainy. And so when I was doing my Unicode reliability tests for, for my master’s thesis, I didn’t want to walk from my apartment all the way to the lab, in the University of Washington Communication Department. So I said, well, I’ll just make one myself and program this thing, literally do the math myself to do this, sort of prettied it up, and made it usable for others, when I put it on my website. So the success of that really led to other you know, sort of forays into writing software for the research community, I think is incredibly important. I think of how I personally have benefited from other people’s software that they’ve created that’s been on an open source basis. And I just want to give back a little bit to that one issue, I find that I think hampers people from from doing this is that, especially outside of computer science, and perhaps information science, the production of open source software for the research community is often not seen as, as, as much of a contribution as it should be.

Noshir Contractor: We’ve talked about sharing code. What about sharing data? You were involved as part of the beta test that Twitter has offered to make all of its data available for free for researchers who apply for it. So tell us about moving from sharing code to sharing data in the web science community.

Deen Freelon: This is a really big topic. Our access to data, especially that which is owned by or stewarded by, for profit corporations is fundamentally tenuous. We’ve seen, you know, the rise of social science, one, which provides application based access and also money to Facebook data, we’ve seen this more recent initiative by Twitter, which allows access all the way back to the first tweet in 2006, to researchers who applied but again, even though I applaud that particular move by Twitter, ultimately, they have a say over who they accept, in terms of this program. Thatstill puts a lot of power in their hands in terms of deciding who gets to access this kind of data, and who gets to do this kind of research. I think that any researcher in the web science area should really have what I consider to be a diversified portfolio in terms of the data streams that they’re working with. So don’t become over reliant on one type of data, to be able to get your work done. So a lot has been written about and said about our field’s over reliance on Twitter data. And so you know, if Twitter data is your only game in town, well, if Twitter decides, you know, that giving this kind of data access is not in their best interest, or if they decide to reject your application for access to this wonderful, you know, time-unlimited stream, then you’re not going to be in a very good position. So having a number of different data sets that can speak to the kind of questions that you’re interested in,whatever they may be, I think is critical for being a web science researcher in 2021.

Noshir Contractor: We talked earlier about polarization, and I’m going to use that as a pivot to a very polarizing concept that I would love to get your take on. And that is, the notion of being able to infer individual level characteristics from digital trace data. You get people on one end of it, who think that that’s the most incredibly powerful way of being able to get to things and others who think that is the scariest idea on the web.

Deen Freelon: Well, this is something I’ve been thinking about for a very long time, and it’s something that I feel like I wish more people paid attention to. Because there are certain norms in certain fields, that don’t really think or are not thoughtful enough about what those traces really mean. For some researchers, it seems that simply studying the trace itself is enough. And there’s not really a whole lot of discussion about what theories this may apply to, and what those traces actually mean.

So I think that under certain circumstances, certain digital traces are really, really great proxies for things that we really care about. In other cases, the fit may not be so great, but what I really want the scholarly community to do, web science and other social sciences, is to really consider carefully the fit between the, the theoretical concepts and research questions of interest and the data to which they have access.

Noshir Contractor: What do you see today, based, either on your own work, or more generally, what do you see as important issues that web science should be addressing moving forward?

Deen Freelon: Again, another really big question. I’ll just sort of beat a drum that I’ve been talking about for a while now. I think that, you know, web science community is in many ways, not unique among social sciences in underestimating the importance of identity more broadly, and race specifically. So, when you’re thinking about any topic that you deal with whether it’s virality or some of the more policy-oriented aspects of this, keeping an identity-focused aspect of this firmly in mind is really important. Identity factors, which include you know, not only race, gender, in some cases, sexual identity, national origin, also in some cases religion, really help to get a fuller picture of what’s going on the web and in various digital domains. So that’s something I’d encourage every web science practitioner to do, first of all, to read up on it, to figure out how to integrate that into the work they’re already doing, and then secondly of course, to implement that knowledge.

Noshir Contractor: Now that’s extremely important especially because in some ways, one can argue that the web conceals some of the normal visual surface level characteristics that we will look closely at many of these identity issues, not all, but some of these identity issues.

Deen Freelon: Yeah, and that actually ties back into the trace data issue, So one of the examples has to do with the underlying concept of gender versus race. So you’ve got a situation in which gender, generally, at least, anglicized names, can be heard with high levels of accuracy from someone’s first name. Then the question is, to what extent does the system support the use of real quote unquote first name. Facebook has its terms of service that you can use your first name so that’s in terms of service level issue, you can use something else but you risk your account being kicked off. Twitter does not require you to do its Terms of Service, and lots of people don’t. So, you would assume that any study that had a bunch of names of individuals on Facebook, and have a lot easier time determining gender than what an equivalent study on Twitter. Now shift to the idea of race, race is a lot harder, especially in the United States, to infer the basis of someone’s first name. In some cases you might be able to do in other cases, you may not be able to do it.

And so that becomes a lot harder to be able to to get. Actually, a better example is that Facebook allows you to indicate your gender, so, the difference in terms of the identity characteristics that you’re able to get out of those systems is baked into the design of the system. So, that means that some identity characteristics are easier to integrate into a research study than others. But I think that the effort is well worth it when you’re trying to figure out how for example, different soci technical systems are used by different people, how they impact different kinds of people, and how different kinds of people see them.

Noshir Contractor: In closing, in 2020, spending almost a year in isolated confined environments and dealing with all kinds of reckonings, cultural, racial, health-related, etc. Can you talk a little bit about what this entire experience might have been how it might have been different, for better and/or for worse, if we didn’t have the web?

Deen Freelon: The image that popped into my mind was, how would a skyscraper be different if you remove the second floor. The second floor goes away, what happens is, floors three through n crash down, and they crush floor 1. So, I think that you know taking the web, that is so, you know, deeply enmeshed into everything we do, would render our society completely unrecognizable. So it’s not like, okay, you take the web out and you go back to the 1980s. It’s everything that relied on that, everything from banking to getting your takeout with a couple of clicks of an app, to your health, to how you relate to others, the fact we can have this conversation remotely.I just don’t think that would really be something that we could imagine, we can’t really put the genie back in the bottle. We have to live with this, as it is. I think there’s certainly ways that people can use the web better, there are choices that I wish people hadn’t made. I think it’s extremely difficult to imagine our society without the web.

Noshir Contractor: I love the metaphor of the second floor of a skyscraper falling apart, I think that is an extremely evocative way of capturing our dependence, if you may, in a very foundational way. Deen, thank you again so much for taking time to talk with us today. I think that your work is extremely important in part because it challenges some conventional wisdoms and does so in a way that really is provocative and advances our understanding and sensibility about many issues related to web science. And I look forward to seeing continued research and insights from you in the years and decades ahead so thank you again.

Deen Freelon: It was really great to be here.

Noshir Contractor: Untangling the Web is a production of the Web Science Trust. This episode was edited by Molly Lubbers. I am Noshir Contractor. You can find out more about our conversation today in the show notes. Thanks for listening.

Episode 7 Transcript

Gina Neff: We see a moment that we’re in right now of being somewhat trapped, I think, between the necessity of contributing to a public good, but also needing to understand where we fit personally in these. So I’ve spoken out quite publicly about back-to-work solutions that don’t protect workers’ privacy, right? What we know about organizations and workplaces, is that we absolutely have an imbalance of power between employees who need work and employers who might have other interests or demands in the workplace.

Noshir Contractor: Welcome to this episode of Untangling The Web, a podcast of the web science trust. I am Noshir Contractor and I will be your host today. On this podcast we bring thought leaders to explore how the web is shaping society and how society in turn is shaping the web.

That was Gina Neff earlier, talking about how the Web and the workplace are influencing one another. Gina is a professor of Technology & Society at the Oxford Internet Institute and the Department of Sociology at the University of Oxford. She’s a sociologist who studies how web-based technologies are shaping the future of work, and has published three well-acclaimed books and over four dozen research articles on innovation and the impact of digital transformation. Her writing for the general public has also appeared in Wired, Slate and The Atlantic, among other outlets. Given her thought leadership in this space, Gina was invited to deliver a Keynote at the 2020 ACM Web Science Conference. Welcome, Gina.

Gina Neff: It’s really good to be here.

Noshir Contractor: Well, I thank you very much for taking time to talk with us today. And I’m very excited to hear your opinions and your insights about this very important new discipline that has been emerging over the last decade, called web science So first I want to talk about what does that term web science mean to you?

Gina Neff: As a social scientist who studies work and technology, I really can’t do what I do without thinking about web science, so when I first think of the term, I think of mapping the web. What I do in my work is to really dive deep into those ties and think about what people are doing in those connections. How are they linking at work, what fo those links mean for them, and how are those collaborations playing out — both at workplaces where people are trying to work on tasks together, but also in terms of making social structure, in terms of making the new rules of society that come out of the links that they’ve built.

Noshir Contractor: And I wonder if you can give us some key insights and contributions that have been made by web science to better understand not just the way we do work now, but the changing nature of work and what some would argue is the future of work?

Gina Neff: Without understanding how our networks and relationships are changing with technology, we simply can’t understand how people accomplish the tasks and goals they have in the workplace. So this is going to seem like a tangent, but bear with me.

I’ve been studying large scale construction projects, skyscrapers and, you know, on the one hand, it’s one of the last industries that we would think of is high tech, and yet, a decade ago they were trying to figure out how to do remote work meaningfully.

The first web meetings — WebEx meetings I was ever in — were on construction sites. Why Because people who come together working on a construction site often have to travel from a two hour radius to get to the construction site. If you bring all of those different companies, people from all of those different jobs to the job site. It’s costly. It takes time. And much of the work that they were trying to do is coordinating in a digital space. And so they tried. They’re like, okay, let’s just have video calls. And yet, even though we have a very tight closed project group, everybody understands their roles and tasks, really highly structured, they struggled with figuring out how to come up with collaborative decision-making in these online virtual meetings.

That’s a neat problem, right, and it’s a problem that we’re all facing right now. So I think when I think about web science, I think about the ways in which we’re mapping new kinds of communication ties that end up structuring our everyday life. Whether that’s from social media, whether that’s from our news environment, whether that’s from our workplace. And so when we look at how this becomes part of our daily way of working — part of our daily rules and ways of being — what I think as a social scientist, what we start to see is some really exciting things about how the fundamental rules of social life are formed. That’s what I think web science can do.

Noshir Contractor: That is that is incredibly important and significant and I was wondering also, how this would tie into some of the earlier work that you did, the book that you wrote titled “Venture Labor Work and the Burden of Risk and Innovative Industries.” Acting with technology, you were one of the early scholars looking at this issue clearly from a web science point of view.

Gina Neff: Yeah, so that project really asked the question, why on earth would an internet industry form in New York City. If we have the capacity with the new commercial worldwide web, to have these links that allow us to work remotely, why would a thriving industry form in some of the most expensive real estate in North America, right in the center of Manhattan?

And the answer is kind of twofold. One is there was a supply of creative individuals who worked in adjacent industries, in advertising and in film and writing and in magazines and that they could come together and basically create new kinds of content to fuel the first wave of commercial web activity.

Now, that’s part of the story. But the other is that we know that innovative industries really thrive and prosper on these close links that people have and that that big that kind of information becomes both the way that industries can understand the event horizon that they face, right? They have all of these people fairly closely together who share information, share new technologies, share ways of doing things, they share new kinds of companies, you know, this is the hope that everyone has of the new Silicon Valley, right, what, what makes it an innovative industry.

And so you put these two pieces together, you have a bunch of creative people and they’re all trying to figure out how to make a new industry.

It — I argued in the book — becomes a way that new kinds of risk gets shared and dispersed and spread across a new industry. And that I think is really interesting for us to think about in this particular moment because people learn to adapt and people learn to take on that risk in these new kinds of environments and they, and they welcomed it in the first wave of the web.

Noshir Contractor: That’s a good point that in terms of the first wave and the second wave. I think initially, a lot of people in the first wave we’re focused on broadening our networks to the point where, we could be anywhere in the world and have virtual organizations and virtual teams, But what you’re pointing out is that the web has at least an equally important role in helping people who may be co-located also augmenting their interactions and communication and collaboration by leveraging other aspects of the web that we might not have looked at initially.

Gina Neff: And I think that, you know, in this moment right now we’re having this conversation. You’re sitting in Evanston, I’m sitting in Oxford. Most of England and much of the United States are sitting at home because we’re fighting the coronavirus pandemic.

So what we don’t know yet is how these initial stores of our social networks, our social capital get translated in this incredibly highly stressed moment, so that people can can become useful and and do their jobs and make the connections in a moment when we can’t travel. When we can’t see each other face to face, right? This is — it is incredible to me. Can you imagine doing this 15 years ago, right? If I go back 20 years ago, if I go back to this moment in Silicon Alley, in New York City, where content creators were so excited about the possibility of what the web could be. They were doing serial small video segments, you know, there was a company pseudo.com that had the, the hubris to turn to 60 minutes and said, I will put you out of business. No one’s heard of pseudo, right? Pseudo is gone, it’s long gone. But the idea that a web company doing streaming video was so laughable in the year 2000, right, it was so inconceivable that we could have the bandwidth to do this thing that we’re doing right now. The high quality video, high quality streaming, high quality interaction. And so, we’re still in many ways in the early days of figuring out how these kinds of links and ties are going to be intensified in our face-to-face interaction, and then nourished through the other kind of digital-mediated ways that we interact.

Noshir Contractor: And this is a question and a challenge that web scientists like yourself have taken on. I was also struck about how with all the technology that we now have around us, they have the ability to be instruments tracking us digitally. And you talked about that in the book that you co-authored with Dawn Nafus, titled “Self-Tracking.” And while that book was focusing more on the quantified self and instrumenting yourself in personal contexts, I would love to get your take on what might happen as technologies in the workplace are being instrumented to capture our actions, interactions and transactions. Where do you think self-tracking is headed in the workplace ? And what do you see as the promises and perils of that?

Gina Neff: One of the things that I really learned in the self-tracking project is that for many people, their personal data is something they’re very willing to share in an altruistic way. There are all these wonderful communities that I studied and the self-tracking in the self tracking project of patient community is where people share really intimate data, genetic data medical histories, things that you and I might see as much too risky to our own sense of personal privacy and protection and yet, these incredible people were driven to do this because they, they saw that in their data held the possibility for the cures to their illnesses. And in their data held the possibility of these incredible connections and ties to other people who were going through the same thing they did.

We see a moment that we’re in right now have been somewhat trapped, I think, between the necessity of contributing to a public good, but also so needing to understand where we fit personally in these. So I’ve spoken out quite publicly about back-to-work solutions that don’t protect workers’ privacy, right? What we know about organizations and workplaces, is that we absolutely have an imbalance of power between employees who need work and employers who might have other interests or demands in the workplace in terms of HR, management or legal exposure.

And so any kind of app that gets us safely back to work absolutely has to take these two kind of tensions in mind, right, have to be designed from the ground up first to tap into people’s altruism. People want to solve problems and they certainly want to solve the global pandemic that we’re in right now. But they don’t want to do it at the risk of their own livelihood, or their own their own ability to continue working, or their own, so so you’ve got to, got to think about that kind of data as control, and data as power. You know, we kind of went off from your basic question. Now, suddenly everything we do at work is somehow also traceable and trackable. It’s a huge opportunity for those of us who study workplaces to think about how those networks at work might be changing, how might the networks for people, who are say, women at work, are we seeing how remote working changes women’s ability to navigate networks. That’s an open question and one that we’re going to have trace data to study. But at the same time, we absolutely need to be thinking about how do we create safe workplaces and how do we create better and more stable workplaces, given the fact that now everyone’s exposed in these new ways with their data.

Noshir Contractor: You hit the nail on the head. It’s really a dilemma in some ways, but also an opportunity to be able to understand how networks are changing today when they go virtual. It’s one thing when we are connecting with people that we already knew and we may be in a position where we are deepening those ties, but it’s an open question how this environment will work when we have to deal with people we had not met previously. How will the web environment accommodate the levels of social presence that we are used to in which we have a prior face-to-face interaction?

Now, you already made reference to the pandemic, but I wanted to just give you another opportunity and invite you to talk about what you think are the one or two most significant things that have been different for us as we navigate the pandemic and global cultural reckonings without the web.

Gina Neff: Can you imagine doing this without the web. Seriously. Can you, im- I mean, evidently there have been global pandemics before the web, but I can’t, I can’t think of them between how we have organized our shopping, how we have organized our home life, how we have organized our schooling, how quickly, within just a matter of weeks we transitioned from face-to-face to be at home around the world. I find it literally inconceivable. And I can remember a world before the commercial internet. So what does that mean?

I think that one of the challenges that we need to remember about the web is that beautiful wonderful decentralized structure that is stable enough to permit this thing, to keep on going and perpetuating without centers of control, is the exact same protocol that allows us individually to navigate in very different ways. And so while my ability to reach out and regenerate my networks may not be damaged as much I as a manager, I as a supervisor as a, as a leader, I really need to remember that others I work with might not be able to do that as well. And so I think that that’s the, that’s the kind of catch-22, right? There was a wonderful example — a terrible example really here in the UK when shops opened up and there were very long lines at one of the discount retail stores, and they’re in the chattering classes in the media, talking about, you know, how dare these people wait in line going to these discount shops in order to buy clothes, that seems so risky, why are they doing that and forgetting the number of people who don’t navigate the world through Amazon, who don’t have credit cards at their disposal. Who, you know, navigate the web from their smartphone and therefore have a very different kind of experience through how you might buy and shop and deal. And by the way, this retailer has no online presence, right. So the retailer that allows the best discount on clothing in the entire country is not one selling online. And yet, this very large disconnect. So I think that one of the things that has been made visible is the way in which we navigate communities through the web are quite distinctive and we need to remember that others are doing their own distinctive path as well.

Noshir Contractor: I think these are really, really important points because you mentioned that you have memories going back before the commercial internet and so do I. And I don’t remember an overwhelming amount of discussion about designing the web to help deal with a pandemic. And yet, for some reason, it seems that many aspects of the web were designed perfectly to deal with it. On the other hand, as you have also been mentioning, there has been the downside of the web in terms of the ways in which at this particular point in time, it might be influencing certain communities. And I’ve heard you talk about the term infodemic and I wonder if you want to talk a little bit more and share your thoughts about that in the present situation.

Gina Neff: So when the head of the World Health Organization says very early on in the COVID-19 crisis that we have an infodemic, right. That is as they say, on, on, on Twitter, you know, not swimming in the lane, right? Come, come, swim in our lane, right, the lane of people who understand online communication. And I want to use the term meaningfully because you know, I, many of the ills that we talked about in terms of disinformation, misinformation.

The infodemic — there are reflections of a moment where there is ending, decreasing distrust of our social institutions. This is not coming from the web and the way in which people connect, this is really about how people feel a part of society. And that’s the challenge we’re having right now. We’re having an enormous challenge in Western democracies, of this relationship between individuals and the state and individuals to their communities and something will shift and change, we just don’t quite yet I think know what. When we bring that back to infodemic, right, when we bring that back to this idea that high-quality good scientific information is hard to come by. I’m sure you’ve seen in your social media feeds and in mine, I’ve had to deal over this crisis with people who don’t believe in science. They don’t believe in vaccines. They don’t believe the same thing I do, and I can take the approach that says — if I only convince them that, you know, if I only argue hard enough — or I can begin to say, part of what we need to do is not simply about the kinds of information, getting better information and better quality information we need to do that. But we need to get out the kinds of stories that we know have always connected us, and make us feel a part of something bigger than what we are individually. And so that’s what I think, you know, we see in this moment right it’s just a simple amplification of a social trend, a very large social trend that’s predated the web, it’s predated COVID-19 and this is coming together at this particular moment. It’s interesting times.

Noshir Contractor: Well, thank you again, Gina, for taking time to join us today to share your insights with us, and more importantly for your thought leadership in web science. Thank you.

Gina Neff: Thank you. It’s been a real pleasure.

Untangling the Web

a podcast of the Web Science Trust, hosted by the Sonic Research Group at Northwestern University

Uncategorized

Episode 13 Transcript

Noshir Contractor: Being a company that is involved in software and software engineering, your report also points out that software engineering got slightly more productive, actually, but also came with accompanying burnout.

Noshir Contractor: I thought it was interesting that the results of this study point to the fact that working from home, after taking into account the partitioning of the COVID issues, actually resulted in less time on collaboration and more focus time.

Episode 12 Transcript

Noshir Contractor: Welcome to this episode of Untangling the Web, a podcast of the web science trust. I am Noshir Contractor and I will be your host today. On this podcast we bring in thought leaders to explore how the web is shaping society and how society in turn is shaping the web.

Noshir Contractor: So on the one hand, where panopticon was saying pay attention to the prisoners, countervailance is saying pay attention to the prison guards.

Noshir Contractor: I’m reminded that, at least at one point, you were the head of a group at MIT called a decentralized information group, that was DIG for short.

Noshir Contractor: You mentioned, Danny, that countervailance is something that should be embraced by these few private organizations that are controlling so much of our data. Can you talk a little about what active transparency would look like?

Noshir Contractor: Well, this is going to be a good challenge because it’s almost like we see an adversarial arms race between the private companies that own a lot of this data, and privacy activists who are trying to challenge them.

Episode 11 Transcript

Noshir Contractor: Welcome to this episode of Untangling the Web, a podcast of the web science trust. I am Noshir Contractor and I will be your host today. On this podcast we bring in thought leaders to explore how the web is shaping society and how society in turn is shaping the web.

Noshir Contractor: This is really intriguing. Can you make a concrete example of what kind of migrant population you’re talking about, and what can be done to help them feel less alienated and more at home?

Noshir Contractor: To what extent do you think that AI is enhancing trust on the web, or undermining it for the lay public?

Noshir Contractor: Can you explain more of why it’s unfair?

Noshir Contractor: It sounds what you’re describing is a scaling problem. Help me understand why scaling is such a challenge.

And I wish you and your colleagues the very best in helping advance the notion of web science in developing countries etc. And we will be looking forward to hearing more about those insights in the years to come. So thank you very much again.

Episode 9 Transcript

Noshir Contractor: Welcome to this episode of Untangling The Web, a podcast of the web science trust. I am Noshir Contractor and I will be your host today. On this podcast we bring thought leaders to explore how the web is shaping society and how society in turn is shaping the web.

Noshir Contractor: I’m so glad that you’re able to join us here today, I have been a huge fan of your work for a long time. Let me first begin by asking you, how did you first get interested in studying the web?

Noshir Contractor: Deen, why do you think there hasn’t been more studies that have tried to examine disinformation on the left?

Noshir Contractor: So you’re not making a conspiracy argument, you’re just saying that this is a scientific curiosity that needs to be balanced across the left and the right.

Noshir Contractor: Can you talk a little bit more about that? What does it mean to be able to disambiguate race from ideology? And also, if you could just recap again, what exactly was the asymmetry in the social media disinformation that you found?

Noshir Contractor: That is incredibly interesting, because it’s so easy for us to conflate some of these in our stereotypes. I’m going to ask you a more general question, do you make a distinction between disinformation and misinformation?

Noshir Contractor: And what was interesting is that you suggest that the best way to counter that or at least one way to counter the Russian disinformation campaign, would be for people across the political spectrum to collaborate against it? Tell us more about that.

Noshir Contractor: Yes, we are living in rather, hyperpolarized times, as you might put it. You did a project that has been going on for a period of time called the filter map. Tell us more about where that project started and where it is now.

Noshir Contractor: What do you see today, based, either on your own work, or more generally, what do you see as important issues that web science should be addressing moving forward?

Noshir Contractor: Now that’s extremely important especially because in some ways, one can argue that the web conceals some of the normal visual surface level characteristics that we will look closely at many of these identity issues, not all, but some of these identity issues.

Noshir Contractor: Untangling the Web is a production of the Web Science Trust. This episode was edited by Molly Lubbers. I am Noshir Contractor. You can find out more about our conversation today in the show notes. Thanks for listening.

Episode 7 Transcript

Noshir Contractor: Welcome to this episode of Untangling The Web, a podcast of the web science trust. I am Noshir Contractor and I will be your host today. On this podcast we bring thought leaders to explore how the web is shaping society and how society in turn is shaping the web.

Noshir Contractor: And I wonder if you can give us some key insights and contributions that have been made by web science to better understand not just the way we do work now, but the changing nature of work and what some would argue is the future of work?

Now, you already made reference to the pandemic, but I wanted to just give you another opportunity and invite you to talk about what you think are the one or two most significant things that have been different for us as we navigate the pandemic and global cultural reckonings without the web.

Noshir Contractor: Well, thank you again, Gina, for taking time to join us today to share your insights with us, and more importantly for your thought leadership in web science. Thank you.