Episode 3 Transcript

 

Wendy Hall: I always say there’s two things I don’t like about web science. One is web and the other is science. But the idea there was that it wasn’t just about the technologies, HTML, HTTP. It was about the web of people actually, it was about interconnectivity, and science in the sense of study of it. And nowadays, I say it’s, we’re studying, you know, our lives online, basically finding ways to do that.

Noshir Contractor: Welcome to this episode of Untangling the Web, a podcast of the Web Science Trust. I am Noshir Contractor and I will be your host today. On this podcast, we bring in thought leaders to explore how the web is shaping society, and how society in turn is shaping the web. 

Our guest today, Dame Wendy Hall, was involved in naming the very field we’re talking about. In a different world, we could’ve sat down to chat about philosophical engineering movement, or psycho history. But Wendy and other co-founders decided to call it web science.  

Wendy was a Founding Director of the Web Science Research Initiative and is the Managing director of the Web Science Trust. She became a Dame Commander of the British Empire in 2009, and was elected a Fellow of the Royal Society in the same year. She was elected President of the Association for Computing Machinery in July 2008, and was the first person from outside North America to hold that position. And now, Wendy is the Regius Professor of Computer Science at the University of Southampton, and the Executive Director of the Web Science Institute there.  In 2020, Wendy was appointed as Chair of the Ada Lovelace Institute by the Nuffield Foundation. Welcome, Wendy. 

Wendy Hall: Hi. It’s lovely to be here.

Noshir Contractor: Thank you so much for joining us, Wendy. It’s a special privilege to be able to talk with you today about web science, because in many contexts, I would consider you as the matriarch of web science. And I would like to begin by asking you to take us through what motivated you and your colleagues to come up with the idea of creating this entity called web science.

Wendy Hall: Well, thank you very much for asking me. We being myself, Tim (Berners -Lee), Nigel (Shadbolt), and Danny (Weitzner) — started meeting in Tim’s office at MIT, to talk about why the Semantic Web was not being taken up more than it had been: Why weren’t people interested in linking data? I’ve known Tim since before the beginning of before he launched the web, and had been around the evolution of the web. And he talked about Semantic Web in his very first keynote at the first Web Conference in 1994. But then everyone was focused on getting the web up and running. And that was thought of as a web of link documents. And we didn’t have social networks yet. The Semantic Web was always part of Tim’s big vision. That machines could help you link data, and when you could link data, then you could infer knowledge from the documents that you were linking or from whatever you were linking, if you could describe it with data.  

And people didn’t just didn’t get it, the web consort — W3C — had developed the Resource Description Framework. He published his paper with Jim Hendler, and others all about what the Semantic Web would mean. And he just couldn’t get people to think about linking data. And so when we started talking about this, and this was 2004, 2005, so 15 years after the web was launched. It was clear that we had to look back in order to look forward. 

So we started to look at how the web had evolved, what had been the tipping points that had for the web to take off? Why did people start adopting the standards that Tim eventually made completely universal? And we started drawing pictures, of how — and we realized that it was actually to do with people and not so much about…not just the technology, it was what people did with it, and how companies used it to create new businesses..

People started having computers at home, and then the smartphone appeared. And that was all happening as we were talking. So you can see it was taken off, but it was so interesting to think about what had happened. It was clearly a sociotechnical story. We have to study the web as an ecosystem. And has to be interdisciplinary studying, it had to bring in people from social science and law and economics and geography and physics and maths and history and education and politics and business studies and anything.

We called it the Web Science Research Initiative, between Southampton and MIT because that’s where we were based. Jim came on later, and then you came on later, while we were still the Web Science Research Initiative. 

But we didn’t really know what to call it. Tim wants to call it philosophical engineering. He studied physics at Oxford, and that was when it was called natural philosophy, so he wants to call it philosophical engineering. We all wanted to call it psycho history, from Foundations and Trends by (Issac) Asimov. Because you can’t predict what a person is going to do. But can you by looking back at history of development, can you forecast, not predict, what the mass of people will do? What society will do? And that was the really founding idea, but we felt people wouldn’t understand psycho history. I think when we make the film or write the book, they will, but we called it Web Science, for good or bad. And I always say there’s two things I don’t like about web science. One is web and the other is science. But the idea there was that it wasn’t just about the technologies, HTML, HTTP, it was about the web of people actually, it was about interconnectivity, and science in the sense of study of it. And nowadays, I say it’s, it’s we’re studying, you know, our lives online, basically finding ways to do that.

For web science, the other thing that was so important was the timing. 2005 was when we did our real thinking about this. And we thought about the name. And we actually launched it on the world in 2006. That’s when we did the press release from MIT, we had the piece in Science. And the amazing thing was that Facebook didn’t start till 2004, so. And Twitter hadn’t started then. So we were doing all this thinking before there were the social media networks, they didn’t exist, but we could see that they were coming and that the issues were the big issues for the future, were going to be issues like privacy and security and trust. I remember writing them, they were like the three term mantra we had, because we could see that they were going to be the big issues for the future, as this opened up to everybody. And the trouble is, everybody includes the good and the bad of us. 

When you think about Vint Cerf and Bob Kahn, when they founded the internet, it was a league of gentlemen who all trusted each other, they were all friends. If somebody did something wrong, you would just tell them to stop doing it. But once the web opened up to the planet effectively, then you can’t stop people. It’s very hard to actually stop people doing bad things on it. We think now about what would we do if we started again now, what would we do differently? Because that was what made it work, was this openness and its accessibility and the fact that anybody could set up a web page and a server. That’s what Tim gave to the world. But that meant anybody, including people that want to steal and harm and do all the nasty things that exist in the real world, do them online at scale. And that’s what we’re living with today.

Noshir Contractor: You’ve touched upon the issue that the web can be used for good and bad. And I want to ask you, to what extent do you see the mission of web science to be focused on the cautionary tale of things that could go wrong, as compared to the opportunities that it creates for novel ways of organizing, for example. How do you reconcile these two aspects of the vision of web science?

Wendy Hall: Wheel, when we started, certainly, in my mind, it was the — it was definitely the former idea. It was, how could we forecast what will happen if we do this with the web, if we create this, this ability, if we develop this standard, if we allow people to put videos on the web, right? 

When the web first started, you couldn’t get a picture or a video on it. It was a dream. And as those standards emerge, so you could, you then of course, have to think about what will people do with that? We — now we know. Tim will often say he would have put more security protocols in the standards if he’d realized, you know, what people were going to do with his invention. So it was the idea that this is never going to be a predictable science, but forecast how people will behave and what bad things could happen. And I think of it like scenario planning, and you sort of like, well, how can we make it shift? How can we make it? What can we do to make sure it goes more in the good direction than in the harmful direction? How can we mitigate against harm? And the problem with that, of course, is you’ve got to then observe what’s going on, you’ve got to have a way of observing, and analyzing what has happened in order to look forward. And then if you are observing what people are doing, you potentially change what they’re doing, you know. So it’s quite a difficult science to evolve in that sense.

Noshir Contractor: You spoke about the invention of the Internet, and how it was different in some ways from when the web was invented. In your own work, you’ve spent a lot of time thinking about today’s fragmentation of the internet and off the web more generally, I’d like to you to share a little bit about the concerns and issues that you see in terms of the fragmentation.  

Wendy Hall: What has happened is over the last 30 or so years, is that the internet has evolved in different ways in different regions of the world. And the geopolitical nature of that is fragmenting the internet and people often talk about the internet becoming bimodal between the US and China, the bi-ification of the internet. But actually, we think it’s more nuanced than that. 

My colleague, Kieron O’Hara and I, who have developed this idea, we’ve just just written a book about it. Think of it as four internets. So the first internet is the open free universal one that we think of as coming from Silicon Valley, all the big companies are there, and then you’ve got their mirror images in China. But actually, the different regions of the world act culturally differently. 

So in the US, the internet is very market driven. The big companies there, they lobby Washington for the regulations and tax reliefs, that will help them grow their companies and bring value to their shareholders, which is fair enough. 

Europe has taken a very different attitude and put data protection first. We don’t have any of the big social media platforms, they’re all based in the States. So the civil libertarian views from Europe have moved in a data protection way. So it’s culminated in GDPR, General Data Protection Regulation. And if you want to  be on the internet in Europe, you have to abide by those rules. If other countries want to sell their digital services in Europe, they have to abide by GDPR.  So that’s the sort of regulation before innovation sort of idea. 

And then of course,  moving to the east, you’ve got China, 1.4 billion people in the world. From the very get-go, China realized the power of the communication medium that the internet was, is. And so the basic rule in China is the government can look at anything. And so if you’re a company, you want to have a digital business in China, you have to abide by Chinese laws. And this is really beginning to fragment the internet. 

And I’ll say two other things  we talked about in the book about Russia being the spoiler. It’s not trying to create a new type of internet, it’s just trying to use the internet to interfere with other countries in various different ways. There are other — other regions that do that as well. 

And then there’s also the big point is that last year, 2019, we reached the 50/50 point on the internet, which means that 50% of the planet have access to the internet. That’s awesome. In 30 years, that’s happened. But it also means there’s 50% still to come. And that 50% is in, largely, in rural China, rural India, and rural Africa. The way India goes in terms of internet governance is really important for the future. And Africa will probably go the way of China because of the Chinese investment in Africa.   

And if you look at the numbers and populate the population numbers, we can end up with a very small part of the internet that is run by democratic governments unless India sticks with the open and free type of access that we have. And I can say a lot more but that’s it’s that sort of a geopolitical analysis of what’s going on. It’s really important, I think our message of the book is keep the technical standards open. Because if that goes and people start to create alternative views of the internet, which means the web can’t run across globally, freely, then you know, all bets are off. And the key thing is nobody owns any of this. Right? The web or the internet are not owned, there’s no one company, no one government. It is us, they are ours. So we have, I think, as much duty to look after them as we do to look after the physical planet.

Noshir Contractor: Yes, I think what you point out is that having these common technical standards does provide a prerequisite for creating a public good that would be global. And yet, that may not be enough in this context, that in some cases, you can still see fragmentation based on geopolitical forces.

Wendy Hall: And it all comes down to how countries govern data.

Noshir Contractor: Given that 50% of the planet is still not on the internet, a lot of places that you referenced were what we might call the global south. Do you see that as creating a fifth internet? Or do you forecast, to use your term, that it is going to fold into one of the existing four internets?

Wendy Hall: Well, our forecast is it will, it will fall into one of the existing four. But we do talk about a fifth internet in the book. But I’m not going to tell you; you have to buy the book.

Noshir Contractor: That’s a wonderful teaser. Wendy, in addition to playing a prominent role in research, you’ve also helped shape science and engineering policy and education. And you co-chair the UK government’s AI review, which was published in 2017. And the UK Government announced you as the first Skills Champion for AI in the UK. 

One topic that is particularly near and dear to your heart is the role of women in computing, and more generally, in science and engineering. Can you talk a little bit about where you think that is headed?

Wendy Hall: (Sighs) It is a tale of two cities in a way. When I was young, you know, I was born in the 50s. And the world was very different. And no one in my family had been to university before. And the reason I would be expected to go to university would be to find a better husband, and get married and have kids. That was the expectation. My parents wanted more than that for me. But the expectation was, my future was marriage and kids. And we didn’t have the equality laws in the UK then, and my very first job interview, as I was a mathematician originally, and I went for a job as a lecturer in maths, at a university I won’t name on here in the UK — it wasn’t Southampton — and my first job interview they told me at the end of the interview, when they decided to get the job. On the day, the head of department said to me, I’m afraid Wendy, you didn’t get the job because you’re a woman. And he told me that on the day. I was young, they didn’t think I’d be able to control classes of engineers and computer scientists. And anyway, the very next week, I got a job doing the same thing. But that was my first sort of realization that things were different for women. Now, they couldn’t do that today, but they might still think it. 

Then when I went to Southampton, we realized in 1987 that we had three years of computer science undergraduates with no women on them at all, women had just turned away and there have been women before and this was the time of the personal computers, the spectrums and the BBC Bs and the Commodore pets, and and suddenly computers were had become overnight toys for the boys. That really switched a whole generation of women off almost overnight. In the West, we have never really recovered from that situation. Countries that came later like your home country, India, that came later to the world of computers didn’t have quite that issue. And so if I go to India, I go to a computer science class in India, more than 50% of the students will be women, right? So it isn’t genetic, it is deeply cultural. 

I’ve tried all my life to turn that round and try and get girls interested in computing and really failing quite miserably, because the stats are just so bad. But the world around us has changed dramatically, of course, and, you know, women now aspire to much, much more, I think they’re still under pressure, they still have this problem of you can do everything. And so you try and be a mother and career woman. But, you know, it is possible. 

The cultural computing in some parts of the industry is just toxic for women. In particular, Silicon Valley, is well known for really, really being toxic for women. It’s so sad for me that we still have this problem, I meet more and more women. So in the world of AI, in some ways, this gets even worse, because when you take a master’s degree in machine learning, you can’t really take those degrees unless you have a computer science undergraduate or maths undergraduate program. So you already got a much, much shorter pipeline for those. So you’re going to increase the stereotyping. 

And we were so worried about this, when we wrote the AI review about how we would get more women coming into AI. But I do see lots of women are involved in AI in the areas of ethics, thinking about how AI is going to be used in society, that is attracting a much more diverse pool of talent. And so I think we need to capture that, that’s what we’ve been trying to do with the skills program in the UK, there will be lots of new jobs that are not to do with programming, to do with auditing AI, looking at bias in AI, design of AI to make sure it’s for the good, not going to be harmful to people or get the wrong results. You know, be biased. 

And I always make sure diversity is firmly in that ethical framework. My argument is if your workforce is not diverse, and I’ll tell you a diversity in its broadest sense, I mean, here, gender, race, culture, age, disability, and all everything you can think of, in that broadest range, then, if it’s if your workforce isn’t diverse, and there’s more chance that your AI systems are going to be unethical or biased in some way. So my mantra is, if it’s not diverse, then it’s not ethical.

I still want to get more women doing the feeding into the pipeline, I want to get more women interested in school, so they do the qualifications to and want to study computer science, but at least we see more women in the workforce.

Noshir Contractor: I think that’s that’s a very fair point. I think there is a temptation in the past to equate balancing the need for diversity and the need for excellence. And what you’re pointing out is that in fact, the two are not opposed to each other that they actually are symbiotically related.

You have been such a wonderful role model in all of these respects, and so we salute you for that, you’ve given us a really good story about how the web got started, and web science in particular got started. Based on your perspective, what do you consider as some of the most significant issues that need to be addressed by web science moving forward?

Wendy Hall: Well, I will have to answer that in terms of data. As you know, I’ve been passionate about the idea of building observatories for web science, the use of the term like the physicists observe the stars and the planets. And, and, and use that all that data to, you know, to work out where we came from, and where we’re going.

And part of that also is how you visualize, how you analyze the data that’s in there. But for me, the really difficult thing is getting the evidence we need. 

And it was, to me, it was all about how do we, how can we share the data that we collect, because it takes so much effort to collect that data. And then when the person who’s collected it retires, or leaves or moves to another job, it just all evaporates. 

And we need some way of being able to share data with other people in ways that’s legal and ethical. And, you know, people are not abusing, you get cited if your data is used, or you get some money for it, if people make money out of it. This  is important for companies, but it’s important for research scientists too. And we’re still struggling to find a way to do that. The hardest thing is actually curating the data and making it available. And then you’ve got the issue of well, what if it’s, you know, data about people and data that’s confidential to companies. So I think that is our biggest challenge is how, as a you know, how, as a community, we can crack that one. 

Noshir Contractor: That’s a really important issue. You also spend some time working with the Library of Congress and some of these issues, haven’t you? 

Wendy Hall: Can we tell the Twitter story? I mean one of the reasons I went there was we all knew that the Library of Congress was getting the Twitter feed and all the data. There was a server down in the basement of the Library of Congress, which was getting a Twitter feed every day. When that was deal was done, Twitter wasn’t the company that it is today. And so that data represents how the company is doing so and it’s very confidential. And of course, even though Twitter is open, you know, you tweet to the world. People — Twitter allows people to delete. So you know, there’s very confidential information in there. So they’re collecting all this stuff. had nobody using it. And so they turned it off as a project. And I understand why. But my reason, my worry is who is the custodian of all this data that in 100, 200 years time, people want to know, what were we saying on Twitter? What was on Facebook? Well, this is the record of our society. Right? And we are collecting snapshots of it in the libraries. They have, you know, the British Library, the library, Congress, they have Web Archiving projects, but it’s snapshots. And the Internet Archive takes what it can, but they’re snapshots. And I don’t know if the companies are storing it for the future historians. I think that’s another challenge for us as a community, how do we, how do we retain our memories, digital memories?

Noshir Contractor: In fact in some cases, I believe that there’s regulation that does not allow the company to hold on to data beyond a certain number of years. I wanted to ask you, as we consider this moment of social reckoning that we are experiencing alongside the pandemic, what is the one or maybe two significant things that would have been different, for better or for worse, if we were going through this period without the web?

Wendy Hall: Well, can you imagine, I say this to people, if the pandemic had happened anytime before 2000, 2010? We would not have been able to deal with it as we have. Not only is it kept our communities, it’s enabled us to see friends and family and talk to them, to actually have lockdowns in a way to save lives, and to keep our spirits up and to enable us to communicate, then life would have been really difficult.

And also the international work to share how to deal with the virus, right? Work about vaccines and antibodies, and what treatments to use and when to lock down and when to ease up. So we’ve rediscovered our love of the web and the internet and COVID,  the The TikTok videos and the zoom cocktail evenings. It will change our lives, it will change our world. It has taught us we don’t have to travel halfway around the world to go to a conference to give a single paper. I want to get back on an airplane. I’m sure you do too. But, you know, we’re beginning to understand that there is a world other than jetting around all the time. And before the pandemic. We were in the West, certainly we were worried about the harmful things that were happening on the web and the internet. We were worried about how to deal with that. We still are, but we have as I say, we have learned to love it again. We’ve remembered why it was invented in the first place. And I think that’s hugely important.

Noshir Contractor: Well, thank you very much again, Wendy, for taking time to take us through a journey of Web Science from where it started to where it’s headed. It was really a pleasure to speak with you. And again, thank you for all your efforts in leadership in terms of developing this field, but also in terms of the work you’ve done in related areas of policy and education. Thank you again.

Wendy Hall: Thank you Nosh, thank you for doing this series.

Noshir Contractor: Untangling the Web is a production of the Web Science Trust. This episode was edited by Molly Lubbers. I am Noshir Contractor. You can find out more about our conversation today in the show notes. Thanks for listening.