Episode 4 Transcript | Untangling the Web

Jen Golbeck: You know, I kind of jokingly said to someone at some point that I want to be the world’s expert on dogs on the internet. And I might be at this point, or at least up there with kind of pets on social networks.

Noshir Contractor: That was Jen Golbeck. Jen is not only an expert on internet pets, but a leading voice in web science. She’s a professor in the College of Information Studies at the University of Maryland at College Park. You may know her from her TEDX talks or podcasts about web science and pets. She has been a research fellow of the Web Science Research Initiative, and gave a keynote address at the 2017 ACM Web Science Conference.

Jen is also known for her work on computational social network analysis. Her models for computing trust between people in social networks were amongst the first in the field. And Jen’s also received a lot of attention for her work on computing personality traits and political preferences of individuals based upon their activity on online social networks.

Welcome, Jen.

Jen Golbeck: Thanks. I’m glad to be here.

Noshir Contractor: Jen, let’s start by learning about how you got interested in studying what you do now.

Jen Golbeck: I’m really lucky that the time of what was going on in technology and the time of my life intersected in a kind of fortuitous way. So the web came about, I think I was probably in middle school. And then when I was in high school, the early to mid 90s, I started designing web pages professionally, which you could do as a 15-year-old at that point. I did that throughout undergrad, and you know, through my master’s degree. Sometimes it was my entire income, sometimes it was a side income. But I was also on the path to get a PhD the whole time. And so when I came to the University of Maryland to get my PhD in 2001 in computer science, I met with Jim Hendler, who was my advisor. And I had actually started as an economics major at the University of Chicago, changed to computer science. But Chicago has the guys who did Freakonomics, you know, this behavioral stuff that wasn’t just, you know, markets and finance. And I loved that.

And so I was like, Jim, how can I take this sort of stuff about how people behave, and things emerge out of that, and then cross it with the web, which is something that I’ve just been immersed in, you know, since I was kind of a thinking pseudo-adult? I said, “Can we maybe do like a social network and put that on the web,” and he was like, I mean, “That sounds interesting. Go ahead and try it and see what happens.”

That was 2001. So pre-Facebook, Myspace was just kind of getting started. And I was like, alright, I’m gonna study social networks on the web. I built some, you know, I studied some of the early ones that were out there. And so it got me into doing research, right as the entire universe of the web shifted into this place where humans were creating tons of content, people were spending a lot of time. And so it was just kind of natural, then, to flow into web science, you know, working in a lab that was looking at knowledge representation and putting information online, and then pulling my own interest of people online and what they’re doing and how to merge that with AI.

Noshir Contractor: It looks like you had the right time, and the right people to be working with, in addition to having the right skills for doing all of that stuff. One of the things that I remember reading earlier on about your work was this work in the area of trust-based recommender systems, and you developed a platform called Film Trust. Can you tell us a little bit about what you learned from that experience? And what do you think of the future of those kinds of recommender systems?

Jen Golbeck: That’s my dissertation work that you’re talking about. So I basically built my own social network because there was no data like this in existing social networks. And in there, you could go in and like, rate your favorite movies. Like you do now with Netflix or Amazon, whatever. And then you also could add friends like on any social network. But I added this system where you could rate how much do you trust this person to recommend a good movie to you, basically.

And the question was, we had recommender systems at the time, like we have now with Amazon and Netflix and say, here are some movies that you might want to watch. Those generally worked by finding people with similar tastes to you and suggesting stuff they like, essentially. And so I was interested: Could we use trust that people express about their actual friends, and do a bunch of interesting AI with that and use that in place of similarity? So if I say I trust you about movies, even if it looks like we’re statistically different, can I maybe get some good information? And it turned out from that, that it does work. And it works really well in cases where I’m just very different than everyone else.

So an example I used to give all the time was, I’m a real film buff. I used to be a projectionist in a theater. I hated A Clockwork Orange, which is like a classic piece of cinema. I wish I had those hours back in my life. No one who is a film buff hates that movie, but I hated everything about it. And so recommender systems would see, okay, well, she loves all this classic cinema, of course, she’s gonna like that movie. And I’m like, any system that tells me to watch A Clockwork Orange is not one I want to use. Like, it doesn’t understand me. And trust is great at capturing those really extreme preferences on either end.

And so it was this really interesting lesson in our social relationships and our understanding of how we relate to people has a power that statistics alone don’t capture. But we can put those things together with AI and some statistical analysis and all this data on the web, to kind of get the best of both worlds. And that’s now something that in all these personalization algorithms, like you see on Facebook, sorting your timeline, like you see in, you know, a lot of recommender systems, they’re incorporating those elements of social relationships. That was one of the things that I first investigated in that dissertation work.

Noshir Contractor: And that was very influential at the time. I remember looking at it, and people were beginning to understand whether trust-based recommender systems may be different, or augment purely algorithmic-based recommender systems.

Netflix, for the most part, is making its recommendations based only on its internal algorithms certainly improved by the Netflix challenge. Is there a difference in the kinds of recommendations? Do you see some day when algorithmic recommendations like the ones that Netflix are doing will just get so good that you don’t need to rely on trust-based recommender systems or social network based recommender systems? And at the end of the day, I guess, do you trust your ability to report accurately who you trust?

Jen Golbeck: That’s a really good question. So I think it depends on the domain, you know, Netflix, I don’t think they really need to use a lot of social network data. Because for movies, you can get a lot about people’s preferences with the genre, the actors, like all this really detailed information we have. The same thing goes for like music recommenders, and all these kind of streaming music services that will make a channel for you. You don’t need a lot of social data for that it may help in little instances.

But there are a lot of cases like, what do you want to look at on your social network feed that are much more social, and not just news stories, right? But like, whose friends’ kids’ updates do you want to see, you know, if it’s the person that you’ve been friends with, since elementary school, you may totally want to see that. If it’s, you know, some guy you met at a professional conference that you happen to friend on Facebook, you may not care at all about that. And your social network and your friends preferences can shape those sorts of personalizations in a way that I don’t think we’ll ever really capture with a purely statistical algorithmic recommender system. So I think depending on the context, the more social that context is, the more important it is to have social input to it.

Noshir Contractor: It also seems that in some situations, if you really trust someone, and they tell you something different than what you’ve seen previously, you might be more open to looking at it and considering it. While if a computer is basing its algorithm exactly on what you like, it may be less likely to provide you enough variety. One of the things where recommender systems have been criticized for is that they make you live more and more in an echo chamber. While in a social network, somebody that you trust might actually tell you something a little different that you may or may not like. If you don’t like it, you may not trust that person anymore, but you might like it as well.

Jen Golbeck: Yeah, it’s, it’s an interesting combination. I’ve had PhD students, I have one just graduated a year ago who was in journalism, looking at, you know, how do we figure out the news that people trust? You know, are they more likely to believe conspiracy theories or fake news or kind of legitimate, real journalistic standard kind of news? And how does that relate to the people who are sharing it with them? And I think that’s really important.

If I have a really trustworthy source, who’s coming to me with something, that’s not what I might normally believe, it may make me more likely to consider that and understand that information. And so that’s one of those interesting ways where you could merge something like, here’s Facebook or Twitter with a really good model of what I’m going to like or click on or comment on, very good at that. Here’s stuff that’s gonna keep me engaged. But let’s broaden that out with a more diverse perspective.

And something that they can see is different than what I might normally click, but coming from someone I trust, that’s a way to sort of say, okay, like, let’s expand to this viewpoint, and maybe look at optimizing things like the social good, or how informed someone is or how much they’ve considered a breadth of perspectives. It’s something that we need to get more into in the research, but I think social connections will be really critical if we start expanding recommendations like that.

Noshir Contractor: And you continue to work on beyond predicting which movies that you might like, and you spend a lot of time looking at web data from the web, to understand individuals’ activities, attitudes, and behaviors. One in particular, was your work on trying to predict the extent to which a person might be able to stay within the Alcoholics Anonymous program or not? Can you tell us a little bit more about what you found there?

Jen Golbeck: Yeah, so that project we originally had started wanting to look at DUI recidivism, so someone gets a DUI, how likely are they to get another DUI, and we really dug for data on social media about that, but not surprisingly, people aren’t posting a lot about their DUIs. And, but what we did find in the process of looking for that is a lot of people talking about their problems with alcohol, going to Alcoholics Anonymous, if they were drinking. And so we did this study where we basically looked for everyone who announced on Twitter that they were going to their first AA meeting. And then we followed what they tweeted after that, you know, after filtering it out for jokes, or whatever, people who legitimately had drinking problems. And after they said that we looked at, did they stay sober for 90 days? Or did they go back to drinking, and we made sure they said, so it could be, you know, two weeks later, they complain, they were hungover at work, we knew they were drinking, it could be six months later, they were celebrating their six months of sobriety, so we knew they’d made it those 90 days.

And then we just took all the data that we could model from their Twitter feeds, to try to see if we could predict that, you know, on the day, you announced you’re going to AA, can we predict if you’ll be sober? And so we looked at things like: Who are the people that you follow on Twitter? And how much do they talk about booze? How much do they use words about alcohol? Are you over 21 or under 21? How do you cope with stress, kind of using other AI as input. And these are all things that addiction researchers might consider. And so we use that as an input to our model. And we can predict with astonishingly high accuracy, 80% accuracy, if someone is going to stay sober or not, on the day they decide to go into treatment.

Is it good or bad, I really struggle with it. We have not made that tool available to the public, because I can see a lot of dangerous ways for it to be used. But it is also explainable. So if you say I’m going to go to AA and my algorithm says I don’t think it’s going to work, it can tell you, this kind of therapy might be helpful or changing up your social circle might be helpful, which I think could be really useful. And it’s one of those things where I am impressed as a scientist with the computational power of what we can predict from this web data. I am also very concerned as someone who plays in the social science space about the implications of that algorithm. There are good and there are bad and I think we just need a lot more work in the kind of policy space, the regulation space before a tool like that is brought out to the world.

Noshir Contractor: And this is exactly why web science is trying to navigate this balance between what can be accomplished technologically and what should be accomplished, or how should it be accomplished, from a social standpoint or a policy standpoint. And your work is just a really excellent illustration of how one tries to navigate through that dilemma. One of the things that this work shows is that if it goes in the wrong hands, for example, I can imagine somebody who is being pulled over by a cop, for example, right? And then the cop could potentially be using this algorithm in different ways to determine what kind of response in that particular situation.

Jen Golbeck: One thing that we’ve seen with AI is that it’s used sentencing guidelines now. And it’s used in ways that we know are already profoundly unfair. But you can imagine this algorithm being included in the decision about whether to send someone to jail for a DUI or to send them into treatment, you know, if the algorithm says treatment will work, they can go to treatment, if not, they can go to jail. But the algorithm is wrong 20% of the time. It’s also pessimistic, so when it’s wrong, it tends to say you won’t recover when you will. Who knows what other biases are in there? We can’t really tell, but there certainly are some. So yeah, it’s really worrying to think about people who may very well mean well and want to make the right decision using this technology. Because AI has this veneer of objectivity, right, it’s math, it’s totally objective, it can’t be racist, or sexist or biased. But of course, it totally is, it just reflects human society. And people who don’t understand that and the errors and the pitfalls, may try to use it in ways that just echo all the problems that we already have, that we’re kind of trying to fix with technology. And that, you know, is worrying not just in this case, but in all these applications of AI and web data together that get out into the world.

Noshir Contractor: Well, one of the things that you highlighted in your own work, is that when you have these predictions, if it goes into the wrong hands, for all the reasons you’ve been describing, could be something that could have unintended negative consequences. Do you believe that these predictions should always be given to the person involved?

Jen Golbeck: I mean, generally, I think they should always be made available. Right? If I ask, I should be told 100% of the time exactly what I’ve asked for, I think I have a right to know that. I may not have that legal right right now, especially in the US, but it’s something that I am working hard for us to get, I think it’s important. Will everyone want to know, you know, not necessarily. A lot of this is benign — personality traits, what are your political preferences, stuff we already know about ourselves. But this alcoholism example, as one, you may not want to know when you go into AA if an algorithm says it’s going to work, you know, if the algorithm says AA won’t work for you, when you’re going and you really want to solve your problem, it may be discouraging to the point that you are so fragile in that recovery that you decide not to continue. So you may say I don’t want to know what the algorithm says, because it may tell me something that won’t help. So I think they should have a right to know if they want to, but it doesn’t necessarily need to be automatically shared.

Noshir Contractor: I want to move us from social networking to social petworking. And I wanted you to tell us a little bit about the research that you’ve done on looking at social network sites for dogs versus cats. And why, turns out, that they do different things on these websites.

Jen Golbeck: I kind of jokingly said to someone at some point that I want to be the world’s expert on dogs on the internet, and I might be at this point, or at least up there with kind of pets on social networks. So I’ve always been fascinated by how people put their pets on social networks, as I’ve followed the development of these. And the work you’re talking about is some work that was looking at some of these early pet social networks, Dog-ster and Cat-ster, there is a Hamster-ster, where you could create a profile for your pet, make them friends with other pets just like you would do on Facebook or any other network.

And what we saw when we studied this is that people use them quite differently. Cat people tended to participate in these kind of community forum discussions, they would do these role playing kind of games and exercises from their cats’ perspective. So there would be these like, cat weddings were very common. People would pick their cats to get married, everyone would come like they’d send invitations, they talk through the reception and everything at a particular time. Dog people tended not to do that kind of thing. Now, this was kind of early days of social networking, you know, mid 2000s. And we’ve seen that kind of behavior shift on to things like Twitter and Instagram now, where I have very popular social media pages for my dogs. I don’t post in their voice, but some people do that.

It’s a really interesting way, where you can see like cat videos were the thing on social media for a long time. Dogs maybe have eclipsed that a little bit recently. But people interact in really different ways through that. One of the most popular cat social media accounts is, I think, “black metal cats.” And it’s like, death metal lyrics with pictures of cats. And they tend to kind of embrace that, that stereotype of cats, where dogs’ social media tends to be very wholesome and encouraging and supportive. It’s interesting that, you know, all the research bears out dog and cat, people tend to have different ways of approaching life. As that moves on to the internet, we’ve seen that consistently that they — they tend to behave in different ways, which I think is kind of fun and wholesome, interesting kind of research in the space.

Noshir Contractor: One of the things that I found really interesting is your explanation as to why cat people are more likely to organize virtual playdates on the web, as compared to dog people. And you mentioned that one of the reasons that might be the case is that dog people take their dogs on walks. And that might make it more sociable than most people, who don’t take their cats on walk.

Jen Golbeck: I am doing some research on this topic right now, sort of the benefits of having a dog and a lot of research benefits of having a dog, not just online but offline, is that it absolutely makes you more social, because you’re out there walking the dog, even if you don’t go to a dog park, you tend to encounter other dog people, you can talk about your dog, if you’re a dog person, you know, if there’s people you see regularly on your walks, you may not even know the humans’ names, but you know the dogs’ names, you recognize those people. So you get to have these social interactions around your dogs, that you really don’t get to have as much as a cat person, because you’re not out in the world encountering it. So virtual spaces provide an opportunity to socialize around those pets. And I think that’s, that’s one of the like, really good things that we’ve seen in general on the web, is that it’s created these spaces for people to connect socially, whether it’s around a rare disease that they have, or you know, a life struggle they’re going through, or their pets or their hobbies, where it was hard to do that in-person, because there just wasn’t a lot of density or opportunities. The web has created those spaces, and so even though it can look a little weird looking in on these online cat communities, I think it’s great that it provides those opportunities to socialize.

Noshir Contractor: And unfortunately, many cases now, virtual playdates have become much more prevalent in general today because of the pandemic, in addition, of course, to the social cultural movements that are experiencing right now as we have this conversation. I want to close here, by asking you, if you could reflect on one thing about what we are experiencing in either of these two areas, and to see how it would be different, better for worse, if we didn’t have the web today?

Jen Golbeck: So I tend to be the pessimist painting the picture of our dystopian future and warning people of the bad things that are gonna happen. I’m not going to do that here, I will give you a positive view of what’s going on right now. You know, you especially look at the Black Lives Matter movement, everything that’s going on with the protest, police brutality, and then also against the administration’s handling of COVID. Social media has been a really powerful place for that.

And I think an interesting way to think about these movements, not just right now, but going back to like Ferguson, looking at the Me Too movement, these, these social movements that have come is that if we look in pre-web times, our media was very much controlled and gatekept. We had a few major networks, they decided what was going to be shown, those were the voices that we saw. And they tended to be white voices, and male voices. I was born basically, you know, a little before 1980. So I remember as a kid, constantly being irritated in a way that I couldn’t describe, at the way women were portrayed in commercials and on TV. It wai not how I was, and it’s not how I was raised, but was so frustrating that the dominant view of women was like, “Oh, we need to be helped.” And you know, “I’m sort of ditzy and whatever.” And, you know, I can only imagine the experiences that people of color had with that.

Social media and the web have given voices to every community that cannot be suppressed in the way that they were when there were these gatekeepers. And we have seen these movements. We saw Ferguson, which is something that I think would not have been covered by the media in the same way, if everyone was not on Twitter, when that was happening. We have the Me Too movement that gives people a voice to sort of challenge these large voices in ways that they couldn’t before that.

And I think if we look right now at what’s happening, especially with the introduction of mobile and everyone having access to these platforms, from their mobile devices, posting videos. posting pictures, challenging what we would always see the police say. You know, there was a video that came out of police arresting, taking a teenager off the streets of New York City throwing her into an unmarked van and driving her away. There’s tons of videos of this, and the police say she was wanted on a bunch of other charges. And then the police were attacked with rocks and bottles. The video shows that they were not attacked with rocks and bottles. None of that was happening. And before social media, we would have gone “Well, the police said this. So that’s probably what actually happened.” And now everyone has the power to challenge these dominant voices.

I think that shift of power away from institutions and people that have traditionally had it is very uncomfortable for the people and institutions that have traditionally had it. But it’s incredibly powerful in shaping and pushing for the change that we have desperately needed for a long time. So I think the web has facilitated that shift of power in a way that is so good for society. Even though we’re very disrupted right now, for lots of reasons. I think the web is playing a net good role in that. And it’s, it’s one of the, you know, most powerful influences that we’ve seen in the last 20 years. And I think it’s going to continue playing that role.

Noshir Contractor: I’m glad to hear that you are so optimistic about these things. And frankly, I’m optimistic knowing that scholars like you are leading and pushing the frontier in the area of Web Science. As you mentioned, I have been following your work since your dissertation days and have been really impressed with the ways in which you’ve been doing high-quality work, socially responsible web science, and being able to translate it well. And I definitely recommend that our listeners, follow you on Twitter and on one of your many Twitter accounts that you have, as well as, listen to your talks on on TEDx because they are really, really compelling. Thank you so much again, Jen for taking time to talk with us.

Jen Golbeck: Thank you. It was a real pleasure as always.

Noshir Contractor: Untangling the Web is a production of the Web Science Trust. This episode was edited by Molly Lubbers. I am Noshir Contractor. You can find out more about our conversation today in the show notes. Thanks for listening.