Beki Grinter

Archive for the ‘computer science’ Category

The Right to be Forgotten and the Right to be Equal

In computer science, empirical, European Union, social media on July 16, 2014 at 4:48 am

I’ve said this before, the Internet can be a mean misogynistic place. Could the Right to be Forgotten help with this?

The Right to be Forgotten is an EU ruling that gives people the means to ask search engine companies to remove data from their searches if it is irrelevant. Its sparked a lot of controversy as well as questions.

The controversy could be characterized as pitting freedom of expression and information against individual privacy rights. Additionally, people have argued that it creates an unfair burden on intermediaries such as Google.

While I am open to these arguments, I find myself thinking about how freedom of expression and misogyny interact. Some of the things that are written about women on the Internet are vile, abusive, full of bile and hatred. Freedom of expression has always had limitations: libel (making false and damaging statements) and obscenity. Freedom of expression on the Internet seems never to have had these limitations, and so obscene libelous statements directed at women exist in perpetuity on the Internet. Perhaps some might argue that its the responsibility of the person they are targetted at to take it up through the courts. But how, when the authors of these remarks are hidden. Which makes me think there is a role for corporations. Or at least a responsibility.

Some advocates for the right to be forgotten have argued that it reflects a social value of forgiveness. We all have the right to make mistakes and then over time have those mistakes disappear into a forgotten history. I agree.

But what I am asking and suggesting here is that the Right to be Forgotten maybe a means to finally have an Internet that is fair to all. For a long time visions of the Internet have championed it as a platform welcoming anyone and everyone. The right to be forgotten may have a role in actually ensuring that it welcomes minorities by proving for once and for all that it will not tolerate discrimination.

538, the World Cup, and Facebook: Telling Stories about Data

In computer science, discipline, empirical, research, social media on July 15, 2014 at 6:49 am

As many of you already know, I’ve been following the World Cup. My team, Germany, won. Watching the World Cup has always involved reading news reports and commentary about the matches. This year I decided to include 538 in my reading.

538 is Nate Silver’s website. Nate Silver became famous predicting US elections. He is a master of analyzing big data to make predictions. It works well for elections. But it doesn’t work so well for the World Cup, at least not for me. First, the site predicted Brazil to win for a long time.

But it’s not just that 538 did not accurately predict the winners. I think that 538 misses the point of a World Cup. Crunching data about the teams doesn’t tell the whole story. And the World Cup is stories. Many stories. As a fan you learn the stories of your team and its history. You might start with world history—this is very salient as a Germany fan. England versus Argentina similarly (1984). It also involves stories about the teams previous encounters. Germany versus Argentina has happened before, even in Finals. And those stories are recounted, and reflected on, in the build up to a game. You might tell stories about strategy. Certainly the Germans have been telling those, about a decade long commitment to raising German players. How you structure a league to encourage more domestic players that can also play for the national side. How you balance the demands of a national league and a national team.

In a nutshell, context matters. These stories of world politics, former World Cups, and the arc of time turn statistics about the players into something richer. 538 tells none of those stories. And I suppose that’s exactly what it wants to be, a “science” of the World Cup. But my World Cup isn’t statistics, it’s larger, more discursive and has a multi-decade narrative arc.

Reflecting on this caused me to revisit the Facebook study. Yes, that Facebook study. The study reported data. But it was data about people. However, at the same time I think some of the response could be interpreted as people feeling that there was more to the story than just statistical reporting of the outcomes. Is it a similar type of human-dimension, an infusion of humanity? This is the question I’ve kept wondering since reflecting on the problems of both of these data-driven reports. 538 reduces football to data. In so doing it loses the human dimension. The Facebook study started as data and the public raised human concerns and considerations. If I have a take away it is that fields like social computing, or any data science of humans, need to seriously pay attention to the stories that we tell about people. How we frame or potentially reduce people is something that the public will care about, for it is their humanity, their stories that we seek to tell.

That Facebook Study

In academia, computer science, discipline, empirical, European Union, research, social media on July 8, 2014 at 8:07 am

Following Michael Bernstein’s suggestion that Social Computing researchers join the conversation.

Facebook and colleagues at Cornell and the University of California, San Francisco published a study in which it was revealed that ~600,000 people had their Newsfeed curated to see either positive or negative posts. The goal was to see how seeing happy or sad posts influenced the users. Unless you’ve been without Internet connectivity you likely have heard about the uproar its generated.

Much has been said, Michael links to a list and some more essays that he’s found. Some people have expressed concerns about the role that corporations play in shaping our views of the world (via their online curation of it). Of course they do that everyday, but this study focused attention on that curation process by telling us, at least for a week how it was done for the subjects of the study. Others have expressed concern about the ethics of this study.

What do I think?

I’ve been dwelling on the ethical concerns. It helps that I’m teaching a course on Ethics and Computing. And that I’m doing it in Oxford, England. So I’m going to start from here.

First, this study has caused me to reflect on the peculiar situation that exists in the United States with regards to ethical review of science, and the lack of protection for individuals that participate in it.

In the United States, only institutions that take Federal Government research dollars are required to have Institutional Review Boards (IRBs). The purpose of an IRB is to review any study involving human subjects to ensure that it meets certain ethical standards. The IRB process has its origin in the appalling abuses conducted in the name of science like the Tuskegee Experiment. Facebook does not take Federal research money, and is therefore not required to have an IRB. The institutions by which research gets published are also not required to perform ethical reviews of work that they receive.

I find myself asking whether individuals who participate in a research study, irrespective of who funds that work, have the right to be protected? Currently there’s an inconsistency, in some research the answer is yes, and in others it is no. It seems very peculiar to me that who funds the work determines whether the research is subject to ethical review and whether the people who participate have protection.

Second, most of the responses I’ve read have been framed in American terms. But social computing, including this study, aspires to be a global science. What I mean is that nowhere did I read that these results only apply to a particular group of people from a particular place. And with the implication of being global comes a deeper and broader responsibility: to respect the values of the citizens that it touches in its research.

The focus on the IRB is uniquely American. Meanwhile I am in Europe. I’ve been learning more about European privacy laws, and my understanding is that they provide a broader protection for individuals (for example, not distinguishing based on who pays for the research), and also place a greater burden on those who collect data about people to inform them, and to explicitly seek consent in many cases. I interpret these laws as reflecting the values that the 505 million European Union citizens have about their rights.

I’ve not been able to tell whether European citizens were a part of the 600,000 people in the study. The PNAS report said that it was focused on English speakers, which perhaps explains why the UK was the first country to launch an inquiry. If Europeans citizens were involved we might get more insight into how the EU and its member nations view ethical conduct in research. If they were not, there is still some possibility that we will learn more about what the EU means when it asks “data controllers” (i.e. those collecting, holding, and manipulating data about individuals) to be transparent in their processes.

I’ve read a number of pieces that express concern about what it means to ask people to consent to a research study. Will we lose enough people that we can’t study network effects? How do we embed it into systems? These are really good questions. But, at the same time I don’t think we can or should ignore citizen’s rights and this will mean being knowledgable about systems that do not just begin and end with the IRB. Its not just because its the law, but because without it I think we demonstrate a lack of respect for other’s values. And I often think that’s quite the point of an ethical review, to get beyond our own perspective and think about those we are studying.

Lack of a Critical Education: An Explanation for IT problems?

In academia, computer science, discipline on April 7, 2014 at 9:41 am

An New York Times article about sexism in the tech industry has been making the rounds on Facebook. One explanation that some of my friends have used to address the why such rampant and explicit misogyny exists is the lack of education. Not engineering/computing education, but a well rounded one in which people would come to understand why its inappropriate and why having a diverse workforce actually matters.

I was making the same argument the other day about a different topic. When Snowdon, Assange, and Manning decided to leak intelligence secrets all of them claimed they had done so because to do otherwise would be ethically wrong. I/You/the NSA may disagree, but they all agree that they had a moral/ethical/civil duty to do so. As I said to a colleague, what drives this moral/ethical/civic sensibility? I shared the thought with my colleague that perhaps a lack of a well-rounded education might play a role here.

For decades we’ve shortchanged all education. It cost us too much. Further, we’ve long prioritized the sciences over the social sciences and the humanities. (We now find it alarming that Congress ridicules the sciences, but as another colleague of mine pointed out, that’s how long many/some in the sciences have treated the social sciences/humanities). But it is just these maligned disciplines that would have gone some way to create the critical thinkers that seem to have vanished from the tech sector. And now we have an industry that’s unabashed in its misogyny. We have “rogue” technologists who now have the power to decide when to leak secrets, and deciding to do so based on moral principles that at least to some are questionable. I wonder whether we did it to ourselves and if there is worse to come.

p.s. if you want to be even more depressed here’s a timeline of sexist incidents (thanks to the friend of another colleague) in the Tech Sector.

A Future for Academia Driven by Metrics

In academia, academic management, computer science, discipline on January 2, 2014 at 5:08 pm

By now, anyone who knows me knows that I am a *huge* fan of metrics. Particularly when they are used uncritically. So perhaps it was inevitable that I would end up in an environment where metrics play an increasingly ubiquitous role: academia.

I want to introduce three metrics.

Student credit hours: a number that measures by class/faculty the number of students a person has taught. You will have a larger number if you teach larger classes. It’s also the number that is at the beginning of a formula that computes the portion of the Institute’s state budget (and presumably how that is divided, although that part of the budgetting process is a complete mystery to me). Higher is better, and in fairness I can imagine that larger classes can create their own organizational structures that need managing and more potential problem cases.

What’s missing in this metric are some other fundamentals about class.

  1. Smaller might be better for the student experience including but not limited to mentoring, one-on-one time with individuals, managing different learning styles… and that this might be exactly what distinguishes a University education at a bricks and mortar institution from an online experience.
  2. Class preparation time, do classes with more students involve more course preparation time. I taught a class recently that was about 1000 pages of reading for 12 people, but it would have still been 1000 pages if it had been 120 people.
  3. The lack of institutional support for say, grading, that larger classes receive.

Research expenditure. This metric measures the amount of money that the Institute receives when a faculty member spends their grant. Again, bigger is better. But this metric assumes that all research costs the same. Not all research costs the same amount to achieve, and funding agencies know that. It does not account for how much it costs to do research.

H-index. I’ve already written about this.

Imagine my joy when someone suggested that we plot all three against each other for an individual. What would that mean? Someone with a larger class, in an area of research that was more expensive to do, and with a high index does well. So, should we optimize (which is the purpose of metrics, to drive behaviour) for large classes at the sake of not giving students the opportunities that come from small ones? Should we optimize for expensive and popular research, and ignore the intellectual, social and political good that might come from less expensive research areas? Should we give even more legitimacy to the papers of an h-index and not ask about the papers that were potentially unpopular but changed a person’s thinking, deepen their intellect…?

Needless to say this epitomizes all that worries me about metrics. The desire to rank and compare, and use numbers to support that is to think uncritically. Sadly, it’s all too common in academia.


In computer science on September 27, 2013 at 10:27 am

I’m just finishing up Tubes. The book’s title reflects Ted Steven’s explanation of the Internet as being a series of Tubes. The author, Andrew Blum, decides to find the physical Internet. Where does it exist. The book comprises that journey, to places on the East and West Coasts of the U.S. and in Germany, where the network is visible. As he puts it:

“—the disconnection that comes as a consequence of connections, as if in a zero-sum game. And yet that isn’t the only truth about the network—especially not in Silicon Valley. Undergirding our ability to be everywhere is a more permanent thicket of connections, both social and technical. We can only talk about being connected as a state of mind, because we take the physical connection that allow it as a given.

He also attends a NANOG meeting (NANOG, North American Network Operators Group) and describes the process of network peering, the social/business relationships and process by which different Internetworks are physically connected together to create more Internetwork paths.

Prior to Tubes, I’d read Underground, Overground a history of the London Underground and its curious construction. I was struck by some of the parallels, how decisions about physical entities have reach far beyond their physical form. How the physicality of even the Underground disappears when it works (the book taught me to look a lot more closely at things in various stations, to see pieces of the infrastructure previously hidden to me, and reminded me of how I have a map of London that is partially shaped by the Tube, there are places I can only navigate without a map below ground, through the Tube, that I can not map to the surface).

Which is of course the way I experience the Internet. I largely do not see it. Indeed, the only pictures of the Internet in the U.S. that I can recall are the historical ones that show the first few nodes and its initial development e.g. here. In Tubes, Blum says that there are more contemporary “maps” available for sale at $5.5K!

But I can think of one place where the Internet has a more physically mapped presence, on the African continent. I see, for one reason or another, these images of the connections coming onshore in various African nations every other month, and this book makes me wonder why I see it more. Perhaps it’s because the infrastructure is not as dense and is therefore easier to map. Perhaps its because making the map is an argument for continuing to develop the infrastructure out and is therefore worth mapping. I don’t know. I did now think about how this map says nothing about a within-continent set of networked connections. It’s just the ones at the edge and that come from other continents. I wonder what that tells me, and what the invisibility of the rest hides.

Photocopier: Physical, Digital, Organizational and a Craft too!

In academia, computer science, discipline, empirical on June 13, 2013 at 12:55 pm

A couple of days ago I finally learnt the username/password combination and the network name for the third floor mopier (scanner, photocopier, printer). Perhaps its because I worked at Xerox for some years, but it always frustrates me when there’s a device I can’t print or photocopy on. This one took me some time to figure out how to operate for a variety of reasons.

I stood next to it several times. Nothing about its physical self revealed its digital self to me. Sometimes you can get a printer to print out its network configuration. But this machine did not allow you to touch any buttons without being logged in first. And so standing there next to it in the physical world changed nothing about my ability to print to it in the digital world. I was having the reverse experience of the one in which your computer “discovers” a printer but you can’t discover it in the physical world (vague embarrassment recalled as I spent some time printing to a machine which I thought was just outside my office (it said Gutenberg on the front of the machine that I read as the network name of the machine, but that’s actually the name of a machine located in a different building on campus. Luckily I printed out an email, so the person receiving the print out was able to email me to let me know that I was mistaken about the name of that machine).

The key to discovering its online name was to find out what username/password combination worked. Who should I ask? The printer’s physical existence is in a space that I don’t understand organizationally. Does it belong to the School of Interactive Computing? Does it belong to the School of Language, Media and Culture? Does it belong to IMTC? Not clear to me because the physical location (which for many other parts of the third floor I can easily read and interpret) was ambiguous. I wondered who to ask.

Quite by chance someone tells me what the username/password combination is, and I log on to the photocopier. I have some photocopying to do. I first learnt to photocopy in graduate school. Need several chapters of a book? No Google search facility back then (WAIS and Gopher if I recall correctly) that would likely yield a probably illegal copy of what you were looking for. No, it was off to the library and then over to the photocopy room. It was a time when people would say that part of learning to be a graduate student was learning how to photocopy, smiling, but acknowledging a truth about the importance of being able to master that skill.

The Department of Information and Computer Science at the University of California, Irvine had a dedicated staff member in the photocopy room. He took care of the several machines that were in the room (no doubt did other things, but this was the primary place I encountered him). ICS had a system of user names and passwords associated with individuals and caps. So the book chapter copying was always a dilemma of balancing the desire to have the reference material against the annual cap. That was until I got the username and password combination for a project that was very rich. DARPA funding meant that the project’s cap was infinite. Now, all that stood between me and the book was the ability to photocopy it. I taught myself a variety of useful skills, to efficiently double-sided, two pages on each side, shrink to fit, copying. I prefer short edge binding over long edge. After a while I was able to size up a book and pretty much get the exact amount of shrinkage right first time.

Having mastered the art of photocopying, ICS provided further opportunities. For a while I was spiral binding most of my photocopies using the machine that cuts rectangular holes down one side of the photocopy stack and the other machine that inserts the spiral binding. I put front and back covers on some of my efforts. I still have one of those to this day, the photocopied proceedings of the first conference on Software Engineering held in Garmisch. And then there was the experiment with the glued binding. There were binders that had glue on the inside of the spine and a machine that would heat it up, you would then stick the paper to be bound in, let the glue run over them and then take the entire thing out of the machine. The trick with this machine was in heating but not overheating the glue. And I have to admit the machine made me nervous, I worried about the potential for fire. I’m actually not sure whether that was a valid concern, but I worried about it and consequently I decided to return to spiral binding even though glue bound photocopies made for a flush on shelf filing.

Of course, I couldn’t experiment while the staff member was there. I was not using my correct code. Perhaps I was photocopying more than I should. I had no idea whether graduate students were “allowed” to use these other machines. So most of these skills were developed in the small hours of the night. Walking home with my latest creation afterwards, I primarily feared the roving packs of raccoons that wandered around campus being generally annoyed by the presence of humans out during the time in which they occupied campus. Sometimes I hid from them as to not invoke their ire. After all I had something to read in hand.

I’m scanning a book chapter on the third floor mopier. I’m going to send it to myself so that I can read it on my iPad. I like reading academic papers and books on my iPad. I decide to add my email address to the list of frequently used emails so that I don’t have to type it all in each time I do this. I look through the list of emails already there, and now I’m even more curious about the organizational history of the machine. There are various addresses in there. Some are graduate students who have since graduated. I’m surprised to read that this machine has been in existence on the third floor even, for longer than I think. But there are some addresses for people who’ve never worked proximate to this machine while it’s been in this location. I wonder why they are there. I wonder whether the machine lived somewhere else in a former life, proximate to those users.  Ive never really thought about reading an organizational history from a photocopier, but I see at least two departmental identities as well as some longevity of history represented in the collections of emails that make up the frequent users of the machine.

The TSRB 3rd floor photocopier is now something I can print too. But it’s given me far more than that, an opportunity to reflect on how this machine lives in the physical and digital worlds, a recollection back to my learning how to photocopy, and about the institutional elements of the machine.

MOOC Participation: Diversity and Assumptions of Development

In computer science, discipline, empirical, research, social media on February 12, 2013 at 11:30 am

Continuing my series of posts about MOOCs. Today’s is about a type of open/development rhetoric I keep hearing associated with MOOCs. It’s well meant I am quite sure, but I’ve heard the following sentiment: MOOCs will allow anyone from any continent to access content. And that in turn leads to increased education, skills for all.

I have a number of problems with this argument.

Starting with the obvious, this sentiment makes important assumptions about access. That access to the Internet and its content is uniform across the world. But it’s not. The Internet is a very different experience if you have a smartphone as your only means of access, versus if you have a laptop. Behind the hardware, there are questions of corporate policies and pricing mechanisms that influence access. Bandwidth caps, bandwidth pricing can influence how people use their phones, and in many parts of the world also how they use the wired network.

Behind these crucial practical questions of access lurk other assumptions, which warrant questioning. Is the content we create relevant or useful for everyone? What assumptions do the producers of content make about, say, what has been previously taught? What assumptions are made about the types of hardware and software the students have access too? And most critically, what assumptions get made about why the person is taking the course and whether that content will ultimately be most useful?

Although its not used too much, I have heard the word “Africa” used to describe diversity. I do think its well meant but it has the danger to collapse all of these questions into a stereotype of a person. Africa is not a person, nor is it a country, it’s a continent of great diversity in all senses. A person from Africa may well contribute to diversity in a MOOC setting, but so might a person from America.

Like others, I see this as being part of understanding the participation divide that shapes the Internet today. Some of that divide is the question of access, its costs, modalities, and so forth. But that’s not all that shapes the participation divide. When we overly simplify an entire continent we close down the question of what shapes participation in very problematic ways. If we are really committed to understanding how online education might help more people learn, the participation divide is precisely the question we ought to open up, to really take account of the highly diverse population of people that have some reach to the Internet. Because it’s only when we actually take diversity seriously that we have any shot at getting to something better than more education for the already well educated.

MOOC Diversity

In academia, computer science on February 5, 2013 at 8:43 am

My colleagues, perhaps like yours, are discussing MOOCs a lot. I’ve got my own set of reservations about them, but today I want to focus on a question. How diverse are the instructors of MOOCs and what implications does that have for increasing diversity in STEM fields?

Recently my colleague, Mark Guzdial, argued that we should do no harm via a MOOC. His point was simple, that MOOCs could reverse the decades of hard-won efforts to diversify Computer Science. I know from experience, every single time I teach Computer Science classes just how non-diverse Computing remains. I’ve been in the situation of doubling the number of women in the class more than once (especially when I have a female TA). It would be nice to get away from that.

And then I saw my colleague, Tucker Balch’s, demographics from his MOOC. Wow! Highly educated men dominated the people who completed his course. As Mark points out in his analysis of Tucker’s demographics, some of this is likely due to the nature of the course, particularly it being an elective (Computational Finance).

This led me to my question. I wonder what diversity is like on the other side, among the faculty who offer MOOCs? And I offer my story of how I stayed in STEM as an example of why I think it matters. My first Computer Science teacher was a woman. She watched out for me and the other one or two women in the class. I remember that she encouraged me, took time to talk to me beyond the content of the classroom… and so I stayed for two years in a classroom with over 25 14-16 year old boys. (I think this deserves a OBE).

This continued with my second teacher, a man. The class was very small, seven and I was the only woman. The advantage of the small class was that we all got to know each other well, perhaps too well. Some boys of 16-18 well, boosting, talking about sex and women in ways that weren’t exactly flattering…  My teacher recognized that this was hard for me, and spent time talking with me about why I should persist despite it. He taught me that developing trust, taking time with an individual student outside of the academic content, could be crucial to inspiring the type of trust that would lead to confidence.

I needed those two teachers. I needed them very much. They are without doubt the reason I am still here. Especially since while I had a couple of faculty at Leeds who really encouraged me (thank you), I didn’t find the part of the discipline that I was passionate about until I reached UC Irvine and my Ph.D.

Given the lack of women in academia, particularly in STEM, I wonder whether the pattern of male dominance repeats itself in who offers the MOOC and I wonder what in turn that does to the student population. Perhaps some would say, offer a MOOC, redress it. But, my route into the field was not about volume encounters, but about those that were very personal. Its only maybe four people who made enough of a difference that I got through, but how can any person be that when they have 50,000 students? Also, how can you achieve these intimacies at a distance, across the network as opposed to face-to-face.

Diversity and Service

In academia, academic management, computer science, discipline, women on February 4, 2013 at 8:15 am

As I mentioned in a previous post recently I read this article about the advantages of being married for male academics versus the disadvantages of being married for women academics. It’s left me with a lot of questions. And being inspired by  Female Science Professor‘s question “why don’t more senior women in STEM blog?” I want to continue

In addition to teaching, research, and publishing responsibilities, service constitutes a major part of a professor’s career. … The gender breakdown within a department plays a significant role. Typically, there are more men than women within a discipline, and yet committees seek as much diversity as possible. Women, then, are often asked to do double the amount of service as men, a number that increases for women of color. While service is certainly considered when promoting, publications play a much larger role.

I understand the logic, to have a diversity of representation/voices at the table and so forth. But this is clearly the flip side of it, that women and minorities can get over-serviced. And since time is limited, service will eat into other important activities like research and teaching. This is a serious problem. But I don’t know what to do to change it. In the long-term we do need to recruit and retain women and minorites in STEM, but what do we do in the short-term? There seems to be a conflict here: we want to hear from diverse voices but in so doing we ask them to participate in things that compete for their precious research time.

One short-term piece of advice I would offer to anyone who fits this potential category, is to be very aggressive about saying no. Benchmark your service against a non-minority in your department at your rank. Do no more. (Read studies such as Link et al. “A time allocation study of university faculty” to see broad trends and uneven distributions as a reminder to do no more.)