An experience in reporting online plagiarism
Over the past couple of days, I've been catching up with a bit of research, concentrating predominately upon current support for WAI-ARIA amongst popular screen readers. After I'd exhausted what I felt to be the key resources (I may blog about these later) I decided to do something I think most of us end up doing, and that is to type in appropriate terms into Google in order to identify any articles or papers I hadn't originally picked up upon.
Scrolling through the list of search results, I spotted a link to an article on a web site of a web design agency I'd never heard of, but which dealt with a particular aspect of WAI-ARIA I was interested in. (Before I go on, to protect the innocent - and the guilty - I have decided not to name any of the parties involved, nor the title/details of the article in question). So, I clicked on the link, and began reading. Almost immediately, something didn't feel quite right.
The web site itself was reasonably "up to date" - different backgrounds for particular sections of the page, rolling images, megamenu, fat footer - if not spectacular. However, as I scanned through the article, I was experiencing a growing sense of déjà vu. I was convinced I knew the artwork, peppered through the article, from somewhere else. But it was the text itself that irritated me more. I found myself "completing" sentences in my mind, in the same way that you find yourself singing along to a song you haven't heard for years.
I'd read this article before, hadn't I? No, obviously not. It was the web design agency's blog, and the text at the top of the page read "Written by author". It was posted only a couple of weeks ago. So, it must be new, and I must have mistaken it for a similar article. Maybe they'd simply read a similar article online, and re-written it in their own words, as I used to do as a first year undergraduate (come on, we all did that? Didn't we...?). Not exactly a crime, then. So, giving them the benefit of the doubt, I read the article in full.
And then it appeared.
At the very bottom of the page, just above the footer, but below the references (ha!), was a link - "Article source: http://www...". Huh? "Article source"? I'm reading the article, surely there isn't a "source"? So, I clicked on the link. I recognised the resultant site straight away.
I had read this article before. It was published back in late 2010, which was around the time I had first read it. It was published on a very well respected web site for web designers. I didn't recognise the author's name, but I remember the article being well structured, very well referenced, and obviously written by someone who knew what he was talking about. It contained details of a study the author and his team had carried out (which meant that it was still shown as "we" in the duplicated article, meaning that someone scanning the article may have been under the impression that it was the web design agency who had carried out the study). The article itself was obviously the product of many months hard work. Yet, there was no mention of his name on the other site, nor the artist who had put together the rather beautiful illustrations.
So, I figured, there must be an innocent explanation. As an occasional developer, I come across vast amounts of plagiarism, in which someone blogs a particular JavaScript or jQuery technique for (say) a nifty lightbox effect, and suddenly the source code appears everywhere as wannabe coders try to show how clever they are, maybe changing one or two lines maximum. Incidentally, if you're a developer and you want to find out if code has been stolen from elsewhere, take a look at the comments, particularly if they are to do with how the code is structured, or what particular routines mean. Is the author answering them? Are comments disabled? If he/she is making an attempt at answering them, they should give an indication as to how much the author knows about the code he has just posted. If he is ignoring them, he might be "too busy to answer just now", but presumably he should know his own code enough so that he can spare the five minutes it requires to clarify or answer any queries, particularly if he was willing to advertise it to the world in the first place.
Anyway, back on topic. This was a web design agency, providing a commercial service, not an individual programmer in his bedroom writing functions. Maybe there is a link between the site and the original publisher, and there is some reciprocal agreement between both parties to publish each others work. That said, I had my suspicions.
So, I decided to pose the following question on Twitter:
I'm reading a blog entry on a web design company's website which is directly lifted from elsewhere - should I inform the original author?
I received three responses, each of whom suggested that, yes, I should. So, with a little concern for making a fool of myself, I got in touch with both the author and the publisher. I wasn't really quite sure whether I was doing the right thing - there had to have been a simple explanation, and I was going to make myself look an idiot. "Of course we know about it ", they were going to reply, "You are an idiot for bringing it to my attention. I run both sites, and I am very busy. Go away!". So, it was with a little trepidation that I hit the "Send" button to both parties.
A couple of hours later, I received the following (edited) response from the original publisher (at the time of writing, I haven't heard from the author):
Thank you for contacting us about this. I have written to the site owner and expect them to remove the article until permission can be gained from [XXX]. Indeed [YYY] is in violation of copyright. They did not request permission from us. They do not have appropriate attribution, and they're not allowed under any circumstances to reproduce [ZZZ]'s artwork.
Woohoo! I had done my good deed for the day. I felt good. I then revisited the web site to find that this was only one of several articles they had stolen (yes, I'm going to use that word) from the original publisher, even going as far as including the original source within a submenu of the megamenu. The article was exactly the same, in its entirety - all of the images had been reproduced, as had the list of references. Bizarrely, the links to the article translations linked to translations on the original website. Maybe they didn't want to deal with the different language codes.
After I'd congratulated myself several times over (yeah, OK, I'll admit to doing so), I started to feel angry. This was not some first year undergraduate hoping to gain a few extra marks for his essay. This wasn't even a very amateur programmer passing off someone else's code as his own. This was a firm carrying out commercial web design work. To the uninitiated, they seemed to know a lot about WAI-ARIA. Hey, if I worked in a small business looking for a new web site, I think I could trust these guys - look, Mr Boss, they know about accessibility, let's give them a significant amount of our budget to build an accessible site. On the other hand, maybe the "author" was pressed for time - he knew there was the potential for a few big jobs to come in, and wanted to impress his clients, so he thought he would write something on one of these new fangled technologies, but just couldn't find the words (or didn't know them in the first place).
This is no excuse though. While I'm still relatively new to web accessibility, I've worked late nights reading up on latest developments, playing around with NVDA and the Web Accessibility Toolbar into the wee hours, reading paper upon paper, article upon article and so on, so I can understand what it's like to navigate through a site under extraordinary circumstances. I've spent years catching up on latest developments, to achieve a moderate level of understanding of the issues. It's been hard work.
Yet, I very rarely blog my thoughts. There are many other more experienced bloggers working in both the commercial and academic realms. I certainly never copy anyone else's work and pass it off as my own. When I do have the time and inclination to write something interesting, I don't copy and paste - instead, I use that tried and tested routine of linking to the original. There's nothing wrong with links. Links don't show that you don't know stuff that someone else does, and therefore you cannot be trusted. Rather, by choosing your links carefully (and assuming you actually read the target resources in detail!), it shows that you are aware what others are doing in the field. It says, "hey, I know about this stuff, but this guy over here writes about it in more detail". It's certainly much better than lifting the content in its entirety and passing it off as your own, which just makes you and your firm look bad at best, and clueless incompetents who are not to be trusted at worst.
So, in future, I've decided that I'll always report suspicious content. I would encourage you to do so as well.
UPDATE 28 February 2011: After a quick check this morning, it appears the copied articles have since been removed from the web design company's website. Hopefully that's the end of the matter - and a reason why it's always good to speak up!
UPDATE 23 February 2011: I've since heard from the author of the original piece, who was also completely unaware that his article had been used elsewhere.


