Plagiarism of code by other PhD student

This is a tricky one. It probably sounds like academic plagiarism, but your licensing concerns are probably not going to resolve the problem at the core of this. There's two sides to this, the academic (plagiarism) side and the commercial (licensing) side. They're almost entirely separate, so I'm going to break them up.


Academic/Plagiarism Claim

The code in the repo is over 97% direct copy of code that I had produced while working in my previous employment...

Without attribution, this is plagiarism, and may be grounds to have the PhD rescinded, but this would be a serious procedure as it's likely to have life-altering affects on the PhD student in question. You'd need to be 100% sure of what you're doing and the validity of your claims going into this. Even if the case was black and white (which I don't think this one necessarily is), the student's University isn't going to take rescinding a PhD lightly as it reflects badly on them.

Additionally, it sounds like there is a some attribution within the work:

The indirect mention of my name is as a maintainer not the sole developer

...which might well be sufficient for the University to write this off as a referencing error, maybe requiring the student to make a small addendum to their thesis.

The code is just part of the story, since the person also claimed that in addition to writing the code they used the code to produce the results in that chapter. They didn't - since I know that I ran the batches in question (this was around 2 years of work).

If you can prove it, the experiments that they claim to have run, that you performed while under the employ of their department are probably your strongest leg to stand on here. But you'd have to have good evidence, and be able to show they haven't run the experiments themselves. If yours were not published previously, and they've generated the data themselves using your code, this may well be null and void.


Commercial/Licensing Claim

The licensing perspective of this is completely separate to the plagiarism side - if the work was published as GPL, they can basically do what they like with it, without attribution, provided that any code based upon it remains GPL. This bit is important, as it's probably where you have a leg to stand on, based on this comment:

although the person has also removed the GPL from the repo, which I thought was against the license terms

That is absolutely in breach of the terms, which is why most industrial entities won't touch GPL code with a bargepole (due to what's commonly called "license bleed").

Based on this, you might have a valid claim to get their repo pulled, but that's not going to solve your actual problem.

Again, from the comments:

...I feel that the supervisor (who used to be my boss)...

I understand this to mean that the PhD student in question is supervised by your old boss, which means you had a contract with their institution. Depending on the contract you had with them, they could therefore own all the rights to it regardless, making your initial GPL license in breach of your contract with them without prior stated agreement. This wouldn't affect the legitimacy of your plagiarism complaint, as that's a genuine academic concern, but might affect how the department deals with your request.


You want to accuse a peer of plagiarism on the basis of the following (emphasis added):

I came across a recent PhD thesis that contained a reference to a github account, which contained code I had written...There is no direct reference to my own Github repo which contains the original code (with GNU public license) and no acknowledgment of my authorship (I am mentioned indirectly as a maintainer of the code elsewhere in the thesis).

You've stated that the accused has acknowledged you and, as stated in a comment, "GPL(v2,v3) does not require attribution," so the accused was not required to reference your Github repository from their own.

This doesn't seem like plagiarism.

Nonetheless, as noted by Abion47, I appreciate that the OP feels they have been wronged and the OP wants to understand what happened. This could perhaps be achieved with a little digging, e.g., by emailing the accused and asking questions, by sitting down with accused, ... For such a strategy to work, the OP must enter the dialogue without any pre-assumptions of guilt: Listen to the accused, hear their story.


Response to comments by the OP:

The indirect mention of my name is as a maintainer not the sole developer

This seems like a minor quibble over the accused's word choice.

Please note my emphasis on sole developer of the original code

The accused has not claimed to be the developer of the code (at least, that's not mentioned in the original question).

The mention is in a different chapter and is not in the Github repo

If the mention is in an earlier chapter, then that surely suffices (code has been attributed to you, the owner), otherwise, well, it should have been, but that's easily explained away (e.g., due to changing the order of chapters). Regarding Github, we've established that you didn't require a mention.


Response to comments regarding maintainer vs. developer:

I'm shocked that this answer is [highly rated]. Being mentioned as a "maintainer" is nowhere near the same thing as being the sole developer. We can talk about technicalities all day, but the other student is clearly being deceptive.

and

I agree with the others in the comments here complaining about it--this person is certainly being dishonest by referring to the actual author as the "maintainer".

Wikipedia offers the following definitions:

  • A software developer is a person concerned with facets of the software development process, including the research, design, programming, and testing of computer software.

  • A software maintainer...is usually one or more people who build source code into a binary package for distribution, commit patches, or organize code in a source repository

I appreciate that software developer is the more appropriate term. However, the accused's first language mightn't be English and the accused (presumably) isn't an expert in software engineering (they work in clinical sequencing).

I really do not think that using maintainer as opposed to developer is a big deal. I certainly would not make a plagiarism case on the basis of a misused term.


Talk to your advisor.

Talk to the other student's advisor, with the support of your own. Or even have your own advisor make the complaint to the other.

Complain to GitHub.

But, most important, make sure that your own advisor will agree that this other, seemingly prior, work doesn't prejudice your own degree.

As to publishing, I'm pretty sure that the code supports your work, rather than being the essence of your work. If that is the case, as is normal, then the issue of plagiarism shouldn't affect your own ability to publish your own results.

Tags:

Phd

Plagiarism