Is there an open-source license that enforces citations?

Before thinking of the citation of the software, think of the related situation for scientific articles. Have you have read an article with a comment "If you use the results of our research, it is mandatory to cite this paper"? Probably not, it is because it is the practice of working scientists to cite relevant material (Comment here: I do not claim that the citation process is always well used, fair, or whatever in this direction. In general, this is the way scientists are supposed to work).

Because of the relatively recent status of software as an academic artifact (in comparison to books and articles), the situation is not as good as it could be.

Several initiatives have been made to think of this issue, to propose solutions that are considered fair, and to encourage a citation practice that is as good for software or data as it is for articles.

  1. Smith et al "Software citation principles" PeerJ Computer Science 2:e86 https://doi.org/10.7717/peerj-cs.86 collects the recommendation of the FORCE 11 working group on software citation. For developers and users.
  2. "Encouraging citation of software – introducing CITATION files" is a blog post by Robin Wilson as part of the Software Sustainability Institute. It recommends to include a "CITATION" file with the relevant data to cite the software.
  3. The Journal of Open Source Software is a journal whose purpose is to publish concise articles about a software. The papers have a DOI and can be cited, which enables you to include your software in the "traditional" publishing and citation practice of scientists.

There are probably other initiatives that I don't think of right now. There seems to be consensus anyway to stick to a well known license recognized by the Open Source Initiative (OSI): https://opensource.org/licenses

To conclude, the context of your software also matters. If one of your aims is that your software is re-used by others, either standalone or in combination with other tools, this influences the choice of the license. Looking furhter, if you want your software to be distributed in larger packages or in Linux distributions, a well established license is critical.


A license requiring some form of citation is certainly possible, but there would be practical problems which, I believe, far outweigh the benefits.

To get an idea of what such problems might be, consider the original BSD license with an "advertising clause", requiring "all advertising materials mentioning features or use of [the] software" to display an acknowledgement of the original authorship. This seems innocent enough, but led to an unreasonable accumulation of acknowledgements as more and more authors added their own name or organization to the must-be-acknowledged list; and it causes the license to be incompatible with other licenses such as GNU's General Public License, which prohibits many kinds of restrictions on redistribution; because of practical problems of this kind, the clause was dropped from later versions of BSD.

So, if you are the sole author of the software, you can write your own license which sets whatever conditions you wish for redistribution (subject to the limitations of what copyright cannot forbid, e.g., "fair use"). But:

  • this will probably make it impossible to combine (or even link, in the case of libraries) parts of your software with certain major open source licenses such as GNU's GPL;

  • it might not pass as "open source" or "free software" according to certain definitions of the term (e.g., Debian's Free Software Guidelines, which are interpreted in a very conservative way and don't consider the GNU Free Documentation License to be "free" — this is a long-lasting controversy), and this can cause additional practical problems;

  • when writing such a clause, you should carefully consider what happens if someone wants to reuse parts of your code in their own software (possibly having completely different goals and being used in a context that you didn't even imagine);

  • and a legal license is only ever useful if you are seriously considering taking violators to court (or at least threatening to do so in certain cases).

For reasons such as these, I submit that a non-legally-binding request is more appropriate. Keep in mind that something non legally binding can still be morally (i.e., ethically) binding, just like citing previous works is morally required in academia even if it is not legally required. So you can phrase your request for citation in a way that makes it clear that, while it is not a legal obligation, it is still much more than a friendly reminder.


Unless you use some third-party code that is copyleft-licensed and forces you to use the same license, as the author/copyright holder, you are entirely free to choose your own licensing terms. (There are some legal limits, but they are very broad. You can't ask for someone's firstborn.) You could make your own license that has as a condition of use of your code that any paper that builds on it needs to cite you.

However, in practical terms this is unlikely to have more effect than a friendly reminder in the README. It would enable you to take the authors of the paper to court, but this would be far worse for your reputation (and your wallet) than a missing citation is ever likely to be. Custom-made licenses can also come with their own pitfalls (it takes a lawyer to draft a good one).

Tags:

License

Code