Does building a Python library fall in the category of research?

The point of research is the production of knowledge. If, after you take away all the code you’ve written and the datasets you’ve collected, there’s nothing left, then you’re not doing research, you’re doing development.

So, if you want to do artefact-oriented research like this, the question you have to ask yourself is this: what is the new knowledge you’re producing? Are you testing a new, better algorithm? Are you applying an existing algorithm in a new context and determining if it works there? Are you investigating how a complex system functions, or how it interacts with humans and/or the environment? What is the research question that you’re answering?

If you do determine that you’re doing research rather than development, however, there are a number of artefact-oriented research methodologies, such as the Design Science Research Methodology. You should be able to find more information on it with Google Scholar, if you’re a member of an institution that gives you access to their research journal subscriptions.


It can and has been done.

Consider PonyGE2, a Python framework for Grammatical Evolution. It has presented at GECCO '17, a leading conference on genetic and evolutionary computation.

It's not strange that it was accepted; it was devised as a research tool, allowing other researchers to also experiment with GE using a common framework. So while it's perhaps not so much entirely novel research, it's more a "utility publication", similar to a publication describing an interesting dataset or benchmark set.

Note that the Github page tells you how to cite PonyGE2 if you use it in research. A utility publication that accompanies a tool that gains wide adoption can actually result in a lot of citations.


There might be research that leads you to build such a library, but coding itself, isn't research. But compilers, for example, were built on a ton of research prior to any coding beyond the experimental.

One often, in CS, does some research first and then builds something to validate the conclusions of the research. But the paper produced is about the research findings, not the code. Code optimizers fall in this category as do many aspects of operating systems.

But note that it starts with the research, not the code.