How to count lines of code in jupyter notebook

The answer from @Jessime Kirk is really good. But it seems like the ipynb file shouldn't have Chinese character. So I optimized the code as below.

#!/usr/bin/env python

from json import load
from sys import argv

def loc(nb):
    with open(nb, encoding='utf-8') as data_file:
        cells = load(data_file)['cells']
        return sum(len(c['source']) for c in cells if c['cell_type'] == 'code')

def run(ipynb_files):
    return sum(loc(nb) for nb in ipynb_files)

if __name__ == '__main__':
    print(r"This file can count the code lines number in .ipynb files.")
    print(r"usage:python countIpynbLine.py xxx.ipynb")
    print(r"example:python countIpynbLine.py .\test_folder\test.ipynb")
    print(r"it can also count multiple code.ipynb lines.")
    print(r"usage:python countIpynbLine.py code_1.ipynb code_2.ipynb")
    print(r"start to count line number")
    print(run(argv[1:]))

This will give you the total number of LOC in one or more notebooks that you pass to the script via the command-line:

#!/usr/bin/env python

from json import load
from sys import argv

def loc(nb):
    cells = load(open(nb))['cells']
    return sum(len(c['source']) for c in cells if c['cell_type'] == 'code')

def run(ipynb_files):
    return sum(loc(nb) for nb in ipynb_files)

if __name__ == '__main__':
    print(run(argv[1:]))

So you could do something like $ ./loc.py nb1.ipynb nb2.ipynb to get results.


The same can be done from shell if you have a useful jq utility:

jq '.cells[] | select(.cell_type == "code") .source[]' nb1.ipynb nb2.ipynb | wc -l

Also, you can use grep to filter lines further, e.g. to remove blank lines: | grep -e ^\"\\\\n\"$ | wc -l