Group files in some folders

The python script below does the job. Hidden files are stored separately in a folder , as well as files without extension.

Since it might be used for a wider range of purposes, I added a few options:

  • You can set extensions you'd like to exclude from the "reorganization". If you simply want to move all, set exclude = ()
  • You can choose what to do with empty folders (remove_emptyfolders = True or False)
  • In case you would like to copy the files instead of moving them, replace the line:
shutil.move(subject, new_dir+"/"+name)

by:

shutil.copy(subject, new_dir+"/"+name) 

The script:

#!/usr/bin/env python3

import os
import subprocess
import shutil

# --------------------------------------------------------
reorg_dir = "/path/to/directory_to_reorganize"
exclude = (".jpg") # for example
remove_emptyfolders = True
# ---------------------------------------------------------

for root, dirs, files in os.walk(reorg_dir):
    for name in files:
        subject = root+"/"+name
        if name.startswith("."):
            extension = ".hidden_files"
        elif not "." in name:
            extension = ".without_extension"
        else:
            extension = name[name.rfind("."):]
        if not extension in exclude:
            new_dir = reorg_dir+"/"+extension[1:]
            if not os.path.exists(new_dir):
                os.mkdir(new_dir)
            shutil.move(subject, new_dir+"/"+name)

def cleanup():
    filelist = []
    for root, dirs, files in os.walk(reorg_dir):
        for name in files:
            filelist.append(root+"/"+name)
    directories = [item[0] for item in os.walk(reorg_dir)]
    for dr in directories:
        matches = [item for item in filelist if dr in item]
        if len(matches) == 0:
            try:
                shutil.rmtree(dr)
            except FileNotFoundError:
                pass

if remove_emptyfolders == True:
    cleanup()

IF there is a risk of unwanted overwriting duplicate files

At the expense of a few extra lines, we can prevent overwriting possible duplicates. With the code below, duplicates will be renamed as:

duplicate_1_filename, duplicate_2_filename 

etc.

The script:

#!/usr/bin/env python3

import os
import subprocess
import shutil

# --------------------------------------------------------
reorg_dir = "/path/to/directory_to_reorganize"
exclude = (".jpg") # for example
remove_emptyfolders = True
# ---------------------------------------------------------

for root, dirs, files in os.walk(reorg_dir):
    for name in files:
        subject = root+"/"+name
        if name.startswith("."):
            extension = ".hidden_files"
        elif not "." in name:
            extension = ".without_extension"
        else:
            extension = name[name.rfind("."):]
        if not extension in exclude:
            new_dir = reorg_dir+"/"+extension[1:]
            if not os.path.exists(new_dir):
                os.mkdir(new_dir)
            n = 1; name_orig = name
            while os.path.exists(new_dir+"/"+name):
                name = "duplicate_"+str(n)+"_"+name_orig
                n = n+1
            newfile = new_dir+"/"+name
            shutil.move(subject, newfile)

def cleanup():
    filelist = []
    for root, dirs, files in os.walk(reorg_dir):
        for name in files:
            filelist.append(root+"/"+name)
    directories = [item[0] for item in os.walk(reorg_dir)]
    for dr in directories:
        matches = [item for item in filelist if dr in item]
        if len(matches) == 0:
            try:
                shutil.rmtree(dr)
            except FileNotFoundError:
                pass

if remove_emptyfolders == True:
    cleanup()

EDIT

With OP in mind, we all forgot to add an instruction on how to use. Since duplicate questions might (and do) appear, it might be useful nevertheless.

How to use

  1. Copy either one of the scripts into an empty file, save it as reorganize.py
  2. In the head section of the script, set the targeted directory (with the files to reorganize):

    reorg_dir = "/path/to/directory_to_reorganize" 
    

    (use quotes if the directory contains spaces)

    possible extensions you'd like to exclude (probably none, like below):

    exclude = ()
    

    and if you'd like to remove empty folders afterwards:

    remove_emptyfolders = True
    
  3. Run the script with the command:

    python3 /path/to/reorganize.py
    

NB if you'd like to copy the files instead of move, replace:

shutil.move(subject, new_dir+"/"+name)

by:

shutil.copy(subject, new_dir+"/"+name)

Please try first on a small sample.


You can use find with a somewhat complex exec command:

find . -iname '*?.?*' -type f -exec bash -c 'EXT="${0##*.}"; mkdir -p "$PWD/${EXT}_dir"; cp --target-directory="$PWD/${EXT}_dir" "$0"' {} \;

# '*?.?*' requires at least one character before and after the '.', 
# so that files like .bashrc and blah. are avoided.
# EXT="${0##*.}" - get the extension
# mkdir -p $PWD/${EXT}_dir - make the folder, ignore if it exists

Replace cp with echo for a dry run.


More efficient and tidier would be to save the bash command in a script (say, at /path/to/the/script.sh):

#! /bin/bash

for i
do
    EXT="${i##*.}" 
    mkdir -p "$PWD/${EXT}_dir"
    mv --target-directory="$PWD/${EXT}_dir" "$i" 
done

And then run find:

find . -iname '*?.?*' -type f -exec /path/to/the/script.sh {} +

This approach is pretty flexible. For example, to use the filename instead of the extension (filename.ext), we'd use this for EXT:

NAME="${i##*/}"
EXT="${NAME%.*}"

ls | gawk -F. 'NF>1 {f= $NF "-DIR"; system("mkdir -p " f ";mv " $0 " " f)}'

Calculating the list of extensions (after moving):

ls -d *-DIR

Calculating the list of extensions (before moving):

ls -X | grep -Po '(?<=\.)(\w+)$'| uniq -c | sort -n

(in this last exemple, we are calculating the number of files for each extension and sorting it)