Python shutil: A Comprehensive Guide

In today’s data-driven world, managing and manipulating filesystems is a task that programmers often encounter. Whether it’s moving files, copying directories, or simply organizing your file structure, having an efficient way to handle these tasks is crucial. This is where Python’s built-in shutil module steps in, acting as a powerful tool for high-level file operations.

What is the shutil module?

shutil (short for shell utilities) is a built-in Python module that was integrated into Python’s standard library to make it easier for developers to automate and handle various file system tasks that would otherwise require shell scripts or manual operations. The shutil module, as of Python 3.10, contains around 22 functions that you can use for various operations such as copying, moving, deleting files and directories, finding files, and more.

chown()

The chown() function in the shutil module changes the owner and/or group of a file or directory.

Syntax:

shutil.chown(path, user=None, group=None)

path is the path to the file or directory.
user is the name or id of the new owner. If not provided, the owner will not be changed.
group is the name or id of the new group. If not provided, the group will not be changed.

This function does not return anything.

Please note that this function requires appropriate privileges and is available only on Unix platforms.

Code Example:

import shutil

try:
    # change the owner and group of a file
    shutil.chown('file.txt', user='new_owner', group='new_group')
    print(f"Owner and group changed")
except Exception as e:
    print(f"An error occurred: {e}")

In this code example, we first import the shutil module. We then call shutil.chown() to change the owner and group of a file named ‘file.txt’ to ‘new_owner’ and ‘new_group’. If an error occurs during the operation (for instance, if the file does not exist, the program does not have the required permissions, or the specified user or group does not exist), the code in the except block will be executed and the error message will be printed.

copy()

The copy() function in Python’s shutil module copies the file from source to destination. It returns a string that is the path to the copied file.

Syntax:

shutil.copy(src, dst, *, follow_symlinks=True)

Here, src is the source file path, dst can be a directory path or another file path. If dst specifies a directory, the file will be copied into dst using the base filename from src. If follow_symlinks is false and src is a symbolic link, dst will be created as a symbolic link.

Code Example:

import shutil

# define the source and destination paths
src = "/path/to/source/file.txt"
dst = "/path/to/destination/directory"

# use shutil.copy() to copy the file
shutil.copy(src, dst)

# print confirmation message
print("File copied successfully.")

In this example, the file at the location /path/to/source/file.txt is copied to the directory at /path/to/destination/directory. If the operation is successful, it prints “File copied successfully.”

copy2()

The copy2() function is similar to copy(), but it also attempts to preserve file metadata, such as the timestamp information. It returns a string that is the path to the copied file.

Syntax:

shutil.copy2(src, dst, *, follow_symlinks=True)

Code Example:

import shutil

# define the source and destination paths
src = "/path/to/source/file.txt"
dst = "/path/to/destination/directory"

# use shutil.copy2() to copy the file and metadata
shutil.copy2(src, dst)

# print confirmation message
print("File and metadata copied successfully.")

In this example, the file at the location /path/to/source/file.txt along with its metadata is copied to the directory at /path/to/destination/directory. If the operation is successful, it prints “File and metadata copied successfully.”

copyfile()

copyfile() in the shutil module copies the content of the source file to the destination file and returns the path to the destination file. If the destination is a directory, a IsADirectoryError will be raised.

Syntax:

shutil.copyfile(src, dst, *, follow_symlinks=True)

Here, src is the source file path, dst is the destination file path. If follow_symlinks is false and src is a symbolic link, dst will be created as a symbolic link.

Code Example:

import shutil

# define the source and destination paths
src = "/path/to/source/file.txt"
dst = "/path/to/destination/file.txt"

# use shutil.copyfile() to copy the file content
shutil.copyfile(src, dst)

# print confirmation message
print("File content copied successfully.")

In this example, the content of the file at the location /path/to/source/file.txt is copied to the file at /path/to/destination/file.txt. If the operation is successful, it prints “File content copied successfully.”

copyfileobj()

The copyfileobj() function is a method in Python’s shutil module. It is used to copy the contents of one file-like object to another file-like object.

This function is particularly useful when you’re working with file-like objects (like files opened in binary mode or io.BytesIO objects), not just actual files.

Syntax:

shutil.copyfileobj(fsrc, fdst, length=16*1024)

This function copies the contents of a file-like object fsrc to a file-like object fdst. The integer length is the buffer size. In case of buffered I/O, a sensible default is used. The actual number of bytes read and written can be less than length.

Code Example:

import shutil

# open source and destination files
with open('/path/to/source/file.txt', 'rb') as fsrc, open('/path/to/destination/file.txt', 'wb') as fdst:

    # use shutil.copyfileobj() to copy the file content
    shutil.copyfileobj(fsrc, fdst)

print("File content copied successfully.")

In this example, the content of the file at the location /path/to/source/file.txt is copied to the file at /path/to/destination/file.txt using file-like objects. If the operation is successful, it prints “File content copied successfully.”

copymode()

The copymode() function is a method in Python’s shutil module. It is used to copy the permission bits from the source file to the destination file.

This function does not copy ownership, timestamps or other metadata, only the file’s permission bits (read, write, execute).

Syntax:

shutil.copymode(src, dst, *, follow_symlinks=True)

This function in the shutil module copies the permission bits from the source file to the destination file. The follow_symlinks is optional and it defaults to True. If follow_symlinks is false and src is a symbolic link, this method will attempt the operation on the symbolic link itself instead of the file the link points to.

Code Example:

import shutil

# define the source and destination paths
src = "/path/to/source/file.txt"
dst = "/path/to/destination/file.txt"

# use shutil.copymode() to copy the file permissions
shutil.copymode(src, dst)

# print confirmation message
print("File permissions copied successfully.")

In this example, the permissions of the file at the location /path/to/source/file.txt are copied to the file at /path/to/destination/file.txt. If the operation is successful, it prints “File permissions copied successfully.”

copystat()

copystat() is a function in the shutil module that copies the permission bits, last access time, last modification time, and flags from src to dst. The file contents, owner, and group are unaffected.

Syntax:

shutil.copystat(src, dst, *, follow_symlinks=True)

src and dst are path-like objects or path names given as strings. The follow_symlinks argument is optional and defaults to True. If follow_symlinks is false and src is a symbolic link, this method will attempt the operation on the symbolic link itself instead of the file the link points to.

Code Example:

import shutil

# define the source and destination paths
src = "/path/to/source/file.txt"
dst = "/path/to/destination/file.txt"

# use shutil.copystat() to copy the file's metadata
shutil.copystat(src, dst)

# print confirmation message
print("File metadata copied successfully.")

In this code example, we first import the shutil module. We then define the source (src) and destination (dst) file paths. The shutil.copystat() function is used to copy the metadata from the source file to the destination file. If the operation is successful, a message “File metadata copied successfully.” is printed to the console.

copytree()

The copytree() function in the shutil module recursively copies an entire directory tree rooted at src, and returns the destination directory. It attempts to preserve symbolic links, file metadata, and permissions.

Syntax:

shutil.copytree(src, dst, symlinks=False, ignore=None, copy_function=copy2, ignore_dangling_symlinks=False)

src and dst are path-like objects or path names given as strings. Other parameters like symlinks, ignore, copy_function, and ignore_dangling_symlinks are optional.

Code Example:

import shutil

# define source directory and destination directory
src = '/path/to/source/directory'
dst = '/path/to/destination/directory'

try:
    # use shutil.copytree() to copy the directory
    shutil.copytree(src, dst)
    print("Directory copied successfully.")
except Exception as e:
    print(f"An error occurred: {e}")

In this code example, we first import the shutil module. We then define the source directory (src) and destination directory (dst). The shutil.copytree() function is used to copy the entire directory from the source to the destination. If the operation is successful, a message “Directory copied successfully.” is printed to the console. If an error occurs during the copy operation (for instance, if the source directory does not exist), the code in the except block will be executed and the error message will be printed.

disk_usage()

The disk_usage() function in the shutil module returns disk usage statistics about the given path as a named tuple with the attributes total, used and free, which are the amount of total, used and free space, in bytes.

Syntax:

shutil.disk_usage(path)

path can be a string or bytes object representing a file or directory.

Code Example:

import shutil

# define path
path = '/'

try:
    # use shutil.disk_usage() to get disk usage
    disk_usage = shutil.disk_usage(path)
    print(f"Total: {disk_usage.total / (1024**3)} GB")
    print(f"Used: {disk_usage.used / (1024**3)} GB")
    print(f"Free: {disk_usage.free / (1024**3)} GB")
except Exception as e:
    print(f"An error occurred: {e}")

In this code example, we first import the shutil module. We then define the path (path) – here it’s the root directory. The shutil.disk_usage() function is used to get the disk usage. If the operation is successful, the total, used, and free disk space in GB is printed to the console. If an error occurs during the operation (for instance, if the path does not exist), the code in the except block will be executed and the error message will be printed.

get_archive_formats()

The get_archive_formats() function in the shutil module returns a list of supported formats for creating archives.

Syntax:

shutil.get_archive_formats()

This function does not take any arguments and returns a list of tuples. Each tuple contains two elements: the name of the format and the description of the format.

Code Example:

import shutil

try:
    # Get a list of supported archive formats
    formats = shutil.get_archive_formats()
    print(f"Supported archive formats: {formats}")
except Exception as e:
    print(f"An error occurred: {e}")

In this code example, we first import the shutil module. We then call shutil.get_archive_formats() to get a list of supported archive formats. The list is then printed. If an error occurs during the operation, the code in the except block will be executed and the error message will be printed.

get_terminal_size()

The get_terminal_size() function in the shutil module returns the size of the terminal as a named tuple of type os.terminal_size containing the attributes columns and lines.

Syntax:

shutil.get_terminal_size(fallback=(columns, lines))

The fallback parameter is optional and should be a tuple containing two integer values. This will be used if the size of the terminal can’t be determined automatically.

Code Example:

import shutil

try:
    # use shutil.get_terminal_size() to get the size of the terminal
    terminal_size = shutil.get_terminal_size()
    print(f"Terminal size: {terminal_size.columns} columns x {terminal_size.lines} lines")
except Exception as e:
    print(f"An error occurred: {e}")

In this code example, we first import the shutil module. The shutil.get_terminal_size() function is used to get the size of the terminal. If the operation is successful, the size of the terminal (in columns and lines) is printed to the console. If an error occurs during the operation, the code in the except block will be executed and the error message will be printed.

get_unpack_formats()

The get_unpack_formats() function in the shutil module returns a list of all registered archive formats that can be used to unpack archives.

Syntax:

shutil.get_unpack_formats()

This function takes no arguments.

The return value is a list of tuples. Each tuple contains three strings:

The name of an archive format.
Several extensions commonly used for this format.
A description of this format.

Code Example:

import shutil

# get a list of all registered unpack formats
formats = shutil.get_unpack_formats()

# print each format
for name, extensions, description in formats:
    print(f"Name: {name}")
    print(f"Extensions: {', '.join(extensions)}")
    print(f"Description: {description}")
    print()

In this code example, we first import the shutil module. We then call shutil.get_unpack_formats() to get a list of all registered unpack formats. We iterate over this list and print the name, extensions, and description of each format.

ignore_patterns()

The ignore_patterns() function in the shutil module returns a function that can be used as the ignore argument to functions like shutil.copytree(). Patterns is a sequence of glob-style patterns that are used to specify what files and directories to ignore.

Syntax:

shutil.ignore_patterns(*patterns)

*patterns is one or more strings representing the glob-style patterns.

Code Example:

import shutil
import os

# define source and destination paths
src = '/path/to/source/directory'
dst = '/path/to/destination/directory'

try:
    # use shutil.copytree() with shutil.ignore_patterns()
    shutil.copytree(src, dst, ignore=shutil.ignore_patterns('*~', '*.pyc'))
    print(f"Copied all files from {src} to {dst} ignoring files ending with '~' and '.pyc'")
except Exception as e:
    print(f"An error occurred: {e}")

In this code example, we first import the shutil and os modules. We then define the source (src) and destination (dst) paths. The shutil.copytree() function is used to copy all files and directories from src to dst, but ignores any files ending with ‘~’ or ‘.pyc’ due to the shutil.ignore_patterns('*~', '*.pyc') function. If the operation is successful, a success message is printed to the console. If an error occurs during the operation (for instance, if the source or destination path does not exist), the code in the except block will be executed and the error message will be printed.

make_archive()

The make_archive() function in the shutil module creates an archive file (such as zip or tar) and returns its name.

Syntax:

shutil.make_archive(base_name, format[, root_dir[, base_dir[, verbose[, dry_run[, owner[, group[, logger]]]]]]])

base_name is the name of the file to create, including the path, minus any format-specific extension.
format is the archive format: one of “zip”, “tar”, “gztar”, “bztar”, or “xztar”.

Other parameters like root_dir and base_dir are optional.

Code Example:

import shutil

# define base name and format for archive
base_name = '/path/to/archive_name'
format = 'zip'

try:
    # use shutil.make_archive() to create an archive
    archive_name = shutil.make_archive(base_name, format)
    print(f"Archive created: {archive_name}")
except Exception as e:
    print(f"An error occurred: {e}")

In this code example, we first import the shutil module. We then define the base name (base_name) and format (format) for the archive. The shutil.make_archive() function is used to create an archive with the specified base name and format. If the operation is successful, the name of the archive file is printed to the console. If an error occurs during the operation (for instance, if the base name or format is not valid), the code in the except block will be executed and the error message will be printed.

move()

The move() function in the shutil module recursively moves a file or directory (src) to another location (dst) and return the destination.

Syntax:

shutil.move(src, dst, copy_function=copy2)

src and dst should be path-like objects or strings. The copy_function is optional and represent the copy function to use.

Code Example:

import shutil

# define source file/directory and destination file/directory
src = '/path/to/source/file_or_directory'
dst = '/path/to/destination/file_or_directory'

try:
    # use shutil.move() to move the file/directory
    shutil.move(src, dst)
    print("File/Directory moved successfully.")
except Exception as e:
    print(f"An error occurred: {e}")

In this code example, we first import the shutil module. We then define the source file/directory (src) and destination file/directory (dst). The shutil.move() function is used to move the file or directory from the source to the destination. If the operation is successful, a message “File/Directory moved successfully.” is printed to the console. If an error occurs during the move operation (for instance, if the source file or directory does not exist), the code in the except block will be executed and the error message will be printed.

register_archive_format()

The register_archive_format() function in the shutil module registers an archive format.

Syntax:

shutil.register_archive_format(name, function[, extra_args[, description]])

name is the name of the format.
function is the callable that will be used to create archives using this format.
extra_args is a sequence of (name, value) tuples that will be passed as arguments to the callable.
description is an optional text description of the format.

Once a format is registered, it can be used as the format argument to shutil.make_archive().

Code Example:

import shutil

def create_zip(base_name, base_dir, **kwargs):
    print(f"Creating zip archive: {base_name}.zip")

# register the archive format
shutil.register_archive_format('custom_zip', create_zip, description='Custom zip archive')

# now you can use 'custom_zip' as the format in make_archive()
try:
    # use shutil.make_archive() with the new format
    archive_name = shutil.make_archive('/path/to/archive_name', 'custom_zip')
    print(f"Archive created: {archive_name}")
except Exception as e:
    print(f"An error occurred: {e}")

In this code example, we first import the shutil module. We then define a custom function create_zip() that prints a message about creating a zip archive. This function is registered as an archive format with the name ‘custom_zip’ using shutil.register_archive_format(). We can now use ‘custom_zip’ as the format in shutil.make_archive(). If an error occurs during the operation (for instance, if the function is not valid), the code in the except block will be executed and the error message will be printed.

register_unpack_format()

The register_unpack_format() function in the shutil module registers an unpack format.

Syntax:

shutil.register_unpack_format(name, extensions, function[, extra_args[, description]])

name is the name of the format.
extensions is a list of file extensions that are used by this format.
function is the callable that will be used to unpack archives in this format.
extra_args is a sequence of (name, value) tuples that will be passed as arguments to the function.
description is an optional text description of the format.

Once a format is registered, it can be used as the format argument to shutil.unpack_archive().

Code Example:

import shutil

def unpack_zip(filename, extract_dir, **kwargs):
    print(f"Unpacking zip archive: {filename} to {extract_dir}")

# register the unpack format
shutil.register_unpack_format('custom_zip', ['.cz', '.czp'], unpack_zip, description='Custom zip format')

# now you can use 'custom_zip' as the format in unpack_archive()
try:
    # use shutil.unpack_archive() with the new format
    shutil.unpack_archive('/path/to/archive_name.cz', '/path/to/extract_dir', 'custom_zip')
    print(f"Archive unpacked")
except Exception as e:
    print(f"An error occurred: {e}")

In this code example, we first import the shutil module. We then define a custom function unpack_zip() that prints a message about unpacking a zip archive. This function is registered as an unpack format with the name ‘custom_zip’ using shutil.register_unpack_format(). We can now use ‘custom_zip’ as the format in shutil.unpack_archive(). If an error occurs during the operation (for instance, if the function is not valid), the code in the except block will be executed and the error message will be printed.

rmtree()

The rmtree() function in the shutil module recursively deletes an entire directory tree; a directory tree is a directory and all its contents, subdirectories, and their contents, etc.

Syntax:

shutil.rmtree(path, ignore_errors=False, onerror=None)

path can be a string or bytes object representing a file or directory. ignore_errors and onerror are optional parameters. If ignore_errors is true, errors resulting from failed removals will be ignored; if false or omitted, such errors are handled by calling a handler specified by onerror or, if onerror is omitted, they raise an exception.

Code Example:

import shutil

# define directory to be removed
dir_path = '/path/to/directory'

try:
    # use shutil.rmtree() to remove the directory
    shutil.rmtree(dir_path)
    print("Directory removed successfully.")
except Exception as e:
    print(f"An error occurred: {e}")

In this code example, we first import the shutil module. We then define the directory to be removed (dir_path). The shutil.rmtree() function is used to remove the entire directory. If the operation is successful, a message “Directory removed successfully.” is printed to the console. If an error occurs during the removal operation (for instance, if the directory does not exist), the code in the except block will be executed and the error message will be printed.

unregister_archive_format()

The unregister_archive_format() function in the shutil module removes an archive format from the registry.

Syntax:

shutil.unregister_archive_format(name)

name is the name of the format to remove.

Once a format is unregistered, it can no longer be used as the format argument to shutil.make_archive().

Code Example:

import shutil

def create_zip(base_name, base_dir, **kwargs):
    print(f"Creating zip archive: {base_name}.zip")

# register the archive format
shutil.register_archive_format('custom_zip', create_zip, description='Custom zip archive')

# unregister the archive format
shutil.unregister_archive_format('custom_zip')

try:
    # try to use 'custom_zip' as the format in make_archive()
    archive_name = shutil.make_archive('/path/to/archive_name', 'custom_zip')
    print(f"Archive created: {archive_name}")
except Exception as e:
    print(f"An error occurred: {e}")

In this code example, we first import the shutil module. We then define a custom function create_zip() that prints a message about creating a zip archive. This function is registered as an archive format with the name ‘custom_zip’ using shutil.register_archive_format(). Then, ‘custom_zip’ is unregistered using shutil.unregister_archive_format(). When we try to use ‘custom_zip’ as the format in shutil.make_archive(), an error occurs because ‘custom_zip’ is no longer a registered format. The code in the except block is executed and the error message is printed.

unregister_unpack_format()

The unregister_unpack_format() function in the shutil module removes an unpack format from the registry.

Syntax:

shutil.unregister_unpack_format(name)

name is the name of the format to remove.

Once a format is unregistered, it can no longer be used as the format argument to shutil.unpack_archive().

Code Example:

import shutil

def unpack_zip(filename, extract_dir, **kwargs):
    print(f"Unpacking zip archive: {filename} to {extract_dir}")

# register the unpack format
shutil.register_unpack_format('custom_zip', ['.cz', '.czp'], unpack_zip, description='Custom zip format')

# unregister the unpack format
shutil.unregister_unpack_format('custom_zip')

try:
    # try to use 'custom_zip' as the format in unpack_archive()
    shutil.unpack_archive('/path/to/archive_name.cz', '/path/to/extract_dir', 'custom_zip')
    print(f"Archive unpacked")
except Exception as e:
    print(f"An error occurred: {e}")

In this code example, we first import the shutil module. We then define a custom function unpack_zip() that prints a message about unpacking a zip archive. This function is registered as an unpack format with the name ‘custom_zip’ using shutil.register_unpack_format(). Then, ‘custom_zip’ is unregistered using shutil.unregister_unpack_format(). When we try to use ‘custom_zip’ as the format in shutil.unpack_archive(), an error occurs because ‘custom_zip’ is no longer a registered format. The code in the except block is executed and the error message is printed.

unpack_archive()

The unpack_archive() function in the shutil module extracts an archive file.

Syntax:

shutil.unpack_archive(filename[, extract_dir[, format]])

filename is the name of the archive.
extract_dir is the name of the target directory where the archive is to be extracted. If not provided, the current working directory is used.
format is the archive format: one of “zip”, “tar”, “gztar”, “bztar”, or “xztar”. Or any other registered format. If not provided, the format is inferred from the filename extension.

Code Example:

import shutil

try:
    # unpack a zip archive
    shutil.unpack_archive('archive.zip', 'extract_dir')
    print(f"Archive unpacked")
except Exception as e:
    print(f"An error occurred: {e}")

In this code example, we first import the shutil module. We then call shutil.unpack_archive() to unpack a zip archive named ‘archive.zip’ into a directory named ‘extract_dir’. If an error occurs during the operation (for instance, if the archive does not exist), the code in the except block will be executed and the error message will be printed.

which()

The which() function in the shutil module returns the path to an executable which would be run if the given cmd was called. If no cmd would be called, return None.

Syntax:

shutil.which(cmd, mode=os.F_OK | os.X_OK, path=None)

cmd is a string representing the command to be executed. The mode and path parameters are optional.

Code Example:

import shutil

# define command
cmd = 'python'

try:
    # use shutil.which() to find the path to the executable
    path_to_executable = shutil.which(cmd)
    print(f"Path to executable: {path_to_executable}")
except Exception as e:
    print(f"An error occurred: {e}")

In this code example, we first import the shutil module. We then define the command (cmd) – here it’s ‘python’. The shutil.which() function is used to find the path to the executable. If the operation is successful, the path to the python executable is printed to the console. If an error occurs during the operation (for instance, if the command does not exist), the code in the except block will be executed and the error message will be printed.

Conclusion

Python’s shutil module is an incredibly useful tool for automating high-level file operations. Its capabilities range from simple tasks like copying and moving files or directories, to more complex ones like preserving file metadata, handling directory trees, and managing file permissions. The functionality it offers is robust and flexible, making it a go-to choice for developers dealing with file manipulation tasks.
So, next time you find yourself needing to work with files or directories in Python, don’t reinvent the wheel – consider leveraging the power of shutil!

Python shutil: A Comprehensive Guide

What is the shutil module?

chown()

copy()

copy2()

copyfile()

copyfileobj()

copymode()

copystat()

copytree()

disk_usage()

get_archive_formats()

get_terminal_size()

get_unpack_formats()

ignore_patterns()

make_archive()

move()

register_archive_format()

register_unpack_format()

rmtree()

unregister_archive_format()

unregister_unpack_format()

unpack_archive()

which()

Conclusion

Python os.path.join() method: A Detailed Guide

Python BytesIO: What is it & How to use it?

Python: How to Get String Length

Python Nested Dictionary: Complete Guide

Python: os.walk() Explained

Python: How to Cast to a String (toString equivalent)

What is the shutil module?

chown()

copy()

copy2()

copyfile()

copyfileobj()

copymode()

copystat()

copytree()

disk_usage()

get_archive_formats()

get_terminal_size()

get_unpack_formats()

ignore_patterns()

make_archive()

move()

register_archive_format()

register_unpack_format()

rmtree()

unregister_archive_format()

unregister_unpack_format()

unpack_archive()

which()

Conclusion

Similar Posts