Archive Unused AsciiDoc Files
⚠️ WARNING: This Tool Can Delete Files
ALWAYS:
- Create a git branch first:
git checkout -b cleanup-unused-files
- Run without
--archive
first to preview what will be affected- Review the list carefully - ensure no files are incorrectly marked as unused
- Check your documentation build after archiving to ensure nothing broke
- Keep the archive files until you’re certain the removal was correct
This tool automatically discovers and scans all modules
and assemblies
directories in your repository for AsciiDoc files not referenced by any other AsciiDoc file in the project. Optionally archives and deletes them.
Auto-Discovery: The tool recursively searches for all directories named modules
or assemblies
that contain .adoc
files, regardless of their location in your repository structure. This works with:
- Standard structures (
./modules
,./assemblies
) - Nested structures (
./downstream/modules
,./content/assemblies
) - Multiple module directories in different locations
The tool automatically detects your repository type:
- OpenShift-docs style: Uses
_topic_maps/*.yml
files to determine file references - Traditional AsciiDoc: Looks for
master.adoc
files andinclude::
directives - Both: The tool always scans for
include::
directives in addition to topic maps
Installation
After installing the package from PyPI:
pip install rolfedh-doc-utils
You can run the tool from anywhere using:
archive-unused-files [options]
Or, if running from source:
python3 archive_unused_files.py [options]
Usage
See the script’s --help
output or the docstring for all options. Common options include:
--archive
— Move the files to a dated zip in the archive directory.--scan-dir
— Specify a specific directory to scan (can be used multiple times). If not specified, auto-discovers all modules and assemblies directories.--exclude-dir
— Directory to exclude (can be used multiple times).--exclude-file
— File to exclude (can be used multiple times).--exclude-list
— Path to a file containing directories or files to exclude, one per line.
Examples
Basic Usage (Auto-Discovery)
Preview what would be archived (dry run):
archive-unused-files
# Automatically discovers all modules and assemblies directories
# Shows: Auto-discovered directories to scan:
# - ./downstream/modules
# - ./downstream/assemblies
Archive unused files:
archive-unused-files --archive
Specifying Directories
Scan specific directories only:
archive-unused-files --scan-dir ./content/modules --scan-dir ./content/assemblies
Exclusions
Exclude specific directories:
archive-unused-files --archive --exclude-dir ./modules/legacy --exclude-dir ./modules/wip
Use an exclusion list file:
archive-unused-files --archive --exclude-list .docutils-ignore
Example exclusion list file (.docutils-ignore
):
# Directories to exclude
./modules/archived/
./modules/wip/
# Specific files to exclude
./assemblies/draft.adoc
Output
- Prints unused files to the terminal.
- Creates a manifest file in the
archive/
directory. - Optionally creates a zip archive of unused files.
See the main README.md for more details on installation and usage as a package.