Archive Unused AsciiDoc Files
⚠️ WARNING: This Tool Can Delete Files
ALWAYS:
- Create a git branch first:
git checkout -b cleanup-unused-files- Run without
--archivefirst to preview what will be affected- Review the list carefully - ensure no files are incorrectly marked as unused
- Check your documentation build after archiving to ensure nothing broke
- Keep the archive files until you’re certain the removal was correct
This tool automatically discovers and scans all modules and assemblies directories in your repository for AsciiDoc files not referenced by any other AsciiDoc file in the project. Optionally archives and deletes them.
Auto-Discovery: The tool recursively searches for all directories named modules or assemblies that contain .adoc files, regardless of their location in your repository structure. This works with:
- Standard structures (
./modules,./assemblies) - Nested structures (
./downstream/modules,./content/assemblies) - Multiple module directories in different locations
The tool automatically detects your repository type:
- OpenShift-docs style: Uses
_topic_maps/*.ymlfiles to determine file references - Traditional AsciiDoc: Looks for
master.adocfiles andinclude::directives - Both: The tool always scans for
include::directives in addition to topic maps
Installation
After installing the package from PyPI:
pip install rolfedh-doc-utils
You can run the tool from anywhere using:
archive-unused-files [options]
Or, if running from source:
python3 archive_unused_files.py [options]
Usage
See the script’s --help output or the docstring for all options. Common options include:
--archive— Move the files to a dated zip in the archive directory.--commented— Include files that are referenced only in commented lines in the archive operation.--scan-dir— Specify a specific directory to scan (can be used multiple times). If not specified, auto-discovers all modules and assemblies directories.--exclude-dir— Directory to exclude (can be used multiple times).--exclude-file— File to exclude (can be used multiple times).--exclude-list— Path to a file containing directories or files to exclude, one per line.
Commented References Behavior
Default behavior (no flag):
- Always scans for both uncommented and commented includes when scanning for references
- Files referenced only in commented lines are considered “used” and will NOT be archived
- Generates a detailed report of files referenced only in commented lines
- Report location:
./archive/commented-references-report.txt
With –commented flag:
- Include files that are referenced only in commented lines in the archive operation
- These files will be treated as unused and archived along with other unused files
Examples
Basic Usage (Auto-Discovery)
Preview what would be archived (dry run):
archive-unused-files
# Automatically discovers all modules and assemblies directories
# Shows: Auto-discovered directories to scan:
# - ./downstream/modules
# - ./downstream/assemblies
Archive unused files:
archive-unused-files --archive
Specifying Directories
Scan specific directories only:
archive-unused-files --scan-dir ./content/modules --scan-dir ./content/assemblies
Exclusions
Exclude specific directories:
archive-unused-files --archive --exclude-dir ./modules/legacy --exclude-dir ./modules/wip
Use an exclusion list file:
archive-unused-files --archive --exclude-list .docutils-ignore
Example exclusion list file (.docutils-ignore):
# Directories to exclude
./modules/archived/
./modules/wip/
# Specific files to exclude
./assemblies/draft.adoc
Working with Commented References
Preview and generate report on commented-only references:
archive-unused-files
# This will:
# 1. Find unused files (not counting commented includes as "used")
# 2. Generate a report of files referenced only in commented lines
# 3. Save report to ./archive/commented-references-report.txt
Archive files including those with commented-only references:
archive-unused-files --archive --commented
# This will archive both:
# 1. Files with no references at all
# 2. Files referenced only in commented lines
Output
- Prints unused files to the terminal.
- Creates a manifest file in the
archive/directory. - Optionally creates a zip archive of unused files.
See the main README.md for more details on installation and usage as a package.