PDF Merging and Splitting Reference
This reference covers merging and splitting operations. Read it when the user asks to combine PDFs, merge files, reorder pages, split a PDF into parts, or extract specific pages as a new PDF.
Basic merge
Combine multiple PDFs into one, in the order provided:
python3 ${CLAUDE_SKILL_DIR}/scripts/merge.py \ combined.pdf \ chapter1.pdf \ chapter2.pdf \ chapter3.pdfThe output file is the first argument. Input files follow in the desired order. The script handles any number of input files.
Page ranges within a merge
To include only specific pages from an input file, append a colon and a range:
python3 ${CLAUDE_SKILL_DIR}/scripts/merge.py \ report.pdf \ cover.pdf \ "body.pdf:2-15" \ appendix.pdfRange formats:
| Format | Meaning |
|---|---|
file.pdf:3 | Page 3 only |
file.pdf:2-8 | Pages 2 through 8 inclusive |
file.pdf:1,4,7 | Pages 1, 4, and 7 |
file.pdf:5- | Page 5 through the last page |
Pages are 1-indexed.
Reordering pages within a single PDF
To reorder pages in an existing PDF, treat it as a merge with page range selections:
# Reverse all pages of a 10-page PDFpython3 ${CLAUDE_SKILL_DIR}/scripts/merge.py \ reversed.pdf \ "original.pdf:10" \ "original.pdf:9" \ "original.pdf:8" \ "original.pdf:7" \ "original.pdf:6" \ "original.pdf:5" \ "original.pdf:4" \ "original.pdf:3" \ "original.pdf:2" \ "original.pdf:1"For large reordering operations, use pypdf directly:
import pypdf
reader = pypdf.PdfReader("original.pdf")writer = pypdf.PdfWriter()
# Example: move the last page to the frontorder = [len(reader.pages) - 1] + list(range(len(reader.pages) - 1))for i in order: writer.add_page(reader.pages[i])
with open("reordered.pdf", "wb") as f: writer.write(f)Adding bookmarks (outlines)
Bookmarks help readers navigate a merged document. After merging, add top-level bookmarks for each section:
import pypdf
reader = pypdf.PdfReader("merged.pdf")writer = pypdf.PdfWriter()writer.append(reader)
# Add bookmarks (page indices are 0-based here)writer.add_outline_item("Introduction", 0)writer.add_outline_item("Chapter 1", 2)writer.add_outline_item("Chapter 2", 15)writer.add_outline_item("Appendix", 42)
with open("merged-with-bookmarks.pdf", "wb") as f: writer.write(f)Nested bookmarks (sub-sections) use the parent parameter:
ch1 = writer.add_outline_item("Chapter 1", 2)writer.add_outline_item("1.1 Background", 3, parent=ch1)writer.add_outline_item("1.2 Methods", 7, parent=ch1)Splitting a PDF into individual pages
To extract every page as its own file:
python3 ${CLAUDE_SKILL_DIR}/scripts/merge.py --split original.pdfThis creates original-page-001.pdf, original-page-002.pdf, etc., in the current directory. The zero-padded numbering ensures correct alphabetical sort order for up to 999 pages.
For custom naming or a specific output directory, use pypdf directly:
import pypdffrom pathlib import Path
reader = pypdf.PdfReader("original.pdf")out_dir = Path("pages")out_dir.mkdir(exist_ok=True)
for i, page in enumerate(reader.pages, 1): writer = pypdf.PdfWriter() writer.add_page(page) out_path = out_dir / f"page-{i:03d}.pdf" with open(out_path, "wb") as f: writer.write(f) print(f"Wrote {out_path}")Splitting at a page boundary
To split a 50-page PDF at page 20 (pages 1-20 in part A, 21-50 in part B):
python3 ${CLAUDE_SKILL_DIR}/scripts/merge.py \ part-a.pdf \ "document.pdf:1-20"
python3 ${CLAUDE_SKILL_DIR}/scripts/merge.py \ part-b.pdf \ "document.pdf:21-50"Handling encrypted PDFs
Encrypted PDFs require the user password to open. pypdf needs it before merging:
import pypdf
reader = pypdf.PdfReader("locked.pdf", password="the-password")writer = pypdf.PdfWriter()writer.append(reader)
# The output will NOT be encrypted by defaultwith open("unlocked.pdf", "wb") as f: writer.write(f)If you need to merge an encrypted PDF without stripping its encryption in the output, this is not reliably supported by pypdf alone — inform the user.
Preserving metadata
By default pypdf does not copy metadata from the input files to the merged output. To copy metadata from the first input file:
import pypdf
reader = pypdf.PdfReader("first.pdf")writer = pypdf.PdfWriter()# ... append pages ...writer.add_metadata(reader.metadata)
with open("merged.pdf", "wb") as f: writer.write(f)Common errors and fixes
| Error | Cause | Fix |
|---|---|---|
ModuleNotFoundError: No module named 'pypdf' | Library not installed | pip install pypdf |
FileNotFoundError | Input file path wrong | Verify each path with ls |
pypdf.errors.FileNotDecryptedError | Encrypted input | Provide password as shown above |
| Output PDF is larger than expected | Duplicate embedded fonts/images | Normal for merging unrelated PDFs; not fixable without re-distilling |
| Bookmarks missing in merged output | Source bookmarks not copied | Copy them explicitly with writer.add_outline_item |