Skip to content

Commit 03ea87b

Browse files
committed
docs: Add ImageMagick policy error troubleshooting guide
Add troubleshooting section for PDF conversion failures with embedded images due to ImageMagick security policy restrictions. Fixes #1604
1 parent c6308dc commit 03ea87b

File tree

1 file changed

+32
-0
lines changed

1 file changed

+32
-0
lines changed

README.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,38 @@ cd markitdown
7474
pip install -e 'packages/markitdown[all]'
7575
```
7676

77+
## Troubleshooting
78+
79+
### ImageMagick Policy Error when Converting PDFs with Images
80+
81+
If you encounter an error like `PolicyError: not authorized PDF at error/constitute.c/ReadImage/1243` when converting PDFs containing embedded images, this is caused by ImageMagick's security policy that restricts PDF processing by default.
82+
83+
**Solution:**
84+
85+
1. **Edit ImageMagick's policy file** (typically at `/etc/ImageMagick-6/policy.xml` or `/etc/ImageMagick/policy.xml`):
86+
87+
```bash
88+
# Find and comment out or modify the PDF restriction line
89+
sudo sed -i 's/<policy domain="coder" rights="none" pattern="PDF" \/>/<!-- <policy domain="coder" rights="none" pattern="PDF" \/> -->/' /etc/ImageMagick-6/policy.xml
90+
```
91+
92+
Or edit manually to change rights from "none" to "read|write":
93+
94+
```xml
95+
<policy domain="coder" rights="read|write" pattern="PDF" />
96+
```
97+
98+
2. **Restart ImageMagick** (if applicable):
99+
100+
```bash
101+
# On some systems
102+
sudo systemctl restart imagemagick
103+
```
104+
105+
3. **Alternative: Use PIL directly** - If you can't modify system policies, the `markitdown-ocr` plugin can use PyMuPDF directly to extract images without ImageMagick.
106+
107+
For more details, see ImageMagick's [Security Policy](https://imagemagick.org/script/security-policy.php) documentation.
108+
77109
## Usage
78110

79111
### Command-Line

0 commit comments

Comments
 (0)