
关于
使用 Nutrient DWS API 处理、转换、OCR、提取、脱敏、签名和填写文档。支持 PDF、DOCX、XLSX、PPTX、HTML 和图像
name: nutrient-document-processing description: 使用 Nutrient DWS API 处理、转换、OCR、提取、脱敏、签名和填写文档。支持 PDF、DOCX、XLSX、PPTX、HTML 和图片。 origin: ECC
Nutrient 文档处理
注意: 此技能集成了 Nutrient 商业 API。使用前请查阅其条款。
使用 Nutrient DWS Processor API 处理文档。转换格式、提取文本和表格、OCR 扫描文档、脱敏 PII、添加水印、数字签名和填写 PDF 表单。
设置
在 nutrient.io 获取免费 API 密钥
export NUTRIENT_API_KEY="pdf_live_..."
所有请求以 multipart POST 形式发送到 https://api.nutrient.io/build,包含一个 instructions JSON 字段。
操作
转换文档
# DOCX to PDF
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.docx=@document.docx" \
-F 'instructions={"parts":[{"file":"document.docx"}]}' \
-o output.pdf
# PDF to DOCX
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"docx"}}' \
-o output.docx
# HTML to PDF
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "index.html=@index.html" \
-F 'instructions={"parts":[{"html":"index.html"}]}' \
-o output.pdf
支持的输入格式:PDF、DOCX、XLSX、PPTX、DOC、XLS、PPT、PPS、PPSX、ODT、RTF、HTML、JPG、PNG、TIFF、HEIC、GIF、WebP、SVG、TGA、EPS。
提取文本和数据
# Extract plain text
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"text"}}' \
-o output.txt
# Extract tables as Excel
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"xlsx"}}' \
-o tables.xlsx
OCR 扫描文档
# OCR to searchable PDF (supports 100+ languages)
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "scanned.pdf=@scanned.pdf" \
-F 'instructions={"parts":[{"file":"scanned.pdf"}],"actions":[{"type":"ocr","language":"english"}]}' \
-o searchable.pdf
语言支持:通过 ISO 639-2 代码支持 100+ 种语言(如 eng、deu、fra、spa、jpn、kor、chi_sim、chi_tra、ara、hin、rus)。也支持完整语言名称如 english 或 german。查看完整 OCR 语言支持表了解所有支持的代码。
脱敏敏感信息
# Pattern-based (SSN, email)
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"preset","strategyOptions":{"preset":"social-security-number"}},{"type":"redaction","strategy":"preset","strategyOptions":{"preset":"email-address"}}]}' \
-o redacted.pdf
# Regex-based
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"regex","strategyOptions":{"regex":"\\b[A-Z]{2}\\d{6}\\b"}}]}' \
-o redacted.pdf
预设模式:social-security-number、email-address、credit-card-number、international-phone-number、north-american-phone-number、date、time、url、ipv4、ipv6、mac-address、us-zip-code、vin。
添加水印
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"watermark","text":"CONFIDENTIAL","fontSize":72,"opacity":0.3,"rotation":-45}]}' \
-o watermarked.pdf
数字签名
# Self-signed CMS signature
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "document.pdf=@document.pdf" \
-F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"sign","signatureType":"cms"}]}' \
-o signed.pdf
填写 PDF 表单
curl -X POST https://api.nutrient.io/build \
-H "Authorization: Bearer $NUTRIENT_API_KEY" \
-F "form.pdf=@form.pdf" \
-F 'instructions={"parts":[{"file":"form.pdf"}],"actions":[{"type":"fillForm","formFields":{"name":"Jane Smith","email":"jane@example.com","date":"2026-02-06"}}]}' \
-o filled.pdf
MCP 服务器(替代方案)
如需原生工具集成,请使用 MCP
兼容工具
Claude CodeCursor
标签
前端开发
