Developer Tools

PDF Document Processor

PDF 文档处理

Intelligent PDF parsing with text, table and image extraction. Supports batch processing of energy industry reports.

PDF文档提取表格能源报告
Install Command
npx openclaw skills install pdf-processor
Version
1.0.0
Author
shuzhihui
Updated
Sat Apr 11 2026 00:00:00 GMT+0000 (Coordinated Universal Time)

Features

  • Text extraction with layout preservation
  • Table detection and structured output (CSV/Excel/JSON)
  • Image extraction from PDF pages
  • Batch processing for multiple files
  • Energy report-specific templates
  • Use Cases

  • **Research Team**: Extract data from industry research reports
  • **Archive Management**: Convert paper documents to searchable digital format
  • **Data Analysis**: Pull tables from financial/technical reports
  • Installation

    npx openclaw skills install pdf-processor

    Usage Examples

    # Extract text from PDF

    pdf-processor extract --input report.pdf --format text

    # Extract tables to CSV

    pdf-processor extract --input report.pdf --format table --output data.csv

    # Batch process directory

    pdf-processor batch --input ./reports/ --format text

    # Energy report specific extraction

    pdf-processor extract --input energy-report.pdf --template energy --include charts

    Supported Formats

    | Output | Description |

    |--------|-------------|

    | text | Plain text with paragraph structure |

    | markdown | Markdown formatted output |

    | csv | Structured table data |

    | json | JSON with metadata and content |

    | excel | Multi-sheet Excel workbook |