Wordpress Content Extractor avatar

Wordpress Content Extractor

Pricing

from $3.00 / 1,000 results

Go to Apify Store
Wordpress Content Extractor

Wordpress Content Extractor

📝 Extract complete content from WordPress sites including posts, categories, and metadata. Perfect for content migration, blog aggregation, and CMS integration.

Pricing

from $3.00 / 1,000 results

Rating

0.0

(0)

Developer

SimplifySME Toolbox

SimplifySME Toolbox

Maintained by Community

Actor stats

0

Bookmarked

13

Total users

1

Monthly active users

4 months ago

Last modified

Share

📝 Extract complete content from WordPress sites including posts, categories, and metadata. Perfect for content migration, blog aggregation, and CMS integration.


📺 What It Extracts

  • Site Metadata: Site information and details
  • Posts: All posts with full content, metadata, and media
  • Categories: All post categories with metadata
  • Statistics: Total post and category counts

🚀 Key Features

FeatureDescription
📝 Complete ContentExtracts all posts with full content and metadata
🏷️ Category SupportExtracts all categories and their relationships
🖼️ Media ExtractionIncludes featured images and media URLs
📊 Structured OutputClean JSON format with nested post and category data
Fast PerformanceDirect API access for quick data retrieval
🔄 Pagination SupportHandles large sites with configurable post limits

📥 Input

Required

  • siteUrl (string): The WordPress site URL
    • Example: "https://example.com"
    • Supports any WordPress site with REST API enabled

Optional

  • maxPosts (integer, default: 100): Maximum number of posts to extract
    • Example: 200

📤 Output

Returns comprehensive WordPress content data:

Site Metadata

{
"site": {
"name": "Site Name",
"description": "Site description...",
"url": "https://example.com"
}
}

Posts Array

{
"posts": [
{
"id": 123,
"title": "Post Title",
"content": "Post content...",
"excerpt": "Post excerpt...",
"author": "Author Name",
"date": "2024-01-01T00:00:00Z",
"modified": "2024-01-15T00:00:00Z",
"slug": "post-slug",
"status": "publish",
"link": "https://example.com/post-slug",
"categories": [1, 2],
"tags": [3, 4],
"featuredMedia": 456,
"featuredImageUrl": "https://example.com/image.jpg"
}
],
"totalPosts": 100
}

Categories Array

{
"categories": [
{
"id": 1,
"name": "Category Name",
"slug": "category-slug",
"description": "Category description...",
"count": 25
}
],
"totalCategories": 10
}

💡 Use Cases

  • Blog Aggregation - Collect content from multiple WordPress sites
  • Content Migration - Extract content for platform migration
  • Content Research - Analyze blog content and topics
  • CMS Integration - Import WordPress content into other systems
  • Content Analysis - Study content patterns and categories
  • Backup & Archive - Create backups of WordPress content

⚙️ Technical Details

  • Extraction Method: Direct API access to WordPress REST API endpoints
  • REST API: Uses WordPress REST API (usually available at /wp-json/wp/v2/)
  • Pagination: Handles pagination for large sites with configurable limits
  • Error Handling: Validates responses and handles missing data gracefully
  • Performance: Fast API-based extraction without browser overhead

📝 Example Usage

Basic Extraction

{
"siteUrl": "https://example.com"
}

With Post Limit

{
"siteUrl": "https://example.com",
"maxPosts": 200
}

⚠️ Important Notes

  • This actor uses WordPress REST API endpoints
  • The REST API must be enabled on the WordPress site (usually enabled by default)
  • Some WordPress sites may have REST API disabled or restricted
  • Featured images are extracted when available
  • Categories and tags are included with their IDs and metadata