# Appsumo Scraper (`scraper-mind/appsumo-scraper`) Actor

This AppSumo scraper efficiently handles bulk extraction of AppSumo product data including AppSumo prices, reviews, ratings, product descriptions, and comprehensive AppSumo deal information.

- **URL**: https://apify.com/scraper-mind/appsumo-scraper.md
- **Developed by:** [Scraper Mind](https://apify.com/scraper-mind) (community)
- **Categories:** Automation, Developer tools, Other
- **Stats:** 14 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

$5.00/month + usage

To use this Actor, you pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period.You also pay for the Apify platform usage, which gets cheaper the higher Apify subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#rental-actors

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

### AppSumo Products Scraper

A robust, production-grade Apify Actor that extracts detailed product information from AppSumo product pages. This scraper efficiently handles bulk extraction of product data including prices, reviews, ratings, descriptions, and more.

### Why Choose Our AppSumo Products Scraper?

🚀 **Production-Ready**: Built with enterprise-grade reliability and error handling  
🔄 **Smart Proxy Management**: Automatic fallback to residential proxies when blocked  
⚡ **High Performance**: Concurrent processing with intelligent rate limiting  
💾 **Live Data Saving**: Data is saved immediately as it's processed (crash-safe)  
📊 **Comprehensive Data**: Extracts all key product metrics and details  
🛡️ **Anti-Detection**: Random delays and residential proxy support  
🔧 **Flexible Configuration**: Customizable retry logic and request timing  

### Key Features

- **Bulk URL Processing**: Process multiple AppSumo product URLs simultaneously
- **Smart Proxy Handling**: Starts with direct requests, automatically falls back to residential proxies when blocked
- **Robust Error Handling**: 3-tier retry system with exponential backoff
- **Real-time Data Saving**: Products are saved to dataset as soon as they're processed
- **Comprehensive Product Data**: Extracts names, prices, reviews, ratings, descriptions, and metadata
- **Rate Limiting Protection**: Built-in delays and request throttling to avoid blocks
- **Detailed Logging**: Real-time progress updates and comprehensive error reporting
- **High Success Rate**: Designed to handle AppSumo's anti-bot measures effectively

### Input

The actor accepts the following input parameters:

```json
{
  "startUrls": [
    { "url": "https://appsumo.com/products/pinreel-animated-video-maker/" },
    { "url": "https://appsumo.com/products/kopify-ai/" }
  ],
  "proxyConfiguration": {
    "useApifyProxy": false
  },
  "maxRetries": 3,
  "requestDelay": 1
}
````

#### Input Parameters

- **startUrls** (required): Array of AppSumo product URLs to scrape
- **proxyConfiguration**: Proxy settings (defaults to no proxy with automatic fallback)
- **maxRetries**: Number of retry attempts for failed requests (1-10, default: 3)
- **requestDelay**: Base delay between requests in seconds (0-10, default: 1)

### Output

The actor outputs structured JSON data for each product:

```json
{
  "url": "https://appsumo.com/products/pinreel-animated-video-maker/",
  "public_name": "PinReel Animated Video Maker",
  "price": "$59",
  "review_count": 124,
  "average_rating": 4.5,
  "comment_count": 89,
  "story_subheader": "Create engaging animated videos in minutes",
  "story_snippet": "Transform your ideas into professional animated videos...",
  "scraped_at": "2025-01-20T10:30:00.000Z",
  "success": true
}
```

#### Output Fields

- **url**: The original AppSumo product URL
- **public\_name**: Product title/name
- **price**: Product price (with currency)
- **review\_count**: Number of product reviews
- **average\_rating**: Average user rating (out of 5)
- **comment\_count**: Number of comments on the product
- **story\_subheader**: Product tagline/subheader
- **story\_snippet**: Product description snippet
- **scraped\_at**: Timestamp when the data was extracted
- **success**: Boolean indicating if the scrape was successful

### 🚀 How to Use the Actor (via Apify Console)

1. **Log in** at https://console.apify.com and go to **Actors**
2. **Search** for "appsumo-products-scraper" or navigate to the actor page
3. **Configure inputs**:
   - Add your AppSumo product URLs in the "startUrls" field
   - Configure proxy settings (recommended: keep default with automatic fallback)
   - Adjust retry attempts and delays if needed
4. **Run the actor** by clicking the "Start" button
5. **Monitor progress** in real-time through the log output
6. **Access results** in the "Dataset" tab as data is being scraped
7. **Export data** to JSON, CSV, or Excel format when complete

### Best Use Cases

- **Market Research**: Analyze AppSumo product offerings and pricing trends
- **Competitive Intelligence**: Monitor competitor products and pricing strategies
- **Deal Tracking**: Track lifetime deals and their performance metrics
- **Content Creation**: Gather product information for reviews or comparisons
- **Price Monitoring**: Regular monitoring of AppSumo deals and pricing
- **Lead Generation**: Identify popular products and market trends
- **Academic Research**: Study marketplace dynamics and consumer behavior

### Proxy Strategy

The actor implements a smart proxy strategy:

1. **Default Mode**: Starts with direct requests (no proxy) for fastest speeds
2. **Automatic Fallback**: Switches to residential proxies if requests are blocked
3. **Persistent Proxy**: Once fallback occurs, continues using residential proxy
4. **Retry Logic**: Up to 3 retry attempts with exponential backoff

### Rate Limiting & Performance

- **Concurrent Processing**: Processes multiple URLs simultaneously
- **Smart Rate Limiting**: 5 requests per second with randomized delays
- **Anti-Detection**: Random 0-0.5 second delays added to base request timing
- **Efficient Resource Usage**: Optimized memory and CPU usage for large datasets

### Technical Specifications

- **Runtime**: Python 3.9+ with asyncio for concurrent processing
- **Dependencies**: Apify SDK, HTTPX for async requests, Parsel for HTML parsing
- **Data Extraction**: Advanced JSON parsing from Next.js **NEXT\_DATA** objects
- **Error Handling**: Comprehensive exception handling with detailed logging
- **Memory Management**: Streaming data processing for large URL lists

### Frequently Asked Questions

**Q: How many URLs can I process at once?**
A: The actor can handle hundreds of URLs efficiently. For very large lists (1000+), consider splitting them into smaller batches for optimal performance.

**Q: What happens if some URLs fail?**
A: Failed URLs are retried up to 3 times with different strategies (proxy fallback, delays). All successfully scraped data is saved even if some URLs fail.

**Q: Do I need to configure proxies?**
A: No configuration needed! The actor automatically handles proxy management. It starts without proxies for speed, then falls back to residential proxies if blocked.

**Q: How fresh is the scraped data?**
A: Data is scraped in real-time from live AppSumo pages. Each record includes a timestamp showing exactly when it was extracted.

**Q: Can I scrape the same URLs multiple times?**
A: Yes! This is useful for monitoring price changes, review updates, or tracking product performance over time.

**Q: What if AppSumo blocks the scraper?**
A: The actor includes advanced anti-blocking measures: automatic residential proxy fallback, randomized delays, realistic browser headers, and retry logic.

**Q: How do I know which URLs failed?**
A: The actor logs all failed URLs and provides detailed statistics at the end. Failed entries are also saved to the dataset with `success: false`.

**Q: Can I customize the scraping speed?**
A: Yes! Use the `requestDelay` parameter to adjust base delays. Higher values = slower but more reliable scraping.

### Error Handling

The actor handles various error scenarios gracefully:

- **Network Timeouts**: Automatic retries with exponential backoff
- **HTTP Errors**: Smart handling of 403, 429, 503 status codes
- **Parsing Failures**: Continues processing other URLs if one fails
- **Proxy Issues**: Automatic fallback and proxy switching
- **Rate Limiting**: Built-in delays and request throttling

### Data Quality Assurance

- **Schema Validation**: All output data follows a consistent structure
- **Data Completeness**: Flags incomplete extractions with `success: false`
- **Timestamp Tracking**: Every record includes extraction timestamp
- **URL Validation**: Ensures all input URLs are properly formatted
- **Duplicate Prevention**: Handles duplicate URLs gracefully

### Legal & Ethical Compliance

⚖️ **Legal Notice**: This actor extracts data only from publicly available AppSumo product pages. It respects robots.txt and implements ethical scraping practices.

🔒 **Privacy**: No personal data or private information is collected. Only public product information is extracted.

📋 **Responsibility**: Users are responsible for ensuring their use complies with AppSumo's Terms of Service and applicable laws.

🌍 **Rate Limiting**: The actor implements reasonable delays to avoid overloading AppSumo's servers.

### Support and Feedback

For technical support, feature requests, or bug reports:

1. **Actor Issues**: Use the feedback button in the Apify Console
2. **Feature Requests**: Contact through the Apify platform messaging
3. **Custom Requirements**: Available for custom modifications and enterprise features

### Version History

- **v1.0.0**: Initial release with full AppSumo product scraping capabilities
- **Features**: Bulk processing, proxy fallback, real-time saving, comprehensive error handling

***

**Built with ❤️ using the Apify platform**

*This actor is optimized for AppSumo's current website structure (as of 2025). Regular updates ensure compatibility with any site changes.*

# Actor input Schema

## `startUrls` (type: `array`):

List of AppSumo product URLs to scrape (e.g., https://appsumo.com/products/product-name/). Supports bulk input for multiple products.

## `proxyConfiguration` (type: `object`):

Choose proxy settings. By default, no proxy is used. If requests are blocked, the actor automatically falls back to residential proxies.

## `maxRetries` (type: `integer`):

Number of retry attempts for failed requests (1-10).

## `requestDelay` (type: `integer`):

Base delay between requests. Random 0-0.5s will be added automatically to avoid rate limiting.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://appsumo.com/products/pinreel-animated-video-maker/"
    },
    {
      "url": "https://appsumo.com/products/kopify-ai/"
    }
  ],
  "proxyConfiguration": {
    "useApifyProxy": false
  },
  "maxRetries": 3,
  "requestDelay": 1
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://appsumo.com/products/pinreel-animated-video-maker/"
        },
        {
            "url": "https://appsumo.com/products/kopify-ai/"
        }
    ],
    "proxyConfiguration": {
        "useApifyProxy": false
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("scraper-mind/appsumo-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [
        { "url": "https://appsumo.com/products/pinreel-animated-video-maker/" },
        { "url": "https://appsumo.com/products/kopify-ai/" },
    ],
    "proxyConfiguration": { "useApifyProxy": False },
}

# Run the Actor and wait for it to finish
run = client.actor("scraper-mind/appsumo-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://appsumo.com/products/pinreel-animated-video-maker/"
    },
    {
      "url": "https://appsumo.com/products/kopify-ai/"
    }
  ],
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}' |
apify call scraper-mind/appsumo-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=scraper-mind/appsumo-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Appsumo Scraper",
        "description": "This AppSumo scraper efficiently handles bulk extraction of AppSumo product data including AppSumo prices, reviews, ratings, product descriptions, and comprehensive AppSumo deal information.",
        "version": "0.0",
        "x-build-id": "azxLzQYLB4VwYIGXN"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/scraper-mind~appsumo-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-scraper-mind-appsumo-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/scraper-mind~appsumo-scraper/runs": {
            "post": {
                "operationId": "runs-sync-scraper-mind-appsumo-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/scraper-mind~appsumo-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-scraper-mind-appsumo-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "startUrls"
                ],
                "properties": {
                    "startUrls": {
                        "title": "AppSumo Product URLs",
                        "type": "array",
                        "description": "List of AppSumo product URLs to scrape (e.g., https://appsumo.com/products/product-name/). Supports bulk input for multiple products.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Choose proxy settings. By default, no proxy is used. If requests are blocked, the actor automatically falls back to residential proxies."
                    },
                    "maxRetries": {
                        "title": "Maximum Retries",
                        "minimum": 1,
                        "maximum": 10,
                        "type": "integer",
                        "description": "Number of retry attempts for failed requests (1-10).",
                        "default": 3
                    },
                    "requestDelay": {
                        "title": "Request Delay (seconds)",
                        "minimum": 0,
                        "maximum": 10,
                        "type": "integer",
                        "description": "Base delay between requests. Random 0-0.5s will be added automatically to avoid rate limiting.",
                        "default": 1
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
