# Aussie Parser Spider (`getdataforme/aussie-parser-spider`) Actor

The Aussie Parser Spider scrapes detailed product data from Aussie.com, extracting names, descriptions, ingredients, and usage instructions....

- **URL**: https://apify.com/getdataforme/aussie-parser-spider.md
- **Developed by:** [GetDataForMe](https://apify.com/getdataforme) (community)
- **Categories:** AI, E-commerce, Automation
- **Stats:** 3 total users, 0 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $8.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Aussie Parser Spider

### Introduction

The Aussie Parser Spider is a powerful Apify Actor designed to scrape detailed product information from Aussie.com, Australia's leading hair care brand. It extracts comprehensive data such as product names, descriptions, ingredients, and usage instructions, enabling users to gather valuable insights for market research, competitive analysis, and e-commerce applications. This Actor ensures reliable, structured data extraction with minimal setup, making it an essential tool for businesses and researchers alike.

### Features

- **Comprehensive Data Extraction**: Scrapes key product details including names, taglines, descriptions, ingredients, sizes, and usage instructions from Aussie.com product pages.
- **High Accuracy and Reliability**: Utilizes robust parsing techniques to ensure data integrity and minimize errors in extraction.
- **Fast and Efficient Performance**: Optimized for quick scraping with support for multiple URLs, reducing processing time for large datasets.
- **Structured JSON Output**: Delivers clean, machine-readable data in JSON format, ideal for integration with databases or analytics tools.
- **Flexible Input Configuration**: Accepts a list of URLs for targeted scraping, allowing customization based on specific needs.
- **Error Handling and Logging**: Includes built-in mechanisms to handle common scraping issues, with detailed logs for monitoring.
- **Scalable for Business Use**: Suitable for batch processing, making it perfect for ongoing monitoring and automation tasks.

### Input Parameters

| Parameter | Type   | Required | Description | Example |
|-----------|--------|----------|-------------|---------|
| Urls     | array | Yes      | A list of URLs from Aussie.com to scrape product data from. Each URL must be a valid HTTP or HTTPS link pointing to a product page. | ["https://aussie.com/en-us/hair-insurance-leave-in-conditioner", "https://aussie.com/en-us/another-product"] |

### Example Usage

To use the Aussie Parser Spider, configure the input with a list of URLs and run the Actor. Below is an example input JSON:

```json
{
  "Urls": [
    "https://aussie.com/en-us/hair-insurance-leave-in-conditioner"
  ]
}
````

This will produce output similar to the following JSON array:

```json
[
  {
    "URL": "https://aussie.com/en-us/hair-insurance-leave-in-conditioner",
    "Product_Name": "Hair Insurance Leave-In Conditioner",
    "Tagline": "Perfect for all hair types",
    "Description": "TAME FRIZZ & SOFTEN HAIR. This hydrating leave-in hair conditioner is the assurance your hair needs to look and feel its best. Aussie Hair Insurance instantly tames frizz and softens hair. It\u2019ll leave your hair looking polished, feeling more manageable, and, of course, super soft and smooth. This hair mist can work as a leave-in conditioner for curly hair as well as any other hair type. Aussie loves jojoba oil for hair\u2014that\u2019s why it\u2019s infused in this leave-in conditioner spray formula. The juicy citrus scent is a bonus.",
    "BV_Product_Id": "80707137",
    "Product_Size": "8.0 FL OZ",
    "Ingredients": "Water, Simmondsia Chinensis (Jojoba) Seed Oil, Fragrance, Phenoxyethanol, Amodimethicone, PEG-40 Hydrogenated Castor Oil, PPG-2 Methyl Ether, Benzyl Alcohol, Polyquaternium-11, Disodium EDTA, Polysorbate 80, Ethylhexylglycerin, Aminomethyl Propanol, Citric Acid, Trideceth-12, Cetrimonium Chloride",
    "SmartLabel_URL": "https://aussie.com/en-us/shop/kids",
    "How_To_Use_Title": "How To Use It",
    "How_To_Use_Text": "Simply spray on damp hair and go! Indulge in the yummy fragrance without rinsing out.",
    "Crawled_Date": "01-14-2026",
    "actor_id": "xXcZQaOsl2V4HoQhJ",
    "run_id": "xoo5piVwnPE4mMk47"
  }
]
```

### Use Cases

- **Market Research and Analysis**: Gather detailed product data to analyze trends in hair care products and consumer preferences.
- **Competitive Intelligence**: Compare Aussie products with competitors by extracting ingredients, descriptions, and pricing information.
- **Price Monitoring**: Track product sizes and availability for dynamic pricing strategies in e-commerce.
- **Content Aggregation**: Build databases of product information for blogs, reviews, or educational content.
- **Academic Research**: Collect data on ingredients and usage for studies in cosmetology or consumer behavior.
- **Business Automation**: Automate data collection for inventory management or supply chain optimization in retail.

### Installation and Usage

1. Search for "Aussie Parser Spider" in the Apify Store.
2. Click "Try for free" or "Run".
3. Configure input parameters by providing a list of URLs.
4. Click "Start" to begin extraction.
5. Monitor progress in the log.
6. Export results in your preferred format (JSON, CSV, Excel).

### Output Format

The Actor outputs data in a JSON array, where each object represents a scraped product. Key fields include:

- **URL**: The source URL of the product page.
- **Product\_Name**: The name of the product.
- **Tagline**: A short promotional phrase.
- **Description**: Detailed product description.
- **BV\_Product\_Id**: Unique product identifier.
- **Product\_Size**: Size of the product (e.g., volume).
- **Ingredients**: List of ingredients.
- **SmartLabel\_URL**: Link to additional product info.
- **How\_To\_Use\_Title** and **How\_To\_Use\_Text**: Usage instructions.
- **Crawled\_Date**: Date of scraping.
- **actor\_id** and **run\_id**: Apify-specific identifiers for tracking.

This structured format ensures easy parsing and integration.

### Support

For custom/simplified outputs or bug reports, please contact:

- Email: support@getdataforme.com
- Subject line: "custom support"
- Contact form: https://getdataforme.com/contact/

We're here to help you get the most out of this Actor!

# Actor input Schema

## `Urls` (type: `array`):

The urls for the spider.

## Actor input object example

```json
{}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("getdataforme/aussie-parser-spider").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("getdataforme/aussie-parser-spider").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call getdataforme/aussie-parser-spider --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=getdataforme/aussie-parser-spider",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Aussie Parser Spider",
        "description": "The Aussie Parser Spider scrapes detailed product data from Aussie.com, extracting names, descriptions, ingredients, and usage instructions....",
        "version": "0.0",
        "x-build-id": "eUhTx0U8DqhWm5hmK"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/getdataforme~aussie-parser-spider/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-getdataforme-aussie-parser-spider",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/getdataforme~aussie-parser-spider/runs": {
            "post": {
                "operationId": "runs-sync-getdataforme-aussie-parser-spider",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/getdataforme~aussie-parser-spider/run-sync": {
            "post": {
                "operationId": "run-sync-getdataforme-aussie-parser-spider",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "Urls": {
                        "title": "Urls",
                        "minItems": 1,
                        "type": "array",
                        "description": "The urls for the spider.",
                        "items": {
                            "type": "string"
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
