crawlab

package module
v0.7.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 6, 2025 License: BSD-3-Clause Imports: 5 Imported by: 0

README

Crawlab Go SDK

Crawlab Go SDK supports Golang-based spiders integration with Crawlab. It contains a number of APIs including saving crawled items into different data sources including MongoDB, MySQL, Postgres, ElasticSearch and Kafka.

Basic Usage

package main

import (
	"github.com/crawlab-team/crawlab-go-sdk"
)

func main() {
	item := make(map[string]interface{})
	item["url"] = "http://example.com"
	item["title"] = "hello world"
	_ = crawlab.SaveItem(item)
}

Example Using Colly

package main

import (
	"fmt"
	"github.com/crawlab-team/crawlab-go-sdk"

	"github.com/gocolly/colly/v2"
)

func main() {
	// Instantiate default collector
	c := colly.NewCollector(
		// Visit only domains: quotes.toscrape.com
		colly.AllowedDomains("quotes.toscrape.com"),
	)

	// On every a element which has href attribute call callback
	crawlab.CollyOnHTMLMany(c, "a[href]", func(e *colly.HTMLElement) []map[string]any {
		return []map[string]any{
			{
				"text": e.Text,
				"link": e.Attr("href"),
			},
		}
	})

	// Before making a request print "Visiting ..."
	c.OnRequest(func(r *colly.Request) {
		fmt.Println("Visiting", r.URL.String())
	})

	// Start scraping on https://quotes.toscrape.com
	c.Visit("https://quotes.toscrape.com")
}

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CollyOnHTMLMany added in v0.7.0

func CollyOnHTMLMany(c *colly.Collector, goqueryString string, getItems func(element *colly.HTMLElement) []map[string]any)

func CollyOnHTMLOne added in v0.7.0

func CollyOnHTMLOne(c *colly.Collector, goqueryString string, getItems func(element *colly.HTMLElement) map[string]any)

func SaveItem

func SaveItem(items ...map[string]any) (err error)

func SaveItems added in v0.7.0

func SaveItems(items []map[string]any) (err error)

Types

This section is empty.

Directories

Path Synopsis
_examples
basic command
colly command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL