sitemap

package module
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 2, 2024 License: MIT Imports: 8 Imported by: 17

README

go-sitemap

Github Actions CI GoDoc

go-sitemap get sitemap.xml (or sitemapindex.xml) and generate Sitemap object.

Installation

go install github.com/yterajima/go-sitemap

Documentation

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func SetFetch

func SetFetch(f func(URL string, options interface{}) ([]byte, error))

SetFetch change fetch closure

func SetInterval

func SetInterval(time time.Duration)

SetInterval change Time interval to be used in Index.get

Types

type Index

type Index struct {
	XMLName xml.Name `xml:"sitemapindex"`
	Sitemap []parts  `xml:"sitemap"`
}

Index is a structure of <sitemapindex>

func ParseIndex

func ParseIndex(data []byte) (Index, error)

ParseIndex create Index data from text

func ReadSitemapIndex added in v0.4.0

func ReadSitemapIndex(path string) (Index, error)

ReadSitemapIndex is a function that reads a file and returns a Index structure.

Example
index, err := ReadSitemap("./testdata/sitemapindex.xml")
if err != nil {
	fmt.Println(err)
}

for _, URL := range index.URL {
	fmt.Println(URL.Loc)
}

type Sitemap

type Sitemap struct {
	XMLName xml.Name `xml:"urlset"`
	URL     []URL    `xml:"url"`
}

Sitemap is a structure of <sitemap>

func ForceGet added in v0.3.0

func ForceGet(URL string, options interface{}) (Sitemap, error)

ForceGet is fetch and parse sitemap.xml/sitemapindex.xml. The difference with the Get function is that it ignores some errors.

Errors to Ignore:

・When sitemapindex.xml contains a sitemap.xml URL that cannot be retrieved. ・When sitemapindex.xml contains a sitemap.xml that is empty ・When sitemapindex.xml contains a sitemap.xml that has format problems.

Errors not to Ignore:

・When sitemap.xml/sitemapindex.xml could not retrieved. ・When sitemap.xml/sitemapindex.xml is empty. ・When sitemap.xml/sitemapindex.xml has format problems.

If you want **not** to ignore some errors, use the Get function.

func Get

func Get(URL string, options interface{}) (Sitemap, error)

Get is fetch and parse sitemap.xml/sitemapindex.xml

If sitemap.xml or sitemapindex.xml has some problems, This function return error.

・When sitemap.xml/sitemapindex.xml could not retrieved. ・When sitemap.xml/sitemapindex.xml is empty. ・When sitemap.xml/sitemapindex.xml has format problems. ・When sitemapindex.xml contains a sitemap.xml URL that cannot be retrieved. ・When sitemapindex.xml contains a sitemap.xml that is empty ・When sitemapindex.xml contains a sitemap.xml that has format problems.

If you want to ignore these errors, use the ForceGet function.

Example
smap, err := Get("https://issueoverflow.com/sitemap.xml", nil)
if err != nil {
	fmt.Println(err)
}

for _, URL := range smap.URL {
	fmt.Println(URL.Loc)
}
Example (ChangeFetch)
SetFetch(func(URL string, options interface{}) ([]byte, error) {
	req, err := http.NewRequest("GET", URL, nil)
	if err != nil {
		return []byte{}, err
	}

	// Set User-Agent
	req.Header.Set("User-Agent", "MyBot")

	// Set timeout
	timeout := time.Duration(10 * time.Second)
	client := http.Client{
		Timeout: timeout,
	}

	// Fetch data
	res, err := client.Do(req)
	if err != nil {
		return []byte{}, err
	}
	defer res.Body.Close()

	body, err := io.ReadAll(res.Body)
	if err != nil {
		return []byte{}, err
	}

	return body, err
})

smap, err := Get("https://issueoverflow.com/sitemap.xml", nil)
if err != nil {
	fmt.Println(err)
}

for _, URL := range smap.URL {
	fmt.Println(URL.Loc)
}

func Parse

func Parse(data []byte) (Sitemap, error)

Parse create Sitemap data from text

func ReadSitemap added in v0.4.0

func ReadSitemap(path string) (Sitemap, error)

ReadSitemap is a function that reads a file and returns a Sitemap structure.

Example
smap, err := ReadSitemap("./testdata/sitemap.xml")
if err != nil {
	fmt.Println(err)
}

for _, URL := range smap.URL {
	fmt.Println(URL.Loc)
}

type URL

type URL struct {
	Loc        string  `xml:"loc"`
	LastMod    string  `xml:"lastmod"`
	ChangeFreq string  `xml:"changefreq"`
	Priority   float32 `xml:"priority"`
}

URL is a structure of <url> in <sitemap>

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL