设为首页收藏本站

LUPA开源社区

 找回密码
 注册
文章 帖子 博客
LUPA开源社区 首页 业界资讯 软件追踪 查看内容

Go语言HTML解析库goquery v 1.0.0正式发布

2016-8-11 21:24| 发布者: joejoe0332| 查看: 1070| 评论: 0|原作者: oschina|来自: oschina

摘要: goquery是一个使用go语言写成的HTML解析库,可以让你像jQuery那样的方式来操作DOM文档。下面是示例:?1234567891011121314packagemainimport("fmt""log""github.com/PuerkitoBio/goquery")funcExampleScrape(){doc,e ...

goquery是一个使用go语言写成的HTML解析库,可以让你像jQuery那样的方式来操作DOM文档。

下面是示例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
package main  import ( "fmt"  "log"  "github.com/PuerkitoBio/goquery"  )  
func ExampleScrape() {  
    doc, err := goquery.NewDocument("http://metalsucks.net"
        if err != nil {
    log.Fatal(err)
  }  // Find the review items  
    doc.Find(".sidebar-reviews article .content-block").Each(func(i int, s *goquery.Selection) {  // For each item found, get the band and title  band := s.Find("a").Text()  
     title := s.Find("i").Text()
     fmt.Printf("Review %d: %s - %s\n", i, band, title)
  })
func main() {  
    ExampleScrape()
}

更新日志:

  • 2016-07-27 (v1.0.0) : Tag version 1.0.0.

  • 2016-06-15 : Invalid selector strings internally compile to a Matcher implementation that never matches any node (instead of a panic). So for example, doc.Find("~") returns an empty *Selection object.

  • 2016-02-02 : Add NodeName utility function similar to the DOM's nodeName property. It returns the tag name of the first element in a selection, and other relevant values of non-element nodes (see godoc for details). Add OuterHtml utility function similar to the DOM's outerHTML property (named OuterHtml in small caps for consistency with the existingHtml method on the Selection).

  • 2015-04-20 : Add AttrOr helper method to return the attribute's value or a default value if absent. Thanks topiotrkowalczuk.

  • 2015-02-04 : Add more manipulation functions - Prepend* - thanks again to Andrew Stone.

  • 2014-11-28 : Add more manipulation functions - ReplaceWith, Wrap and Unwrap - thanks again to Andrew Stone.

  • 2014-11-07 : Add manipulation functions (thanks to Andrew Stone) and *Matcher functions, that receive compiled cascadia selectors instead of selector strings, thus avoiding potential panics thrown by goquery viacascadia.MustCompile calls. This results in better performance (selectors can be compiled once and reused) and more idiomatic error handling (you can handle cascadia's compilation errors, instead of recovering from panics, which had been bugging me for a long time). Note that the actual type expected is a Matcher interface, that cascadia.Selectorimplements. Other matcher implementations could be used.

  • 2014-11-06 : Change import paths of net/html to golang.org/x/net/html (seehttps://groups.google.com/forum/#!topic/golang-nuts/eD8dh3T9yyA). Make sure to update your code to use the new import path too when you call goquery with html.Nodes.

  • v0.3.2 : Add NewDocumentFromReader() (thanks jweir) which allows creating a goquery document from an io.Reader.

  • v0.3.1 : Add NewDocumentFromResponse() (thanks assassingj) which allows creating a goquery document from an http response.

  • v0.3.0 : Add EachWithBreak() which allows to break out of an Each() loop by returning false. This function was added instead of changing the existing Each() to avoid breaking compatibility.

  • v0.2.1 : Make go-getable, now that go.net/html is Go1.0-compatible (thanks to @matrixik for pointing this out).

  • v0.2.0 : Add support for negative indices in Slice(). BREAKING CHANGE Document.Root is removed, Document is now aSelection itself (a selection of one, the root element, just like Document.Root was before). Add jQuery's Closest() method.

  • v0.1.1 : Add benchmarks to use as baseline for refactorings, refactor Next...() and Prev...() methods to use the new html package's linked list features (Next/PrevSibling, FirstChild). Good performance boost (40+% in some cases).

  • v0.1.0 : Initial release.


酷毙

雷人

鲜花

鸡蛋

漂亮
  • 快毕业了,没工作经验,
    找份工作好难啊?
    赶紧去人才芯片公司磨练吧!!

最新评论

关于LUPA|人才芯片工程|人才招聘|LUPA认证|LUPA教育|LUPA开源社区 ( 浙B2-20090187 浙公网安备 33010602006705号   

返回顶部