jsoup 1.10.3 发布了,该版本带来了更好的 CSS 选择器性能,Jsoup.Connection 改进和其他 bug 修复。 详情包括:
Improvements Added Elements.eachText() and Elements.eachAttr() , which return a list of an Element's text or attribute values, respectively. This makes it simpler to for example get a list of each URL on a page: List<String> urls = doc.select("a").eachAttr("abs:href""); Improved selector validation for :contains(...) with unbalanced quotes. Improved the speed of index based CSS selectors and other methods that use elementSiblingIndex, by a factor of 34x. Added Node.clearAttributes() , to simplify removing of all attributes of a Node / Element .
Fixes Bugfix: if an attribute name started or ended with a control character, the parse would fail with a validation exception. Bugfix: Element.hasClass() and the .classname selector would not find the class attribute case-insensitively. Bugfix: In Jsoup.Connection , if a redirect contained a query string with %xx escapes, they would be double escaped before the redirect was followed, leading to fetching an incorrect location. Bugfix: In Jsoup.Connection , if a request body was set and the connection was redirected, the body would incorrectly still be sent. Bugfix: In DataUtil when
detecting the character set from meta data, and there are two
Content-Types defined, use the one that defines a character set. Bugfix: when parsing unknown tags in case-sensitive HTML mode, end tags would not close scope correctly. In Jsoup.Connection , ensure there is no Content-Type set when being redirected to a GET. Bugfix: in certain locales (Turkish specifically), lowercasing and case insensitivity could fail for specific items.
下载地址:https://jsoup.org/download |