设为首页收藏本站

LUPA开源社区

 找回密码
 注册
文章 帖子 博客
LUPA开源社区 首页 业界资讯 软件追踪 查看内容

Jsoup 1.10.1发布,Java的HTML解析器

2016-10-24 22:30| 发布者: joejoe0332| 查看: 551| 评论: 0|原作者: oschina|来自: oschina

摘要: Jsoup 1.10.1 发布了,Jsoup 是一款 Java 的HTML 解析器,可直接解析某个URL地址、HTML文本内容。它提供了一套非常省力的API,可通过DOM,CSS以及类似于JQuery的操作方法来取出和操作数据。更新内容如下:改进Improv ...

Jsoup 1.10.1 发布了,Jsoup 是一款 Java 的HTML 解析器,可直接解析某个URL地址、HTML文本内容。它提供了一套非常省力的API,可通过DOM,CSS以及类似于JQuery的操作方法来取出和操作数据。更新内容如下:

改进

  • Improved support for extended HTML entities, including supplemental characters and multiple character references. Also reduced memory consumption of the entity tables.

  • Added support for *|E wildcard namespace selectors.

  • Added support for setting multiple connection headers in Jsoup.connect at once with Connection.headers(Map)

  • Added support for setting/overriding the response character set in Connection.Response, for cases where the charset is not defined by the server, or is defined incorrectly.

  • Improved the performance of class selectors by reducing memory allocation and garbage collection.

  • Improved performance of HTML output by reducing the creation of temporary attribute list iterators.

修复

  • Fixed an issue when converting to the W3CDom XML, where valid (but ugly) HTML attribute names containing characters like " could not be converted into valid XML attribute names. These attribute names are now normalized if possible, or not added to the XML DOM.

  • Fixed an OOB exception when loading an empty-body URL and parsing with the XML parser.

  • Fixed an issue where attribute names starting with a slash would be parsed incorrectly.

  • Don't reuse charset encoders from OutputSettings, to make threadsafe.

  • Fixed an issue in connections with a requestBody where a custom content-type header could be ignored.

点此查看完整更新内容发行说明

下载地址:


酷毙

雷人

鲜花

鸡蛋

漂亮
  • 快毕业了,没工作经验,
    找份工作好难啊?
    赶紧去人才芯片公司磨练吧!!

最新评论

关于LUPA|人才芯片工程|人才招聘|LUPA认证|LUPA教育|LUPA开源社区 ( 浙B2-20090187 浙公网安备 33010602006705号   

返回顶部