2019独角兽企业重金招聘Python工程师标准>>>

Problem

You have HTML in a Java String, and you want to parse that HTML to get at its contents, or to make sure it's well formed, or to modify it. The String may have come from user input, a file, or from the web.

Solution

Use the static Jsoup.parse(String html) method, or Jsoup.parse(String html, String baseUri) if the page came from the web, and you want to get at absolute URLs (see [working-with-urls]).

String html = "<html><head><title>First parse</title></head>"+ "<body><p>Parsed HTML into a doc.</p></body></html>";
Document doc = Jsoup.parse(html);

Description

The parse(String html, String baseUri) method parses the input HTML into a newDocument. The base URI argument is used to resolve relative URLs into absolute URLs, and should be set to the URL where the document was fetched from. If that's not applicable, or if you know the HTML has a base element, you can use the parse(String html) method.

As long as you pass in a non-null string, you're guaranteed to have a successful, sensible parse, with a Document containing (at least) a head and a body element. (BETA: if you do get an exception raised, or a bad parse-tree, please file a bug.)

Once you have a Document, you can get get at the data using the appropriate methods in Document and its supers Element and Node.

转载于:https://my.oschina.net/u/553266/blog/296059

Parse a document from a String相关推荐

  1. org.hibernate.InvalidMappingException: Could not parse mapping document from resource

    在写hibernate时,若运行出现"org.hibernate.InvalidMappingException: Could not parse mapping document from ...

  2. Could not parse mapping document from input stream hibernate配置异常

    Could not parse mapping document from input stream hibernate配置异常 参考文章: (1)Could not parse mapping do ...

  3. Java DOM4J解析String类型XML,Document对象转String

    解析String类型XML数据 Document doc = DocumentHelper.parseText(String text); ​​​​​​​ Document对象转String Stri ...

  4. hibernate 出现Could not parse mapping document from resource 报错

    <!-- 映射文件的根节点 -->   <hibernate-mapping >       <!--           对象关系映射的开始:class元素表示类和数据 ...

  5. xml.parsers.expat.ExpatError: mismatched tag: line 63, column 4

    xml.parsers.expat.ExpatError: mismatched tag: line 63, column 4使用wxpy时遇到的这个报错查看堆栈,打印出解析的字符串def parse ...

  6. DOM方式进行的XML文件、Document、String之间的相互转换

    http://kingxss.iteye.com/blog/1026954 XML文件test.xml: Xml代码 <?xml version="1.0" encoding ...

  7. flutter int.parse报错type ‘int‘ is not a subtype of type ‘String‘

    项目中想要计算一些通过接口获取的数值,便使用int.parse('')转换. 尝试打印o['num']发现也有值,只是看不到类型,int.parse(arg)的参数是string类型,尝试对o['nu ...

  8. JSON.parse 函数应用 (复制备忘)

    JSON.parse 函数 JSON.parse 函数 (JavaScript) 将 JavaScript 对象表示法 (JSON) 字符串转换为对象. 语法 JSON.parse(text [, r ...

  9. jq js json 转字符串_JS中JSON对象和String之间的互转及处理技巧

    json:JavaScript 对象表示法(javascript Object Notation),其实JSON就是一个javaScript的对象(Object)而已. 如有不清楚JSON,可以去w3 ...

最新文章

  1. 论政府开放数据的意义
  2. 人生苦短:Python里的17个“超赞操作
  3. 第一阶段冲刺(第七天)
  4. 数据结构之队列的特别实现
  5. 蓝桥杯2021年第十二届C++省赛第四题-货物摆放
  6. 简约html5动态个人简历,HTML5 简约风格的程序员简历模板
  7. 复盘:pearson皮尔森相关系数和spearman斯皮尔曼相关系数的区别
  8. 计算机安全常用防护策略,新手必看
  9. openwrt dnsmasq rebind_protection域名劫持保护
  10. 腾讯云灯塔计划——云行业研究报告
  11. 解读 | 自监督视觉特征学习综述
  12. 汉诺塔II|汉诺塔4柱
  13. 贪心算法(Greedy)
  14. uip-udp-demo分析---基于contiki
  15. (Java高级程序设计-案例)-通过JDBC连接MySQL并对表进行增、删、改、查
  16. cma可以免考几门acca
  17. thisis incompatible with sql_mode=only_full_group
  18. 数模国赛latex论文模板,直接往里套内容就行了
  19. stm32 TIM定时器中断
  20. 谷歌浏览器自定义视频的倍速播放

热门文章

  1. 关于IE6下CSS选择器失效的问题
  2. 推荐一个短小精干的JavaScript对话框
  3. 10 分钟上手 Vim,常用命令大盘点
  4. 60道Python面试题答案精选!找工作前必看
  5. java监控网卡_VC++监控网卡状态
  6. Apollo产品对比
  7. 索引使用原则-列的离散(sàn)度
  8. 字符流中的编码解码问题
  9. 结合zuul网关的鉴权流程
  10. 高级concurrent包