我是在edge浏览器中安装的xpath,需要安装的朋友可以参考下面这篇博客最新版edge浏览器中安装xpath插件
一、xpathd的使用
- 安装lxml
            
            
              python
              
              
            
          
          pip install lxml ‐i https://pypi.douban.com/simple- 导入lxml.etree
            
            
              python
              
              
            
          
          from lxml import etree- etree.parse()解析本地文件
            
            
              python
              
              
            
          
          html_tree = etree.parse('XX.html')- etree.HTML()服务器响应文件
            
            
              python
              
              
            
          
          html_tree = etree.HTML(response.read().decode('utf‐8')- html_tree.xpath(xpath路径)

            
            
              html
              
              
            
          
          <!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8"/>
    <title>Title</title>
</head>
<body>
    <ul>
        <li id="l1" class="c1">北京</li>
        <li id="l2">上海</li>
        <li id="c3">深圳</li>
        <li id="c4">武汉</li>
    </ul>
<!--    <ul>-->
<!--        <li>大连</li>-->
<!--        <li>锦州</li>-->
<!--        <li>沈阳</li>-->
<!--    </ul>-->
</body>
</html>
            
            
              python
              
              
            
          
          from lxml import etree
# xpath解析
# (1)本地文件                                                etree.parse
# (2)服务器响应的数据  response.read().decode('utf-8') *****   etree.HTML()
# xpath解析本地文件
tree = etree.parse('路径.html')
#tree.xpath('xpath路径')
# 查找ul下面的li
# li_list = tree.xpath('//body/ul/li')
# 查找所有有id的属性的li标签
# text()获取标签中的内容
# li_list = tree.xpath('//ul/li[@id]/text()')
# 找到id为l1的li标签  注意引号的问题
# li_list = tree.xpath('//ul/li[@id="l1"]/text()')
# 查找到id为l1的li标签的class的属性值
# li = tree.xpath('//ul/li[@id="l1"]/@class')
# 查询id中包含l的li标签
# li_list = tree.xpath('//ul/li[contains(@id,"l")]/text()')
# 查询id的值以l开头的li标签
# li_list = tree.xpath('//ul/li[starts-with(@id,"c")]/text()')
#查询id为l1和class为c1的
# li_list = tree.xpath('//ul/li[@id="l1" and @class="c1"]/text()')
li_list = tree.xpath('//ul/li[@id="l1"]/text() | //ul/li[@id="l2"]/text()')
# 判断列表的长度
print(li_list)
print(len(li_list))