java表单类爬虫

见代码附件

编写方式1

java 复制代码
 public static void Posts() throws IOException{

        // 创建 HTTP POST 请求
        HttpPost httpPost = new HttpPost("http://www.cninfo.com.cn/new/disclosure");

        // 设置请求头
        httpPost.setHeader("Content-Type", "application/x-www-form-urlencoded");

        // 构建要发送的数据字符串,动态替换 pageNum 参数
        String pageNum = "1"; // 假设要发送的 pageNum 值为 3
        String postData = String.format("column=szse_gem_latest&pageNum=%s&pageSize=30&sortName=&sortType=&clusterFlag=true", pageNum);

        // 设置实体为要发送的数据字符串
        httpPost.setEntity(new StringEntity(postData));

        // 创建 HttpClient 并发送请求
        try (CloseableHttpClient httpClient = HttpClients.createDefault()) {
            HttpResponse response = httpClient.execute(httpPost);
            // 处理响应
            int statusCode = response.getStatusLine().getStatusCode();
            if (statusCode == 200) {
                // 读取响应内容
                String responseBody = EntityUtils.toString(response.getEntity());
                System.out.println("响应内容:" + responseBody);
            } else {
                System.err.println("请求失败,状态码:" + statusCode);
            }

        }

    }

编写方式2

用列表设置参数,采用UrlEncodedFormEntity 进行编码

java 复制代码
public static void PostForm() throws IOException {
        // 2. 设置表单参数
        List<NameValuePair> kv = new ArrayList<>();
        kv.add(new BasicNameValuePair("column", "szse_gem_latest"));
        kv.add(new BasicNameValuePair("pageNum", "1"));
        kv.add(new BasicNameValuePair("pageSize", "30"));
        kv.add(new BasicNameValuePair("sortName", ""));
        kv.add(new BasicNameValuePair("sortType", ""));
        kv.add(new BasicNameValuePair("clusterFlag", "true"));

        // 3. 创建HttpPost实例
        HttpPost httpPost = new HttpPost("http://www.cninfo.com.cn/new/disclosure");
        httpPost.setHeader("Content-Type", "application/x-www-form-urlencoded; charset=UTF-8");
        httpPost.addHeader("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36");

        // 4. 让Post携带表单参数
        httpPost.setEntity(new UrlEncodedFormEntity(kv, Consts.UTF_8));

        // 5. 获取HttpResponse响应
        CloseableHttpResponse response = httpClient.execute(httpPost);

        // 6. 读response
        System.out.println(EntityUtils.toString(response.getEntity()));

        // 7. 释放资源
        response.close();
        httpClient.close();

    }
相关推荐
kingbal6 个月前
Java:爬虫htmlunit抓取a标签
爬虫·java爬虫·htmlunit抓取a标签·爬取页面a标签·java快速爬取页面数据
李南想做条咸鱼1 年前
Java爬取哔哩哔哩视频(可视化)
java·开发语言·音视频·swing·java爬虫·htmlunit