DeepSeek辅助编写的利用quick_xml把xml转为csv的rust程序

提示词

请用rust quickxml库实现读取xml的row和c标签信息,并输出到csv格式,要求是:数值型c,输出标签的内容,字符串型c(t = "inlineStr"),输出的内容,row的r属性表是行号,c的r属性是字母列号行号,如果不连续,就在中间补空列,用逗号分隔各列,一行完全输出后换行,如果1行全为空,不输出该行,为便于查对,每行开头输出row的r属性行号,利用内存映射文件实现。只输出代码,不干别的。

rust 复制代码
use quick_xml::events::Event;
use quick_xml::Reader;
use memmap2::Mmap;
use std::fs::File;
use std::io::{self, Write};
use std::path::Path;

fn main() -> io::Result<()> {
    let file_path = "input.xml";
    let file = File::open(file_path)?;
    let mmap = unsafe { Mmap::map(&file)? };
    
    let mut reader = Reader::from_reader(&mmap[..]);
    reader.trim_text(true);
    
    let mut buf = Vec::new();
    let mut current_row: Option<u32> = None;
    let mut current_cells: Vec<Option<String>> = Vec::new();
    let mut output = io::stdout();
    
    loop {
        match reader.read_event_into(&mut buf) {
            Ok(Event::Start(ref e)) => {
                match e.name().as_ref() {
                    b"row" => {
                        if let Some(attrs) = e.attributes().filter_map(Result::ok).find(|a| a.key.as_ref() == b"r") {
                            if let Ok(r_val) = String::from_utf8(attrs.value.to_vec()) {
                                current_row = r_val.parse().ok();
                            }
                        }
                        current_cells.clear();
                    }
                    b"c" => {
                        let mut cell_type = None;
                        let mut cell_ref = None;
                        
                        for attr in e.attributes().filter_map(Result::ok) {
                            match attr.key.as_ref() {
                                b"t" => {
                                    cell_type = String::from_utf8(attr.value.to_vec()).ok();
                                }
                                b"r" => {
                                    cell_ref = String::from_utf8(attr.value.to_vec()).ok();
                                }
                                _ => {}
                            }
                        }
                        
                        let mut cell_value = None;
                        let mut in_v = false;
                        let mut in_t = false;
                        
                        loop {
                            match reader.read_event_into(&mut buf) {
                                Ok(Event::Start(ref e)) => {
                                    match e.name().as_ref() {
                                        b"v" => in_v = true,
                                        b"t" => in_t = true,
                                        _ => {}
                                    }
                                }
                                Ok(Event::Text(e)) => {
                                    if in_v || in_t {
                                        if let Ok(text) = e.unescape() {
                                            cell_value = Some(text.to_string());
                                        }
                                    }
                                }
                                Ok(Event::End(ref e)) => {
                                    if e.name().as_ref() == b"c" {
                                        break;
                                    }
                                    if e.name().as_ref() == b"v" {
                                        in_v = false;
                                    }
                                    if e.name().as_ref() == b"t" {
                                        in_t = false;
                                    }
                                }
                                Ok(Event::Eof) => break,
                                _ => {}
                            }
                        }
                        
                        if let Some(c_ref) = cell_ref {
                            if let Some(col) = parse_column(&c_ref) {
                                while current_cells.len() <= col {
                                    current_cells.push(None);
                                }
                                current_cells[col] = cell_value;
                            }
                        }
                    }
                    _ => {}
                }
            }
            Ok(Event::End(ref e)) => {
                if e.name().as_ref() == b"row" {
                    if let Some(row_num) = current_row {
                        if current_cells.iter().any(Option::is_some) {
                            write!(output, "{}", row_num)?;
                            for cell in &current_cells {
                                write!(output, ",")?;
                                if let Some(val) = cell {
                                    write!(output, "\"{}\"", val)?;
                                }
                            }
                            writeln!(output)?;
                        }
                    }
                    current_row = None;
                    current_cells.clear();
                }
            }
            Ok(Event::Eof) => break,
            Err(e) => {
                eprintln!("Error reading XML: {}", e);
                break;
            }
            _ => {}
        }
        buf.clear();
    }
    
    Ok(())
}

fn parse_column(ref_str: &str) -> Option<usize> {
    let mut col_str = String::new();
    for c in ref_str.chars() {
        if c.is_ascii_alphabetic() {
            col_str.push(c);
        } else {
            break;
        }
    }
    
    if col_str.is_empty() {
        return None;
    }
    
    let mut col = 0;
    for c in col_str.chars() {
        col = col * 26 + (c.to_ascii_uppercase() as usize - 'A' as usize + 1);
    }
    Some(col - 1)
}

相应Cargo.toml

toml 复制代码
[package]
name = "xml-to-csv"
version = "0.1.0"
edition = "2024"

[dependencies]
quick-xml = "0.31.0"
memmap2 = "0.9.0"

编译执行

bash 复制代码
export CARGO_INCREMENTAL=1
export PATH=/par:/par/mold240/bin:$PATH
cargo build --release

/par/xmlcsv/target/release# ./xml-to-csv
1,"1","15519","785","1","17.00","24386.67","0.04","0.02","N","O","35137.0","35107.0","35146.0","DELIVER IN PERSON","TRUCK","to beans x-ray carefull"
2,"1","6731","732","2","36.00","58958.28","0.09","0.06","N","O","35167.0","35123.0","35175.0","TAKE BACK RETURN","MAIL","according to the final foxes. qui"

/par# time xmlcsv/target/release/xml-to-csv >quickxml.csv

real    1m28.133s
user    0m5.104s
sys     0m5.273s
相关推荐
2401_83150173几秒前
Python学习之day01学习(变量定义和数据类型使用)
开发语言·python·学习
山海青风17 分钟前
藏语自然语言处理入门 - 3 找关键词
人工智能·自然语言处理
Java与Android技术栈19 分钟前
AI Coding 让我两天完成图像编辑器 Monica 的国际化与多主题
人工智能
wwwzhouhui19 分钟前
85-dify案例分享-不用等 OpenAI 邀请,Dify+Sora2工作流实测:写实动漫视频随手做,插件+教程全送
人工智能·音视频·sora2
Testopia28 分钟前
AI与敏捷开发管理系列3:敏捷方法在AI项目中的应用案例
人工智能·ai编程·敏捷流程·#人工智能学习
倔强青铜三29 分钟前
苦练Python第61天:logging模块——让Python日志“有迹可循”的瑞士军刀
人工智能·python·面试
数智顾问29 分钟前
Java坐标转换的多元实现路径:在线调用、百度与高德地图API集成及纯Java代码实现——纯Java代码实现与数学模型深度剖析
java·开发语言
Testopia31 分钟前
AI与敏捷开发管理1:传统方法失灵?人工智能项目的新法则
人工智能·项目管理·敏捷开发·敏捷流程
倔强青铜三32 分钟前
苦练Python第60天:json模块——让Python和JSON“无缝互译”的神兵利器
人工智能·python·面试
Ivanqhz36 分钟前
RUST 静态生命周期和动态生命周期
开发语言