一、软件介绍
文末提供程序和源码下载
Z-Ant (Zig-Ant) 是一个全面的开源神经网络框架,专门用于在微控制器和边缘设备上部署优化的 AI 模型。Z-Ant 使用 Zig 构建,为资源受限的硬件上的模型优化、代码生成和实时推理提供端到端工具。
二、全面的模型部署
- ONNX Model Support : Full compatibility with ONNX format models
ONNX 模型支持:与 ONNX 格式模型完全兼容 - Cross-platform Compilation : ARM Cortex-M, RISC-V, x86, and more
跨平台编译:ARM Cortex-M、RISC-V、x86 等 - Static Library Generation : Generate optimized static libraries for any target architecture
静态库生成:为任何目标架构生成优化的静态库 - Real-time Inference : Microsecond-level prediction times on microcontrollers
实时推理:微控制器上的微秒级预测时间
三、高级优化引擎
- Quantization : Automatic model quantization with dynamic and static options
量化:具有动态和静态选项的自动模型量化 - Pruning : Neural network pruning for reduced model size
修剪:用于减小模型大小的神经网络修剪 - Buffer Optimization : Memory-efficient tensor operations
缓冲区优化:节省内存的张量运算 - Flash vs RAM Execution : Configurable execution strategies
Flash 与 RAM 执行:可配置的执行策略
四、GUI Interface 🖥 GUI 界面
Z-Ant includes an experimental cross-platform GUI built with SDL for basic model selection and code generation. Note that the GUI is currently unstable and under active development - we recommend using the command-line interface for production workflows.
Z-Ant 包括一个使用 SDL 构建的实验性跨平台 GUI,用于基本模型选择和代码生成。请注意,GUI 当前不稳定,正在积极开发中 - 我们建议对生产工作流使用命令行界面。
📷 ImageToTensor Processing
📷 ImageToTensor 处理
- JPEG Decoding : Complete JPEG image processing pipeline
JPEG 解码:完整的 JPEG 图像处理管道 - Multiple Color Spaces : RGB, YUV, Grayscale support
多个色彩空间:支持 RGB、YUV、灰度 - Hardware Optimization : SIMD and platform-specific optimizations
硬件优化:SIMD 和特定于平台的优化 - Preprocessing Pipeline : Normalization, resizing, and format conversion
预处理管道:规范化、调整大小和格式转换
🔧 Extensive ONNX Support
🔧 广泛的 ONNX 支持
- 30+ Operators : Comprehensive coverage of neural network operations
30+ 算子:神经网络作全面覆盖 - Multiple Data Types : Float32, Int64, Bool, and more
多种数据类型:Float32、Int64、Bool 等 - Dynamic Shapes : Support for variable input dimensions
动态形状:支持可变输入尺寸 - Custom Operators : Extensible operator framework
自定义运算符:可扩展的运算符框架
Why Z-Ant? 为什么选择 Z-Ant?
- 🚫 Lack of DL Support : Devices like TI Sitara, Raspberry Pi Pico, or ARM Cortex-M lack comprehensive DL libraries
🚫 缺乏 DL 支持:TI Sitara、Raspberry Pi Pico 或 ARM Cortex-M 等设备缺乏全面的 DL 库 - 🌍 Open-source : Complete end-to-end NN deployment and optimization solution
🌍 开源:完整的端到端 NN 部署和优化解决方案 - 🎓 Research-Inspired : Implements cutting-edge optimization techniques inspired by MIT's Han Lab research
🎓 受研究启发:实施受麻省理工学院 Han 实验室研究启发的尖端优化技术 - 🏛 Academic Collaboration : Developed in collaboration with institutions like Politecnico di Milano
🏛 学术合作: 与米兰理工大学等机构合作开发 - ⚡ Performance First : Designed for real-time inference with minimal resource usage
⚡ 性能优先:专为实时推理而设计,资源使用量最少 - 🔧 Developer Friendly : Clear APIs, extensive documentation, and practical examples
🔧 开发人员友好:清晰的 API、广泛的文档和实际示例
Use Cases 使用案例
- 🏭 Edge AI : Real-time anomaly detection, predictive maintenance
🏭 Edge AI:实时异常检测,预测性维护 - 🤖 IoT & Autonomous Systems : Lightweight AI models for drones, robots, vehicles, IoT devices
🤖 物联网和自主系统:用于无人机、机器人、车辆、物联网设备的轻量级人工智能模型 - 📱 Mobile Applications : On-device inference for privacy-preserving AI
📱 移动应用程序:用于隐私保护 AI 的设备端推理 - 🏥 Medical Devices : Real-time health monitoring and diagnostics
🏥 医疗设备:实时健康监测和诊断 - 🎮 Gaming : AI-powered gameplay enhancement on embedded systems
🎮 游戏:嵌入式系统上 AI 驱动的游戏增强功能
同类最佳 TinyML 引擎路线图
To establish Z-Ant as the premier tinyML inference engine, we are pursuing several key improvements:
为了将 Z-Ant 确立为首屈一指的 tinyML 推理引擎,我们正在寻求几项关键改进:
🔥 Performance Optimizations
🔥 性能优化
**Ultra-Low Latency Inference
超低延迟推理**
- Custom Memory Allocators : Zero-allocation inference with pre-allocated memory pools
自定义内存分配器:使用预分配的内存池进行零分配推理 - In-Place Operations : Minimize memory copies through tensor operation fusion
就地作:通过张量运算融合最大限度地减少内存副本 - SIMD Vectorization : ARM NEON, RISC-V Vector extensions, and x86 AVX optimizations
SIMD 矢量化:ARM NEON、RISC-V 矢量扩展和 x86 AVX 优化 - Assembly Kernels : Hand-optimized assembly for critical operations (matrix multiplication, convolution)
汇编内核:针对关键运算(矩阵乘法、卷积)的手动优化汇编 - Cache-Aware Algorithms : Memory access patterns optimized for L1/L2 cache efficiency
高速缓存感知算法:针对 L1/L2 高速缓存效率优化的内存访问模式
**Advanced Model Optimization
高级模型优化**
- Dynamic Quantization : Runtime precision adjustment based on input characteristics
动态量化:根据输入特性进行运行时精度调整 - Structured Pruning : Channel and block-level pruning for hardware-friendly sparsity
结构化修剪:通道和块级修剪,实现硬件友好的稀疏性 - Knowledge Distillation : Automatic teacher-student model compression pipeline
知识蒸馏:自动师生模型压缩管道 - Neural Architecture Search (NAS) : Hardware-aware model architecture optimization
神经架构搜索 (NAS):硬件感知模型架构优化 - Binary/Ternary Networks : Extreme quantization for ultra-low power inference
二进制/三元网络:用于超低功耗推理的极端量化
⚡ Hardware Acceleration ⚡ 硬件加速
**Microcontroller-Specific Optimizations
微控制器特定的优化**
- DSP Instruction Utilization : Leverage ARM Cortex-M DSP instructions and RISC-V packed SIMD
DSP 指令利用:利用 ARM Cortex-M DSP 指令和 RISC-V 封装的 SIMD - DMA-Accelerated Operations : Offload data movement to DMA controllers
DMA 加速作:将数据移动卸载到 DMA 控制器 - Flash Execution Strategies : XIP (Execute-in-Place) optimization for flash-resident models
Flash 执行策略:针对 Flash 驻留模型的 XIP(就地执行)优化 - Low-Power Modes : Dynamic frequency scaling and sleep mode integration
低功耗模式:动态频率调节和休眠模式集成 - Hardware Security Modules : Secure model storage and execution
硬件安全模块:安全的模型存储和执行
**Emerging Hardware Support
新兴硬件支持**
- NPU Integration : Support for dedicated neural processing units (e.g., Arm Ethos, Intel Movidius)
NPU 集成:支持专用神经处理单元(例如 Arm Ethos、Intel Movidius) - FPGA Acceleration : Custom hardware generation for ultra-performance inference
FPGA 加速:用于超高性能推理的定制硬件生成 - GPU Compute : OpenCL/CUDA kernels for edge GPU acceleration
GPU 计算:用于边缘 GPU 加速的 OpenCL/CUDA 内核 - Neuromorphic Computing : Spike-based neural network execution
Neuromorphic Computing:基于尖峰的神经网络执行
🧠 Advanced AI Capabilities
🧠 高级 AI 功能
**Model Compression & Acceleration
模型压缩和加速**
- Lottery Ticket Hypothesis : Sparse subnetwork discovery and training
彩票假说:稀疏子网络发现和训练 - Progressive Quantization : Gradual precision reduction during training/deployment
渐进式量化:训练/部署期间逐渐降低精度 - Magnitude-Based Pruning : Automatic weight importance analysis
基于量级的修剪:自动权重重要性分析 - Channel Shuffling : Network reorganization for efficient inference
通道改组:网络重组以实现高效推理 - Tensor Decomposition : Low-rank approximation for parameter reduction
张量分解:用于参数缩减的低秩近似
Adaptive Inference 自适应推理
- Early Exit Networks : Conditional computation based on input complexity
Early Exit Networks:基于输入复杂度的条件计算 - Dynamic Model Selection : Runtime model switching based on resource availability
动态模型选择:基于资源可用性的运行时模型切换 - Cascaded Inference : Multi-stage models with progressive complexity
级联推理:具有渐进复杂性的多阶段模型 - Attention Mechanism Optimization : Efficient transformer and attention implementations
注意力机制优化:高效的 transformer 和 attention 实现
🔧 Developer Experience & Tooling
🔧 开发者体验和工具
**Advanced Profiling & Analysis
高级分析和分析**
- Hardware Performance Counters : Cycle-accurate performance measurement
硬件性能计数器:周期精确的性能测量 - Energy Profiling : Power consumption analysis per operation
能量分析:每个作的功耗分析 - Memory Footprint Analysis : Detailed RAM/Flash usage breakdown
内存占用分析:详细的 RAM/Flash 使用情况明细 - Thermal Analysis : Temperature impact on inference performance
热分析:温度对推理性能的影响 - Real-Time Visualization : Live performance monitoring dashboards
实时可视化:实时性能监控仪表板
**Automated Optimization Pipeline
自动优化管道**
- AutoML Integration : Automated hyperparameter tuning for target hardware
AutoML 集成:目标硬件的自动超参数调整 - Benchmark-Driven Optimization : Continuous performance regression testing
基准测试驱动优化:持续性能回归测试 - Hardware-in-the-Loop Testing : Automated testing on real hardware platforms
硬件在环测试:在真实硬件平台上进行自动化测试 - Model Validation : Accuracy preservation verification throughout optimization
模型验证:在整个优化过程中保持精度验证 - Deploy-to-Production Pipeline : One-click deployment to embedded systems
部署到生产管道:一键部署到嵌入式系统
🌐 Ecosystem & Integration
🌐 生态系统与集成
**Framework Interoperability
框架互作性**
- TensorFlow Lite Compatibility : Seamless migration from TFLite models
TensorFlow Lite 兼容性:从 TFLite 模型无缝迁移 - PyTorch Mobile Integration : Direct PyTorch model deployment pipeline
PyTorch 移动集成:直接 PyTorch 模型部署管道 - ONNX Runtime Parity : Feature-complete ONNX runtime alternative
ONNX 运行时奇偶校验:功能齐全的 ONNX 运行时替代方案 - MLflow Integration : Model versioning and experiment tracking
MLflow 集成:模型版本控制和实验跟踪 - Edge Impulse Compatibility : Integration with popular edge ML platforms
Edge Impulse 兼容性:与流行的边缘 ML 平台集成
Production Deployment 生产部署
- OTA Model Updates : Over-the-air model deployment and versioning
OTA 模型更新:无线模型部署和版本控制 - A/B Testing Framework : Safe model rollout with performance comparison
A/B 测试框架:安全推出模型并进行性能比较 - Federated Learning Support : Distributed training on edge devices
联邦学习支持:边缘设备上的分布式训练 - Model Encryption : Secure model storage and execution
模型加密:安全的模型存储和执行 - Compliance Tools : GDPR, HIPAA, and safety-critical certifications
合规性工具:GDPR、HIPAA 和安全关键认证
📊 Benchmarking & Validation
📊 基准测试和验证
**Industry-Standard Benchmarks
行业标准基准**
- MLPerf Tiny : Competitive performance on standard benchmarks
MLPerf Tiny:在标准基准测试中具有竞争力的性能 - EEMBC MLMark : Energy efficiency measurements
EEMBC MLMark:能效测量 - Custom TinyML Benchmarks : Domain-specific performance evaluation
自定义 TinyML 基准测试:特定领域的性能评估 - Real-World Workload Testing : Production-representative model validation
真实工作负载测试:具有生产代表性的模型验证 - Cross-Platform Consistency : Identical results across all supported hardware
跨平台一致性:在所有支持的硬件上获得相同的结果
Quality Assurance 质量保证
- Fuzzing Infrastructure : Automated testing with random inputs
模糊测试基础设施:使用随机输入进行自动化测试 - Formal Verification : Mathematical proof of correctness for critical operations
形式化验证:关键作正确性的数学证明 - Hardware Stress Testing : Extended operation under extreme conditions
硬件压力测试:在极端条件下延长作时间 - Regression Test Suite : Comprehensive backward compatibility testing
回归测试套件:全面的向后兼容性测试 - Performance Monitoring : Continuous integration with performance tracking
性能监控:与性能跟踪持续集成
🚀 Getting Started for Contributors
🚀 投稿人入门
Prerequisites 先决条件
- Zig Compiler : Install the latest Zig compiler
Zig 编译器:安装最新的 Zig 编译器 - Git : For version control and collaboration
Git:用于版本控制和协作 - Basic Zig Knowledge : Improve Zig proficiency via Ziglings
Zig 基础知识:通过 Ziglings 提高 Zig 熟练度
Quick Setup 快速设置
-
Clone the repository: 克隆存储库:
git clone https://github.com/ZIGTinyBook/Z-Ant.git cd Z-Ant
-
Run tests to verify setup:
运行测试以验证设置:zig build test --summary all
-
Generate code for a model:
为模型生成代码:zig build codegen -Dmodel=mnist-1
First Time Contributors 首次贡献者
Start here if you're new to Z-Ant:
如果您是 Z-Ant 的新用户,请从这里开始:
- Run existing tests : Use
zig build test --summary all
to understand the codebase
运行现有测试:用于zig build test --summary all
了解代码库 - Try code generation : Use
zig build codegen -Dmodel=mnist-1
to see the workflow
尝试代码生成:用于zig build codegen -Dmodel=mnist-1
查看工作流程 - Read the documentation : Check
/docs/
folder for detailed guides
阅读文档:检查/docs/
文件夹以获取详细指南 - Review the Hackathon Guide : For specific guidance on the rendering and lowering pipeline, refer to the HackathonGuide.md.
查看 Hackathon 指南:有关渲染和降低管道的具体指导,请参阅 HackathonGuide.md。
Project Architecture 项目架构
notranslate
<span style="background-color:var(--bgColor-muted, var(--color-canvas-subtle))"><span style="color:#1f2328"><span style="color:var(--fgColor-default, var(--color-fg-default))"><span style="background-color:var(--bgColor-muted, var(--color-canvas-subtle))"><code>Z-Ant/
├── src/ # Core source code
│ ├── Core/ # Neural network core functionality
│ ├── CodeGen/ # Code generation engine
│ ├── ImageToTensor/ # Image preprocessing pipeline
│ ├── onnx/ # ONNX model parsing
│ └── Utils/ # Utilities and helpers
├── tests/ # Comprehensive test suite
├── datasets/ # Sample models and test data
├── generated/ # Generated code output
├── examples/ # Arduino and microcontroller examples
└── docs/ # Documentation and guides
</code></span></span></span></span>
🛠️ Development Workflow 🛠️ 开发工作流程
Quick Start Commands 快速启动命令
# Run comprehensive tests
zig build test --summary all
# Generate code for a specific model
zig build codegen -Dmodel=mnist-1
# Test generated code
zig build test-codegen -Dmodel=mnist-1
# Compile static library for deployment
zig build lib -Dmodel=mnist-1 -Dtarget=thumb-freestanding -Dcpu=cortex_m33
Git Branching Strategy Git 分支策略
We follow a structured branching strategy to ensure code quality and smooth collaboration:
我们遵循结构化的分支策略来确保代码质量和顺利协作:
Branch Types 分支类型
main
: Stable, production-ready code for releases
main
:用于发布的稳定、生产就绪代码feature/<feature-name>
: New features under development
feature/<feature-name>
:正在开发的新功能fix/<issue-description>
: Bug fixes and patches
fix/<issue-description>
:错误修复和补丁docs/<documentation-topic>
: Documentation improvements
docs/<documentation-topic>
: 文档改进test/<test-improvements>
: Test suite enhancements
test/<test-improvements>
:测试套件增强功能
Best Practices for Contributors
贡献者的最佳实践
- Test Before Committing : Run
zig build test --summary all
before every commit
Test Before Committing:在每次提交之前运行zig build test --summary all
- Document Your Code : Follow Zig's doc-comments standard
记录您的代码:遵循 Zig 的文档注释标准 - Small, Focused PRs : Keep pull requests small and focused on a single feature/fix
小型、专注的 PR:保持拉取请求较小并专注于单个功能/修复 - Use Conventional Commits : Follow commit message conventions (feat:, fix:, docs:, etc.)
使用常规提交:遵循提交消息约定(feat:、fix:、docs: 等)
Using Z-Ant 🔧 使用 Z-Ant
Development Requirements 开发要求
- Install the latest Zig compiler
安装最新的 Zig 编译器 - Improve Zig proficiency via Ziglings
通过 Ziglings 提高 Zig 熟练度
Running Tests 运行测试
Add tests to build.zig/test_list
.
将测试添加到 build.zig/test_list
。
-
Regular tests: 定期测试:
zig build test --summary all
-
Heavy computational tests:
繁重的计算测试:zig build test -Dheavy --summary all
Generating Code for Models
为模型生成代码
zig build codegen -Dmodel=model_name [-Dlog -Duser_tests=user_tests.json]
Generated code will be placed in:
生成的代码将被放置在:
notranslate
<span style="background-color:var(--bgColor-muted, var(--color-canvas-subtle))"><span style="color:#1f2328"><span style="color:var(--fgColor-default, var(--color-fg-default))"><span style="background-color:var(--bgColor-muted, var(--color-canvas-subtle))"><code>generated/model_name/
├── lib_{model_name}.zig
├── test_{model_name}.zig
└── user_tests.json
</code></span></span></span></span>
Testing Generated Models 测试生成的模型
zig build test-codegen -Dmodel=model_name
Integrating into Your Project
集成到您的项目中
Build the static library:
构建静态库:
zig build lib -Dmodel=model_name -Dtarget={arch} -Dcpu={cpu}
Linking with CMake: 与 CMake 链接:
target_link_libraries(your_project PUBLIC path/to/libzant.a)
Logging (Optional) 日志记录(可选)
To set a custom log function from your C code:
要从 C 代码设置自定义日志函数:
extern void setLogFunction(void (*log_function)(uint8_t *string));
🏗️ Build System (build.zig
)
🏗️ 构建系统 ( build.zig
)
Available Build Commands 可用的构建命令
Core Commands 核心命令
-
Standard build: 标准版本:
zig build # Build all targets
-
Run unit tests: 运行单元测试:
zig build test --summary all # Run all unit tests
-
Code generation: 代码生成:
zig build codegen -Dmodel=model_name # Generate code for specified model
-
Static library compilation:
静态库编译:zig build lib -Dmodel=model_name # Compile static library for deployment
Testing Commands 测试命令
-
Test generated library: 测试生成的库:
zig build test-generated-lib -Dmodel=model_name # Test specific generated model library
-
OneOp model testing: OneOp 模型测试:
zig build test-codegen-gen # Generate oneOperation test models zig build test-codegen # Test all generated oneOperation models
-
ONNX parser testing: ONNX 解析器测试:
zig build onnx-parser # Test ONNX parser functionality
Profiling & Performance 分析和性能
-
Build main executable for profiling:
构建用于性能分析的主可执行文件:zig build build-main -Dmodel=model_name # Build profiling target executable
Command-Line Options 命令行选项
**Target & Architecture Options
目标和架构选项**
-Dtarget=<arch>
: Target architecture (e.g.,thumb-freestanding
,native
)
-Dtarget=<arch>
:目标架构(例如、thumb-freestanding
native
、 )-Dcpu=<cpu>
: CPU model (e.g.,cortex_m33
,cortex_m4
)
-Dcpu=<cpu>
:CPU 型号(例如、cortex_m33
、)cortex_m4
Model & Path Options 模型和路径选项
-Dmodel=<name>
: Model name (default:mnist-8
)
-Dmodel=<name>
:模型名称(默认值:mnist-8
)-Dmodel_path=<path>
: Custom ONNX model path
-Dmodel_path=<path>
:自定义 ONNX 模型路径-Dgenerated_path=<path>
: Output directory for generated code
-Dgenerated_path=<path>
:生成代码的输出目录-Doutput_path=<path>
: Output directory for compiled library
-Doutput_path=<path>
:编译库的输出目录
Code Generation Options 代码生成选项
-Dlog=true|false
: Enable detailed logging during code generation
-Dlog=true|false
:在代码生成期间启用详细日志记录-Duser_tests=<path>
: Specify custom user tests JSON file
-Duser_tests=<path>
:指定自定义用户测试 JSON 文件-Dshape=<shape>
: Input tensor shape
-Dshape=<shape>
:输入张量形状-Dtype=<type>
: Input data type (default:f32
)
-Dtype=<type>
:输入数据类型(默认值:f32
)-Dcomm=true|false
: Generate code with comments
-Dcomm=true|false
:生成带注释的代码-Ddynamic=true|false
: Enable dynamic memory allocation
-Ddynamic=true|false
:启用动态内存分配
Testing Options 测试选项
-Dheavy=true|false
: Run heavy computational tests
-Dheavy=true|false
:运行繁重的计算测试-Dtest_name=<name>
: Run specific test by name
-Dtest_name=<name>
:按名称运行特定测试
**Debug & Profiling Options
调试和分析选项**
-Dtrace_allocator=true|false
: Use tracing allocator for debugging (default:true
)
-Dtrace_allocator=true|false
:使用跟踪分配器进行调试(默认值:true
)-Dallocator=<type>
: Allocator type to use (default:raw_c_allocator
)
-Dallocator=<type>
:要使用的分配器类型(默认值:raw_c_allocator
)
Common Usage Examples 常见使用示例
# Generate code for MNIST model with logging
zig build codegen -Dmodel=mnist-1 -Dlog=true
# Build static library for ARM Cortex-M33
zig build lib -Dmodel=mnist-1 -Dtarget=thumb-freestanding -Dcpu=cortex_m33
# Test with heavy computational tests enabled
zig build test -Dheavy=true --summary all
# Generate code with custom paths and comments
zig build codegen -Dmodel=custom_model -Dmodel_path=my_models/custom.onnx -Dgenerated_path=output/ -Dcomm=true
# Build library with custom output location
zig build lib -Dmodel=mnist-1 -Doutput_path=/path/to/deployment/
# Run specific test
zig build test -Dtest_name=tensor_math_test
# Build profiling executable for performance analysis
zig build build-main -Dmodel=mnist-1 -Dtarget=native
五、软件下载
本文信息来源于GitHub作者地址:https://github.com/ZantFoundation/Z-Ant