开源免费ETL工具==PYTHON实现

方便自己快速处理一些基于文件的聚合计算,比如CSV。

https://github.com/hebian1994/etl_react_flow

🛠️ ETL Flow Builder

A powerful, visual ETL (Extract-Transform-Load) tool built with React , React Flow , Material-UI , and a Python Flask backend. Design and manage complex data pipelines with a user-friendly interface and DAG-based execution.


📸 Preview

📸 Structure


🚀 Features

🔄 Flow Management

  • Create, edit, and delete ETL flows
  • Flow version history and versioning
  • Real-time configuration and validation

🧩 Node System

  • Modular nodes for various ETL operations:
    • File Input
    • Data Viewer
    • Filter
    • Left Join
  • Custom node configuration panels
  • Node connection validation and schema enforcement

⚙️ Data Processing

  • DAG-based flow execution
  • Schema propagation and management
  • Preview intermediate data at any node
  • Configuration validation before execution

🧱 System Architecture

🖥️ Frontend

  • Framework: React + TypeScript
  • Visualization: React Flow
  • UI Library: Material-UI (MUI)
  • Engine Library: Polars
  • Core Components:
    • FlowList: Dashboard for managing flows
    • Designer: Drag-and-drop interface for building flows
    • History: View and restore previous versions
    • Custom Node UI and Config Panels

🔧 Backend

  • Framework: Python + Flask
  • Database: SQLite (via SQLAlchemy ORM)
  • API: RESTful endpoints for flow and node operations
  • Core Services:
    • FlowService: Handles flow CRUD and metadata
    • NodeService: Manages node lifecycle and configurations
    • ETLService: Executes DAGs and manages schema propagation

🧬 Data Models

复制代码
Flow    ──>  stores flow structure and metadata
Node    ──>  represents each ETL operation
Config  ──>  holds node-specific configuration
Schema  ──>  manages and validates data schema

🔒 State & Validation

  • Node configuration status tracking
  • Flow validation before execution
  • Schema-aware transformations and previews
  • UI-managed configuration state

🛠️ Tech Stack

Layer Technology
Frontend React, TypeScript, React Flow, MUI
Backend Polars, Python, Flask, SQLAlchemy
Database SQLite
Architecture REST API + DAG Executor

📦 Getting Started

🔧 Prerequisites

  • Node.js (v16+)
  • Python (v3.8+)
  • pipenv or virtualenv

🖥️ Frontend Setup

复制代码
cd frontend
npm install
npm run dev

🐍 Backend Setup

复制代码
cd backend
pipenv install
pipenv run flask run
相关推荐
菜鸟学Python1 小时前
Python web框架王者 Django 5.0发布:20周年了!
前端·数据库·python·django·sqlite
旧时光巷2 小时前
【机器学习-4】 | 集成学习 / 随机森林篇
python·随机森林·机器学习·集成学习·sklearn·boosting·bagging
Ice__Cai3 小时前
Django + Celery 详细解析:构建高效的异步任务队列
分布式·后端·python·django
MediaTea3 小时前
Python 库手册:doctest 文档测试模块
开发语言·python·log4j
2025年一定要上岸3 小时前
【pytest高阶】源码的走读方法及插件hook
运维·前端·python·pytest
angushine3 小时前
Python将Word转换为Excel
python·word·excel
抠头专注python环境配置4 小时前
Anaconda创建环境报错:CondaHTTPEFTOT: HTTP 403 FORBIDDEN for url
python·conda
王者鳜錸4 小时前
PYTHON从入门到实践-15数据可视化
开发语言·python·信息可视化
杨航 AI4 小时前
ADB+Python控制(有线/无线) Scrcpy+按键映射(推荐)
开发语言·python·adb
GISer_Jing4 小时前
Coze:字节跳动AI开发平台功能和架构解析
javascript·人工智能·架构·开源