开源免费ETL工具==PYTHON实现

方便自己快速处理一些基于文件的聚合计算,比如CSV。

https://github.com/hebian1994/etl_react_flow

🛠️ ETL Flow Builder

A powerful, visual ETL (Extract-Transform-Load) tool built with React , React Flow , Material-UI , and a Python Flask backend. Design and manage complex data pipelines with a user-friendly interface and DAG-based execution.


📸 Preview

📸 Structure


🚀 Features

🔄 Flow Management

  • Create, edit, and delete ETL flows
  • Flow version history and versioning
  • Real-time configuration and validation

🧩 Node System

  • Modular nodes for various ETL operations:
    • File Input
    • Data Viewer
    • Filter
    • Left Join
  • Custom node configuration panels
  • Node connection validation and schema enforcement

⚙️ Data Processing

  • DAG-based flow execution
  • Schema propagation and management
  • Preview intermediate data at any node
  • Configuration validation before execution

🧱 System Architecture

🖥️ Frontend

  • Framework: React + TypeScript
  • Visualization: React Flow
  • UI Library: Material-UI (MUI)
  • Engine Library: Polars
  • Core Components:
    • FlowList: Dashboard for managing flows
    • Designer: Drag-and-drop interface for building flows
    • History: View and restore previous versions
    • Custom Node UI and Config Panels

🔧 Backend

  • Framework: Python + Flask
  • Database: SQLite (via SQLAlchemy ORM)
  • API: RESTful endpoints for flow and node operations
  • Core Services:
    • FlowService: Handles flow CRUD and metadata
    • NodeService: Manages node lifecycle and configurations
    • ETLService: Executes DAGs and manages schema propagation

🧬 Data Models

复制代码
Flow    ──>  stores flow structure and metadata
Node    ──>  represents each ETL operation
Config  ──>  holds node-specific configuration
Schema  ──>  manages and validates data schema

🔒 State & Validation

  • Node configuration status tracking
  • Flow validation before execution
  • Schema-aware transformations and previews
  • UI-managed configuration state

🛠️ Tech Stack

Layer Technology
Frontend React, TypeScript, React Flow, MUI
Backend Polars, Python, Flask, SQLAlchemy
Database SQLite
Architecture REST API + DAG Executor

📦 Getting Started

🔧 Prerequisites

  • Node.js (v16+)
  • Python (v3.8+)
  • pipenv or virtualenv

🖥️ Frontend Setup

复制代码
cd frontend
npm install
npm run dev

🐍 Backend Setup

复制代码
cd backend
pipenv install
pipenv run flask run
相关推荐
AI生存日记2 小时前
百度文心大模型 4.5 系列全面开源 英特尔同步支持端侧部署
人工智能·百度·开源·open ai大模型
步、步、为营3 小时前
.net开源库SignalR
开源·.net
烛阴3 小时前
简单入门Python装饰器
前端·python
好开心啊没烦恼4 小时前
Python 数据分析:numpy,说人话,说说数组维度。听故事学知识点怎么这么容易?
开发语言·人工智能·python·数据挖掘·数据分析·numpy
面朝大海,春不暖,花不开4 小时前
使用 Python 实现 ETL 流程:从文本文件提取到数据处理的全面指南
python·etl·原型模式
2301_805054565 小时前
Python训练营打卡Day59(2025.7.3)
开发语言·python
万千思绪6 小时前
【PyCharm 2025.1.2配置debug】
ide·python·pycharm
国服第二切图仔6 小时前
文心开源大模型ERNIE-4.5-0.3B-Paddle私有化部署保姆级教程及技术架构探索
百度·架构·开源·文心大模型·paddle·gitcode
钱彬 (Qian Bin)6 小时前
一文掌握Qt Quick数字图像处理项目开发(基于Qt 6.9 C++和QML,代码开源)
c++·开源·qml·qt quick·qt6.9·数字图像处理项目·美观界面
步、步、为营7 小时前
.net开源物联网项目IoTSharp
物联网·开源·.net