数据可视化与分析平台之Superset

Superset

概述

Apache Superset是一个现代的数据探索和可视化平台。它功能强大且十分易用,可对接各种数据源,包括很多现代的大数据分析引擎,拥有丰富的图表展示形式,并且支持自定义仪表盘。

官网:https://superset.apache.org/

文档:https://superset.apache.org/docs/intro

GitHub:https://github.com/apache/superset

安装Python环境

Superset是由Python语言编写的Web应用,要求Python3.7以上的环境。
通常Linux服务器是有安装Python环境的,Python版本为2.X。又因为系统很多服务功能需要Python2.X,且Python2与Python3不兼容,故需要安装Python3的环境。

注意:

如果误删除、更新python2将出现不可预料的后果,解决方法参考:误删自带python2或yum异常导致yum命令不可用的解决方法

这里使用CondaPython虚拟环境管理,具体使用参考:Anaconda Conda的安装配置与Python虚拟环境管理

python 复制代码
conda create -n superset
python 复制代码
[root@master ~]# conda activate superset
(superset) [root@master ~]#

创建superset环境

python 复制代码
conda create --name superset python=3.10.9

激活环境,并查看Python版本

python 复制代码
[root@node01 ~]# conda activate superset
(superset) [root@node01 ~]# python -V
Python 3.10.9

Superset部署

安装依赖

安装Superset之前,需安装以下所需依赖

bash 复制代码
yum install -y gcc gcc-c++ libffi-devel python-devel python-pip python-wheel python-setuptools openssl-devel cyrus-sasl-devel openldap-devel

安装Superset

可能需要升级 pip、upgrade才能使安装正常工作

bash 复制代码
pip install --upgrade pip -i https://pypi.douban.com/simple/

pip install --upgrade setuptools pip -i https://pypi.douban.com/simple/

安装Supetset

bash 复制代码
pip install apache-superset -i https://pypi.douban.com/simple/

更换镜像安装Supetset

bash 复制代码
pip install apache-superset --trusted-host https://repo.huaweicloud.com -i https://repo.huaweicloud.com/repository/pypi/simple

指定版本安装

bash 复制代码
pip install apache-superset==2.1.0 -i https://pypi.douban.com/simple/

安装遇到异常:

python 复制代码
  ERROR: HTTP error 404 while getting https://repo.huaweicloud.com/repository/pypi/packages/18/b9/cb8d519ea0094b9b8fe7480225c14937517729f8ec927643dc7379904f64/celery-5.3.1-py3-none-any.whl.metadata
ERROR: 404 Client Error: Not Found for url: https://repo.huaweicloud.com/repository/pypi/packages/18/b9/cb8d519ea0094b9b8fe7480225c14937517729f8ec927643dc7379904f64/celery-5.3.1-py3-none-any.whl.metadata

使用清华大学的镜像源进行安装:

python 复制代码
pip install apache-superset==2.1.0 -i https://pypi.tuna.tsinghua.edu.cn/simple

配置Superset元数据库

Superset的元数据支持MySQL、PostgreSQL,此处采用MySQL。

创建superset元数据库

bash 复制代码
CREATE DATABASE superset DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

创建superset用户

bash 复制代码
create user superset@'%' identified WITH mysql_native_password BY 'superset';

grant all privileges on *.* to superset@'%' with grant option;

flush privileges;

修改superset配置文件

bash 复制代码
vim  /usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/config.py

开启行号

python 复制代码
: set nu

找到大概197行

python 复制代码
 197 # The SQLAlchemy connection string.
 198 SQLALCHEMY_DATABASE_URI = "sqlite:///" + os.path.join(DATA_DIR, "superset.db")
 199 # SQLALCHEMY_DATABASE_URI = 'mysql://myapp@localhost/myapp'
 200 # SQLALCHEMY_DATABASE_URI = 'postgresql://root:password@localhost/myapp'

配置

bash 复制代码
SQLALCHEMY_DATABASE_URI = 'mysql://superset:superset@node01:3306/superset?charset=utf8'

安装python msyql驱动

bash 复制代码
conda install mysqlclient

初始化superset元数据

bash 复制代码
export FLASK_APP=superset

superset db upgrade

可能出现如下异常:

python 复制代码
(superset) [root@master ~]# superset db upgrade
Traceback (most recent call last):
  File "/usr/local/program/miniconda3/envs/superset/bin/superset", line 5, in <module>
    from superset.cli.main import superset
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/__init__.py", line 21, in <module>
    from superset.app import create_app
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/app.py", line 23, in <module>
    from superset.initialization import SupersetAppInitializer
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/initialization/__init__.py", line 33, in <module>
    from superset.extensions import (
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/extensions/__init__.py", line 32, in <module>
    from superset.utils.async_query_manager import AsyncQueryManager
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/utils/async_query_manager.py", line 26, in <module>
    from superset.utils.core import get_user_id
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/utils/core.py", line 106, in <module>
    from superset.sql_parse import sanitize_clause
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/sql_parse.py", line 67, in <module>
    re.compile(r"'(''|\\\\|\\|[^'])*'", sqlparse.keywords.FLAGS).match,
AttributeError: module 'sqlparse.keywords' has no attribute 'FLAGS'

原因:

superset使用的sqlparse库版本不兼容导致的。SQLParse库中的FLAGS属性在较新的版本中已被移除,而superset所依赖的版本可能需要使用这个属性。

解决方案:

复制代码
升级superset:尝试升级到superset的最新版本,可能已经修复了这个问题。

降级sqlparse库:尝试降低sqlparse库的版本,使用一个兼容的版本。可以使用以下命令安装一个特定版本的sqlparse库

修改superset代码

这里选择降低sqlparse库版本

python 复制代码
(superset) [root@master ~]# conda list | grep sqlparse
sqlparse                  0.4.4                    pypi_0    pypi


(superset) [root@master ~]# pip install sqlparse==0.4.1


(superset) [root@master ~]# conda list | grep sqlparse
sqlparse                  0.4.1                    pypi_0    pypi

再次执行初始化操作,异常消失,但是出现警告。

这个警告是关于Superset中的SECRET_KEY的默认设置。SECRET_KEY用于加密数据和计算哈希值,以增加应用程序的安全性。默认情况下,Superset使用一个默认的SECRET_KEY,但这是不安全的,因为它在公开的代码仓库中公开,可能会被恶意使用。

python 复制代码
(superset) [root@master ~]# superset db upgrade
--------------------------------------------------------------------------------
                                    WARNING
--------------------------------------------------------------------------------
A Default SECRET_KEY was detected, please use superset_config.py to override it.
Use a strong complex alphanumeric string and use a tool to help you generate
a sufficiently random sequence, ex: openssl rand -base64 42
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Refusing to start due to insecure SECRET_KEY

为了解决这个警告,应该使用一个强大且随机的SECRET_KEY来覆盖默认设置,以增加应用程序的安全性。

进入Superset的安装目录

python 复制代码
cd /usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset

使用工具来帮助生成一个足够随机的序列

python 复制代码
(superset) [root@master lib]# openssl rand -base64 42
m9y2X0JSOhZBPafQE8JVJtqtzESXXIeFg8opUOLom04k7EucpYCEb4Ts

修改superset配置文件

python 复制代码
vim  /usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/config.py

配置SECRET_KEY

python 复制代码
 191 # Your App secret key. Make sure you override it on superset_config.py
 192 # or use `SUPERSET_SECRET_KEY` environment variable.
 193 # Use a strong complex alphanumeric string and use a tool to help you generate
 194 # a sufficiently random sequence, ex: openssl rand -base64 42"
 195 #SECRET_KEY = os.environ.get("SUPERSET_SECRET_KEY") or CHANGE_ME_SECRET_KEY
 196 SECRET_KEY ='m9y2X0JSOhZBPafQE8JVJtqtzESXXIeFg8opUOLom04k7EucpYCEb4Ts'

再次执行初始化遇到如下异常:

python 复制代码
Traceback (most recent call last):
  File "/usr/local/program/miniconda3/envs/superset/bin/superset", line 8, in <module>
    sys.exit(superset())
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/flask/cli.py", line 567, in main
    return super().main(*args, **kwargs)
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/click/core.py", line 1685, in invoke
    super().invoke(ctx)
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/flask/cli.py", line 406, in decorator
    with __ctx.ensure_object(ScriptInfo).load_app().app_context():
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/flask/cli.py", line 369, in load_app
    app = locate_app(import_name, name)
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/flask/cli.py", line 231, in locate_app
    return find_best_app(module)
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/flask/cli.py", line 57, in find_best_app
    app = app_factory()
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/app.py", line 44, in create_app
    raise ex
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/app.py", line 37, in create_app
    app_initializer.init_app()
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/initialization/__init__.py", line 493, in init_app
    self.init_app_in_ctx()
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/initialization/__init__.py", line 425, in init_app_in_ctx
    self.configure_data_sources()
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/initialization/__init__.py", line 519, in configure_data_sources
    __import__(module_name, fromlist=class_names)
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/connectors/sqla/__init__.py", line 17, in <module>
    from . import models, views
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/connectors/sqla/views.py", line 32, in <module>
    from superset.connectors.base.views import DatasourceModelView
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/connectors/base/views.py", line 24, in <module>
    from superset.views.base import SupersetModelView
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/views/__init__.py", line 17, in <module>
    from . import (
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/views/access_requests.py", line 24, in <module>
    from superset.views.base import DeleteMixin, SupersetModelView
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/views/base.py", line 67, in <module>
    from superset.db_engine_specs.gsheets import GSheetsEngineSpec
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/db_engine_specs/gsheets.py", line 33, in <module>
    from superset.databases.schemas import encrypted_field_properties, EncryptedString
  File "/usr/local/program/miniconda3/envs/superset/lib/python3.10/site-packages/superset/databases/schemas.py", line 28, in <module>
    from marshmallow_enum import EnumField
ModuleNotFoundError: No module named 'marshmallow_enum'

解决方案:安装marshmallow_enum

python 复制代码
pip install marshmallow_enum
python 复制代码
(superset) [root@master ~]# superset db upgrade
logging was configured successfully
2023-08-23 10:43:23,481:INFO:superset.utils.logging_configurator:logging was configured successfully
2023-08-23 10:43:23,494:INFO:root:Configured event logger of type <class 'superset.utils.log.DBEventLogger'>

INFO  [alembic.runtime.migration] Running upgrade a39867932713 -> 409c7b420ab0, add created_by_fk as owner
INFO  [alembic.runtime.migration] Running upgrade 409c7b420ab0 -> ffa79af61a56, rename report_schedule.extra to extra_json
INFO  [alembic.runtime.migration] Running upgrade ffa79af61a56 -> 6d3c6f9d665d, fix_table_chart_conditional_formatting_colors
INFO  [alembic.runtime.migration] Running upgrade 6d3c6f9d665d -> 291f024254b5, drop_column_allow_multi_schema_metadata_fetch
INFO  [alembic.runtime.migration] Running upgrade 291f024254b5 -> deb4c9d4a4ef, parameters in saved queries
INFO  [alembic.runtime.migration] Running upgrade deb4c9d4a4ef -> 4ce1d9b25135, remove_filter_bar_orientation
INFO  [alembic.runtime.migration] Running upgrade 4ce1d9b25135 -> f3c2d8ec8595, create_ssh_tunnel_credentials_tbl

初始化成功,查看数据库,发现生成了相关的表。

python 复制代码
mysql> use superset
Database changed
mysql> show tables;
+----------------------------+
| Tables_in_superset         |
+----------------------------+
| ab_permission              |
| ab_permission_view         |
| ab_permission_view_role    |
| ab_register_user           |
| ab_role                    |
| ab_user                    |
| ab_user_role               |
| ab_view_menu               |
| access_request             |
| alembic_version            |
| alert_logs                 |
| alert_owner                |
| alerts                     |
| annotation                 |

SupersetSet初始化

创建管理员用户

bash 复制代码
superset fab create-admin
python 复制代码
  for prop in class_mapper(obj).iterate_properties:
Username [admin]: // 回车,使用默认用户admin,用于登陆管理页面的管理用户
User first name [admin]: // 回车
User last name [user]: // 回车
Email [admin@fab.org]: // 回车
Password: // 设置密码,用于登陆管理页面的管理用户密码
Repeat for confirmation: // 确认密码
Recognized Database Authentications. 
Admin User admin created.

初始化superset

bash 复制代码
superset init

启动Supterset

安装gunicorn,它是一个Python Web Server,可以和java中的TomCat类比

bash 复制代码
pip install gunicorn -i https://pypi.douban.com/simple/

启动Superset

bash 复制代码
gunicorn --workers 5 --timeout 120 --bind node01:8787  "superset.app:create_app()" --daemon 
bash 复制代码
--workers:指定进程个数

--timeout:worker进程超时时间,超时会自动重启

--bind:绑定本机地址,即为Superset访问地址

--daemon:后台运行

登录Superset

访问http://IP:8787进行登录,使用创建管理员的账号密码

停止gunicorn进程

bash 复制代码
ps -ef | awk '/superset/ && !/awk/{print $2}' | xargs kill -9

退出superset环境

bash 复制代码
conda deactivate

superset启停脚本

创建vim superset.sh文件

bash 复制代码
#!/bin/bash

superset_status(){
    result=`ps -ef | awk '/gunicorn/ && !/awk/{print $2}' | wc -l`
    if [[ $result -eq 0 ]]; then
        return 0
    else
        return 1
    fi
}
superset_start(){
        source ~/.bashrc
        superset_status >/dev/null 2>&1
        if [[ $? -eq 0 ]]; then
            conda activate superset ; gunicorn --workers 5 --timeout 120 --bind hadoop102:8787 --daemon 'superset.app:create_app()'
        else
            echo "superset正在运行"
        fi

}

superset_stop(){
    superset_status >/dev/null 2>&1
    if [[ $? -eq 0 ]]; then
        echo "superset未在运行"
    else
        ps -ef | awk '/gunicorn/ && !/awk/{print $2}' | xargs kill -9
    fi
}


case $1 in
    start )
        echo "启动Superset"
        superset_start
    ;;
    stop )
        echo "停止Superset"
        superset_stop
    ;;
    restart )
        echo "重启Superset"
        superset_stop
        superset_start
    ;;
    status )
        superset_status >/dev/null 2>&1
        if [[ $? -eq 0 ]]; then
            echo "superset未在运行"
        else
            echo "superset正在运行"
        fi
esac

加执行权限

bash 复制代码
chmod +x superset.sh

启动superset

bash 复制代码
superset.sh start

停止superset

bash 复制代码
superset.sh stop

Superset使用

Superset对接MySQL数据源

安装依赖

bash 复制代码
conda install mysqlclient

注意:对接不同的数据源,需安装不同的依赖

官网说明:

bash 复制代码
https://superset.apache.org/docs/databases/installing-database-drivers/

Database配置

点击Database Connections 点击DATABASE 选择需要连接的数据库

方式一:逐个输入认证信息 方式二:通过URL连接

注意:

SQL Alchemy URI编写规范:mysql://用户名:密码@主机名:端口号/数据库名称

此处填写:mysql://superset:superset@master:3306/demo?charset=utf8,然后点击Test Connection,出现Connection looks good提示即表示连接成功

Table配置

点击Datasets

点击DATASET 配置Table 点击Create DataSet And Create Chart 此时返回Datasets

创建空白仪表盘

点击Dashboards

命名并保存

创建图表

点击Charts

选则数据源及图表类型并创建图表

按照说明配置图表并创建

如配置无误,可出现以下图标 保存至仪表盘

编辑仪表盘

打开仪表盘,点击编辑按钮

调整图表大小以及图表盘布局

调整仪表盘自动刷新时间 最后保存

相关推荐
2601_9599862440 分钟前
M4Markets:把信息透明度做到位——路径分析与提示整理
大数据·人工智能
追巨1 小时前
单节点elasticsearch安装笔记
大数据·elasticsearch·jenkins
夜郎king1 小时前
告别低效单篇创作,CSDN AI 批量生成工具深度体验
大数据·人工智能·csdn ai 数字营销
鱼锦0.01 小时前
Coreseek和Elasticsearch 有什么区别
大数据·elasticsearch·jenkins
SZLSDH1 小时前
从“高保真镜像”到“智能体集群”:数字孪生应用演进的工程适配逻辑
ai·数字孪生·数据可视化·智能体
王莎莎-MinerU1 小时前
Agent 时代,科学数据 API 需要重新设计
大数据·前端·数据库·人工智能·个人开发
不做无法实现的梦~1 小时前
Git Clone 使用 Watt/Steam++ 加速时报证书错误的原因与解决方法
大数据·git·elasticsearch
智塑未来1 小时前
如何选择RFID软硬件系统供应商:采购决策的关键判断维度
大数据·人工智能
段一凡-华北理工大学2 小时前
工业领域的Hadoop架构学习~系列文章07:Spark内存计算引擎
大数据·人工智能·hadoop·学习·架构·高炉炼铁·高炉炼铁智能化
Bechamz2 小时前
大数据开发学习Day46
大数据·学习