selenium实现dataworks自动化批量创建数据集成任务

目录

一、背景

有个任务需要创建800多个dataworks的数据接入任务,手工一个个搞有些对不起自己的手,更加对不起自己的脑子。

场景:从odps下同步数据到mysql库

库名相同,表名不同

二、思路

使用selenium模拟人的操作路径。

打开dataworks地址,登录

点击业务流程,打开存放任务的目录

右键点击目录

遍历所有需要创建的任务

挨个创建任务

遇到的难点:

  1. 不同页面加载方式不同,有时隐式等待和显示等待都不好使,只能sleep
  2. (未解决)某些控件(例如时间控件)selenium无法操作,需调用javascript去实现
  3. 切换标签页后找不到元素,需要切换句柄
  4. (未解决)操作多次后浏览器运行变慢导致超时
    1. 解决思路1:清除缓存
    2. 解决思路2:重启浏览器
    3. 关闭当前标签页重新打开
  5. (未解决)dataworks无外网时没有dns解析,所有页面的访问速度变慢非常容易报超时异常,导致无法24小时执行。

心得:所有大量重复性质的工作都可通过程序实现。几乎所有系统的所有应用不论浏览器、pc软件、andriod、ios、linux,理论上都可以实现自动化。越复杂的软件自动化成本越高,面对复杂软件需要考虑自动化开发时间成本和人力手动时间成本的对比。

三、实现

版本

win11

阿里云专有云

python 3.8.x

selenium 4.17.2

提前准备excel

第一列保存表名字,第二列保存调度周期。需要更多自定义内容可自行扩展

py 复制代码
from pymouse import PyMouse
from pykeyboard import PyKeyboard
import math
import time
from selenium.common import NoSuchElementException, TimeoutException, ElementNotInteractableException
from selenium.webdriver import ActionChains
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.edge.options import Options
import os
import logging
from selenium.webdriver.support import expected_conditions as EC
import xlwings as xw
from selenium.webdriver.support.ui import WebDriverWait


def jump_login_links(driver_instense):
    # 跳转页面
    # 对定位到的元素执行点击操作
    # 是否跳转到登录页面
    flag = True
    while flag:
        try:
            button1 = driver_instense.find_element(By.XPATH, '//*[@id="details-button"]')
            link1 = driver_instense.find_element(By.XPATH, '//*[@id="proceed-link"]')
            ActionChains(driver_instense).click(button1).click(link1).perform()
        except NoSuchElementException:
            print("已完成登录界面跳转")
            flag = False
        # else:
        #     flag = False


def jump_dataworks_links(driver_instense):
    # 跳转页面
    # 对定位到的元素执行点击操作
    # 是否跳转到登录页面
    change_handles(driver_instense)
    flag = True
    time.sleep(3)
    while flag:
        try:

            button1 = driver_instense.find_element(By.XPATH, '//*[@id="details-button"]')

            time.sleep(1)
            link1 = driver_instense.find_element(By.XPATH, '//*[@id="proceed-link"]')

            ActionChains(driver_instense).click(button1).click(link1).perform()
        except NoSuchElementException:
            print("已完成dataworks界面跳转")
            flag = False
        # else:
        #     flag = False


def login(driver_instense):
    # 跳过连接
    jump_login_links(driver_instense)
    # 登录
    do_login(driver_instense)


def do_login(driver_instense):
    # 输入用户名和密码,点击登录
    user_input = driver_instense.find_element(By.XPATH, '//*[@id="user_tag"]')
    pwd_input = driver_instense.find_element(By.XPATH, '//*[@id="user_pd"]')
    login_butten = driver_instense.find_element(By.XPATH, '//*[@id="fm1"]/button')
    user_input.send_keys('user')
    pwd_input.send_keys('password')
    login_butten.click()


def open_dataworks(driver_instense):
    product_butten = WebDriverWait(driver_instense, 15, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                                 '/html/body/div[2]/div[1]/div[2]/div[1]/div/span[1]')))
    ActionChains(driver_instense).move_to_element(product_butten).perform()

    time.sleep(3)

    dataworks_butten = WebDriverWait(driver_instense, 15, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                                   '/html/body/div[2]/div[2]/div/div/div/div/div/div[2]/div/div/div[1]/div[2]/a[3]')))
    ActionChains(driver_instense).move_to_element(dataworks_butten).click(dataworks_butten).perform()

    dataworks_pre_page = driver_instense.current_window_handle

    change_handles(driver_instense)

    dataworks = driver_instense.find_element(By.XPATH,
                                             '//*[@id="teamix-container"]/section/div/div[4]/div/button')
    ActionChains(driver_instense).click(dataworks).perform()

    jump_dataworks_links(driver_instense)
    return dataworks_pre_page


def open_dataworks_no_jump(driver_instense, handle):

    change_handles(driver_instense)
    dataworks = WebDriverWait(driver_instense, 15, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                            '//*[@id="teamix-container"]/section/div/div[4]/div/button')))
    ActionChains(driver_instense).click(dataworks).perform()


def open_produce_path(driver_instense):
    change_handles(driver_instense)

    # 点击数据开发
    develop = WebDriverWait(driver_instense, 15, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                          '//*[@id="app"]/div/div/div/div/div[1]/div[1]/div[1]/div/div[2]/div[1]/div/div/div[1]/div/ul[1]/li[1]/div/span/div/div')))
    ActionChains(driver_instense).click(develop).perform()
    # 打开业务流程
    biz_process = WebDriverWait(driver_instense, 15, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                              '//*[@id="10012-workflow-hiddenRegion"]/div/div/div[1]/div/div/div[2]/div/div/div[2]/div/span[1]')))
    ActionChains(driver_instense).click(biz_process).perform()
    # 生产域
    produce = driver_instense.find_element(By.XPATH,
                                           '//*[@id="10012-workflow-hiddenRegion"]/div/div/div[1]/div/div/div[9]/div/div/div[2]/div/span[1]')
    ActionChains(driver_instense).click(produce).perform()
    # 数据集成
    extract = driver_instense.find_element(By.XPATH,
                                           '//*[@id="10012-workflow-hiddenRegion"]/div/div/div[1]/div/div/div[10]/div/div/div[2]/div/span[1]')
    ActionChains(driver_instense).click(extract).perform()
    # 数据推送
    date_send = driver_instense.find_element(By.XPATH,
                                             '//*[@id="10012-workflow-hiddenRegion"]/div/div/div[1]/div/div/div[12]/div/div/div[2]/div/span[1]')
    ActionChains(driver_instense).click(date_send).perform()
    # 生产域主数据推送
    produce_main_date_send = driver_instense.find_element(By.XPATH,
                                                          '//*[@id="10012-workflow-hiddenRegion"]/div/div/div[1]/div/div/div[15]/div/div/div[2]/div/span[1]')
    ActionChains(driver_instense).click(produce_main_date_send).perform()


def create_di_file(driver_instense, table_info):
    # 参数
    table_name = table_info[0].strip()
    file_name = 'di_out_' + table_name
    schedule_cycle = table_info[1].strip()

    # 右键点击文件夹
    right_click = driver_instense.find_element(By.XPATH, "//span[text()='生产域主数据推送']")
ActionChains(driver_instense).move_to_element(right_click).click(right_click).context_click(right_click).perform()

    # 点新建
    new_button = WebDriverWait(driver_instense, 15, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                             "//span[text()='新建']")))
    ActionChains(driver_instense).move_to_element(new_button).click(new_button).perform()

    # 点离线同步/html/body/div[6]/ul/li/div/span/span[2]
    offline_sync = driver_instense.find_element(By.XPATH, "//span[text()='离线同步']")
    ActionChains(driver_instense).click(offline_sync).perform()

    # 创建文件
    task_name_input = driver_instense.find_element(By.XPATH, '//input[@id="name"]')
    task_name_input.clear()
    task_name_input.send_keys(file_name)
    # 点击提交
    submmit_button = driver_instense.find_element(By.XPATH, "//button[text()='提交']")
    ActionChains(driver_instense).click(submmit_button).perform()

    time.sleep(2)

    try:
        # 弹窗错误提示
        err_info = '创建文件失败!当前路径下已存在名为"{0}"的文件'.format(file_name)
        # 弹窗的html路径
        err_alert_xpath = "//*[contains(text(),'{0}')]".format(err_info)
        # 错误提示div对象
        alert_div = driver_instense.find_element(By.XPATH, err_alert_xpath)
        # 如果出现错误提示点击取消按钮,返回False
        cancel_button = driver_instense.find_element(By.XPATH, "//button[text()='取消']")
        ActionChains(driver_instense).click(cancel_button).perform()
        print("文件已存在: {0} 已取消".format(file_name))
        return False
    except NoSuchElementException:
        # 报错就正常执行
        pass
    print("文件不存在继续创建 {0}".format(file_name))

    # 设置数据来源
    set_source(driver_instense, table_name)

    print("source 设置完毕")
    time.sleep(3)
    # 设置数据去向
    set_target(driver_instense, table_name)

    print("target 设置完毕")

    # 调度配置
    set_sechedule(driver_instense, schedule_cycle, table_name)

    print("调度设置完毕")

    # 数据集成资源组配置
    set_resource_group(driver_instense)
    print("数据集成资源组设置完毕")

    # 6 提交
    submit_task(driver_instense)
    print("提交完毕")

    # TODO 解决第二次发布找不到发布按钮问题
    # 7 发布
    # publish_task(driver_instense)

    # close_sub_tag(driver_instense)

    return True


def close_sub_tag(driver_instense):
    # 关闭子标签页
    close_button = driver_instense.find_element(By.XPATH,
                                                '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[1]/div[2]/div/div/ul/li[1]/div/i')
    ActionChains(driver_instense).click(close_button).perform()


def set_resource_group(driver_instense):
    # 点击集成资源组
    # data_ins_conf = driver_instense.find_element(By.XPATH,
    #                                              '//*[@id="app"]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div[1]/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[2]/span/div/div[1]/div[3]/div/span/span')
    data_ins_conf = WebDriverWait(driver_instense, 10, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                                '//*[@id="app"]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div[1]/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[2]/span/div/div[1]/div[3]/div/span/span')))
    ActionChains(driver_instense).click(data_ins_conf).perform()
    # 点击公共资源组

    
    data_ins_conf = WebDriverWait(driver_instense, 10, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                                '//*[@id="resourceGroupType"]/label[1]/span[1]/input')))
    ActionChains(driver_instense).click(data_ins_conf).perform()


def close_new_tag(driver_instense):
    # 关闭新打开的标签页
    driver_instense.close()
    # 切换窗口句柄
    all_handles = driver_instense.window_handles
    driver_instense.switch_to.window(all_handles[-1])


def change_handles(driver_instense):
    time.sleep(1)
    # 切换窗口句柄
    all_handles = driver_instense.window_handles
    driver_instense.switch_to.window(all_handles[-1])


def publish_task(driver_instense):
    # 7.1 点击发布
    # click_publish = driver_instense.find_element(By.XPATH,
    #                                               "/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div[1]/div/div/div/div/div/div[1]/div/div/div/span[1]/button")
    click_publish = WebDriverWait(driver_instense, 10, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                                "/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div[1]/div/div/div/div/div/div[1]/div/div/div/span[1]/button")))
    ActionChains(driver_instense).click(click_publish).perform()

    # 7.2 发布任务

    publish_one(driver_instense)

    # 关闭最后一个标签页
    close_current_page(driver_instense)


def publish_one(driver_instense):
    driver_instense.implicitly_wait(30)

    # 切换句柄
    change_handles(driver_instense)

    click_publish_button = driver_instense.find_element(By.XPATH, '//span[text()="发布"]')
    ActionChains(driver_instense).click(click_publish_button).perform()
    time.sleep(2)
    publish_confirm_button = driver_instense.find_element(By.XPATH,
                                                          "//button[text()='发布']")
    ActionChains(driver_instense).click(publish_confirm_button).perform()


def publish_two(driver_instense):
    click_publish_button = driver_instense.find_element(By.XPATH,
                                                        '/html/body/div[3]/div/div/div/div/div[2]/div[3]/div/div/div[2]/table/tbody/tr/td[1]/div/label/span/input')
    ActionChains(driver_instense).click(click_publish_button).perform()
    time.sleep(2)
    publish_pre_confirm_button = driver_instense.find_element(By.XPATH,
                                                              "//button[text()='发布选中项']")
    ActionChains(driver_instense).click(publish_pre_confirm_button).perform()

    time.sleep(1)

    publish_confirm_button = driver_instense.find_element(By.XPATH,
                                                          "//button[text()='发布']")
    ActionChains(driver_instense).click(publish_confirm_button).perform()


def close_current_page(driver_instense):
    driver_instense.close()
    time.sleep(1)
    change_handles(driver_instense)


def submit_task(driver_instense):
    # 6.1 点击提交
    submit_button = WebDriverWait(driver_instense, 10, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                                '//*[@id="app"]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div[1]/div/div/div/div/div/div[1]/div/span[4]/button')))
    ActionChains(driver_instense).click(submit_button).perform()
    time.sleep(3)
    # 6.2 点击确认
    confirm_button = WebDriverWait(driver_instense, 10, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                                 "//button[text()='确认']")))
    ActionChains(driver_instense).click(confirm_button).perform()
    # 6.3 确认版本
    time.sleep(3)
    confirm_version = WebDriverWait(driver_instense, 10, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                                  "//button[text()='确认']")))
    ActionChains(driver_instense).click(confirm_version).perform()


def set_sechedule(driver_instense, schedule_cycle, table_name):
    # 1 点击调度配置
    scheduler_conf = WebDriverWait(driver_instense, 15, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                                 '//*[@id="app"]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div[1]/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[2]/span/div/div[1]/div[1]/div')))

    ActionChains(driver_instense).click(scheduler_conf).perform()

    # 2 设置重跑属性
    # 2.1 点击重跑输入框
    rerun_able = driver_instense.find_element(By.XPATH,
                                              '//*[@id="reRunAble"]')
    ActionChains(driver_instense).click(rerun_able).perform()
    time.sleep(1)
    # 2.2 确认重跑输入项
    rerun_able_item = WebDriverWait(driver_instense, 15, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                                  '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div[1]/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[2]/span/div/div[2]/div[1]/div/div[2]/div/div/div[2]/div[2]/form/span/div/div[2]/div/div[2]/div/div/div/div/ul/li[1]/div')))
    ActionChains(driver_instense).click(rerun_able_item).perform()
    # 3 设置调度周期
    # 3.1 点击调度周期
    rerun_able_item = WebDriverWait(driver_instense, 15, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                                  '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[2]/span/div/div[2]/div[1]/div/div[2]/div/div/div[2]/div[2]/form/span/div/div[6]/div/div[2]/div/div/div/span/span[1]/span[1]')))
    ActionChains(driver_instense).click(rerun_able_item).perform()
    # 3.2 点击调度周期项目
    if '日' == schedule_cycle:
        pass
    elif '周' == schedule_cycle:
        week = driver_instense.find_element(By.XPATH,
                                            '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[2]/span/div/div[2]/div[1]/div/div[2]/div/div/div[2]/div[2]/form/span/div/div[6]/div/div[2]/div/div/div/div/ul/li[4]/div')
        ActionChains(driver_instense).click(week).perform()
    elif '月' == schedule_cycle:
        month = driver_instense.find_element(By.XPATH,
                                             '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[2]/span/div/div[2]/div[1]/div/div[2]/div/div/div[2]/div[2]/form/span/div/div[6]/div/div[2]/div/div/div/div/ul/li[5]/div')
        ActionChains(driver_instense).click(month).perform()

    ## TODO 修改时间
    # 4 设置定时调度时间
    # 4.1 点击下拉框
    # schedule_time = driver_instense.find_element(By.XPATH,
    #                                              '//*[@id="oneHourMinute"]/div/span/input')
    # ActionChains(driver_instense).click(schedule_time).perform()
    #

    # 4.2 选择3

    # js = 'document.evaluate(\'/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[2]/span/div/div[2]/div[1]/div/div[2]/div/div/div[2]/div[2]/form/span/div/div[7]/div/div[2]/div/div/div/div/div/span/input\', document).iterateNext().removeAttribute(\'readonly\')'
    # driver_instense.execute_script(js)
    # time.sleep(3)
    #
    # three_input = driver_instense.find_element(By.XPATH,
    #                                            '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[2]/span/div/div[2]/div[1]/div/div[2]/div/div/div[2]/div[2]/form/span/div/div[7]/div/div[2]/div/div/div/div/div/span/input')
    # three_input.send_keys('03:25')
    # three_input.submit()
    # time.sleep(1)

    # 5 依赖的上游节点
    # 5.1 点击输入框
    # pre_node_input = driver_instense.find_element(By.XPATH,
    #                                               '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div[1]/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[2]/span/div/div[2]/div[1]/div/div[2]/div/div/div[4]/div[2]/div[1]/div[1]/span/div/div/div/div[2]/div/div/div/span[1]/span[1]/span[1]/span/input')
    pre_node_input = WebDriverWait(driver_instense, 15, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                                 '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div[1]/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[2]/span/div/div[2]/div[1]/div/div[2]/div/div/div[4]/div[2]/div[1]/div[1]/span/div/div/div/div[2]/div/div/div/span[1]/span[1]/span[1]/span/input')))
    ActionChains(driver_instense).click(pre_node_input).perform()
    # 5.2 输入虚拟节点名字
    pre_node_input.send_keys('TEST.virtual_node')

    time.sleep(1)

    # 5.3 点击该节点
    pre_node_item = WebDriverWait(driver_instense, 15, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                                '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[2]/span/div/div[2]/div[1]/div/div[2]/div/div/div[4]/div[2]/div[1]/div[1]/span/div/div/div/div[2]/div/div/div/div/ul/li/div/span')))
    ActionChains(driver_instense).click(pre_node_item).perform()
    # 5.4 点击+号
    plus_button = WebDriverWait(driver_instense, 15, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                              '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[2]/span/div/div[2]/div[1]/div/div[2]/div/div/div[4]/div[2]/div[1]/div[1]/span/div/div/div/div[2]/div/div/div/button[1]')))
    ActionChains(driver_instense).click(plus_button).perform()

    time.sleep(2)

    # 5.5 输入ods表名字,例TEST.ods_test_table_name_df
    pre_node_input.send_keys('TEST.' + table_name)

    time.sleep(5)
    # 5.6 点击依赖项
    pre_node_item = WebDriverWait(driver_instense, 15, 3).until(EC.presence_of_element_located((By.XPATH,
                                                                                                '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[2]/span/div/div[2]/div[1]/div/div[2]/div/div/div[4]/div[2]/div[1]/div[1]/span/div/div/div/div[2]/div/div/div/div/ul/li/div')))
    ActionChains(driver_instense).click(pre_node_item).perform()
    # 5.7 点击+号
    time.sleep(1)
    plus_button = WebDriverWait(driver_instense, 15, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                              '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[2]/span/div/div[2]/div[1]/div/div[2]/div/div/div[4]/div[2]/div[1]/div[1]/span/div/div/div/div[2]/div/div/div/button[1]')))
    ActionChains(driver_instense).click(plus_button).perform()


def set_target(driver_instense, table_name):
    time.sleep(1)
    # 1 点击数据源
    target_data_source = driver_instense.find_element(By.XPATH,
                                                      '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[1]/div/div/div/div/div/div/div/div[2]/div[2]/div/div[2]/div/div/form/div/div[2]/div/div[1]/div/div/span/span[1]/span[1]/span/input')
    ActionChains(driver_instense).move_to_element(target_data_source).click(target_data_source).perform()
    time.sleep(2)
    # 2 点击mysql
    mysql_data_source = driver_instense.find_element(By.XPATH,
                                                     '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[1]/div/div/div/div/div/div/div/div[2]/div[2]/div/div[2]/div/div/form/div/div[2]/div/div[1]/div/div/div[2]/ul/li[2]')
    ActionChains(driver_instense).click(mysql_data_source).perform()

    # 3 点击数据源名称输入框
    target_mysql_data_source = driver_instense.find_element(By.XPATH,
                                                            '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[1]/div/div/div/div/div/div/div/div[2]/div[2]/div/div[2]/div/div/form/div/div[2]/div/div[2]/div/div')

    ActionChains(driver_instense).click(target_mysql_data_source).perform()
    time.sleep(2)
    # 4 输入
    target_data_source_item = driver_instense.find_element(By.XPATH,
                                                           '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[1]/div/div/div/div/div/div/div/div[2]/div[2]/div/div[2]/div/div/form/div/div[2]/div/div[2]/div/div/span/span[1]/span[1]/span/input')

    target_data_source_item.send_keys('mysql_database_name')

    time.sleep(3)

    # 5 点击确认数据源名称
    target_mysql_database_name = driver_instense.find_element(By.XPATH,
                                                     '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[1]/div/div/div/div/div/div/div/div[2]/div[2]/div/div[2]/div/div/form/div/div[2]/div/div[2]/div/div/div/ul/li/div')

    ActionChains(driver_instense).click(target_mysql_database_name ).perform()

    time.sleep(3)

    # 7 输入表名字
    target_table_name_input = driver_instense.find_element(By.XPATH,
                                                           '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div[1]/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[1]/div/div/div/div/div/div/div/div[2]/div[2]/div/div[2]/div/form/div[1]/div[2]/div[1]/div[1]/div/div/span/span[1]/span[1]/span/input')
    target_table_name_input.send_keys(table_name)
    target_table_name_input.submit()

    time.sleep(6)
    # 8 确认目标表名称
    try:
        confirm_target_table_input = driver_instense.find_element(By.XPATH,
                                                                  '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[1]/div/div/div/div/div/div/div/div[2]/div[2]/div/div[2]/div/form/div[1]/div[2]/div[1]/div[1]/div/div/div/ul/li/div')

        ActionChains(driver_instense).click(confirm_target_table_input).perform()
    except ElementNotInteractableException:
        print("目标表不存在或者超时 {0}".format(table_name))
    time.sleep(1)
    # 9 输入导入前准备语句
    pre_sql_input = driver_instense.find_element(By.XPATH,
                                                 '//*[@id="preSql"]')
    pre_sql_input.send_keys('truncate table ' + table_name)


def set_source(driver_instense, table_name):
    source_data_source_type = WebDriverWait(driver_instense, 20, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                                          '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[1]/div/div/div/div/div/div/div/div[2]/div[2]/div/div[1]/div/div/form/div/div[2]/div/div[1]/div/div/span/span[1]/span[1]/span/input')))
    ActionChains(driver_instense).click(source_data_source_type).perform()

    source_odps_type = WebDriverWait(driver_instense, 15, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                                   '//*[@id="app"]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div[1]/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[1]/div/div/div/div/div/div/div/div[2]/div[2]/div/div[1]/div/div/form/div/div[2]/div/div[1]/div/div/div[2]/ul/li[1]/div')))

    ActionChains(driver_instense).click(source_odps_type).perform()

    time.sleep(2)
    source_table_input = driver_instense.find_element(By.XPATH,
                                                      '//*[@id="tableItem"]')
    source_table_input.send_keys(table_name)
    source_table_input.submit()

    time.sleep(2)

    source_confirm_table = driver_instense.find_element(By.XPATH,
                                                        '//*[@id="app"]/div/div/div/div/div[1]/div[1]/div[2]/div/div[1]/div/div/div[2]/div[1]/div/div/div/div/div/div[2]/div/div/div[1]/div/div/div[1]/div/div/div/div/div/div/div/div[2]/div[2]/div/div[1]/div/form/div[3]/div[2]/div[1]/div[1]/div/div/div/ul/li/div/span')
    ActionChains(driver_instense).click(source_confirm_table).perform()


def read_excel_file(file_path):
    wb = xw.Book(file_path)
    wks = xw.sheets
    # 表名字
    table_name = wks[0].range("A1:A472")
    # 调度周期
    schedule = wks[0].range("B1:B472")
    table_schedule = []
    for index, row in enumerate(table_name):
        if row.value is not None and table_name[index].value is not None and row.value != '无' and schedule[index].value is not None:
            table_schedule.append((table_name[index].value, schedule[index].value))
    wb.close()
    return table_schedule


# 跳转到指定窗口
def change_target_handle(driver_instense, handle):
    # 切换窗口句柄
    time.sleep(1)

    all_handles = driver_instense.window_handles
    for h in all_handles:
        if h == handle:
            driver_instense.switch_to.window(handle)


def re_do_task(driver_instense, last_table_info, table_info, datawork_pre_page_handle):
    # 在搜索栏输入上一个成功的作为筛选条件
    file_name_input = WebDriverWait(driver_instense, 15, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                                  '/html/body/div[2]/div/div/div/div/div[1]/div[1]/div[1]/div/div[2]/div[2]/div/div/div/div[1]/div/div[2]/span/input')))
    ActionChains(driver_instense).click(file_name_input).perform()
    file_name = 'di_out_' + last_table_info[0]
    file_name_input.clear()
    time.sleep(6)
    develop = WebDriverWait(driver_instense, 15, 1).until(EC.presence_of_element_located((By.XPATH,
                                                                                          '//*[@id="app"]/div/div/div/div/div[1]/div[1]/div[1]/div/div[2]/div[1]/div/div/div[1]/div/ul[1]/li[1]/div/span/div/div')))
    ActionChains(driver_instense).click(develop).perform()
    time.sleep(3)
    file_name_input.send_keys(file_name)
    # 执行创建文件操作
    time.sleep(3)
    create_di_file(driver_instense, table_info)


def create_di_files(driver_instense, datawork_pre_page_handle):
    excel_path = 'D:\\tmp\\2024129.xlsx'
    table_infos = read_excel_file(excel_path)
    # table_infos = [('test_table_name1', '日'),
    #                ('test_table_name2', '月'),
    #                ('test_table_name3', '月')]
    result_flag = False
    while not result_flag:
        last_table_info = None
        for index, table_info in enumerate(table_infos):
            try:
            	# 每20个任务切换筛选条件
            	if index % 20 == 19:
                    last_table_info = table_infos[index - 1]
                    re_do_task(driver_instense, last_table_info, table_info, datawork_pre_page_handle)
                else:
                    result_flag = create_di_file(driver_instense, table_info)datawork_pre_page_handle)
            except NoSuchElementException:
                print("本次创建失败,开始重试任务")
                if index > 0:
                    last_table_info = table_infos[index - 1]
                re_do_task(driver_instense, last_table_info, table_info, datawork_pre_page_handle)
            except TimeoutException:
                print("本次创建失败,开始重试任务")
                if index > 0:
                    last_table_info = table_infos[index - 1]
                re_do_task(driver_instense, last_table_info, table_info, datawork_pre_page_handle)
            finally:
                print("{0} {1} {2}".format(table_info[0], table_info[1], result_flag))

    pass


def create_produce_di_task():
    # 读取excel所有任务

    driver = webdriver.Edge()
    driver.maximize_window()
    driver.get(url="https://ide.res.cloud.impc.com.cn/")
    driver.implicitly_wait(15)
    login(driver_instense=driver)
    dataworks_pre_page_handle = open_dataworks(driver_instense=driver)
    # 处理xx域
    open_produce_path(driver_instense=driver)
    create_di_files(driver, dataworks_pre_page_handle)
    # 遍历所有任务

    driver.close()


if __name__ == "__main__":
    # 创建xx域任务
    create_produce_di_task()
    # TODO 其他域
相关推荐
Data跳动7 分钟前
Spark内存都消耗在哪里了?
大数据·分布式·spark
woshiabc1111 小时前
windows安装Elasticsearch及增删改查操作
大数据·elasticsearch·搜索引擎
lucky_syq2 小时前
Saprk和Flink的区别
大数据·flink
lucky_syq2 小时前
流式处理,为什么Flink比Spark Streaming好?
大数据·flink·spark
袋鼠云数栈2 小时前
深入浅出Flink CEP丨如何通过Flink SQL作业动态更新Flink CEP作业
大数据
美团测试工程师2 小时前
九大高效的前端测试工具与框架
软件测试·测试工具·jmeter
小白学大数据3 小时前
如何使用Selenium处理JavaScript动态加载的内容?
大数据·javascript·爬虫·selenium·测试工具
weixin_419349793 小时前
selenium 报错 invalid argument: invalid locator
selenium·测试工具
程序猿000001号3 小时前
Selenium 深度解析:自动化浏览器操作的利器
selenium·测试工具·自动化
15年网络推广青哥3 小时前
国际抖音TikTok矩阵运营的关键要素有哪些?
大数据·人工智能·矩阵