Android依赖分析自动化工具实战

需求

一个Android工程常常会有很多依赖，可能是组件化项目对于其他业务及中间件的依赖，也有可能是对于第三方sdk的依赖。这里我们分别看下对于这两种场景为什么需要自动化的版本依赖分析工具：

二方组件依赖：
每次集成发版时候可能会有许多业务组件进行发布，有了自动分析的工具可以监测到有开发期的业务组件进行了集成，有效的对风险进行规避。
三方sdk依赖：
一个正常的项目对于三方sdk的升级肯定是非常谨慎的，但是现在sdk的依赖层级非常深，包括androidx，经常一拖多，不知道什么时候就升级了一个我们不想升级的sdk版本，这时自动化分析就非常重要。

通过上述分析，实现的工具大概是这样的：

需要在一些关键的构建场景（如集成，上线）阶段进行版本依赖的自动化分析及报告，报告可以直接通过机器人或邮件进行提醒。
提醒的信息不需要太复杂，能直观的看出相对于上次构建那些库进行了升级即可，后续具体的分析还是交给人工，使用gradle dependencies进行二次排查。

接下来分析下怎样快速的实现一个自动化版本依赖分析工具, 先看一下具体的效果展示：

集成分支（xxx）相对于上次构建（xxx）依赖上的变动:

版本号发生了变动 com.squareup.okhttp3:okhttp 3.11.0===>3.12.6

采集依赖

想要对依赖进行分析，首先需要将工程的所有依赖先收集起来，这里我们不需要像dependencies进行复杂的树形分析，直接将所有依赖铺平（flatten）到一个文件中即可，文件每一行类似：

groovy 复制代码

com.squareup.okhttp3:okhttp:3.11.0

首先定义依赖对应的数据bean:

groovy 复制代码

class Dependency{
    String group = ""
    String name = ""
    String version = ""

    @Override
    String toString() {
        return "${group}:${name}:${version}"
    }
}

采用网络上的通用做法，通过gradle api的方式获取依赖：

groovy 复制代码

project.afterEvaluate {
    project.android.applicationVariants.all { variant ->
        tasks.create(name: "showDependencies${variant.name.capitalize()}",
                description: "展示所有依赖") {
            File file = new File("${getProjectDir()}","Dependencies${variant.name.capitalize()}")
            if (file.exists()){
                file.delete()
            }
            List<Dependency> list = new ArrayList<>()
            HashSet<String> set = new HashSet<>()
            Configuration configuration
            try {
                //gradle 3.x
                configuration = project.configurations."${variant.name}CompileClasspath"
            } catch (Exception e) {
                //gradle 2.x
                configuration = project.configurations."_${variant.name}Compile"
            }
            configuration.resolvedConfiguration.lenientConfiguration.allModuleDependencies.each {
                list.add(collectDependency(it))
            }
            file.withWriter { writer->
                list.forEach{ data->
                    addDependency(writer,data,0,set)
                }
                set.forEach{ item->
                    writer.writeLine item
                }
            }
        }
    }
}

Dependency collectDependency(ResolvedDependency dependency){
    def identifier = dependency.module.id
    def item = new Dependency()
    item.group = identifier.group
    item.name = identifier.name
    item.version = identifier.version

    return item
}

def addDependency(BufferedWriter writer, Dependency dependency, int index, HashSet<String>set){
    set.add(dependency.toString())
}

到这里为止，获取到了理想格式的依赖内容，并去重后存入"Dependencies变体"文件中。

脚本执行的时机是每次构建工程，在gradle configuration的完成阶段（afterEvaluate）。可以通过一些变量，将脚本执行限制在CI/CD的集成阶段，在开发期不太需要每次都跑这个，这里不再赘述。

依赖变更生成

生成了依赖相关的文件，就可以进行真正的依赖比对，这里将场景简化为：两次集成依赖变更分析后，只需要比对上一次和本次的依赖分析文件即可。

我们将上一次的依赖分析文件定义为 DA，本次的依赖分析文件定义为 DB。

新增：DA中不存在，DB中存在。
修改：DA，DB中均存在，但两次version不一样。
删除：DA中存在，DB中不存在。

本次依赖分析完成后，将文件归档保存下来，就变成了上一次的依赖分析文件。

这里需要注意的是依赖分析文件上次的位置是在当前CI的工程workspace下，比如我们使用jenkins，就在slave对应集成job的工作空间下，归档指的是将其迁移到一个稳定的存储磁盘中，因为jenkins的工作空间每次都是新的。

有了两次的依赖分析文件，看下具体的比对细节：

python 复制代码

workspace = os.getenv("WORKSPACE")
work_space_name = workspace.split("/")[len(workspace.split("/")) - 1]
REMOTE_SAVE_DIR = "xxxx"
domain = "xxxx"
for roots, dirs, files in os.walk(workspace + "/app", topdown=True):
    for name in files:
        if name.find("Dependencies") != -1:
            file_name = os.path.join(roots, name)
            record_map = {}
            init_version_map(open(file_name), record_map)
            find_last_commit_version_diff(name, record_map)
            write_file(record_map, name)

本次的依赖文件从当前job工作空间获取即可，数据反序列化后写入record_map中，具体逻辑在init_version_map方法。

上次的依赖文件需要从归档的存储中获取，例如根据domain和remote_save_dir拼出具体的拉取地址进行拉取，拉取后写入last_record_map中，具体比对逻辑在find_last_commit_version_diff方法。

init_version_map

python 复制代码

def init_version_map(file, version_map):
    for line in file.readlines():
        if len(line.strip()) != 0:
            item_list = line.split(":")
            if item_list[0].strip().find(work_space_name) != -1:
                item_list[0] = item_list[0].strip().replace(work_space_name, "local")
            if len(item_list) == 3:
                version_map[item_list[0].strip() + ":" + item_list[1].strip()] = item_list[2].strip()
            else:
                key = ""
                for item in item_list:
                    key += item
                version_map[key] = "NO_VERSION"

由于存入依赖分析文件中的格式为：

groovy 复制代码

 "${group}:${name}:${version}"

所以map以${group:name}为key，version为value。

find_last_commit_version_diff

python 复制代码

def find_cloud_dependencies(target_name):
    try:
        # 建立一个sshclient对象
        ssh = paramiko.SSHClient()
        # 允许将信任的主机自动加入到host_allow 列表，此方法必须放在connect方法的前面
        ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
        transport = paramiko.Transport((domain, port))
        transport.connect(username='xxx', password='xxx')
        sftp = paramiko.SFTPClient.from_transport(transport)
        local = "/home/dev/workspace/RecordDependencies/cloud" + target_name
        sftp.get(src, local)
        ssh.close()
        return os.path.exists(local)
    except BaseException as error:
        return False

这一步将上一次的依赖分析文件从远端拉回，并命名为"cloudDependencies变体"文件。

python 复制代码

def find_last_commit_version_diff(name, is_doctor, current_record_map):
    if find_cloud_dependencies(name, is_doctor):
        last_record_map = {}
        last_path = "/home/dev/workspace/RecordDependencies/cloud" + name
        init_version_map(open(last_path), last_record_map)
        add_version_map = {}
        remove_version_map = {}
        changed_version_map = {}
        for lib, version in current_record_map.items():
            if lib not in last_record_map:
                add_version_map[lib] = version
            elif last_record_map[lib] != version:
                changed_version_map[lib] = last_record_map[lib] + "===>" + version
        for lib, version in last_record_map.items():
            if lib not in current_record_map:
                remove_version_map[lib] = version
        version_diff_file_path = "/home/dev/workspace/RecordDependencies/lastVersionDiff" + name
 
        if os.path.exists(version_diff_file_path):
            os.remove(version_diff_file_path)
        diff_file = open(version_diff_file_path, 'w')
        if add_version_map or remove_version_map or changed_version_map:
            for lib, version in add_version_map.items():
                diff_file.write("新增了  " + lib + "  版本号为:" + version + "  \n  ")
            if add_version_map:
                diff_file.write("**------------------------**  \n  ")
 
            for lib, version in remove_version_map.items():
                diff_file.write("移除了  " + lib + "  版本号为:" + version + "  \n  ")
            if remove_version_map:
                diff_file.write("**------------------------**  \n  ")
 
            for lib, version in changed_version_map.items():
                diff_file.write("版本号发生了变动  " + lib + "  " + version + "  \n  ")
        else:
            diff_file.write("相对于上一次打包没有变动  \n  ")
        diff_file.close()
        upload_file(version_diff_file_path, "lastVersionDiff" + name)

上一次的依赖分析map为last_record_map，本次的依赖分析map为current_record_map，根据之前分析的规则进行比对，并生成对应的文本说明。

最终生成的比对文本写入文件"lastVersionDiff变体"中，供后续使用，可以将文件的内容发送到机器人或邮件，就是开头实现展示的效果了。