03_apollo_scripts子模块整体软件架构深入分析文档

1. 概述

apollo_scripts是Apollo自动驾驶平台的脚本管理模块,负责自动化构建、部署、运行和测试等功能。该模块包含了一系列shell脚本和Python工具,提供了环境配置、模块启动、设备初始化、代码质量检查等关键功能,是整个Apollo系统的重要支撑组件。这些脚本涵盖了从开发到部署的全生命周期管理,包括构建、测试、运行和维护等各个方面。

2. 软件架构图

graph TB subgraph "用户接口层" A1[命令行接口 CLI] A2[Docker容器接口] A3[IDE集成接口] A4[Web管理界面接口] end subgraph "脚本管理层" A[Apollo Scripts Manager] subgraph "构建脚本组" B[Build Scripts] B1[apollo_buildify.sh - 代码格式化] B2[apollo_action.sh - 构建动作管理] B3[apollo_clean.sh - 清理构建产物] B4[apollo_format.sh - 代码格式检查] B5[buildifier.sh - BUILD文件格式化] B6[clang_format.sh - C++代码格式化] B7[yapf.sh - Python代码格式化] B8[apollo_lint.sh - 代码规范检查] B9[apollo_ci.sh - 持续集成] end subgraph "运行时脚本组" C[Runtime Scripts] C1[env.sh - 环境初始化] C2[apollo_base.sh - 基础函数库] C3[bootstrap.sh - 系统引导] C4[cyber_launch.sh - Cyber组件启动] C5[模块启动脚本群组] C51[dreamview.sh - 可视化界面] C52[perception.sh - 感知模块] C53[localization.sh - 定位模块] C54[planning.sh - 规划模块] C55[control.sh - 控制模块] C56[canbus.sh - CAN总线通信] C57[routing.sh - 路由模块] C58[prediction.sh - 预测模块] end subgraph "测试脚本组" D[Test Scripts] D1[replay.sh - 数据回放] D2[record_bag.sh - 数据记录] D3[performance_test.sh - 性能测试] D4[unit_test_runner.sh - 单元测试] D5[integration_test.sh - 集成测试] D6[functional_test.sh - 功能测试] end subgraph "工具脚本组" E[Utility Scripts] E1[device_setup.sh - 硬件设备配置] E2[data_management.sh - 数据管理] E3[configuration_tools.sh - 配置管理] E4[model_download.sh - 模型下载] E5[map_tools.sh - 地图工具] E6[common_functions.sh - 通用函数库] E7[log_analyzer.sh - 日志分析] E8[diagnostic_tool.sh - 诊断工具] end subgraph "配置管理组" F[Configuration Scripts] F1[apollo_config.sh - 系统配置] F2[switch_vehicle.sh - 车辆切换] F3[install_scripts.sh - 安装脚本] F4[environment_setup.sh - 环境设置] F5[vehicle_calibrations.sh - 车辆标定] F6[security_config.sh - 安全配置] end subgraph "运维脚本组" G[Maintenance Scripts] G1[monitor.sh - 系统监控] G2[log_management.sh - 日志管理] G3[data_cleaner.sh - 数据清理] G4[ota.sh - 在线更新] G5[health_check.sh - 健康检查] G6[backup_restore.sh - 备份恢复] end subgraph "部署脚本组" H[Deployment Scripts] H1[apollo_deploy.sh - 部署脚本] H2[remote_deploy.sh - 远程部署] H3[package_builder.sh - 包构建] H4[image_creator.sh - 镜像创建] H5[container_manager.sh - 容器管理] end end subgraph "底层支撑层" J[操作系统层 - Linux Ubuntu] K[Docker容器引擎] L[Bazel构建系统] M[Python运行时环境] N[Bash Shell环境] O[Protobuf编译器] P[CMake构建工具] end subgraph "外部依赖" Q[硬件驱动程序] R[传感器设备] S[网络服务] T[云服务平台] end A1 --> A A2 --> A A3 --> A A4 --> A A --> B A --> C A --> D A --> E A --> F A --> G A --> H B --> L B --> P B --> O C --> N C --> M C --> K D --> K D --> J E --> J E --> Q F --> J F --> S G --> J G --> T H --> K H --> S H --> T J -.-> A K -.-> A L -.-> A M -.-> A N -.-> A O -.-> A P -.-> A Q -.-> E R -.-> E S -.-> F T -.-> H

3. 调用流程图

flowchart TD Start([用户启动Apollo]) --> PreCheck{检查运行环境} PreCheck -->|环境正常| InitEnv[初始化环境变量] PreCheck -->|环境异常| ErrorEnv[报告环境错误] ErrorEnv --> End([结束]) InitEnv --> LoadConfig[加载系统配置] LoadConfig --> CheckDocker{检查Docker环境} CheckDocker -->|在Docker内| InDockerOps[容器内操作] CheckDocker -->|在Docker外| OutDockerOps[容器外操作] InDockerOps --> DetectArch[检测系统架构] OutDockerOps --> DetectArch DetectArch -->|x86_64| SetupX86[配置x86_64环境] DetectArch -->|aarch64| SetupARM[配置ARM环境] SetupX86 --> DeviceSetup[初始化硬件设备] SetupARM --> DeviceSetup DeviceSetup --> CreateDirs[创建必要目录结构] CreateDirs --> SetupDevices[配置CAN/GPU设备] SetupDevices --> VerifySetup{验证设备配置} VerifySetup -->|配置成功| SystemReady[系统就绪] VerifySetup -->|配置失败| RetrySetup[重试配置] RetrySetup --> VerifySetup SystemReady --> WaitForCmd{等待用户命令} WaitForCmd -->|启动模块| StartModule[启动指定模块] WaitForCmd -->|停止模块| StopModule[停止指定模块] WaitForCmd -->|构建系统| BuildSystem[构建Apollo系统] WaitForCmd -->|运行测试| RunTests[运行测试套件] WaitForCmd -->|部署系统| DeploySystem[部署到目标] WaitForCmd -->|监控系统| MonitorSystem[监控系统状态] WaitForCmd -->|数据记录| RecordData[记录数据] WaitForCmd -->|清理数据| CleanData[清理历史数据] %% 模块启动流程 StartModule --> ParseModule[解析模块参数] ParseModule --> CheckModule{检查模块状态} CheckModule -->|模块未运行| LaunchModule[启动模块] CheckModule -->|模块已运行| NotifyRunning[通知模块已在运行] NotifyRunning --> ReturnReady[返回系统就绪] LaunchModule --> FindLaunchFile[查找启动配置文件] FindLaunchFile --> ExecuteLaunch[执行cyber_launch启动] ExecuteLaunch --> VerifyLaunch{验证启动状态} VerifyLaunch -->|启动成功| LogSuccess[记录成功日志] VerifyLaunch -->|启动失败| LogFailure[记录失败日志] LogSuccess --> ReturnReady LogFailure --> ReturnReady %% 模块停止流程 StopModule --> IdentifyProcess[识别模块进程] IdentifyProcess --> KillProcess[终止模块进程] KillProcess --> VerifyStop{验证停止状态} VerifyStop -->|已停止| LogStop[记录停止日志] VerifyStop -->|未停止| ForceKill[强制终止] ForceKill --> VerifyStop LogStop --> ReturnReady %% 构建系统流程 BuildSystem --> ParseBuildArgs[解析构建参数] ParseBuildArgs --> CheckBuildEnv{检查构建环境} CheckBuildEnv -->|环境正常| CleanBuild[清理构建缓存] CheckBuildEnv -->|环境异常| SetupBuildEnv[设置构建环境] SetupBuildEnv --> CleanBuild CleanBuild --> DetermineTargets[确定构建目标] DetermineTargets --> ExecuteBuild[执行Bazel构建] ExecuteBuild --> VerifyBuild{验证构建结果} VerifyBuild -->|构建成功| PostBuild[构建后处理] VerifyBuild -->|构建失败| ReportBuildErr[报告构建错误] PostBuild --> ReturnReady ReportBuildErr --> ReturnReady %% 测试流程 RunTests --> SetupTestEnv[设置测试环境] SetupTestEnv --> RunUnitTest[运行单元测试] RunUnitTest --> RunIntegrationTest[运行集成测试] RunIntegrationTest --> RunFunctionalTest[运行功能测试] RunFunctionalTest --> GenTestReport[生成测试报告] GenTestReport --> ReturnReady %% 部署流程 DeploySystem --> ValidateTarget[验证部署目标] ValidateTarget --> PreparePackage[准备部署包] PreparePackage --> TransferPackage[传输部署包] TransferPackage --> InstallPackage[安装部署包] InstallPackage --> ConfigureDeploy[配置部署环境] ConfigureDeploy --> VerifyDeploy{验证部署结果} VerifyDeploy -->|部署成功| LogDeploy[记录部署成功] VerifyDeploy -->|部署失败| Rollback[回滚部署] LogDeploy --> ReturnReady Rollback --> ReturnReady %% 监控流程 MonitorSystem --> CollectMetrics[收集系统指标] CollectMetrics --> AnalyzeData[分析指标数据] AnalyzeData --> CheckThresholds{检查阈值} CheckThresholds -->|正常| ContinueMonitor[继续监控] CheckThresholds -->|异常| RaiseAlert[发出警报] ContinueMonitor --> CollectMetrics RaiseAlert --> NotifyAdmin[通知管理员] NotifyAdmin --> ContinueMonitor %% 数据记录流程 RecordData --> DecideStorage[决定存储位置] DecideStorage --> CreateTaskDir[创建任务目录] CreateTaskDir --> SelectChannels[选择记录通道] SelectChannels --> StartRecording[开始记录数据] StartRecording --> MonitorDisk{监控磁盘空间} MonitorDisk -->|空间充足| ContinueRecord[继续记录] MonitorDisk -->|空间不足| StopRecord[停止记录] ContinueRecord --> MonitorDisk StopRecord --> CompressData[压缩数据] CompressData --> ReturnReady %% 数据清理流程 CleanData --> ScanOldData[扫描旧数据] ScanOldData --> FilterData[筛选待清理数据] FilterData --> ConfirmClean[确认清理操作] ConfirmClean --> ExecuteClean[执行清理] ExecuteClean --> VerifyClean{验证清理结果} VerifyClean -->|清理成功| LogClean[记录清理日志] VerifyClean -->|清理失败| ReportCleanErr[报告清理错误] LogClean --> ReturnReady ReportCleanErr --> ReturnReady ReturnReady --> WaitForCmd subgraph "核心流程" CoreFlow[SystemReady] end subgraph "错误处理流程" ErrFlow[Error Handling] ErrFlow --> LogError[记录错误] ErrFlow --> AttemptRecovery[尝试恢复] ErrFlow --> NotifyUser[通知用户] end LogFailure -.-> ErrFlow ReportBuildErr -.-> ErrFlow ReportCleanErr -.-> ErrFlow

4. UML类图

classDiagram %% 基础抽象层 class ApolloScriptBase { <> +String TOP_DIR +String APOLLO_ROOT_DIR +String ARCH +Boolean APOLLO_IN_DOCKER +int APOLLO_OUTSIDE_DOCKER +String CMDLINE_OPTIONS +Boolean ENABLE_PROFILER +String APOLLO_BIN_PREFIX +Map env_vars + +initialize_environment() +set_lib_path() +create_data_dir() +determine_bin_prefix() +setup_device() +decide_task_dir() +check_in_docker() +pathprepend(String var, String value) +pathappend(String var, String value) +info(String msg) +warning(String msg) +error(String msg) +ok(String msg) +fatal(String msg) +check_function_exists(String func_name) +is_stopped_customized_path(String module_path, String module) } %% 构建系统层 class BuildSystem { +String DISABLED_TARGETS +String SHORTHAND_TARGETS +int USE_GPU +Boolean USE_ESD_CAN +int USE_OPT +String BUILD_TYPE + +determine_build_targets(String... components) +determine_disabled_targets(String... components) +_chk_n_set_gpu_arg(String arg) +_determine_perception_disabled() +build(String... targets) +clean() +verify_build() +setup_build_environment() +configure_build_options() } class BuildOptimizer { +int MAX_JOBS +String BUILD_CACHE_DIR +Boolean USE_INCREMENTAL_BUILD + +optimize_build_performance() +enable_cache_mechanism() +limit_concurrent_jobs() } %% 模块管理层 class ModuleLauncher { +String LAUNCH_FILE_PATH +String MODULE_STATUS + +start(String module, String... args) +start_customized_path(String module_path, String module, String... args) +stop(String module) +check_module_status(String module) +wait_for_exit(String module) +list_running_modules() } class ModuleRegistry { +Map registered_modules +List essential_modules + +register_module(ModuleInfo info) +unregister_module(String module_name) +get_module_info(String module_name) +get_essential_modules() +validate_module_dependencies() } class ModuleInfo { +String name +String path +String launch_file +List dependencies +Boolean is_essential +String description + +ModuleInfo(String name, String path, String launch_file) +getName() +getPath() +getLaunchFile() +getDependencies() } %% 设备管理层 class DeviceSetup { +String CAN_DEVICE_PATTERN +String GPU_DEVICE_PATTERN +int NUM_CAN_PORTS + +setup_device_for_amd64() +setup_device_for_aarch64() +setup_can_devices() +check_gpu_devices() +setup_shared_mem() +initialize_hardware() +validate_device_access() } class HardwareValidator { +List required_devices +Map device_paths + +validate_required_hardware() +check_device_permissions() +test_device_functionality() +generate_hardware_report() } %% 配置管理层 class ConfigManager { +String VEHICLE_NAME +String BRIDGE_PORT +String DASHBOARD_PORT +String CONFIG_DIR + +load_config() +validate_config() +apply_config() +save_config() +switch_vehicle(String vehicle_id) +validate_vehicle_config(String vehicle_id) } class VehicleConfig { +String vehicle_id +String model +String calibration_file +Map parameters + +VehicleConfig(String vehicle_id) +getCalibrationFile() +getParameter(String key) +setParameter(String key, Object value) +validate() } %% 数据管理层 class DataManager { +String BAG_PATH +String LOG_PATH +String TASK_DIR +String DATA_RETENTION_DAYS + +manage_logs() +clean_data() +backup_data() +record_bag(List channels) +stop_record() +rotate_logs() +compress_old_data() } class DataRecorder { +String RECORDING_TASK_ID +String CURRENT_BAG_FILE +Boolean is_recording + +start_recording(List channels) +stop_recording() +pause_recording() +resume_recording() +get_recording_status() } %% 测试管理层 class TestRunner { +String TEST_FILTER +String TEST_TIMEOUT +String TEST_REPORT_DIR + +run_unit_tests() +run_integration_tests() +run_functional_tests() +generate_test_report() +analyze_coverage() +validate_test_results() } class TestCaseManager { +List test_cases +TestResultAggregator aggregator + +add_test_case(TestCase tc) +run_all_tests() +get_test_results() +generate_coverage_report() } class TestCase { +String name +String description +String command +int timeout + +TestCase(String name, String command) +execute() +getName() +getTimeout() } %% 部署管理层 class DeploymentManager { +String TARGET_HOST +String DEPLOY_PATH +String DEPLOY_PACKAGE + +deploy_to_remote() +rollback_version() +verify_deployment() +update_config() +check_target_compatibility() } class PackageBuilder { +String PACKAGE_FORMAT +List components +String OUTPUT_DIR + +create_package(List components) +extract_package(String path) +verify_package_integrity() +install_package(String package_path) +calculate_checksum(String file_path) } %% 监控系统层 class MonitorSystem { +String METRICS_INTERVAL +Map system_metrics +List alert_handlers + +collect_cpu_usage() +collect_memory_usage() +collect_disk_usage() +collect_network_stats() +send_alert(String message) +log_event(String event) +start_monitoring() +stop_monitoring() } class MetricsCollector { +SystemMetrics current_metrics +List sources + +collect_system_metrics() +collect_process_metrics() +collect_network_metrics() +aggregate_metrics() } class AlertHandler { +String handler_type +String destination + +handle_alert(Alert alert) +send_notification(String message) +log_alert(Alert alert) } %% 执行管理层 class ScriptExecutor { +String current_command +ExecutionResult last_result + +execute_command(String cmd) +handle_error(Error error) +log_operation(String operation) +validate_execution_env() } class ExecutionResult { +int exit_code +String stdout +String stderr +long execution_time + +ExecutionResult(int code, String out, String err) +isSuccessful() +getExitCode() +getStdout() +getStderr() } %% 主控制器 class MainController { +BuildSystem build_system +ModuleLauncher module_launcher +ConfigManager config_manager +DataManager data_manager +TestRunner test_runner +DeploymentManager deployment_manager +MonitorSystem monitor_system +PackageBuilder package_builder + +initialize_system() +process_command(String[] args) +manage_lifecycle() +handle_shutdown() } %% 继承关系 ApolloScriptBase <|-- BuildSystem ApolloScriptBase <|-- ModuleLauncher ApolloScriptBase <|-- DeviceSetup ApolloScriptBase <|-- ConfigManager ApolloScriptBase <|-- DataManager ApolloScriptBase <|-- TestRunner ApolloScriptBase <|-- DeploymentManager ApolloScriptBase <|-- MonitorSystem %% 关联关系 BuildSystem --> BuildOptimizer : uses ModuleLauncher --> ModuleRegistry : manages ModuleRegistry --> ModuleInfo : contains DeviceSetup --> HardwareValidator : uses ConfigManager --> VehicleConfig : manages DataManager --> DataRecorder : uses TestRunner --> TestCaseManager : uses TestCaseManager --> TestCase : contains DeploymentManager --> PackageBuilder : uses MonitorSystem --> MetricsCollector : uses MonitorSystem --> AlertHandler : uses ScriptExecutor --> ExecutionResult : creates %% 主控制器关联 MainController --> BuildSystem : orchestrates MainController --> ModuleLauncher : orchestrates MainController --> DeviceSetup : orchestrates MainController --> ConfigManager : orchestrates MainController --> DataManager : orchestrates MainController --> TestRunner : orchestrates MainController --> DeploymentManager : orchestrates MainController --> MonitorSystem : orchestrates MainController --> PackageBuilder : orchestrates MainController --> ScriptExecutor : uses

5. 状态机

stateDiagram-v2 [*] --> SystemInit : 启动脚本 SystemInit --> EnvSetup : 初始化环境变量 EnvSetup --> CheckDocker : 检查Docker环境 CheckDocker -->|在容器内| InDockerState : 设置容器环境 CheckDocker -->|在容器外| OutDockerState : 设置宿主环境 InDockerState --> DetectPlatform : 检测平台架构 OutDockerState --> DetectPlatform DetectPlatform -->|x86_64| SetupAMD64 : 配置x86_64环境 DetectPlatform -->|aarch64| SetupARM64 : 配置ARM64环境 SetupAMD64 --> DeviceInitialization : 初始化设备 SetupARM64 --> DeviceInitialization DeviceInitialization --> CreateDataDirs : 创建数据目录 CreateDataDirs --> SetupHardware : 配置硬件设备 SetupHardware --> SystemReady : 系统就绪 SystemReady --> WaitForCommand : 等待用户命令 WaitForCommand -->|构建命令| BuildProcess : 开始构建 WaitForCommand -->|启动模块| ModuleStart : 启动模块 WaitForCommand -->|停止模块| ModuleStop : 停止模块 WaitForCommand -->|运行测试| TestProcess : 运行测试 WaitForCommand -->|部署命令| DeployProcess : 执行部署 WaitForCommand -->|监控命令| MonitorProcess : 开始监控 WaitForCommand -->|数据记录| RecordProcess : 开始记录 WaitForCommand -->|清理命令| CleanProcess : 执行清理 %% 构建过程状态 state BuildProcess { [*] --> ParseArgs : 解析参数 ParseArgs --> ValidateEnv : 验证环境 ValidateEnv -->|环境有效| PrepareBuild : 准备构建 ValidateEnv -->|环境无效| BuildError : 环境错误 PrepareBuild --> SelectTargets : 选择构建目标 SelectTargets --> ExecuteBazel : 执行Bazel构建 ExecuteBazel -->|构建成功| PostBuild : 构建后处理 ExecuteBazel -->|构建失败| BuildError : 构建错误 PostBuild --> BuildComplete : 构建完成 BuildError --> [*] BuildComplete --> [*] } %% 模块启动状态 state ModuleStart { [*] --> ParseModuleArgs : 解析模块参数 ParseModuleArgs --> CheckModuleStatus : 检查模块状态 CheckModuleStatus -->|模块已运行| ModuleRunning : 模块已在运行 CheckModuleStatus -->|模块未运行| LocateLaunchFile : 查找启动文件 ModuleRunning --> [*] LocateLaunchFile --> LaunchViaCyber : 通过cyber_launch启动 LaunchViaCyber --> WaitLaunchResult : 等待启动结果 WaitLaunchResult -->|启动成功| VerifyModule : 验证模块状态 WaitLaunchResult -->|启动失败| ModuleStartError : 启动错误 VerifyModule -->|验证通过| ModuleStarted : 模块启动成功 VerifyModule -->|验证失败| ModuleStartError : 验证失败 ModuleStartError --> [*] ModuleStarted --> [*] } %% 模块停止状态 state ModuleStop { [*] --> IdentifyModule : 识别模块 IdentifyModule --> FindProcess : 查找进程 FindProcess -->|找到进程| KillProcess : 终止进程 FindProcess -->|未找到进程| ModuleNotRunning : 模块未运行 KillProcess --> VerifyStop : 验证停止状态 VerifyStop -->|已停止| ModuleStopped : 模块已停止 VerifyStop -->|未停止| ForceKill : 强制终止 ForceKill --> VerifyStop ModuleNotRunning --> [*] ModuleStopped --> [*] } %% 测试过程状态 state TestProcess { [*] --> SetupTestEnv : 设置测试环境 SetupTestEnv --> RunUnitTests : 运行单元测试 RunUnitTests -->|通过| RunIntegrationTests : 运行集成测试 RunUnitTests -->|失败| TestsFailed : 测试失败 RunIntegrationTests -->|通过| RunFunctionalTests : 运行功能测试 RunIntegrationTests -->|失败| TestsFailed : 测试失败 RunFunctionalTests -->|通过| GenerateReports : 生成报告 RunFunctionalTests -->|失败| TestsFailed : 测试失败 GenerateReports --> TestsComplete : 测试完成 TestsFailed --> [*] TestsComplete --> [*] } %% 部署过程状态 state DeployProcess { [*] --> ValidateTarget : 验证部署目标 ValidateTarget -->|有效| PreparePackage : 准备部署包 ValidateTarget -->|无效| DeployError : 部署目标错误 PreparePackage --> TransferPackage : 传输部署包 TransferPackage -->|成功| InstallPackage : 安装部署包 TransferPackage -->|失败| DeployError : 传输失败 InstallPackage -->|成功| ConfigureSystem : 配置系统 InstallPackage -->|失败| DeployError : 安装失败 ConfigureSystem -->|成功| VerifyDeploy : 验证部署 ConfigureSystem -->|失败| DeployError : 配置失败 VerifyDeploy -->|成功| DeploySuccess : 部署成功 VerifyDeploy -->|失败| DeployError : 验证失败 DeployError --> [*] DeploySuccess --> [*] } %% 监控过程状态 state MonitorProcess { [*] --> InitializeMonitors : 初始化监控器 InitializeMonitors --> CollectMetrics : 收集指标 CollectMetrics --> AnalyzeData : 分析数据 AnalyzeData --> CheckThresholds : 检查阈值 CheckThresholds -->|正常| ContinueMonitor : 继续监控 CheckThresholds -->|异常| TriggerAlert : 触发警报 ContinueMonitor --> CollectMetrics : 循环收集 TriggerAlert --> NotifyAdmin : 通知管理员 NotifyAdmin --> ContinueMonitor } %% 记录过程状态 state RecordProcess { [*] --> SelectChannels : 选择记录通道 SelectChannels --> CreateTaskDir : 创建任务目录 CreateTaskDir --> StartBagRecord : 开始bag记录 StartBagRecord --> MonitorDiskUsage : 监控磁盘使用 MonitorDiskUsage -->|空间充足| ContinueRecord : 继续记录 MonitorDiskUsage -->|空间不足| StopAndAlert : 停止并警报 ContinueRecord --> MonitorDiskUsage : 循环监控 StopAndAlert --> CompressData : 压缩数据 CompressData --> RecordComplete : 记录完成 RecordComplete --> [*] } %% 清理过程状态 state CleanProcess { [*] --> ScanData : 扫描数据 ScanData --> IdentifyOldData : 识别旧数据 IdentifyOldData --> ConfirmClean : 确认清理 ConfirmClean --> ExecuteClean : 执行清理 ExecuteClean --> VerifyClean : 验证清理 VerifyClean -->|成功| CleanComplete : 清理完成 VerifyClean -->|失败| CleanError : 清理错误 CleanError --> [*] CleanComplete --> [*] } %% 错误处理状态 state ErrorHandling { [*] --> LogError : 记录错误 LogError --> AssessSeverity : 评估严重性 AssessSeverity -->|致命错误| SystemShutdown : 系统关闭 AssessSeverity -->|一般错误| AttemptRecovery : 尝试恢复 AttemptRecovery -->|恢复成功| ReturnToReady : 返回就绪 AttemptRecovery -->|恢复失败| SystemShutdown : 系统关闭 SystemShutdown --> [*] ReturnToReady --> [*] } %% 连接错误处理 BuildProcess --> ErrorHandling : 构建错误 ModuleStart --> ErrorHandling : 启动错误 ModuleStop --> ErrorHandling : 停止错误 TestProcess --> ErrorHandling : 测试错误 DeployProcess --> ErrorHandling : 部署错误 RecordProcess --> ErrorHandling : 记录错误 CleanProcess --> ErrorHandling : 清理错误 %% 返回系统就绪状态 BuildComplete --> SystemReady ModuleStarted --> SystemReady ModuleStopped --> SystemReady TestsComplete --> SystemReady DeploySuccess --> SystemReady RecordComplete --> SystemReady CleanComplete --> SystemReady ReturnToReady --> SystemReady ModuleRunning --> SystemReady ModuleNotRunning --> SystemReady %% 系统关闭状态 SystemReady --> SystemShutdown : 接收关闭信号 ErrorHandling --> SystemShutdown : 系统错误关闭 SystemShutdown --> [*]

6. 源码分析

6.1. 核心初始化脚本

6.1.1. apollo_base.sh 初始化流程

apollo_base.sh是所有Apollo脚本的基础,它负责初始化环境变量和定义通用函数。

bash 复制代码
#!/usr/bin/env bash
TOP_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd -P)"
source ${TOP_DIR}/scripts/apollo.bashrc

ARCH="$(uname -m)"

APOLLO_OUTSIDE_DOCKER=0
CMDLINE_OPTIONS=
SHORTHAND_TARGETS=
DISABLED_TARGETS=

: ${CROSSTOOL_VERBOSE:=0}
: ${NVCC_VERBOSE:=0}
: ${HIPCC_VERBOSE:=0}

: ${USE_ESD_CAN:=false}
USE_GPU=-1

use_cpu=-1
use_gpu=-1
use_nvidia=-1
use_amd=-1

ENABLE_PROFILER=true

初始化流程主要包括:

  1. 设置顶级目录路径
  2. 加载基础bash配置
  3. 检测系统架构
  4. 初始化各种标志和配置

6.1.2. 环境路径配置

bash 复制代码
function set_lib_path() {
  local CYBER_SETUP="${APOLLO_ROOT_DIR}/cyber/setup.bash"
  [ -e "${CYBER_SETUP}" ] && . "${CYBER_SETUP}"
  pathprepend ${APOLLO_ROOT_DIR}/modules/tools PYTHONPATH
  pathprepend ${APOLLO_ROOT_DIR}/modules/teleop/common PYTHONPATH
  pathprepend /apollo/modules/teleop/common/scripts
}

该函数配置Python模块路径,确保模块能够正确导入。它首先检查CyberRT的环境配置文件是否存在,如果存在则加载该文件,然后将相关模块路径添加到PYTHONPATH中。

6.1.3. 数据目录创建

bash 复制代码
function create_data_dir() {
    local DATA_DIR="${APOLLO_ROOT_DIR}/data"
    mkdir -p "${DATA_DIR}/log"
    mkdir -p "${DATA_DIR}/bag"
    mkdir -p "${DATA_DIR}/core"
}

创建必要的数据目录,包括日志、数据包和核心转储目录。

6.2. 设备初始化机制

6.2.1. 设备初始化流程

根据系统架构类型,脚本会调用不同的设备初始化函数:

bash 复制代码
function setup_device() {
    if [ "$(uname -s)" != "Linux" ]; then
        info "Not on Linux, skip mapping devices."
        return
    fi
    if [[ "${ARCH}" == "x86_64" ]]; then
        setup_device_for_amd64
    else
        setup_device_for_aarch64
    fi
}

6.2.2. x86_64 架构设备初始化

bash 复制代码
function setup_device_for_amd64() {
    # setup CAN device
    local NUM_PORTS=8
    for i in $(seq 0 $((${NUM_PORTS} - 1))); do
        if [[ -e /dev/can${i} ]]; then
            continue
        elif [[ -e /dev/zynq_can${i} ]]; then
            # soft link if sensorbox exist
            sudo ln -s /dev/zynq_can${i} /dev/can${i}
        else
            break
            # sudo mknod --mode=a+rw /dev/can${i} c 52 ${i}
        fi
    done

    # Check Nvidia device
    if [[ ! -e /dev/nvidia0 ]]; then
        warning "No device named /dev/nvidia0"
    fi
    if [[ ! -e /dev/nvidiactl ]]; then
        warning "No device named /dev/nvidiactl"
    fi
    if [[ ! -e /dev/nvidia-uvm ]]; then
        warning "No device named /dev/nvidia-uvm"
    fi
    if [[ ! -e /dev/nvidia-uvm-tools ]]; then
        warning "No device named /dev/nvidia-uvm-tools"
    fi
    if [[ ! -e /dev/nvidia-modeset ]]; then
        warning "No device named /dev/nvidia-modeset"
    fi
}

该函数初始化CAN设备节点,为自动驾驶车辆的通信做准备,同时检查NVIDIA GPU设备的存在。

6.2.3. aarch64 架构设备初始化

bash 复制代码
function setup_device_for_aarch64() {
    local can_dev="/dev/can0"
    local socket_can_dev="can0"
    if [ ! -e "${can_dev}" ]; then
        warning "No CAN device named ${can_dev}. "
    fi

    if [[ -x "$(command -v ip)" ]]; then
        if ! ip link show type can | grep "${socket_can_dev}" &> /dev/null; then
            warning "No SocketCAN device named ${socket_can_dev}."
        else
            sudo modprobe can
            sudo modprobe can_raw
            sudo modprobe mttcan
            sudo ip link set "${socket_can_dev}" type can bitrate 500000 sjw 4 berr-reporting on loopback off
            sudo ip link set up "${socket_can_dev}"
        fi
    else
        warning "ip command not found."
    fi
}

6.3. 模块管理机制

6.3.1. 模块启动流程

模块启动的核心函数:

bash 复制代码
function start_customized_path() {
    MODULE_PATH=$1
    MODULE=$2
    shift 2

    is_stopped_customized_path "${MODULE_PATH}" "${MODULE}"
    if [ $? -eq 1 ]; then
        # todo(zero): Better to move nohup.out to data/log/nohup.out
        eval "nohup cyber_launch start ${APOLLO_ROOT_DIR}/modules/${MODULE_PATH}/launch/${MODULE}.launch &"
        sleep 0.5
        is_stopped_customized_path "${MODULE_PATH}" "${MODULE}"
        if [ $? -eq 0 ]; then
            ok "Launched module ${MODULE}."
            return 0
        else
            error "Could not launch module ${MODULE}. Is it already built?"
            return 1
        fi
    else
        info "Module ${MODULE} is already running - skipping."
        return 2
    fi
}

6.3.2. 模块状态检查

bash 复制代码
function is_stopped_customized_path() {
    MODULE_PATH=$1
    MODULE=$2
    NUM_PROCESSES="$(pgrep -f "modules/${MODULE_PATH}/launch/${MODULE}.launch" | grep -cv '^1$')"
    if [ "${NUM_PROCESSES}" -eq 0 ]; then
        return 1
    else
        return 0
    fi
}

该函数检查模块是否处于停止状态。

6.4. 构建系统实现

6.4.1. 构建目标确定

bash 复制代码
function determine_build_targets() {
    local targets_all
    if [[ "$#" -eq 0 ]]; then
        targets_all="$(python3 ${TOP_DIR}/scripts/find_all_package.py)"
        echo "${targets_all}"
        return
    fi

    for component in $@; do
        local build_targets
        if [ "${component}" = "cyber" ]; then
            build_targets="cyber"
        elif [[ -d "${TOP_DIR}/modules/${component}" ]]; then
            build_targets="modules/${component}"
        else
            error "Directory ${TOP_DIR}/modules/${component} not found. Exiting ..."
            exit 1
        fi
        if [ -z "${targets_all}" ]; then
            targets_all="${build_targets}"
        else
            targets_all="${targets_all} ${build_targets}"
        fi
    done
    echo "${targets_all}"
}

6.4.2. 构建参数处理

脚本使用getopts处理构建参数:

bash 复制代码
while getopts "cdef:g:hij:mn:pt:uv" opt; do
  case $opt in
    c)
      ACTION=clean
      ;;
    d)
      if [ -z "${SHORTHAND_TARGETS}" ]; then
        SHORTHAND_TARGETS="all"
      fi
      USE_DBG=1
      ;;
    e)
      ENABLE_PROFILER=false
      ;;
    f)
      ADDTIONAL_OPTIONS="${ADDTIONAL_OPTIONS} --compilation_mode=${OPTARG}"
      ;;
    g)
      ADDTIONAL_OPTIONS="${ADDTIONAL_OPTIONS} --cxxopt=-g${OPTARG}"
      ;;
    h)
      usage
      exit 0
      ;;
    i)
      USE_OPT=1
      ;;
    j)
      ADDTIONAL_OPTIONS="${ADDTIONAL_OPTIONS} -j${OPTARG}"
      ;;
    m)
      USE_GPU=0
      ;;
    n)
      ADDTIONAL_OPTIONS="${ADDTIONAL_OPTIONS} --jobs=${OPTARG}"
      ;;
    p)
      ACTION=build
      ;;
    t)
      if [ -z "${SHORTHAND_TARGETS}" ]; then
        SHORTHAND_TARGETS="all"
      fi
      ADDTIONAL_OPTIONS="${ADDTIONAL_OPTIONS} --test_timeout=${OPTARG}"
      ;;
    u)
      USE_GPU=1
      ;;
    v)
      set -x
      ;;
    \?)
      echo "Invalid option: -$OPTARG" >&2
      exit 1
      ;;
    :)
      echo "Option -$OPTARG requires an argument." >&2
      exit 1
      ;;
  esac
done

6.5. 配置管理脚本

6.5.1. 车辆配置切换

实现了车辆配置的动态切换:

bash 复制代码
function switch_vehicle() {
    local vehicle_id=$1
    local vehicle_dir="${APOLLO_ROOT_DIR}/modules/calibration/data/${vehicle_id}"
    
    if [ ! -d "${vehicle_dir}" ]; then
        error "Invalid vehicle id: ${vehicle_id}. Directory does not exist: ${vehicle_dir}"
        usage
    fi

    # Create symbolic links for calibration data
    rm -rf ${APOLLO_ROOT_DIR}/modules/calibration/data/current
    ln -s ${vehicle_dir} ${APOLLO_ROOT_DIR}/modules/calibration/data/current

    ok "Successfully switched to vehicle: ${vehicle_id}"
}

7. 设计模式

7.1. 模板方法模式

bash 复制代码
function start_customized_path() {
    MODULE_PATH=$1
    MODULE=$2
    shift 2

    is_stopped_customized_path "${MODULE_PATH}" "${MODULE}"  # 检查状态
    if [ $? -eq 1 ]; then                                   # 算法骨架
        eval "nohup cyber_launch start ${APOLLO_ROOT_DIR}/modules/${MODULE_PATH}/launch/${MODULE}.launch &"
        sleep 0.5
        is_stopped_customized_path "${MODULE_PATH}" "${MODULE}"
        if [ $? -eq 0 ]; then
            ok "Launched module ${MODULE}."
            return 0
        else
            error "Could not launch module ${MODULE}. Is it already built?"
            return 1
        fi
    else
        info "Module ${MODULE} is already running - skipping."
        return 2
    fi
}

这个函数定义了启动模块的通用流程,但具体的模块名称和路径可以由子类(即具体的模块启动脚本)来定制。

7.2. 策略模式

在设备初始化中,Apollo Scripts使用了策略模式来处理不同架构的设备初始化:

bash 复制代码
function setup_device() {
    if [ "$(uname -s)" != "Linux" ]; then
        info "Not on Linux, skip mapping devices."
        return
    fi
    if [[ "${ARCH}" == "x86_64" ]]; then
        setup_device_for_amd64  # x86_64策略
    else
        setup_device_for_aarch64  # aarch64策略
    fi
}

这里,setup_device_for_amd64setup_device_for_aarch64是两种不同的设备设置策略,系统根据当前架构选择合适的策略执行。

7.3. 工厂模式

构建系统使用工厂模式来创建不同的构建目标:

bash 复制代码
function determine_build_targets() {
    # ...
    for component in $@; do
        local build_targets
        if [ "${component}" = "cyber" ]; then
            build_targets="cyber"
        elif [[ -d "${TOP_DIR}/modules/${component}" ]]; then
            build_targets="modules/${component}"
        else
            error "Directory ${TOP_DIR}/modules/${component} not found. Exiting ..."
            exit 1
        fi
        # ...
    done
    # ...
}

根据不同的输入参数,函数创建不同的构建目标,这正是工厂模式的体现。

7.4. 单例模式

环境变量和全局配置在整个脚本系统中只初始化一次,后续脚本直接使用,体现了单例模式:

bash 复制代码
TOP_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd -P)"

这种模式确保了全局状态的一致性。

7.5. 适配器模式

bash 复制代码
if [ -f /.dockerenv ]; then
    APOLLO_IN_DOCKER=true
else
    APOLLO_IN_DOCKER=false
fi

该脚本检测当前运行环境(Docker容器内或外),并提供统一的环境变量接口。

7.6. 命令模式

模块管理中使用命令模式将操作封装为对象:

bash 复制代码
function start() {
    MODULE=$1
    shift

    start_customized_path $MODULE $MODULE "$@"
}

function stop() {
    MODULE=$1

    pkill -f "modules/${MODULE}/launch/${MODULE}.launch" || true
    sleep 1
}

7.7. 观察者模式

监控脚本实现观察者模式,监听系统事件并做出反应:

bash 复制代码
while true; do
    check_system_status
    check_module_health
    sleep $MONITOR_INTERVAL
done

监控系统作为观察者,定期检查系统状态和模块健康状况。

这些设计模式的运用使得Apollo Scripts具有良好的可扩展性、可维护性和灵活性,为Apollo自动驾驶平台提供了可靠的脚本支持。

8. 总结

apollo_scripts模块通过一系列精心设计的shell脚本,实现了Apollo系统的自动化构建、部署、运行和测试。其设计合理,模块化程度高,通过基础脚本提供通用功能,特定脚本完成专门任务,形成了一个完整的脚本生态系统。

相关推荐
Coder个人博客2 小时前
04_apollo_docker子模块整体软件架构深入分析文档
架构
Coder个人博客2 小时前
05_apollo_tools子模块整体软件架构深入分析文档
架构
会飞的大可2 小时前
WMS系统演进——从单体到微服务
微服务·云原生·架构
源远流长jerry3 小时前
软件定义网络 SDN 核心技术深度解析:从概念到实践
linux·网络·架构
二等饼干~za8986683 小时前
豆包GEO优化源码开发全解析:技术架构、实现逻辑与实操指南
数据库·sql·重构·架构·mybatis·音视频
盘古信息IMS3 小时前
2026年注塑MES系统选型新思维:从技术架构到行业适配的全方位评估框架
大数据·架构
roman_日积跬步-终至千里3 小时前
【软件系统架构师-综合题(2)】项目管理题目
架构
ai生成式引擎优化技术3 小时前
TSPR-WEB-LLM-HIC 生产级架构升级方案
架构