Kotlin:协程为管理异步操作提供了强大的工具,而Job和SupervisorJob正是这个系统的核心。理解这些概念对于构建健壮、容错的应用程序至关重要。本文将从它们的区别、使用场景和最佳实践进行讲解。
一、概念
什么是Job?
Job是一个具有生命周期的可取消任务。每个协程都有一个关联的Job,用于控制其执行。
你可以把Job看作是协程的句柄,它允许你:
- 取消协程
- 等待其完成
- 检查其状态(活跃、完成、取消)
- 建立父子关系
什么是SupervisorJob?
SupervisorJob是一种特殊类型的Job,它改变了取消行为。
使用SupervisorJob时:
- 子Job失败不会向上传播: 失败的子Job不会取消父Job
- 向下传播仍然有效: 取消父Job仍然会取消子Job
- 兄弟Job相互独立: 一个子Job的失败不会影响兄弟Job
二、 Job生命周期状态
- 新建 --- Job已创建但尚未启动(仅适用于惰性协程)
- 活跃 --- Job正在运行
- 完成中 --- Job正在完成但等待子Job
- 已完成 --- Job成功完成
- 取消中 --- Job正在被取消
- 已取消 --- Job已被取消

三、Job状态详细说明及示例
1. New (新建)
- 刚刚创建但还未启动的 Job
- 需要通过
start()或launch启动
kotlin
val job = Job() // 创建后处于 New 状态
job.start() // 启动后变为 Active
2. Active (活跃)
- Job 正在运行
- 可以创建子协程
- 属性状态:
isActive = trueisCompleted = falseisCancelled = false
kotlin
val job = launch {
// 协程体执行中 - Active 状态
delay(1000)
println("Done")
}
3. Completing (完成中)
- 协程体已执行完毕
- 正在等待所有子协程完成
- 属性状态:
isActive = falseisCompleted = false(还有子协程未完成)
4. Completed (已完成)
- 协程及其所有子协程都已正常完成
- 属性状态:
isActive = falseisCompleted = trueisCancelled = false
kotlin
val job = launch {
// 正常执行完成
}
job.join() // 等待进入 Completed 状态
5. Cancelling (取消中)
- 收到取消请求,正在执行取消操作
- 可能还在执行
finally块或挂起函数 - 属性状态:
isActive = falseisCancelled = true
kotlin
val job = launch {
try {
delay(10000)
} finally {
// 进入 Cancelling 状态
delay(1000) // 仍可执行挂起函数
}
}
delay(100)
job.cancel() // 触发 Cancelling 状态
6. Cancelled (已取消)
- 取消操作已完成
- 属性状态:
isActive = falseisCompleted = true(注意:已取消也被视为完成)isCancelled = true
7. 状态检查方法
kotlin
val job = launch {
// 协程体
}
// 检查状态
println("isActive: ${job.isActive}") // 是否活跃
println("isCompleted: ${job.isCompleted}") // 是否完成
println("isCancelled: ${job.isCancelled}") // 是否取消
// 等待完成
job.join()
// 取消 Job
job.cancel()
// 取消并等待完成
job.cancelAndJoin()
8. 状态转换示例
kotlin
fun main() = runBlocking {
val job = launch {
println("1. Job started - Active")
try {
delay(2000)
println("2. Normal completion - Completing → Completed")
} finally {
println("3. Finally block - Cancelling")
delay(500) // 在取消中仍可挂起
}
}
delay(500)
job.cancel() // Active → Cancelling
job.join() // 等待进入 Cancelled
println("4. Final state: isCancelled=${job.isCancelled}")
}
四、Job重要注意事项
1. Completed 和 Cancelled 都是完成状态
isCompleted = true表示两者- 需要用
isCancelled区分
2. 状态检查是瞬时的
- 状态可能在你检查后立即改变
3. 父子关系影响状态
- 父 Job 取消会导致所有子 Job 取消
- 父 Job 等待所有子 Job 完成才能进入 Completed
4. 结构化并发
- 父协程的作用域取消会传播到所有子协程
- 父协程会等待所有子协程完成
五、SupervisorJob基础示例
1. 失败不会传播
kotlin
import kotlinx.coroutines.*
fun main() = runBlocking {
val supervisorJob = SupervisorJob() // SupervisorJob
launch(supervisorJob) {
println("Child 1 starts")
delay(100)
throw RuntimeException("Child 1 failed!")
}
launch(supervisorJob) {
println("Child 2 starts")
delay(200)
println("Child 2 completed") // 这个会正常执行
}
delay(300)
println("Supervisor is alive: ${supervisorJob.isActive}") // true
println("Supervisor children active: ${supervisorJob.children.count { it.isActive }}")
}
2. 推荐使用 supervisorScope 构建器
kotlin
import kotlinx.coroutines.*
fun main() = runBlocking {
supervisorScope {
val child1 = launch {
println("Task 1 started")
delay(500)
throw ArithmeticException("Task 1 calculation error!")
}
val child2 = launch {
println("Task 2 started")
delay(1000)
println("Task 2 completed successfully")
}
val child3 = launch {
println("Task 3 started")
delay(1500)
println("Task 3 completed successfully")
}
// 处理各个子协程的结果
child1.join() // 会抛出异常
println("Child1 joined, isCancelled: ${child1.isCancelled}")
child2.join() // 正常完成
println("Child2 joined, isCompleted: ${child2.isCompleted}")
child3.join() // 正常完成
println("All tasks processed")
}
println("SupervisorScope completed")
}
六、SupervisorJob实际应用场景
1. 并行下载多个文件
kotlin
import kotlinx.coroutines.*
class FileDownloader {
suspend fun downloadFiles(urls: List<String>) = supervisorScope {
val downloadJobs = urls.map { url ->
async {
try {
println("Downloading $url")
delay((100..500).random().toLong()) // 模拟下载
if (url.contains("error")) {
throw Exception("Failed to download $url")
}
"Content of $url"
} catch (e: Exception) {
println("Error downloading $url: ${e.message}")
null // 返回 null 表示失败
}
}
}
// 收集所有结果(包括失败的)
val results = downloadJobs.map { it.await() }
results.filterNotNull() // 返回成功下载的内容
}
}
fun main() = runBlocking {
val downloader = FileDownloader()
val urls = listOf(
"http://example.com/file1.txt",
"http://example.com/file2-error.txt", // 这个会失败
"http://example.com/file3.txt",
"http://example.com/file4-error.txt" // 这个也会失败
)
val successfulDownloads = downloader.downloadFiles(urls)
println("Successfully downloaded ${successfulDownloads.size} files")
}
2. 微服务健康检查
kotlin
import kotlinx.coroutines.*
import java.time.LocalDateTime
class HealthChecker {
data class ServiceStatus(
val name: String,
val isHealthy: Boolean,
val responseTime: Long,
val error: String? = null
)
suspend fun checkAllServices() = supervisorScope {
val services = listOf(
"auth-service" to "http://auth:8080/health",
"payment-service" to "http://payment:8080/health",
"notification-service" to "http://notification:8080/health",
"database" to "http://db:5432/health"
)
services.map { (name, url) ->
async {
try {
val startTime = System.currentTimeMillis()
// 模拟健康检查请求
delay((50..500).random().toLong())
// 模拟随机失败
if (Math.random() < 0.3) {
throw Exception("$name timeout")
}
val responseTime = System.currentTimeMillis() - startTime
ServiceStatus(name, isHealthy = true, responseTime)
} catch (e: Exception) {
ServiceStatus(name, isHealthy = false, 0, e.message)
}
}
}.map { it.await() }
}
}
fun main() = runBlocking {
val checker = HealthChecker()
repeat(3) { checkNumber ->
println("\n=== Health Check #${checkNumber + 1} ===")
val results = checker.checkAllServices()
results.forEach { status ->
val statusIcon = if (status.isHealthy) "✅" else "❌"
println("$statusIcon ${status.name}: " +
if (status.isHealthy) "${status.responseTime}ms"
else "ERROR: ${status.error}")
}
val healthyCount = results.count { it.isHealthy }
println("\nSummary: $healthyCount/${results.size} services healthy")
delay(2000)
}
}
3. UI 应用中的并行任务
kotlin
import kotlinx.coroutines.*
class UserDashboardViewModel {
suspend fun loadDashboardData(userId: String): DashboardData = supervisorScope {
// 并行加载各种数据
val userProfile = async {
loadUserProfile(userId)
}
val recentOrders = async {
loadRecentOrders(userId)
}
val notifications = async {
loadNotifications(userId)
}
val recommendations = async {
try {
loadRecommendations(userId)
} catch (e: Exception) {
// 推荐失败不影响其他数据
emptyList<Recommendation>()
}
}
// 等待所有请求完成
DashboardData(
profile = userProfile.await(),
orders = recentOrders.await(),
notifications = notifications.await(),
recommendations = recommendations.await()
)
}
private suspend fun loadUserProfile(userId: String): UserProfile {
delay(300)
return UserProfile("User $userId")
}
private suspend fun loadRecentOrders(userId: String): List<Order> {
delay(400)
return listOf(Order("Order1"), Order("Order2"))
}
private suspend fun loadNotifications(userId: String): List<Notification> {
delay(200)
if (Math.random() < 0.2) throw Exception("Notification service down")
return listOf(Notification("New message"))
}
private suspend fun loadRecommendations(userId: String): List<Recommendation> {
delay(500)
if (Math.random() < 0.3) throw Exception("Recommendation engine error")
return listOf(Recommendation("Item 1"), Recommendation("Item 2"))
}
// 数据类
data class DashboardData(
val profile: UserProfile,
val orders: List<Order>,
val notifications: List<Notification>,
val recommendations: List<Recommendation>
)
data class UserProfile(val name: String)
data class Order(val id: String)
data class Notification(val message: String)
data class Recommendation(val item: String)
}
fun main() = runBlocking {
val viewModel = UserDashboardViewModel()
repeat(5) {
try {
println("\nLoading dashboard...")
val dashboard = viewModel.loadDashboardData("user123")
println("Dashboard loaded successfully:")
println("- Profile: ${dashboard.profile.name}")
println("- Orders: ${dashboard.orders.size}")
println("- Notifications: ${dashboard.notifications.size}")
println("- Recommendations: ${dashboard.recommendations.size}")
} catch (e: Exception) {
println("Critical error loading dashboard: ${e.message}")
}
delay(1000)
}
}
七、SupervisorJob高级用法:自定义错误处理
kotlin
import kotlinx.coroutines.*
class ResilientTaskManager {
private val supervisorJob = SupervisorJob()
private val scope = CoroutineScope(Dispatchers.IO + supervisorJob)
data class TaskResult(val id: String, val success: Boolean, val value: Any? = null, val error: String? = null)
fun submitTask(taskId: String, task: suspend () -> Any) {
scope.launch {
try {
println("[$taskId] Starting task")
val result = task()
println("[$taskId] Task completed successfully")
onTaskCompleted(TaskResult(taskId, true, result))
} catch (e: Exception) {
println("[$taskId] Task failed: ${e.message}")
onTaskCompleted(TaskResult(taskId, false, error = e.message))
// 失败的任务可以在这里重试
if (shouldRetry(taskId)) {
delay(1000)
println("[$taskId] Retrying...")
submitTask("$taskId-retry", task)
}
}
}
}
suspend fun shutdown(timeoutMillis: Long = 5000) {
scope.cancel()
try {
withTimeout(timeoutMillis) {
supervisorJob.join()
}
println("All tasks completed gracefully")
} catch (e: TimeoutCancellationException) {
println("Some tasks did not complete in time")
}
}
private fun onTaskCompleted(result: TaskResult) {
// 处理任务完成事件
println("Task ${result.id}: ${if (result.success) "SUCCESS" else "FAILED"}")
}
private fun shouldRetry(taskId: String): Boolean {
return !taskId.contains("no-retry")
}
}
fun main() = runBlocking {
val manager = ResilientTaskManager()
// 提交多个任务
manager.submitTask("task-1") {
delay(1000)
"Result 1"
}
manager.submitTask("task-2-error") {
delay(500)
throw RuntimeException("Simulated error in task 2")
}
manager.submitTask("task-3-no-retry") {
delay(800)
throw RuntimeException("Fatal error - no retry")
}
manager.submitTask("task-4") {
delay(1200)
"Result 4"
}
// 等待所有任务完成
delay(3000)
manager.shutdown()
}
八、SupervisorJob 重要注意事项
1. 错误处理
kotlin
supervisorScope {
val child = launch {
throw Exception("Child failed")
}
// 必须显式处理子协程异常
try {
child.join()
} catch (e: Exception) {
println("Caught child exception: ${e.message}")
}
}
2. 与 CoroutineExceptionHandler 配合
kotlin
val handler = CoroutineExceptionHandler { _, exception ->
println("Uncaught exception: ${exception.message}")
}
fun main() = runBlocking {
val supervisor = SupervisorJob()
val scope = CoroutineScope(Dispatchers.Default + supervisor + handler)
scope.launch {
throw Exception("This will be caught by handler")
}
scope.launch {
delay(1000)
println("I'm still running!")
}
delay(2000)
supervisor.cancel()
}
3. 限制并发数
kotlin
import kotlinx.coroutines.sync.Semaphore
suspend fun limitedParallelTasks(tasks: List<suspend () -> Unit>, maxConcurrent: Int = 3) {
val semaphore = Semaphore(maxConcurrent)
supervisorScope {
tasks.map { task ->
launch {
semaphore.withPermit {
task()
}
}
}.forEach { it.join() }
}
}
九、SupervisorJob总结
SupervisorJob 的主要使用场景:
- 需要独立执行多个任务的场景
- 一个任务的失败不应影响其他任务
- UI 应用中并行加载多个数据源
- 批量处理任务,允许部分失败
- 微服务架构中的健康检查、监控等
关键点:
- 使用
supervisorScope或SupervisorJob()创建 - 子协程异常需要显式处理
- 配合
CoroutineExceptionHandler进行全局错误处理 - 注意资源清理和超时控制
通过 SupervisorJob,你可以构建更健壮、容错性更好的并发应用。