机器学习(六) — 评估模型

Evaluate model

1 test set

  1. split the training set into training set and a test set
  2. the test set is used to evaluate the model

1. linear regression

compute test error

J t e s t ( w ⃗ , b ) = 1 2 m t e s t ∑ i = 1 m t e s t ( f ( x t e s t ( i ) ) − y t e s t ( i ) ) 2 J_{test}(\vec w, b) = \frac{1}{2m_{test}}\sum_{i=1}^{m_{test}} \left (f(x_{test}\^{(i)}) - y_{test}\^{(i)})\^2 \\right Jtest(w ,b)=2mtest1i=1∑mtest(f(xtest(i))−ytest(i))2

2. classification regression

compute test error

J t e s t ( w ⃗ , b ) = − 1 m t e s t ∑ i = 1 m t e s t y t e s t ( i ) l o g ( f ( x t e s t ( i ) ) ) + ( 1 − y t e s t ( i ) ) l o g ( 1 − f ( x t e s t ( i ) ) J_{test}(\vec w, b) = -\frac{1}{m_{test}}\sum_{i=1}^{m_{test}} \left y_{test}\^{(i)}log(f(x_{test}\^{(i)})) + (1 - y_{test}\^{(i)})log(1 - f(x_{test}\^{(i)}) \\right Jtest(w ,b)=−mtest1i=1∑mtestytest(i)log(f(xtest(i)))+(1−ytest(i))log(1−f(xtest(i))

2 cross-validation set

  1. split the training set into training set, cross-validation set and test set
  2. the cross-validation set is used to automatically choose the better model, and the test set is used to evaluate the model that chosed

3 bias and variance

  1. high bias: J t r a i n J_{train} Jtrain and J c v J_{cv} Jcv is both high
  2. high variance: J t r a i n J_{train} Jtrain is low, but J c v J_{cv} Jcv is high
  1. if high bias: get more training set is helpless
  2. if high variance: get more training set is helpful

4 regularization

  1. if λ \lambda λ is too small, it will lead to overfitting(high variance)
  2. if λ \lambda λ is too large, it will lead to underfitting(high bias)

5 method

  1. fix high variance:
    • get more training set
    • try smaller set of features
    • reduce some of the higher-order terms
    • increase λ \lambda λ
  2. fix high bias:
    • get more addtional features
    • add polynomial features
    • decrease λ \lambda λ

6 neural network and bias variance

  1. a bigger network means a more complex model, so it will solve the high bias
  2. more data is helpful to solve high variance
  1. it turns out that a bigger(may be overfitting) and well regularized neural network is better than a small neural network
相关推荐
Godspeed Zhao4 分钟前
Level 4自动驾驶系统设计1——功能与场景1
人工智能·机器学习·自动驾驶
ACP广源盛139246256739 分钟前
IX6012 PCIe 交换芯片@ACP#RTX Spark 入门级 12 口存储外设扩展方案(对比 ASM1812)
大数据·人工智能·分布式·嵌入式硬件·gpt·spark·电脑
丨白色风车丨11 分钟前
OpenCV 实战入门:轮廓检测、模板匹配与命令行参数解析
人工智能·opencv·计算机视觉
zhangfeng113313 分钟前
workbuddy 结合deepseekv4-flash 安装打印机 hp laster jet 3050
人工智能·workbuddy
爱看科技17 分钟前
三星提速开启AI转型,英伟达/WIMI微美全息推进AI算力建设需求持续旺盛
人工智能
dog25019 分钟前
信号权重和流分类的对数规律
人工智能·分类·数据挖掘
道一云黑板报20 分钟前
告别提示词工程:为什么“循环工程”才是 AI 编程的未来?
人工智能·驱动开发·软件工程·ai编程
实在智能RPA21 分钟前
大模型驱动航班规划实战:2026年企业级Agent重塑航空业调度逻辑
人工智能·ai
叫我:松哥21 分钟前
基于Python的共享单车租赁数据分析与预测系统,技术栈flask+boostrap+随机森林+XGBoost
人工智能·python·深度学习·算法·随机森林·数据分析·flask
米小虾33 分钟前
2026年6月AI大模型全景报告:GPT-5.6、Claude Opus 4.8、Gemini 3.5,中美AI三足鼎立谁主沉浮?
人工智能