机器学习(六) — 评估模型

Evaluate model

1 test set

  1. split the training set into training set and a test set
  2. the test set is used to evaluate the model

1. linear regression

compute test error

J t e s t ( w ⃗ , b ) = 1 2 m t e s t ∑ i = 1 m t e s t [ ( f ( x t e s t ( i ) ) − y t e s t ( i ) ) 2 ] J_{test}(\vec w, b) = \frac{1}{2m_{test}}\sum_{i=1}^{m_{test}} \left [ (f(x_{test}^{(i)}) - y_{test}^{(i)})^2 \right ] Jtest(w ,b)=2mtest1i=1∑mtest[(f(xtest(i))−ytest(i))2]

2. classification regression

compute test error

J t e s t ( w ⃗ , b ) = − 1 m t e s t ∑ i = 1 m t e s t [ y t e s t ( i ) l o g ( f ( x t e s t ( i ) ) ) + ( 1 − y t e s t ( i ) ) l o g ( 1 − f ( x t e s t ( i ) ) ] J_{test}(\vec w, b) = -\frac{1}{m_{test}}\sum_{i=1}^{m_{test}} \left [ y_{test}^{(i)}log(f(x_{test}^{(i)})) + (1 - y_{test}^{(i)})log(1 - f(x_{test}^{(i)}) \right ] Jtest(w ,b)=−mtest1i=1∑mtest[ytest(i)log(f(xtest(i)))+(1−ytest(i))log(1−f(xtest(i))]

2 cross-validation set

  1. split the training set into training set, cross-validation set and test set
  2. the cross-validation set is used to automatically choose the better model, and the test set is used to evaluate the model that chosed

3 bias and variance

  1. high bias: J t r a i n J_{train} Jtrain and J c v J_{cv} Jcv is both high
  2. high variance: J t r a i n J_{train} Jtrain is low, but J c v J_{cv} Jcv is high
  1. if high bias: get more training set is helpless
  2. if high variance: get more training set is helpful

4 regularization

  1. if λ \lambda λ is too small, it will lead to overfitting(high variance)
  2. if λ \lambda λ is too large, it will lead to underfitting(high bias)

5 method

  1. fix high variance:
    • get more training set
    • try smaller set of features
    • reduce some of the higher-order terms
    • increase λ \lambda λ
  2. fix high bias:
    • get more addtional features
    • add polynomial features
    • decrease λ \lambda λ

6 neural network and bias variance

  1. a bigger network means a more complex model, so it will solve the high bias
  2. more data is helpful to solve high variance
  1. it turns out that a bigger(may be overfitting) and well regularized neural network is better than a small neural network
相关推荐
合新通信 | 让光不负所托几秒前
氟化液、矿物油、改性硅油三种冷却液,分别适合搭配什么功率等级的浸没式液冷光模块?
人工智能·安全·云计算·信息与通信·光纤通信
啊阿狸不会拉杆1 分钟前
《机器学习》第五章-集成学习(Bagging/Boosting)
人工智能·算法·机器学习·计算机视觉·集成学习·boosting
Programmer boy2 分钟前
我是一名软件行业从业者,AI主要帮助我做哪些工作?
人工智能
Coder_Boy_3 分钟前
基于SpringAI的在线考试系统-成绩管理功能实现方案
开发语言·前端·javascript·人工智能·spring boot
lxs-3 分钟前
探索自然语言处理(NLP)的旅程:从分词到文本生成
人工智能·自然语言处理
大模型任我行3 分钟前
腾讯:RAG生成器感知的排序模型
人工智能·语言模型·自然语言处理·论文笔记
玩转AI6665 分钟前
AI-论文智能降重工具
人工智能
科研计算中心7 分钟前
2026年仿真计算对电脑的要求深度解析:从硬件选型到算力方案的全维度适配指南
人工智能·云计算·算力·高性能计算·仿真计算
幻云20107 分钟前
Python深度学习:从筑基与巅峰
前端·javascript·vue.js·人工智能·python
Light608 分钟前
庖丁解牛:深入JavaScript内存管理,从内存泄漏到AI赋能的性能优化
javascript·人工智能·性能优化·内存管理·垃圾回收·内存泄漏·v8引擎