面试手撕笔记ML/DL

数据集

数据集的批处理迭代器

实现一个批量可迭代函数，该函数在numpy数组X和可选numpy数组y中进行采样。该函数应该生成指定大小的批量。如果提供了y，则该函数应生成（X, y）对的批次；否则，它应该只产生X批次。

Example:

Input:

复制代码

X = np.array([[1, 2], 
                  [3, 4], 
                  [5, 6], 
                  [7, 8], 
                  [9, 10]])
    y = np.array([1, 2, 3, 4, 5])
    batch_size = 2
    batch_iterator(X, y, batch_size)

Output:

复制代码

[[[[1, 2], [3, 4]], [1, 2]],
     [[[5, 6], [7, 8]], [3, 4]],
     [[[9, 10]], [5]]]

python 复制代码

import numpy as np

def batch_iterator(X, y=None, batch_size=64):
	n_samples = X.shape[0]
	batches = []
	for i in range(0, n_samples, batch_size):
		begin, end = i, min(i + batch_size, n_samples)
		if y is not None:
			batches.append([X[begin:end], y[begin:end]])
		else:
			batches.append(X[begin:end])
	return batches

激活函数

sigmoid

题解

python 复制代码

import math
def sigmoid(z: float) -> float:
   result = 1 / (1 + math.exp(-z))
   return round(result, 4)

梯度下降

使用梯度下降的线性回归（MSE）

题解

python 复制代码

import numpy as np
def linear_regression_gradient_descent(X: np.ndarray, y: np.ndarray, alpha: float, iterations: int) -> np.ndarray:
    m, n = X.shape
    theta = np.zeros((n, 1))
    for _ in range(iterations):
        predictions = X @ theta
        errors = predictions - y.reshape(-1, 1)
        updates = X.T @ errors / m
        theta -= alpha * updates
    return np.round(theta.flatten(), 4)

MSE 损失的多种梯度下降

题解

python 复制代码

import numpy as np

def gradient_descent(X, y, weights, learning_rate, n_iterations, batch_size=1, method='batch'):
    m = len(y)
    
    for _ in range(n_iterations):
        if method == 'batch':
            # Calculate the gradient using all data points
            predictions = X.dot(weights)
            errors = predictions - y
            gradient = 2 * X.T.dot(errors) / m
            weights = weights - learning_rate * gradient
        
        elif method == 'stochastic':
            # Update weights for each data point individually
            for i in range(m):
                prediction = X[i].dot(weights)
                error = prediction - y[i]
                gradient = 2 * X[i].T.dot(error)
                weights = weights - learning_rate * gradient
        
        elif method == 'mini_batch':
            # Update weights using sequential batches of data points without shuffling
            for i in range(0, m, batch_size):
                X_batch = X[i:i+batch_size]
                y_batch = y[i:i+batch_size]
                predictions = X_batch.dot(weights)
                errors = predictions - y_batch
                gradient = 2 * X_batch.T.dot(errors) / batch_size
                weights = weights - learning_rate * gradient
                
    return weights