机器学习-监督性学习 2021-11-20

人工智能基础总目录

监督性学习

人工智能基础总目录
- 一、内容介绍
- 二、线性回归
- - 多分类的时候-Softmax函数
  - 多分类的交叉商 cross_entrory
- 三、模型的评价值

一、内容介绍

线性回归 linear regression
逻辑回归 logistic regression
softmax, cross-entropy
模型评价指标

二、线性回归

逻辑回归是机器学习中的一种分类模型。虽然名字中带有回归，但是本质是一种分类算法。之所以没有同线性回归一样解决回归问题，是因为它在线性回归的结果上，加上了sigmoid激活函数。sigmoid激活函数将结果变为0或1。存在一个数字，将 0.8 处理为1。
在这里插入图片描述
定义Loss 函数， Loss 对w 求偏导，计算得到w。使用梯度下降计算得到w。原有函数 [x1 - x2]^2, 偏导在[0,1] 之间比较小。梯度小更新慢，梯度太大出现抖动的情况，都不为很好的找到目标值。

线性回归的结果输入到sigmoid激活函数当中，输出结果是 [0, 1] 区间中的一个概率值，默认0.5为阈值，逻辑回归最终的分类是通过属于某个类别的概率值来判断是否属于某个类别，并且这个类别默认标记为1（1表示正例），另外一个类别会标记为0（0表示反例）

无论何时，我们都希望损失函数值越小越好。分情况讨论，对应的损失函数值，当y=1时，我们希望hΘ(x)的值越大越好，当y=0时，我们希望hΘ(x)的值越小越好。

在这里插入图片描述

多分类的时候-Softmax函数

假设x （行*列）输入是 1*10， W 10 * 5 + b ; xw + b 得到 1*5 的结果为Z。通过Softmax函数就可以将多分类的输出值转换为范围在[0, 1]和为1的概率分布
在这里插入图片描述

多分类的交叉商 cross_entrory

在多个输出中，S(y)相当于y的估计值， L 是实际值。当使用Softmax函数作为输出节点的激活函数的时候，一般使用交叉熵作为损失函数。
在这里插入图片描述

通过梯度下降实现逻辑回归w 的计算，并输出预测结果。

import numpy as np
from icecream import ic

X = np.random.normal(size=(10, 7))
y = np.array([
    [1],
    [0],
    [0],
    [0],
    [1],
    [0],
    [0],
    [1],
    [0],
    [0],
])

weights = np.random.normal(size=(1, 7))
bias = 0


def loss(yhats, y):
    return np.mean( (yhats - y) ** 2 )


def partial_w(yhats, y, train_x):
    return 2 * np.mean((yhats - y) * train_x, axis=0)


def partial_b(yhats, y):
    return 2 * np.mean(yhats - y)


def logistic(x):
    return 1 / (1 + np.exp(-x))


def softmax(x):
    x -= np.max(x)
    sum = np.sum(np.exp(x))

    return np.exp(x) / sum


def cross_entropy(yhats, y):
    return - np.mean( y * np.log(yhats))


def train_linear_regression(X, weights, bias, y):
    for i in range(10):
        # yhats = X @ weights.T + bias
        yhats = logistic(X @ weights.T + bias)
        threshold = 0.5
        probs = np.array((yhats > threshold), dtype=np.int)
        ic(probs)
        # loss = cross_entropy(yhats, y)
        # loss_value = loss(yhats, y)
        # ic(loss_value)
        # learning_rate = 1e-3
        # weights += -1 * partial_w(yhats, y, X) * learning_rate
        # bias += -1 * partial_b(yhats, y)
        ic(yhats)
        # ic(loss)

if __name__ == '__main__':
    train_linear_regression(X, weights, bias, y)

三、模型的评价值

Baseline
Accuracy
Precision
Recall
ROC/AUC （ROC 曲线下的面积）
F1_score, F2_score
Precision， Recall 都是针对positive 来说，Precision 预测的准确度。recall 是在真正positive 里面找到多少。
在这里插入图片描述