从零开始构建逻辑回归模型

逻辑回归模型是针对线性可分问题的一种易于实现而且性能优异的分类模型。我们将分别使用Numpy和TensorFlow实现逻辑回归模型训练和预测过程。

从零构建

首先，我们通过Numpy构建一个逻辑回归模型。
我们定义shape如下：
\(X\):（n,m）
\(Y\)：(1,m)
\(w\):（n,1）
\(b\):（1）
其中\(n\)代表特征维数，\(m\)代表样本个数。
对于逻辑回归二分类模型，其损失函数如下：
\[J(\theta)=-\frac{1}{m}\sum_{i=1}^N{[y_i\log{h_\theta(x_i)}+(1-y_i)\log{(1-h_\theta(x_i))}]}\]
对\(\theta\)求导得\(\theta\)的更新方式是：
\[ \theta_j:= \theta_j-\alpha \frac{1}{m}\sum_{i=1}^m (h_\theta(x_i)-y_i)x_i^j\]
所以，在代码中，\(\theta\)的更新方式为：

dw = np.dot(X,(A-Y).T)/m

各个函数作用如下：

sigmoid(x):激活函数实现
initialization(dim):零值初始化w以及b
propagate(w,b,X,Y)：前向传播得到梯度以及代价函数值
optimize(w,b,X,Y,learning_rate,epochs,print_cost=False)：反向传播更新参数
predict(w,b,X)：得到预测值
model(X_train,Y_train,X_test,Y_test,epochs=200,learning_rate=0.01,print_cost=False)：建模

import numpy as np
def sigmoid(x):
    return 1/(1+np.exp(-x))

def initialization(dim):
    w = np.zeros((dim,1))
    b = 0
    return w,b

def propagate(w,b,X,Y):
    m = X.shape[1]
    A = sigmoid(np.dot(w.T,X)+b)
    cost = -1*np.sum(Y*np.log(A)+(1-Y)*np.log(1-A))/m
    dw = np.dot(X,(A-Y).T)/m
    db = np.sum(A-Y)/m
    grads = {'dw':dw,'db':db}
    return grads,cost

def optimize(w,b,X,Y,learning_rate,epochs,print_cost=False):
    costs = []
    for epoch in range(epochs):
        grads,cost = propagate(w,b,X,Y)
        dw = grads['dw']
        db = grads['db']
        w -= learning_rate * dw
        b -= learning_rate * db
        if epochs % 100 == 0:
            costs.append(cost)
            if print_cost:
                print('epochs:%i;cost:%f'%(epoch,cost))
    params = {'w':w,'b':b}
    return params,costs

def predict(w,b,X):
    predictions = sigmoid(np.dot(w.T,X)+b)
    return (predictions>0.5).astype(int)

def model(X_train,Y_train,X_test,Y_test,epochs=200,learning_rate=0.01,print_cost=False):
    dim = X_train.shape[0]
    w,b = initialization(dim)
    params, costs = optimize(w,b,X_train,Y_train,learning_rate,epochs,print_cost)
    w,b = params['w'],params['b']
    Y_predictions = predict(w,b,X_test)
    print('Test Acc:{}%'.format(100-np.mean(abs(Y_predictions-Y_test))*100))

if __name__ == '__main__':
    n = 20 #特征维度
    m = 200 #样本数目
    X_train = np.random.random((n,m))
    Y_train = np.random.randint(0,2,size=(1,m))
    X_test = np.random.random((n,10))
    Y_test = np.random.randint(0,2,size=(1,10))

    model(X_train,Y_train,X_test,Y_test,epochs=200,learning_rate=0.01,print_cost=False)

TensorFlow版本

下面，我们实现下 TensorFlow版本的逻辑回归模型。
这里采用了mnist数据集，将每张图片\(28*28\)像素作为特征，使用\(softmax\)作为激活函数，最终我们的损失函数定义只有一行代码：

cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred)))

下面，是具体的构建过程：

import tensorflow as tf

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('/tmp/data',one_hot=True)

learning_rate = 0.1
training_steps = 100
display_steps = training_steps//10
batch_size =64

X = tf.placeholder(tf.float32,[None,28*28])
y = tf.placeholder(tf.float32,[None,10])

W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))

pred = tf.nn.softmax(tf.add(tf.matmul(X,W),b))

cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred)))

optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    for epoch in range(training_steps):
        avg_cost = 0.
        total_batch = int(mnist.train.num_examples/batch_size)
        for i in range(total_batch):
            batch_x,batch_y = mnist.train.next_batch(batch_size)
            _,c = sess.run([optimizer,cost],feed_dict={X:batch_x,y:batch_y})
            avg_cost += c/total_batch
        if (epoch+1) % display_steps == 0:
            print('The epoch:%i,The cost:%9f'%(epoch+1,avg_cost))
    print('Finished')
    correct_prediction = tf.equal(tf.argmax(pred,1),tf.argmax(y,1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
    print('acc:',sess.run(accuracy,feed_dict={X: mnist.test.images[:3000], y: mnist.test.labels[:3000]}))