过去几年发表于各大 AI 顶会论文提出的 400 多种算法中,公开算法代码的仅占 6%,其中三分之一的论文作者分享了测试数据,约 54% 的分享包含“伪代码”。这是今年 AAAI 会议上一个严峻的报告。 人工智能这个蓬勃发展的领域正面临着实验重现的危机,就像实验重现问题过去十年来一直困扰着心理学、医学以及其他领域一样。最根本的问题是研究人员通常不共享他们的源代码。
可验证的知识是科学的基础,它事关理解。随着人工智能领域的发展,打破不可复现性将是必要的。为此,PaperWeekly 联手百度 PaddlePaddle 共同发起了本次论文有奖复现,我们希望和来自学界、工业界的研究者一起接力,为 AI 行业带来良性循环。
本项目用paddlepaddle复现InfoGAN网络模型,感谢中山大学数学学院「黄涛」同学的论文复现工作与开源贡献
下载安装命令
## CPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle
## GPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu
数据介绍
本项目使用mnist数据集进行训练,并在训练完成后验证模型的生成效果
模型介绍
作者将输入生成器的随机噪声分成了两部分:一部分是随机噪声Z, 另一部分是由若干隐变量拼接而成的latent code c。其中,c会有先验的概率分布,可以离散也可以连续,用来代表生成数据的不同特征。例如:对于MNIST数据集,c包含离散部分和连续部分,离散部分取值为0~9的离散随机变量(表示数字),连续部分有两个连续型随机变量(分别表示倾斜度和粗细度)。
为了让隐变量c能够与生成数据的特征产出关联,作者引入了互信息来对c进行约束,因为c对生成数据G(z, c)具有可解释性,那么c和G(z, c)应该具有较高的相关性,即它们之间的互信息比较大。互信息是两个随机变量之间依赖程度的度量,互信息越大就说明生成网络在根据c的信息生成数据时,隐编码c的信息损失越低,即生成数据保留的c的信息越多。因此,我们希望c和G(z, c)之间的互信息I(c; G(z, c))越大越好,故模型的目标函数变为:
但是由于在c与G(z, c)的互信息的计算中,真实的P(c|x)难以获得,因此在具体的优化过程中,作者采用了变分推断的思想,引入了变分分布Q(c|x)来逼近P(c|x),它是基于最优互信息下界的轮流迭代实现最终的求解,于是InfoGAN的目标函数变为:
模型的基本结构
其中,真实数据Real_data只是用来跟生成的Fake_data混合在一起进行真假判断,并根据判断的结果更新生成器和判别器,从而使生成的数据与真实数据接近。生成数据既要参与真假判断,还需要和隐变量C_vector求互信息,并根据互信息更新生成器和判别器,从而使得生成图像中保留了更多隐变量C_vector的信息。
因此可以对InfoGAN的基本结构进行如下的拆分,其中判别器D和Q共用所有卷积层,只是最后的全连接层不同。从另一个角度来看,G-Q联合网络相当于是一个自编网络,G相当于一个编码器,而Q相当于一个解码器,生成数据Fake_data相当于对输入隐变量C_vector的编码。
生成器G的输入为:(batch_size, noise_dim + discrete_dim + continuous_dim),其中noise_dim为输入噪声的维度,discrete_dim为离散隐变量的维度,continuous_dim为连续隐变量的维度。生成器G的输出为(batch_size, channel, img_cols, img_rows)。
判别器D的输入为:(batch_size, channel, img_cols, img_rows),判别器D的输出为:(batch_size, 1)。
判别器Q的输入为:(batch_size, channel, img_cols, img_rows),Q的输出为:(batch_size, discrete_dim + continuous_dim)
# 模型的定义以及训练,并在训练过程中展示模型生成的效果
%matplotlib inline
#让matplotlib的输出图像能够直接在notebook上显示
import paddle
import paddle.fluid as fluid
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import os
#batch 的大小
mb_size = 32
#随机变量长度
Z_dim = 16
#生成随机变量
def sample_Z(m, n):
return np.random.uniform(-1., 1., size=[m, n])
def sample_c(m):
return np.random.multinomial(1, 10*[0.1], size=m)
#生成器模型
#由于aistudio环境限制,使用两层FC网络
def generator(inputs):
G_h1 = fluid.layers.fc(input = inputs,
size = 256,
act = "relu",
param_attr = fluid.ParamAttr(name="GW1",
initializer = fluid.initializer.Xavier()),
bias_attr = fluid.ParamAttr(name="Gb1",
initializer = fluid.initializer.Constant()))
G_prob = fluid.layers.fc(input = G_h1,
size = 784,
act = "sigmoid",
param_attr = fluid.ParamAttr(name="GW2",
initializer = fluid.initializer.Xavier()),
bias_attr = fluid.ParamAttr(name="Gb2",
initializer = fluid.initializer.Constant()))
return G_prob
#判别器模型
def discriminator(x):
D_h1 = fluid.layers.fc(input = x,
size = 128,
act = "relu",
param_attr = fluid.ParamAttr(name="DW1",
initializer = fluid.initializer.Xavier()),
bias_attr = fluid.ParamAttr(name="Db1",
initializer = fluid.initializer.Constant()))
D_logit = fluid.layers.fc(input = D_h1,
size = 1,
act = "sigmoid",
param_attr = fluid.ParamAttr(name="DW2",
initializer = fluid.initializer.Xavier()),
bias_attr = fluid.ParamAttr(name="Db2",
initializer = fluid.initializer.Constant()))
return D_logit
#c的概率近似(对于输入X)
def Q(x):
Q_h1 = fluid.layers.fc(input = x,
size = 128,
act = "relu",
param_attr = fluid.ParamAttr(name="QW1",
initializer = fluid.initializer.Xavier()),
bias_attr = fluid.ParamAttr(name="Qb1",
initializer = fluid.initializer.Constant()))
Q_prob = fluid.layers.fc(input = Q_h1,
size = 10,
act = "softmax",
param_attr = fluid.ParamAttr(name="QW2",
initializer = fluid.initializer.Xavier()),
bias_attr = fluid.ParamAttr(name="Qb2",
initializer = fluid.initializer.Constant()))
return Q_prob
#G优化程序
G_program = fluid.Program()
with fluid.program_guard(G_program, fluid.default_startup_program()):
Z = fluid.layers.data(name='Z', shape=[Z_dim], dtype='float32')
c = fluid.layers.data(name='c', shape=[10], dtype='float32')
#合并输入
inputs = fluid.layers.concat(input=[Z, c], axis = 1)
G_sample = generator(inputs)
D_fake = discriminator(G_sample)
G_loss = 0.0 - fluid.layers.reduce_mean(fluid.layers.log(D_fake + 1e-8))
theta_G = ["GW1", "Gb1", "GW2", "Gb2"]
G_optimizer = fluid.optimizer.AdamOptimizer()
G_optimizer.minimize(G_loss, parameter_list=theta_G)
#D优化程序
D_program = fluid.Program()
with fluid.program_guard(D_program, fluid.default_startup_program()):
Z = fluid.layers.data(name='Z', shape=[Z_dim], dtype='float32')
c = fluid.layers.data(name='c', shape=[10], dtype='float32')
X = fluid.layers.data(name='X', shape=[784], dtype='float32')
X = X * 0.5 + 0.5
inputs = fluid.layers.concat(input=[Z, c], axis = 1)
G_sample = generator(inputs)
D_real = discriminator(X)
D_fake = discriminator(G_sample)
D_loss = 0.0 - fluid.layers.reduce_mean(fluid.layers.log(D_real + 1e-8)
+ fluid.layers.log(1.0 - D_fake + 1e-8))
theta_D = ["DW1", "Db1", "DW2", "Db2"]
D_optimizer = fluid.optimizer.AdamOptimizer()
D_optimizer.minimize(D_loss, parameter_list=theta_D)
#Q优化程序
Q_program = fluid.Program()
with fluid.program_guard(Q_program, fluid.default_startup_program()):
Z = fluid.layers.data(name='Z', shape=[Z_dim], dtype='float32')
c = fluid.layers.data(name='c', shape=[10], dtype='float32')
inputs = fluid.layers.concat(input=[Z, c], axis = 1)
G_sample = generator(inputs)
Q_c_given_x = Q(G_sample)
#最小化熵
Q_loss = fluid.layers.reduce_mean(
0.0 - fluid.layers.reduce_sum(
fluid.layers.elementwise_mul(fluid.layers.log(Q_c_given_x + 1e-8), c), 1))
theta_Q = ["GW1", "Gb1", "GW2", "Gb2",
"QW1", "Qb1", "QW2", "Qb2"]
Q_optimizer = fluid.optimizer.AdamOptimizer()
Q_optimizer.minimize(Q_loss, parameter_list = theta_Q)
#Inference
Infer_program = fluid.Program()
with fluid.program_guard(Infer_program, fluid.default_startup_program()):
Z = fluid.layers.data(name='Z', shape=[Z_dim], dtype='float32')
c = fluid.layers.data(name='c', shape=[10], dtype='float32')
inputs = fluid.layers.concat(input=[Z, c], axis = 1)
G_sample = generator(inputs)
#读入数据,只载入训练集
train_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.mnist.train(), buf_size=500),
batch_size=mb_size)
#Executor
exe = fluid.Executor(fluid.CUDAPlace(0))
exe.run(program=fluid.default_startup_program())
it = 0
for _ in range(11):
for data in train_reader():
it += 1
#获取训练集图像
X_mb = [data[i][0] for i in range(mb_size)]
#生成噪声
Z_noise = sample_Z(mb_size, Z_dim)
c_noise = sample_c(mb_size)
feeding_withx= {"X" : np.array(X_mb).astype('float32'),
"Z" : np.array(Z_noise).astype('float32'),
"c" : np.array(c_noise).astype('float32')}
feeding = {"Z" : np.array(Z_noise).astype('float32'),
"c" : np.array(c_noise).astype('float32')}
#三层优化
D_loss_curr = exe.run(feed = feeding_withx, program = D_program, fetch_list = [D_loss])
G_loss_curr = exe.run(feed = feeding, program = G_program, fetch_list = [G_loss])
Q_loss_curr = exe.run(feed = feeding, program = Q_program, fetch_list = [Q_loss])
if it % 1000 == 0:
print(str(it) + ' | '
+ str (D_loss_curr[0][0]) + ' | '
+ str (G_loss_curr[0][0]) + ' | '
+ str (Q_loss_curr[0][0]))
if it % 10000 == 0:
#显示模型生成结果
Z_noise_ = sample_Z(mb_size, Z_dim)
idx1 = np.random.randint(0, 10)
idx2 = np.random.randint(0, 10)
idx3 = np.random.randint(0, 10)
idx4 = np.random.randint(0, 10)
c_noise_ = np.zeros([mb_size, 10])
c_noise_[range(8), idx1] = 1.0
c_noise_[range(8, 16), idx2] = 1.0
c_noise_[range(16, 24), idx3] = 1.0
c_noise_[range(24, 32), idx4] = 1.0
feeding_ = {"Z" : np.array(Z_noise_).astype('float32'),
"c" : np.array(c_noise_).astype('float32')}
samples = exe.run(feed = feeding_,
program = Infer_program,
fetch_list = [G_sample])
# 保存固化后用于infer的模型,方便后续使用
fluid.io.save_inference_model(dirname='freeze_model', executor=exe, feeded_var_names=['Z', 'c'], target_vars=[G_sample],main_program=Infer_program)
for i in range(32):
ax = plt.subplot(4, 8, 1 + i)
plt.axis('off')
ax.set_xticklabels([])
ax.set_yticklabels([])
ax.set_aspect('equal')
plt.imshow(np.reshape(samples[0][i], [28,28]), cmap='Greys_r')
plt.show()
1000 | 0.013494952 | 7.978466 | 0.0010943383 2000 | 0.030616779 | 4.97865 | 3.5265643e-05 3000 | 0.023494804 | 5.071703 | 2.2821092e-05 4000 | 0.17517455 | 4.284748 | 3.7797705e-05 5000 | 0.1888292 | 5.2216587 | 2.0867174e-05 6000 | 0.14710242 | 4.715311 | 0.0017330337 7000 | 0.15080842 | 4.5600624 | 1.3128748e-05 8000 | 0.060652077 | 5.233957 | 4.6640474e-05 9000 | 0.23902658 | 6.346929 | 0.00071203377 10000 | 0.23928824 | 4.280617 | 3.1697547e-07
11000 | 0.57027113 | 4.215673 | 6.4238243e-06 12000 | 0.08393675 | 5.377479 | 7.828099e-07 13000 | 0.3140963 | 4.2664027 | 0.00039404785 14000 | 0.6752943 | 3.544209 | 6.477377e-05 15000 | 0.36702886 | 3.6079073 | 7.1728846e-07 16000 | 0.51712215 | 2.1365876 | 9.12756e-06 17000 | 0.5277622 | 2.5976753 | 0.0029913513 18000 | 0.5808212 | 2.6313777 | 0.13058223 19000 | 0.7584077 | 2.1950169 | 1.8572642e-05 20000 | 0.6150839 | 1.896971 | 4.166835e-05
# 使用保存的模型进行随机生成
exe = fluid.Executor(fluid.CUDAPlace(0))
[infer_program, feed_list, fetch_list] = fluid.io.load_inference_model('freeze_model', exe)
print(feed_list)
Z_noise_ = sample_Z(mb_size, Z_dim)
idx1 = np.random.randint(0, 10)
idx2 = np.random.randint(0, 10)
idx3 = np.random.randint(0, 10)
idx4 = np.random.randint(0, 10)
c_noise_ = np.zeros([mb_size, 10])
c_noise_[range(8), idx1] = 1.0
c_noise_[range(8, 16), idx2] = 1.0
c_noise_[range(16, 24), idx3] = 1.0
c_noise_[range(24, 32), idx4] = 1.0
feeding_ = {feed_list[0] : np.array(Z_noise_).astype('float32'),
feed_list[1] : np.array(c_noise_).astype('float32')}
samples = exe.run(feed = feeding_,
program = Infer_program,
fetch_list = fetch_list)
for i in range(32):
ax = plt.subplot(4, 8, 1 + i)
plt.axis('off')
ax.set_xticklabels([])
ax.set_yticklabels([])
ax.set_aspect('equal')
plt.imshow(np.reshape(samples[0][i], [28,28]), cmap='Greys_r')
plt.show()
['Z', 'c']
请点击此处查看本环境基本用法.
Please click here for more detailed instructions.
点击链接,使用AI Studio一键上手实践项目吧:https://aistudio.baidu.com/aistudio/projectdetail/169476
下载安装命令
## CPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle
## GPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu
>> 访问 PaddlePaddle 官网,了解更多相关内容。