tensorflow2.0设计理念
ref: https://blog.tensorflow.org/2019/01/what-are-symbolic-and-imperative-apis.html
Symbolic (or Declarative) APIs
Sequential APIs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test, verbose=2)
Functional APIs
ref:https://www.tensorflow.org/guide/keras/functional#all_models_are_callable_just_like_layers
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27 encoder_input = keras.Input(shape=(28, 28, 1), name='original_img')
x = layers.Conv2D(16, 3, activation='relu')(encoder_input)
x = layers.Conv2D(32, 3, activation='relu')(x)
x = layers.MaxPooling2D(3)(x)
x = layers.Conv2D(32, 3, activation='relu')(x)
x = layers.Conv2D(16, 3, activation='relu')(x)
encoder_output = layers.GlobalMaxPooling2D()(x)
encoder = keras.Model(encoder_input, encoder_output, name='encoder')
encoder.summary()
decoder_input = keras.Input(shape=(16,), name='encoded_img')
x = layers.Reshape((4, 4, 1))(decoder_input)
x = layers.Conv2DTranspose(16, 3, activation='relu')(x)
x = layers.Conv2DTranspose(32, 3, activation='relu')(x)
x = layers.UpSampling2D(3)(x)
x = layers.Conv2DTranspose(16, 3, activation='relu')(x)
decoder_output = layers.Conv2DTranspose(1, 3, activation='relu')(x)
decoder = keras.Model(decoder_input, decoder_output, name='decoder')
decoder.summary()
autoencoder_input = keras.Input(shape=(28, 28, 1), name='img')
encoded_img = encoder(autoencoder_input)
decoded_img = decoder(encoded_img)
autoencoder = keras.Model(autoencoder_input, decoded_img, name='autoencoder')
autoencoder.summary()Imperative (or Model Subclassing) APIs
详细见下面的sublayer,subclass
自定义Layer
一般步骤
- 如果需要使用到其他Layer结构或者Sequential结构,需要在init()函数里赋值
- 在build()里面构建权重参数, 每个参数需要赋值name
- 如果参数不给name,当训练到第2个epoch时会报错:AttributeError: ‘NoneType’ object has no attribute ‘replace’
- 在call()里写计算逻辑
eg:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17from tensorflow.keras import layers
class Linear(layers.Layer):
def __init__(self, units=32):
super(Linear, self).__init__()
self.units = units
def build(self, input_shape):
self.w = self.add_weight(shape=(input_shape[-1], self.units),
initializer='random_normal',
trainable=True)
self.b = self.add_weight(shape=(self.units,),
initializer='random_normal',
trainable=True)
def call(self, inputs):
return tf.matmul(inputs, self.w) + self.b
Layer层中添加loss
Layers recursively collect losses created during the forward pass
如果Layer层有loss,通过调用self.add_loss()来添加
1 | # A layer that creates an activity regularization loss |
如果Layer层有正则,则会自动添加正则loss1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18class OuterLayer(layers.Layer):
def __init__(self):
super(OuterLayer, self).__init__()
self.dense = layers.Dense(32, kernel_regularizer=tf.keras.regularizers.l2(1e-3))
def call(self, inputs):
return self.dense(inputs)
layer = OuterLayer()
_ = layer(tf.zeros((1, 1)))
# This is `1e-3 * sum(layer.dense.kernel ** 2)`,
# created by the `kernel_regularizer` above.
print(layer.losses)
>> [<tf.Tensor: shape=(), dtype=float32, numpy=0.0018715981>]
在编写训练循环时应该考虑这些损失,如下所示:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15# Instantiate an optimizer.
optimizer = tf.keras.optimizers.SGD(learning_rate=1e-3)
loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)
# Iterate over the batches of a dataset.
for x_batch_train, y_batch_train in train_dataset:
with tf.GradientTape() as tape:
logits = layer(x_batch_train) # Logits for this minibatch
# Loss value for this minibatch
loss_value = loss_fn(y_batch_train, logits)
# Add extra losses created during this forward pass:
loss_value += sum(model.losses)
grads = tape.gradient(loss_value, model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))
Layer中选择是否training
1 | class CustomDropout(layers.Layer): |
自定义Model
一般步骤
- 如果需要使用到其他Layer结构或者Sequential结构,需要在init()函数里赋值
- 在model没有fit前,想调用summary函数时显示模型各层shape时,则需要自定义一个函数去build下模型,类似下面代码中的build_graph函数
- summary()显示shape顺序,是按照init()里layer赋值的顺序
eg:
1 | # -*- coding: utf-8 -*- |
关于training
无论时自定义layer还是自定义model里都有一个方法:1
call(self, inputs, training=None, mask=None)
这个方法里的training在没有显式指定时,使用keras的fit方法时会变成True, predict方法会变成False。但是一旦在自定义layer或者model里显式指定了,如果还有下一层连接,那么下一层会自动指定为上一层显式指定的training。
关于Mask
自定义层里支持mask
self.supports_masking = True,这个还有很多不清楚的地方,慎用, 建议使用手动传递mask
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22 class Attention(Layer):
def __init__(self, ):
super(Attention, self).__init__()
# 如果来自上一层有Mask,一般是Embedding层(mask_zero=True),这一层要访问需要人工指定self.supports_masking = True
self.supports_masking = True
def build(self, input_shape):
'''
:param input_shape:
:return:
'''
# 这里定义一些参数
self.built = True
def compute_mask(self, inputs, mask=None):
''' 这个很重要,当self.supports_masking = True, 如果后续layer不需要mask,这里需要重写compute_mask,并且返回None
'''
return None
def call(self, inputs, mask=None):
# 这里的mask可以手动传递一个,如果不传递,且self.supports_masking = True,则是来自上一层的mask
return c
关于checkpoints
报错:WARNING: Logging before flag parsing goes to stderr.
W1008 09:57:52.766877 4594230720 util.py:244] Unresolved object in checkpoint: (root).optimizer.iter
解决方法链接:https://code5.cn/so/python-3.x/2479816