python自定义求导tf_diffs
plum_blossom 人气:0自定义求导:(近似求导数的方法)
让x向左移动eps得到一个点,向右移动eps得到一个点,这两个点形成一条直线,这个点的斜率就是x这个位置的近似导数。
eps足够小,导数就足够真。
def f(x): return 3. * x ** 2 + 2. * x - 1 def approximate_derivative(f, x, eps=1e-3): return (f(x + eps) - f(x - eps)) / (2. * eps) print(approximate_derivative(f, 1.))
运行结果:
7.999999999999119
多元函数的求导
def g(x1, x2): return (x1 + 5) * (x2 ** 2) def approximate_gradient(g, x1, x2, eps=1e-3): dg_x1 = approximate_derivative(lambda x: g(x, x2), x1, eps) dg_x2 = approximate_derivative(lambda x: g(x1, x), x2, eps) return dg_x1, dg_x2 print(approximate_gradient(g, 2., 3.))
运行结果:
(8.999999999993236, 41.999999999994486)
在tensorflow中的求导
x1 = tf.Variable(2.0) x2 = tf.Variable(3.0) with tf.GradientTape() as tape: z = g(x1, x2) dz_x1 = tape.gradient(z, x1) print(dz_x1)
运行结果:
tf.Tensor(9.0, shape=(), dtype=float32)
但是tf.GradientTape()只能使用一次,使用一次之后就会被消解
try: dz_x2 = tape.gradient(z, x2) except RuntimeError as ex: print(ex)
运行结果:
A non-persistent GradientTape can only be used to compute one set of gradients (or jacobians)
解决办法:设置persistent = True,记住最后要把tape删除掉
x1 = tf.Variable(2.0) x2 = tf.Variable(3.0) with tf.GradientTape(persistent = True) as tape: z = g(x1, x2) dz_x1 = tape.gradient(z, x1) dz_x2 = tape.gradient(z, x2) print(dz_x1, dz_x2) del tape
运行结果:
tf.Tensor(9.0, shape=(), dtype=float32) tf.Tensor(42.0, shape=(), dtype=float32)
使用tf.GradientTape()
同时求x1,x2的偏导
x1 = tf.Variable(2.0) x2 = tf.Variable(3.0) with tf.GradientTape() as tape: z = g(x1, x2) dz_x1x2 = tape.gradient(z, [x1, x2]) print(dz_x1x2)
运行结果:
[<tf.Tensor: shape=(), dtype=float32, numpy=9.0>, <tf.Tensor: shape=(), dtype=float32, numpy=42.0>]
对常量求偏导
x1 = tf.constant(2.0) x2 = tf.constant(3.0) with tf.GradientTape() as tape: z = g(x1, x2) dz_x1x2 = tape.gradient(z, [x1, x2]) print(dz_x1x2)
运行结果:
[None, None]
可以使用watch函数关注常量上的导数
x1 = tf.constant(2.0) x2 = tf.constant(3.0) with tf.GradientTape() as tape: tape.watch(x1) tape.watch(x2) z = g(x1, x2) dz_x1x2 = tape.gradient(z, [x1, x2]) print(dz_x1x2)
运行结果:
[<tf.Tensor: shape=(), dtype=float32, numpy=9.0>, <tf.Tensor: shape=(), dtype=float32, numpy=42.0>]
也可以使用两个目标函数对一个变量求导:
x = tf.Variable(5.0) with tf.GradientTape() as tape: z1 = 3 * x z2 = x ** 2 tape.gradient([z1, z2], x)
运行结果:
<tf.Tensor: shape=(), dtype=float32, numpy=13.0>
结果13是z1对x的导数加上z2对于x的导数
求二阶导数的方法
x1 = tf.Variable(2.0) x2 = tf.Variable(3.0) with tf.GradientTape(persistent=True) as outer_tape: with tf.GradientTape(persistent=True) as inner_tape: z = g(x1, x2) inner_grads = inner_tape.gradient(z, [x1, x2]) outer_grads = [outer_tape.gradient(inner_grad, [x1, x2]) for inner_grad in inner_grads] print(outer_grads) del inner_tape del outer_tape
运行结果:
[[None, <tf.Tensor: shape=(), dtype=float32, numpy=6.0>], [<tf.Tensor: shape=(), dtype=float32, numpy=6.0>, <tf.Tensor: shape=(), dtype=float32, numpy=14.0>]]
结果是一个2x2的矩阵,左上角是z对x1的二阶导数,右上角是z先对x1求导,在对x2求导
左下角是z先对x2求导,在对x1求导,右下角是z对x2的二阶导数
学会自定义求导就可以模拟梯度下降法了,梯度下降就是求导,再在导数的位置前进一点点 模拟梯度下降法:
learning_rate = 0.1 x = tf.Variable(0.0) for _ in range(100): with tf.GradientTape() as tape: z = f(x) dz_dx = tape.gradient(z, x) x.assign_sub(learning_rate * dz_dx) print(x)
运行结果:
<tf.Variable 'Variable:0' shape=() dtype=float32, numpy=-0.3333333>
结合optimizers进行梯度下降法
learning_rate = 0.1 x = tf.Variable(0.0) optimizer = keras.optimizers.SGD(lr = learning_rate) for _ in range(100): with tf.GradientTape() as tape: z = f(x) dz_dx = tape.gradient(z, x) optimizer.apply_gradients([(dz_dx, x)]) print(x)
运行结果:
<tf.Variable 'Variable:0' shape=() dtype=float32, numpy=-0.3333333>
加载全部内容