YeeKal

note3_value_fu_approx

YeeKal โ€ข โ€ข
"#"

Estimate value function with function approximation:

three kinds input<--->output types: network_kinds.png

Some differentiable function approximators: - linear combinations of features - neural network - decision tree - nearest neighbour - fourier/wavelet bases - ...

optimal object: find parameter vector $w$ to minimise mena-squared error between approximate value function$ \hat v(s,w)$ and true value function $v_\pi(s)$.

gradient descent:

feature vector