note3_value_fu_approx
YeeKal
โข
โข
"#"
Estimate value function with function approximation:
three kinds input<--->output types:
Some differentiable function approximators: - linear combinations of features - neural network - decision tree - nearest neighbour - fourier/wavelet bases - ...
optimal object: find parameter vector $w$ to minimise mena-squared error between approximate value function$ \hat v(s,w)$ and true value function $v_\pi(s)$.
gradient descent:
feature vector