I am trying to implement a DDPG agent to control the Gym’s Pendulum. Since I am new to gym, I was wondering if the state data collected via env.step(action)
is already normalized or I should do that manually. Also, should action
be normalized or in the [-2, 2] range?
Is AI Gym’s action and state data normalized?
Share
env.step(action)
returns tuple (observation
,reward
,done
,info
). If you’re referring to data inobservation
, then answer is no, it’s not normalized (all with accordance to observation space section: three coordinates with values in [-1; 1] for the first two and [-8; 8] for the last one).action
should be normalized to [-2; 2] range, though it’ll be addinionally clipped to this range.