Week 1 : Research
- Abhijit Baruah
- Jun 1, 2022
- 2 min read
Updated: Jun 16, 2022
I spent my first week installing ML agents and all its required dependencies. Then I had to think about my environment , the list of observations , actions the agent could take and my reward system.
An observation is a quantifiable parameter that the RL model knows about and will change its inputs based on the state of these variables and the reward obtained.
All of the observations are float's / integers which add to the observation space vector of the model, i.e. A vector of all the values the RL model will observe and generate input from.
Actions are inputs/numbers that the model generates during training, A continuous action is one that can have values ranging from -1 to 1. A discrete action is an integer which can have set values, if a discrete action has a size of two, the model would generate inputs 0 and 1 for that action.

The above system is still an untested and preliminary one.
As mentioned before continuous actions correspond to a random value between -1 and 1 that the neural network will generate, this would enable the agent to move in any direction on the X-Z plane.
In order to give the agent an option of not moving if its health is above a certain threshold I decided to add a discrete action of size 2 which would just generate an integer 0 or 1, which would help the agent decide if it should move or not.
On doing more research ML agents has constraints over the Neural Network parameters that have to be specified before runtime, this meant that the world would not be dynamic, i.e. New entities could not be spawned as the Agent's observation vector had to be of a fixed length.
For the next week I am going to attempt to get my agent to collect food when its health is low.

Comments