Week 4: Discovering Fire

Abhijit Baruah
Jun 22, 2022
2 min read

For this week I wanted to give the agent more "actions" to do to keep itself alive. I decided to go ahead with the idea of a "campfire" a place where the bot would have to deposit food in order to gain health, this simulates the idea of "cooking food" before you eat,

To accomplish this I added a constraint on the model that I trained for week 3 :- https://baruahabhijit97.wixsite.com/website/post/week3-dynamic-world , the agent can only collect one food particle at a time , i.e. each food particle needs to first be deposited at the camp fire before the agent can go and collect more food.

The changes made to the observation space were :-

1) Since the campfire would always be located at a fixed position in the world, I added an additional Vector3 , indicating the distance between the agent and the campfire (transform.position - campfire.transform.position)

2) The direction the campfire is located in i.e. (campfire.transform.position-transform.position).normalized

The reward system also had to be changed in order to encourage the agent to both collect and deposit food, the changes I made were :-

1) +30.0f every time the agent hits the campfire WITH food collected

2) -30/(_health +1f) for every time the agent hits the campfire without food , this reward becomes worse (more negative) if the agent chooses to go to the campfire with lesser health and without food.

Because the observation space was now more complex and the agent had to perform more actions , I decided to decrease the learning rate, increase the hidden layers of the python model from 2 to 4 and increase the steps of training to 8M.

https://www.jeremyjordan.me/nn-learning-rate/

The changes made had positive results in training and helped my agent discover fire!!

The video below is running inference(using the trained model in the game)

https://video.wixstatic.com/video/01117d_456bd79ca76242a9a4ceac938a90d98a/1080p/mp4/file.mp4

The graph above shows the fact that training peaked around 5M iterations.

For the next week I am going to focus on making this look better and adding ways the player can influence the world during runtime.

Programmer

Week 4: Discovering Fire

Recent Posts

Comments