Posted by

Posted on

September 26, 2024

Posted under

Comments

AI Behavior In Unreal Engine

Either Unreal Engine 5 AI has become more intuitive or I am spending right time with the AI. Recently I made a zombie game first person shooter prototype with UE 5.4.4. The natively playable prototype binaries can be found at this GitHub page (available for Linux, Mac, and Windows). But, as with several different software, the versioning usually, in some sense, can make it non-intuitive.

Disclaimer: This is not to imply that what follows, the concept of Observer Aborts, is an example of different implementation with different versions of UE. From forums it seems like, but I am not sure.

The story starts with the following image

If you are new to Unreal’s behavior trees, might I refer to here, here, and here. We are interested in the BT_ZombieBehavior window containing the behavior tree for zombie AI. Clearly, as per the decorator in Sequence node (third level from ROOT), the Move To task shouldn’t be executed if Target Enemy (query key) is not set, meaning set to NULL.

This wasn’t happening practically in the game. The zombies were still moving towards the player (chase mode) when Target Enemy was not set. Native logging verified that this is case. Now, following my friend’s noble advice, on doing some forum reading, I stumbled upon https://forums.unrealengine.com/t/ai-blackboard-based-condition-does-not-abort/511343.

Based upon the observation in the forum thread, consider the following image

All we have now is to understand the purpose of setting the field Observer Aborts to self in UE 5.4.4. For that we refer to official UE documentation page. The page mentions that setting the field to Self, aborts self (node) and any subtrees, which is Move To task in this context, running under the node. This means that even if Target Enemy is NULL, Move To task is being executed with whatever Target Enemy was set earlier and, if the task is not completed, AI won’t halt the zombie, leading to an infinite pursuit, even when the pawn vehicle is out of sight.

Upon further forums’ reading, for instance this, seems like an optimization (caching?) for changing decorator conditions, which probably is our case, for re-evaluation of “if Target Enemy is set”, which updates appropriately if Observer Aborts is set to self.

Extra Credit

If you have been observing pedantically enough, in the second level from ROOT, Selector node has been replaced by Sequence which, in practical sense, is a mental exercise only. In this context, there is no practical difference between them. Because, when using Selector (at second level), until the Move To task is completed, the behaviour tree control won’t go to the Sequence with BTTask_Halt child, unless the Target Enemy is set to NULL, in which case, Move To task will be aborted, moving the control to the parent of halt task.

Even simpler version of behaviour tree, in this context, which gives the same result is like so

Since self abort stops the execution of Move To task, BTTask_Halt, which essentially sets the MoveTo actor to NULL, has no need. Probably self abort does that already.

Posted by

ravimohan

Posted on

August 3, 2019

Posted under

The holographic view

Comments

1 Comment

B for Behavior, Blackboard and Brain (3)

This is the final blog-post of the series (for previous parts see here and here). As promised, here we will see the implementation of Reinforcement Learning through Behavior Tree (in natural Unreal Engine environment). The method written is fairly general and can be used to generate BT corresponding to any RL based algorithm.

The k-armed bandit (which for our case turns into shooting colored boxes) algorithm can be categorized into 5 tasks.

Select a box (based on probabilities and estimates)
Rotate towards box (for more complex, locomotion may be involved)
Shoot the box
Wait (for reward assesement)
Update the estimates

The tree built with this categorization of the tasks is shown in figure rlbt

The Engine also provides an powerful interface for Blueprints to interact with C++. This is a good way to unify the artistic and programming development of a game. We will show that in real time here.

Consider the task BTTask_FindAppBox node. The Blueprint implementation is shown in the figure

taskfindbox

In the Engine, every BT task starts with the node Event Receive Execute AI. So we start with that and make the connection to Cast node yielding the object corresponding to class MAIPlayerController. Once that is done, we invoke the C++ method through the node Look for App Box and by setting the target pin as the casted object. The C++ method is posted here.

Similarly rest of the three tasks (except the Engine default task Wait) are implemented through this C++-Blueprint interface. Another example is BTTask_UpdateEstimates

estimates

with the corresponding C++ code posted here.

Posted by

ravimohan

Posted on

July 23, 2019

Posted under

The holographic view

Comments

2 Comments

B for Behavior, Blackboard and Brain (2)

This post is continuation of this blog-post. Here we will understand the algorithm behind the emergent behavior, called Reinforcement Learning (RL). The situation of shooting colored boxes is equivalent to k-armed bandit problem which can be stated as

You are faced repeatedly with a choice among k different options, or actions. After each choice you receive a numerical reward chosen from a stationary probability distribution that depends on the action you selected. Your objective is to maximize the expected total reward over some time period, for example, over 1000 action selections, or time steps.

The reward for this game is: 100 for shooting red box and -10 for shooting blue box. The AI agent can access the scoreboard and take the actions accordingly. Based on the rewards, the agent generates the expected value associated to an action. This value is, in a way, the memory of all the rewards obtained when that particular action was performed. In more complicated systems, the rewards my be even probability based.

For RL, it is not only important to take those actions which have high expected value associated with them (exploiting the knowledge-base), but, once in a while, it is necessary to pick a random action irrespective of the associated expected value (exploration). This makes sure that the system doesn’t get stuck in the local optimum. For instance in case of correlated rewards, some actions when performed in certain sequence might yield better rewards. Later in this post it is shown why exploring leads to global optimum configuration knowledge!

For the problem at hand, the rewards are deterministic and static. So the RL algorithm is as follows (this algorithm can be found in Sutton’s book, section 2.4 (working link July 23, 2019))

Initialize, for $a = 1$ to $k$ :
$Q(a) \leftarrow 0$
$N (a) \leftarrow 0$
Repeat forever:

$\text{arg } \text{max}_a Q(a)$ with probability $1-\epsilon$ or a random action with probability $\epsilon$

$R \leftarrow \text{bandit}(A)$
$N (A) \leftarrow N (A) + 1$
$Q(A)\leftarrow Q(A) + \frac{1}{N(A)}\left[R - Q(A)\right]$

In Unreal Engine, the above algorithm can be easily implemented and is posted here (specifically look the member function UMAIBrainComponent::TickComponent). The game is played in the Unreal Editor where the rendered arena image is

Arena

The first two red boxes on left are levers 0 and 1 and last two boxes on right are 3 and 2 levers.

When the game begins, the control is passed over to AI agent who, as per the logic flowing through Tick function, starts shooting the boxes implementing RL technique. The log of first 22 iterations of the shootouts performed on July 23, 2019 can be found here.

Log Analysis:

We start with 0th iteration. An action (lever) is chosen at random because all the estimates are set to 0. The reward obtained is 100 and the estimate of action 0 is updated. For the next 3 iterations, the action with the highest expected value is chosen by the agent which results in pawn firing at first red box. But during 5th (marked with iteration number 4) iteration, the agent explores and randomly* chooses action 2 (last blue box) in spite of its expected value as 0. As result reward is -10. But then again it starts exploiting the knowledge-base and obtains higher rewards. Then again during iteration number 11 it chooses action 3 (second last blue box) and gets negative reward. Again, it continues with highest reward action selection (based on knowledge-base).

Finally during iteration number 19, it chooses action 1 (second red box) and gets reward 100. The estimates are updated (as can be see from next iteration’s maximum estimates ) and now, agent has global high estimates in the knowledge-base which couldn’t have been registered if agent weren’t exploring.

In the next blog-post of this series, we will see the visual implementation of this logic (natural to UE codebase).

* Ignore the 1 – epsilon text typo in the log text.

Posted by

ravimohan

Posted on

July 20, 2019

Posted under

The holographic view

Comments

3 Comments

B for Behavior, Blackboard and Brain (1)

Blackboard is a very efficient tool utilized by AI agent for generating appropriate behavior. In this blog-post I shall demonstrate the Unreal Engine code, in action, using the BlackBoard class.

First we initialize the BlackBoard object in the following fashion with line 8


void AMAIController::SetGameRep(AMAIGameState *GR)
{
MAIGameRep = GR;

// setup blackboard component
if(!MAIGameRep->GetBlackBoard())
{
UBlackboardComponent* BB = NewObject(this, TEXT("BlackboardComponent"));
if(BB)
{
BB->RegisterComponent();
MAIGameRep->SetBlackBoard(BB);

UBlackboardData* BlackboardAsset = NewObject(BB);
BlackboardAsset->UpdatePersistentKey(FName("PlayerScore"));
BB->InitializeBlackboard(*BlackboardAsset);
}
}

}

Then, we register the component with line 11 and declare the UBlackBoardData with line 14. Next we add the key PlayerScore in the UBlackBoardData and finally initialize the BlackBoard with the asset. This is the code equivalent to the editor state shown below

bbeq

The BlackBoard keys can be accessed by the code


GetGameRep()->GetBlackBoard()->GetValueAsFloat(FName("PlayerScore"));

The idea behind the BlackBoard is to provide a scratch work space with relevant data, required for the decision making purposes. Furthermore it is equipped with the ability to send notifications once data has been updated and cache the calculations saving redundant CPU processing time and power.

BlackBoard can be shared among several AI instances. It basically provides a common ground for all the AI instances to operate in collaborative manner. This gives rise to new collective behavior leading to more realistic gameplay and recreation.

Next, we want to focus our attention towards Reinforcement Learning algorithms which are more or less dormant in the game AI world (I really don’t have the clue why). But good news is that people are now gearing towards its implementation in game Engines (Unity seems first one to include that). For more information watch this video. There have been some white papers based on RL in game AI for instance Capture The Flag: the emergence of complex cooperative agents and Implementing Reinforcement Learning in Unreal Engine 4 with Blueprints.

Using the BlackBoard class, I applied Reinforcement Learning on Unreal Engine AI controller which is shown in the video below

The aim was to train the AI to shoot red boxes without actually coding it. This is called emergent behavior which was not included in the compiled code, but based on the rewards, AI learnt to shoot red boxes! More information on how I implemented RL will follow in the sequel blog-post. Stay tuned.

Posted by

ravimohan

Posted on

July 18, 2019

Posted under

The holographic view

Comments

Behavior Trees in Game Artificial Intelligence

Due to circumstances beyond my present control, my career trajectory has taken a sharp turn (quantified by delta function ). I hope Dirac would be proud of that! In order to work with same passion and rigor, I have channelized my energy into Artificial Intelligence (AI) and parted ways from Physics. Of course it was painful and depressing, but I found that even such feelings have utility in the catharsis. This blog-post is an attempt to show just that!

The gaming industry has played pivotal role in reshaping the modern networking architecture and graphics rendering (replication, realistic rendering and ray tracing). Therefore it is not unreal to expect the industry to push forward the AI realm. This can be estimated from sheer number of players in games like Fortnite (250 million), Halo and GTA V among others. Any breakthrough in the field of AI can be conveniently and collectively scrutinized by millions of players, facilitated by streamline workflow including developers, players and academics.

Behavior tree (BT) is an important mathematical structure which generates appropriate series of tasks in modular fashion. For instance, a patrolling pawn in some evil fortress. Unreal Engine (UE) is one of the very first game engines to implement BT in very natural way (given the visual scripting structure of UE called Blueprints). I will demonstrate the BT in action using UE project in this blog-post.

The BT can be pictured as

behaviortree

Here black nodes represent the “composites” (from of flow control) and pink nodes represent “tasks”. I have used two categories of composites

Selector: Executed in left-right pattern. It stops traversing the subtrees once successful execution branch is found.
Sequence: Executed in left-right pattern. It doesn’t stop executing subtrees until unsuccessful execution branch is found.

The entire BT is executed top-down pattern in deterministic way. A next level implementation could involve assigning probabilities with each edge resulting in particular node, but we won’t talk about that here.

If you were to petrol, what would be the list of tasks you’d make to execute. Probably it might include

Spotting enemy
Chasing enemy if spotted
Else perform random patrol in arbitrary directions

Now next step is to further divide the tasks into single elemental entities. For instance spotting enemy task includes checking lineofsight actors and spin towards appropriate actor if found. Thus the hierarchy and placement of composites and task should be as shown in the figure above.

Now BT in action corresponding to the chase is shown below.

Chase

One can clearly visualize the train of executing branches of the tree. Since Chase Player is a sequence node, we can deduce that the tasks “Rotate to face BB entry” and “BTT_ChasePlayer” have been executed and now “Move To” task is undergoing and indeed that is what is being done in the Editor.

Next, the BT simulation of patrolling with that task “Move To” is

petrol_moveto

and “Wait” is

petrol_wait

The complete information to setup the project is detailed at https://docs.unrealengine.com/en-US/Engine/ArtificialIntelligence/BehaviorTrees/BehaviorTreeQuickStart/index.html. I encourage to try!

Finally I will give a teaser to upcoming UE project https://github.com/ravimohan1991/MAI

Ravi Mohan

If you haven't found something strange during a day, it hasn't been much of a day. – J. A. Wheeler

Tag Archives: Artificial Intelligence

AI Behavior In Unreal Engine

B for Behavior, Blackboard and Brain (3)

B for Behavior, Blackboard and Brain (2)

Log Analysis:

B for Behavior, Blackboard and Brain (1)

Behavior Trees in Game Artificial Intelligence