Quantcast
Channel: Анал со зрелой
Viewing all articles
Browse latest Browse all 47

ECS and Data Structures

$
0
0

This post is more for beginners or hobbyists interested in learning a bit more about common data structures used in games nowadays (circa 2021) in C/C++, and talks a little bit about Entity Component Systems (ECS).

In the past linked lists and arrays were extremely common, extra emphasis on linked lists. For example, back when StarCraft: Brood War was made linked lists were used and written ad-hoc all over the game to keep track of units with shared behaviors. Arrays were often used to statically pre-allocate large pools of objects, a practice becoming less popular as time goes on.

However, nowadays linked lists are used rather sparingly in favor of arrays and hashtables. One other data structure, which does not quite have a ubiquitous name, also gets used very often. Personally I call it a handle table. Let us cover each of these data structures briefly. A lot of online material exists for hashtables, arrays, and linked lists, so we will just link to some other high quality content. But, for the handle table some more explanation and example code is shown below.

Linked Lists

Linked lists in C are covered all over the internet, so I fetched a couple links to some other high quality resources. Peruse them as desired!

My take: Linked lists can be really nice for implementing memory allocators, free lists, but aren’t too commonly used. In the past they were super popular. The main reason, in my opinion, is that hardware in the past was bottlenecked mostly by the processor speed. Nowadays processors are blazing fast in comparison, and now the bottleneck for performance is almost entirely cache oriented. An old but still valid video I learned from and enjoyed about modern cache concerns by Scott Meyers is a good one to reference.

Please note that for hobbyists or smaller 2D games, cache concerns are not a top priority. It’s completely viable to make a game using linked lists as your primary data structure and potentially make a great game with huge success. Just take a game like Undertale as an example. Although this game was written with game maker, a hobbyist could have easily implemented Undertale in their own way using linked lists without worrying much about performance.

Hash Tables

I highly recommend learning about hash tables in depth. The hash table is critical and commonly used data structure. Since hash tables are an ancient technology the internet is riddled with great resources. Let me link a few nice ones here as reference.

The main points to learn about are listed here.

  • The hash function
  • Collision resolution
  • Chained resolution, or chaining
  • Linear probing, or open addressing

These are good search terms to plug into your favorite search engine!

Hash tables are extremely useful. To put this into perspective, let us take the Lua programming language as an example. What data structures are available as built-in features of the language? There is only one: the hash table, or in Lua parlance a table. Hashtables are so useful it’s possible to implement all other data structures as hash tables and do a decent job while at it! Of course, this might be a bit silly, but nonetheless interesting.

Even today (circa 2021) hash tables are still used frequently! Learn and love them.

Handle Table

There isn’t quite an official name for this kind of data structure, and it goes by various names depending on who you ask. I’ll just go with handle table. My friend Sean calls it a SlotMap. Noel Llopis called it a Handle Manager. The idea is to take a growable array data structure, such as C++’s vector, and add a little more onto it. Specifically, there are a few features I have in mind that a typical vector lacks.

  • Constant time removal of any element
  • Unique identifier for any element with respect to lifetime
  • Constant time lookup for any element given it’s unique identifier

Let us write down a sketch for this data structure’s API in C++ (click here for a full example implementation).

struct HandleTable
{
	struct Handle
	{
		uint64_t h;
	};

	Handle Alloc(int data);
	void Free(Handle h);

	int Lookup(Handle h);
	void Update(Handle h, int data);
	bool Validate(Handle h);

private:
	// ... To be implemented. An example implementation is linked below.
};

The HandleTable implements a mapping from Handle to an integer. The integer can be changed to anything really, it does not necessarily have to be an integer. In my own code usually I do use an integer as an index, but it could also be a pointer or perhaps something else. The purpose is to map a handle to a data. The handle is a unique identifier and tracks the lifetime of the data.

Implementing an Alloc function is quite trivial, something like vector.push_back will work just fine. The complexity rises a bit when implementing Free though. In order to support a constant-time Free operation, the element to be free’d can be swapped with the last element in the array. However, once swapped, the Lookup function might be broken.

In order to support constant time lookups, but also support the swap-with-last-element strategy, there must exist a mapping from handle to internal index somehow. There are many ways to implement this feature, but my personal favorite is to store two 32-bit integers inside of the Handle. The first integer would be an index, and the second integer would be a counter. The lookup function can pull the index out of handle. This index maps to the data with an intermediary array.

Here is my example implementation in C++ of the handle table. Once implemented this table can track the lifetime of some data. Let us say the data is an index into a vector. Whenever we want to delete an element from the vector, we can perform the swap-with-last-element strategy, and then called table.Update to update the associated handle with a new index!

std::vector<Entity> entities;
HandleTable table;

void DestroyEntity(Entity e)
{
	// Remove the old entity with the swap with last element strategy.
	Handle h = e.handle;
	int index = table.Lookup(h);
	entities[index] = entities[entities.size() - 1];
	
	// Update the swapped entity's handle.
	h = entities[index].handle;
	table.Update(h, index);
}

The HandleTable implements a unique mapping from Handle to some data (an integer) and tracks lifetime, while also reusing memory efficiently in the presence of frequent calls to table.Alloc and table.Free – great! Combined with a typical std::vector for storing objects or game entities, the handle table can be used to implement a contiguous array of uniquely tracked and identified elements. Like a std::vector of game entities.

Tying it Together with Entity Component Systems (ECS)

I’ve actually been asked in multiple interviews at large companies (think FANG companies) how to design an object or entity tracking system or pool as a technical interview question. This is a rather interesting topic such that it’s even getting discussed for hiring purposes at certain places!

One way to answer this question is to combine a typical growable array (like std::vector) with another layer of indirection to track lifetimes and uniquely identify elements. This is where the HandleTable can come into play, as described in the previous section.

In an ECS the entities might be a small amount of data, perhaps just a Handle and maybe some extra typing information. This handle can be used inside of a HandleTable (or some similar data structure) to uniquely identify and track lifetime of the entity. The unique identifier can also map to the run-time storage of the actual entity’s components. Exactly how this mapping is implemented is a question up in the air, and open to debate. The simplest model I can think of, as mentioned in the previous section, would be to store entity’s components in a vector and update the handles to them whenever elements are deleted from the vector.

More complicated schemes are possible, of course, but the key takeaway here is use two arrays: one array to implement the lifetime tracking and unique identifiers, and another array for actual run-time storage of game objects or game components.

Here is a great example by my friend roig on github. He implemented his own ECS in plain C using a very similar technique called Destral ECS. They call their variation of the HandleTable a Sparse Set. As far as I am aware, the concept of a Spare Set was primarily popularized by the EnTT library – here’s some info about the Sparse Set. My colleague Anders wrote about sparse sets in the context of the Zig programming language here.

Here’s a nice article by Nyklas talking about data structures and ECS. He has lots of pretty pictures and a good amount of sample code. There’s also a companion article about arrays and hash tables.

Share


Viewing all articles
Browse latest Browse all 47

Trending Articles