Aller au contenu principal

Green Power (GP2X) (+ source)

10/05/2012

Time to go deep inside the memory management with C++.

The aim of this project was to create a memory manager to handle allocations and so, let the game run on limited systems such as handheld machines. Because I had some projects more important to undertake at the same time, this one was leave apart for some time. Actually, the demo was created in 1 month and the memory manager in, at most, one week. That’s why there is no gameplay.

Keywords

  • C++
  • Visual Studio 2010 (+ debugger & profiler)
  • Sprite Animation (Animation Manager)
  • Level Scrolling (camera follows player)
  • Memory Manager
  • Black-boxed
  • Interface to use any API (here SDL) without modifying the game engine. Very useful for multiplatform development.

User Guide

No installation required: on the PC just launch the GreenPower.exe. On the GP2X the executable is GreenPower.gpe.
Controls: All the controls are pictured by the two following figures. The Start button is used to activate the Pause menu.

Controls for PC

Controls for GP2X


Known issues:

  • On the PC, if the console is closed while the game is running, this latter crashes.
  • There is no goal, collisions and life (the work was more about the memory manager).
  • It is not possible to create the states with the memory manager (otherwise it crashes).

Demo creation

The game design was changed a number of times to always improve its effectiveness and enhance its reusability. The figure 2.1 displays the overall design of the game engine with a Class Diagram (UML 2.0).

Class diagram of the game

The main class of the engine is the GameStateManager which holds a pointer to the Visualisation, Input and Sound components. At the very beginning, these components were based on the Singleton pattern and it worked pretty well, but one issued came up: it was not possible to create an interface. I have so decided to change the design and let the GameStateManager holds a pointer to them; in this way, I have created an interface for these three components and, as shown on the class diagram, a GameState has a pointer to the GameStateManager to have an access via a Getter function. The blackbox is better than ever respected. For instance, if a developer wants to use the FMOD library instead of SDL, he just has to implement this class respecting the ISound interface and that’s it, the game will run with no more code modification required. However, the WorldModel component is still a Singleton just like Play, Intro and Pause. In this way, it is possible to access them from wherever you want without actually holding them. Furthermore, each entity has an AnimationManager to manage their animations in a very simple way: I have created a structure which defines the behaviour of an animation, so that once it is parameterised, it is just needed to add this animation to the manager and let it does the job. The structure is as follows:

struct AnimationSpecs
{
int nbFrames;                 // Number of frames
int nbSpritesRow;          // Number of sprites for a row
int nbSpritesColumn;    // Number of sprites for a column
int width;                          // Width of a frame
int height;                        // Height of a frame
int delay;                         // Delay between each frame
int indexFirstFrame;    // Which frame is the first (usually 0)
int delayAfterFirstFrame;  // Delay between the first and the second frame
bool hasToLoop;        // True if the animation can loop
bool hasADirection;   // True if the animation has a direction (for the display)
};

Last, the engine uses a structure Rectangle to replace the structure SDL_Rect I was using from the beginning. The reason is the same as above: if we do not want to use the SDL library anymore, we do not have to rewrite the entire code.

Profiling and Optimisation

This section will be split in two sub-sections. First, the optimisation and the FPS earnings will be discussed and then, the subject will be about the memory manager and time savings during allocations. Each time, the tests were performed on my computer (or a computer from the labs) and the GP2X. The results do not have to be compared each other if they are not in the same test (because it is not always the same machine which was used for each test).

Optimisation

First, I have tried to play around with Visual Studio and change some options to optimise the code. I won’t speak about the programming tweaks but since the GP2X does not handle the floating point values, I am only using integers. Afterwards, I have tested some different configurations with the SDL library about the loadings of the sprites. Every tests performed on a computer were run in the release mode.

Here are the optimisations applied from Visual Studio:

– C/C++ -> Optimization -> Whole Program Optimization: Yes (/GL).

– Linker -> Optimization -> Link Time Code Generation: Use Link Time Code Generation (/LTCG)
(not compatible with precompiled headers. Ensure the inline function).

– C/C++ -> Code Generation -> Floating Point Model: Fast (/fp:fast).

PC:

Without any optimisation: 1290 FPS (average).
With optimisation: 1295 FPS (average).

On a computer, the game already runs really fast and the options did not change that much.


GP2X:

Without any optimisation: 79 FPS (average).
With optimisation: 81 FPS (average).

On the GP2X, there are not a lot of difference in term of frames per second.

On reason is that I am not using floating point so the Floating Point Model set to Fast does not change anything.

Then, I have tried five configurations for SDL_SetVideoMode:

– SDL_SWSURFACE
– SDL_HWSURFACE
– SDL_SWSURFACE | SDL_DOUBLEBUF
– SDL_HWSURFACE | SDL_DOUBLEBUF
– A mix between SDL_SWSURFACE | SDL_DOUBLEBUF and SDL_HWSURFACE | SDL_DOUBLEBUF

PC:

SDL_SWSURFACE:  1288 FPS

SDL_HWSURFACE:  1289 FPS

SDL_SWSURFACE | SDL_DOUBLEBUF:  1292 FPS

SDL_HWSURFACE | SDL_DOUBLEBUF:  1297 FPS

Mix:  1310 FPS

The Mix consists in loading every sprite with a display purpose in the video memory and every sprite which won’t be displayed in the system memory. Once again, on the PC the difference is not very noticeable but the Mix mode is the fastest.

GP2X:

SDL_SWSURFACE:  70 FPS

SDL_HWSURFACE:  369 FPS  but the screen does not display the level properly.

SDL_SWSURFACE | SDL_DOUBLEBUF:  78 FPS

SDL_HWSURFACE | SDL_DOUBLEBUF:  79 FPS

Mix:  81 FPS

This time again the Mix mode won and the difference between the SDL_SWSURFACE and the Mix mode is more noticeable. These tests have just proved that by putting the sprites in the most appropriate memory, it is the best way to optimise the accesses to these data.

Memory Manager

As we all know, a handled machine has limited capabilities and so, few memory to deal with. It became important to manage this memory, limit its size, accesses and partitioning. My Memory Manager allocates a bit of memory (say, 10 Megabytes) and then, each time the game wants to create a new object it has to ask to the Memory Manager to do it and return a pointer. What the manager does is very simple: this is all about the first fit policy. Indeed, it checks if there is enough free space for the new object and if yes, the object is built in this space. About the deletion of the memory, the object’s bloc allocated is not really freed, a flag is set to true and the program continues. In my opinion, free memory is waste memory so it is not mandatory to free it each time (and it is a loss of time). Afterwards, if memory is running out, two choices are available: first, the manager will check if there are any blocs with the delete flag equals to true and if this bloc is big enough to store the new object. If yes, the previous object is deleted and the new one is built. However, if there is no more free space, the second choice is to enlarge the memory we have allocated the first time. In this case, the size is doubled. Finally, if it is not possible to create the new object, a NULL pointer is returned and the game will exit properly.

In conclusion, instead of often asking the system to allocate and free memory, there is just one big allocation at the beginning of the game and no deletion. Just before the exiting, the bloc is freed and so every object is deleted at once.

In order to test this algorithm, I have made a little program which allocates a number of elements, puts them in a vector and then deletes everything. Results are available in the two following charts; the first one was launched on a PC and the second one on the GP2X.

Test allocation on PC

Test allocation on GP2X

 

Without any doubt the memory manager permitted to save an interesting amount of time. Surprisingly, the time saving is a lot better on the computer than the GP2X because the time needed to allocate and delete 50 000 elements on the computer is incredibly higher than the time needed by the GP2X. Powerful computers may have some difficulties with very little objects in memory otherwise it seems to be difficult to explain such results. Anyway, the most important remains that the memory manager is a great time saver.

Last but not least, I have noticed I was losing an incredible time by deleting an element with the manager. In fact the class has a std::map and a std::vector; the map is used to store the objects built, identified by their memory address (the reason is that when you want to delete the object you pass the pointer to the memory manager and so it is possible to find it directly with this address) and the vector stores the blocs no more used by the game. So, when the game wanted to delete an object, the manager searched it in the map and if it was found, the entry was erased from the map and pushed back in the vector. It seems that the std::map::erase function is very slow and after a quick test with the allocation/deletion of 50 000 elements here is the result: it took 7002ms with the erase function and only 356ms without. I have so decided to remove it and use a flag to check if the bloc in the map is still used or not.

When the program exits, it is possible to check the memory leaks by passing true to the clear function. If so, the memory manager just checks in the map if it remains some blocs with the flag to false. The big bloc of memory is also freed (so it is not possible to have a memory leak, it is just a way of warning the programmer not to forget to delete all the objects).

Once the memory manager was ready, I have implemented it in my game and made some tests in order to check if it speeds up the initialisation process. The time measured if the time passed in the Play::Init function (where all the game is loaded).

PC:

With memory manager: 27ms
Without memory manager: 29ms

On the computer, the time saving is ridiculously low but still, we can measure the difference on a powerful computer which is a proof of its efficiency.

GP2X:

With memory manager: 751ms
Without memory manager: 752ms

Unfortunately, these tests do not prove the efficiency of the memory manager. I cannot say why since the allocation should be sped up. There is probably a bottleneck in the memory manager somewhere and it requires some more tests to find it. Since it was an exam, I prefered to leave it as it was submitted.

 

Conclusions

For sure, it is more challenging to work on a limited platform than a PC but since we are not developing a complete game, the limitations are not a big issue. However, we still have to create our own memory manager and it was the first time I was trying to do so. The main issue I faced was to find a way to call the constructor at a specific memory address and I finally heard about the placement new that I have used. Except that, it was a great opportunity to improve my engine I am enhancing it since the last year and it is always something I enjoy to make it more and more efficient. I have already spoken about the GameState engine and this time the two master pieces are the interfaces, which permit to make the engine even more platform-independent and black-boxed, and the Memory Manager (which is the successor of my previous Memory Tracer), of course not yet perfect and will probably never be, but still, it is there and gave me some lessons about how C++ functions, such as new, delete, malloc, and so on, are managing the memory.

Advanced Games Software Development (VIS 3016) Feedback

Final Module Mark: 60, Grade: B

Feedback

The demo runs fine on both platforms and generally conforms to the TRC requirements. The demo itself is a work in progress but there is some complexity for testing out the internal systems. It would have been good if you could have implemented a bit of gameplay so the performance could be more fully tested in a real game type situation. There is a large code base in place that is generally well structured, presented and commented. A memory manager has been implemented with a good use of templates to handle different type allocations. It does not appear that the memory manager has been fully implemented for the GP2X platform. A also notice that your delete macro will not use the correct delete form for an array. A timer is in place which could be used for profiling however there is little else supporting profiling. Also implemented is a state system although I note this was not working correctly with the memory manager. I like the use of interfaces throughout the code base. Report layout is good with a good use of pictures in the user guide and a UML diagram in the design. There is a good discussion on alternatives and I like the approach to the singleton problem via interfaces and a common instance holder. Profiling is limited to looking at the frame rate and memory manager. The discussion on the memory manager allocation is good.  The conclusion is a little short.  A number of advanced implementations have been attempted and it is clear you have a good grasp of C++ and the game specific issues. Where the ICA falls down a bit is that it does not have much in the way of profiling data gathered and hence optimisations are limited to just a couple of areas. Overall though this is a good attempt.

            

Release date: May 2011
Download Green Power demo (.exe & .gpe)
Download Source Code (.zip)

Laissez un commentaire

Laisser un commentaire