Nycterent: Outpost

A couple of months back I started a new project, and this time it’s not using XNA. I decided that the very limited audience for an XBLIG project would be so limited that my somewhat ambitious goals for quality, content and player experience were simply going to be wasted effort.

I decided to refocus my efforts on a much wider potential audience and a much more ambitious project. Enter: Nycterent: Outpost. It’s a space-based exploration and construction game. The word Nycterent roughly means “Night Hunter” which seems to fit the game given the eternal blackness of space.

More information can be found at the Nycterent project website http://www.nycterent.com


Tags:
Categories: General | Gaming

0 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

September Progress Video

Just a small update with a video to show the recent progress on Galactic Ranger. The video mainly shows space combat including music and sound effects. All the sound you can hear in the video is in-game, nothing was added after it was recorded. There is also some static objects (asteroids, space station) which use the same rendering method as the ships (diffuse/specular/normal/glow mapped). The asteroids use a single sphere for their collision volume and the space station uses multiple Object Oriented Bounding Boxes (OOBB’s) so that the player can actually fly through hollow parts of the station.

So here’s the video, more soon!


Tags: , , ,
Categories: Gaming | XNA | General

19 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

How to eliminate frame-by-frame Garbage Generation using CLR Profiler

When working with XNA on the XBOX 360, managing your Garbage generation and collections is extremely important for performance. I’ve already blogged on some code-level optimisations which help with reducing garbage generated, but I’m going to show how to identify and fix using the fantastically free CLR Profiler.

The way I see it, there are 4 steps.

1. Verify            2. Profile            3. Locate            4. Fix

1. Verify
First off, you need to verify that you are generating garbage each frame. You can do this easily by displaying the value of the GC’s total memory using something like SpriteBatch.Draw(). To get the total memory allocated, use:

GC.GetTotalMemory(false) / 1024

This will return the number of Kilobytes of allocated memory. So if you have 11 megabytes allocated it will return a value of 11264.

Now when you run your XNA app, this number will increase (either fast or slow, or not at all) and then jump back down a couple of megabytes (if on windows). When it does that jump, a Garbage Collection just occurred. If GC memory is increasing very slowly (4-10 bytes/second) then that’s a fairly good place to be. Ideally you don’t want that number to change after your level has started. In some cases, especially on new, un-optimised projects, the number may tick over so fast you can’t even read it.

2. Profile
Now, go and get CLR Profiler, free, from Microsoft and extract it somewhere locally. I usually run the 32-bit version of CLR Profiler even though I’m running on 64-bit windows. I figure the XNA process is 32-bit, so it makes sense. I’ve never had any issues in the past, but your mileage may vary. If you have problems, try running the 64-bit version or as administrator if you’re using Vista or Windows 7.

image

Once you start it, select your XNA application executable and it will launch your app. Depending on how fast your allocations are occurring you may only need to play for a couple of seconds or a couple of minutes. Generally I like to profile around 3-4 collections before exiting.

When you exit, CLR Profiler will analyse the data collected from your app and then display a summary window.

image

From there, the first thing I look at is the Time Line. This will show me my allocations over time and GC’s will appear as spikes made up of the types that were allocated.

image

You can safely ignore the initial section of the graph, as allocations are expected when you load resources (textures, models, allocate initial strings etc) and GC’s during this period are not really an issue. You will need to identify the point in time of the section you want to profile, typically this will be the section that you added the GC.GetTotalMemory() to show up. In the above graph, it starts at approximately 13 seconds into the run.

You can use the Radio buttons at the top of the window to get more detail. Usually 20 or even 10 on both groups is a decent detail level. Use the scrollbars to slide right and down a bit to view one of the spikes at the top of the graph. In the below example, because I had so many collections the spikes were very narrow at horizontal setting 20, so I went down to 5 to make them wider and see more detail.

image

Now select and drag a section in which you can clearly see a 'step' or 'jump' up in the spike. This ‘step’ up represents a new object allocated. In the above picture I’ve selected a section between 15.183 seconds and 15.225 seconds, represented by the vertical bars.

On the right hand side, you will see a list of types and allocated sizes. When you select a small section of the graph (the 'step up' bit) it will only show allocations that occur within that selection. So in the above graph we can clearly see that over 42 milliseconds 94208 bytes of System.Single[] was allocated. That’s 2.1MB of garbage per SECOND, which is why there are so many collections occurring in my sample.

So, lets find out where it’s being allocated, and fix it.

3. Locate

Right click on the highest allocated type line on the right, and choose “Show who allocated”. In the above picture, it’s the System.Single[] line. You should see something similar to:

image

Scroll all the way to the right and you’ll see your hotly allocated type:

image

Start to follow the chain to the immediate left and you’ll see the method hierarchy that ends up allocating that type. In the case above, it’s OrientedBoundingBox.Intersects. Armed with the method, and the type (System.Single[] aka float[]) we can take a quick look at the method and..

        public bool Intersects(OrientedBoundingBox oobb1, Matrix oobb1Matrix, Oriented…
        {
seperating
= Vector3.Zero;

worldAMin
= Vector3.Transform(oobb1.Min, oobb1Matrix);
worldAMax
= Vector3.Transform(oobb1.Max, oobb1Matrix);

worldBMin
= Vector3.Transform(oobb2.Min, oobb2Matrix);
worldBMax
= Vector3.Transform(oobb2.Max, oobb2Matrix);

centerA
= (worldAMax + worldAMin) * 0.5f;
centerB
= (worldBMax + worldBMin) * 0.5f;

halfExtentA
= (worldAMax - worldAMin) * 0.5f; //centerA - oobb1.Min
halfExtentB = (worldBMax - worldBMin) * 0.5f;

float[] overlappingExtents = new float[15];

Vector3 centerAtoB = centerB - centerA;

It’s quickly evident that declaring a new array of 15 floats each time this method is called is the culprit, especially since this method is called potentially thousands of times per frame depending on how many active objects there are.

4. Fix

In this case it’s quite easily fixed, simply by moving the overlappingExtents declaration from within the method body to outside (so it’s a private field on the OrientedBoundingBox class) means it’s only ever allocated once when the class is created.

We can tell from logic further down in the method that all 15 values are replaced before the values are used, so we don’t have any issue with data hanging around between calls to Intersects. Running another profiling session after moving that one line of code results in a much happier graph:

image

This scenario was one that I identified and fixed during a long GC optimisation session which resulted in the current version of my game only allocating around 3-5 bytes per second which made my  Xbox very happy. There were a lot of results from that big GC optimisation session, one of which was my previously blogged ResourcePool class.

However, there are more involved scenarios relating to string usage especially that need more creative approaches to fixing, but hopefully this has explained enough to help you find where your collections are occurring and some steps to resolve them.


Tags: , ,
Categories: General | XNA

98 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

Change of direction

I’ve recently made a change in direction for my main personal project. The space-based 3d system that I’ve hinted at in earlier posts is more or less on the backburner for now. I did find time to prototype a physics-based ship control method which is intended to be integrated at a later date however.

I’ve recently had an idea for a 2D oriented space game that can be applied to a couple of different platforms (Xbox, Windows, iPad). It’s a fairly straight-forward galactic conquest type of game where you set out to occupy planets/systems, generate income, build fleets and conquer other controlled planets/systems.

I’ll hold off on disclosing more details until I get the basic prototype working, as anything can change at this point :) I will say that where the platform allows, a persistent ever-changing galaxy is the goal.

I’ll be applying a lot of my previous work on planets and procedural content in order to speed up development of this project.

Some crude screens after about 10 hours of work. Try to see through the “programmer art” to the potential :) Note: The following screenshots are taken along one continuous zoom, there are no loading screens to get from the top screenshot to the bottom one. Also, these are systems and not planets, even though the picture used looks like a planet :)

Galaxy Overview

 Zoomed into lower-left section

 Zoomed within view of planet names

The zooming functionality will be especially useful on the iPad using pinching and also moving quickly around the galaxy map.


Tags: , , , ,
Categories: General | XNA

19 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

GPU Noise and Normal generation

For the last several weeks I’ve been experimenting with GPU generation of height and normal maps in order to increase real-time performance of the patch generation. This is a bit of a progress update covering the different kinds of maps that I’m now generating and using to render the basic planet terrain.

Height Map

The height map GPU generation didn’t take long to sort out, starting with the sample at Ziggyware; which was recently hacked and is currently down so I cant provide a link :(  I initially encountered precision issues with generating the GPU generated values to apply to the vertex heights, with the edges of patches at the same LOD being quite different resulting in very large gaps. I eventually worked out that I needed to set the render target surface format to SurfaceFormat.Single and then read the texture into memory as float[] after the GPU has finished rendering. I was then able to go back to using the GPU height map as the height values for the vertices and got much better precision. Not perfect, but very, very close. I can resolve the small cracks between equal LOD patches when I do seam-fixing, which is next on my todo list.


The basic concept is that I pass in the normalised spherical vertices of the patch, which the GPU Noise shader then interprets and creates noise values for on a per-pixel basis. I played with many, many, many different noise generation patterns, you could literally spend 6 months doing different combinations of noise, combiners, feedback etc without finding something that looks really good. I settled on a not-so-bad combination (at least from orbit) at the moment:

image (In HLSL)

It basically applies 3 ridged multifractal noise values of increasing radius’ and averages them. Then does the same for a fractional Brownian motion noise, using the value of c1 as a modifier to further increase the radius of the source positions. I also divide the fBm result by c1 which I’ve found gives interesting results. I also mixed in a 3 octave version of ridgedmf and fBm to give it a bit of a softer edge. I finally c3^0.4 to scale down the results a bit, then subtract 0.5f to push the terrain lower, creating more sea area.

You can see the results of the above function in the screenshots at the end of this post, but here’s the height map result of some mix-mashing of high-octave noise which proved to be too noisy at ground level (massive noise spikes at the highest LOD = bad for games):

patches-good

I really like the pattern though, it has a good distribution of complex features and flat planes. Maybe I’ll end up using it as the basis for a cloud layer or some other arbitrary noise source.

Normal Map

The normal map generation took a very long time to sort out. In the CPU method I was computing real positions for each texel required and calculating the true normal. The results were very accurate, but incredibly, mind numbingly, slow due to the massive amount of noise calls that had to be made.  My screenshots and YouTube Video in earlier posts use the CPU based method. GPU generation was the obvious answer to get some decently interactive frame rates going.

There are so many articles about various methods floating around the internet, and very few relating to spherical terrain that it makes things very difficult. The most common approach seems to be to compute slope from the height map and derive a normal using a fixed Z value. This works.. to a point, but different LOD patches give different results, so may look fine at orbit, but by the time you reach the surface the normals contribute so little to the lighting the effect is barely visible. A few experiments in the way of increasing/decreasing the static Z constant based on LOD level yielded little improvement. Also, because the generation assumed that the height map was planar, the normals at the edges of the faces didn’t match and there were obnoxious seams visible.

After experimenting with various other methods I gave up for about a week. Then over the course of about 2 days I started to formulate the idea of generating positions per-texel on the GPU in the same way that I do on the CPU for patch vertices. This was made easier by the fact that the positions I pass into the height map shader lie on a unit sphere, so I would not be running into any precision issues since the normal map shader and the height map shader are operating on the same values.

All I needed to do was to pass in the normalised starting position (x,y) of the patch on the face of the unit-cube and the rotation matrix to transform the patch to the correct position (there are 6 total rotation matrices, up/down/left/right/forward/back). I also passed in the normalised width of the patch. The concept is that each texel will lie within the patch bounds on the unit cube, at which point I transform the position into the unit sphere equivalent and pass into the noise function to get the height of terrain at that point.

That probably sounds confusing. So how about some code.

Vertex shader: 
image

This calculates the location of the vertex on the unit cube using the texture coordinates already provided. Bounds(X|Y|Width|Height) are the normalised extents of the patch’s location on the planet’s cube face.

Pixel Shader:
image

Because the original positions are planar, I can easily add 1 texel unit to the x/y positions of the FacePosition and get the correct locations; the above shows this. It just gets the current position, and the positions one texel to the right, and one texel below. It also transforms the unit cube position to the correct face, then converts to the spherical equivalent.

image

Next, the 3 positions are scaled by the noise (terrain height). Scale is used to reduce the strength of the normals. Ideally the proportion between scale and the unit sphere radius would be the same as the proportion of the maximum terrain height and your planet’s radius, so that you get proportionally equivalent results. My normals are still coming out a bit strong, but that’ll be fixed later.

image

The last bit of the pixel shader computes the normal from the 3 scales positions and packs it into the output pixel.

As a step of convenience, I decided to calculate the slope for this particular location using the dot product of the unit sphere normal and the calculated normal packed into the W channel. This is convenient because I don’t have to generate a slope map in a separate step and pass into the terrain shader as a separate texture. It wasn’t something I planned for, it just happened :)

The normal map is still not 100% accurate and there are some minor bugs. Mainly, positions which lie on the x=1 or y=1 of the unit cube face and add one texel to FacePosition x and y are generating potentially invalid positions. It’s my intention to pass the correct FaceMatrix in for the adjoining faces so that when the positions are x>1 or y>0, they are correctly placed on the adjoining unit cube face. Also, the normals only take into account the positions x+texel and y+texel, whereas to be most accurate it should properly take into account x-texel and y-texel as well, which at the moment would double the error on the edge cases where x-texel < 0 or y-texel < 0. Something to look in to later.

Slope Map

I’ve already mentioned how the Slope map is being generated (W channel of the normal map). So I thought I’d give a quick rundown on what it’s used for. Basically, the Slope map indicates how steep a certain location on the terrain is. This is useful for adjusting the texture blending selection so that steep surfaces that would otherwise be showing green grass, could show brown dirt or rock instead. Just eye-candy really.

Diffuse Map

The diffuse map is currently generated each frame in the terrain shader. It takes the other maps (Normal, Height, Slope, Blend) and the result is a half-decent looking, per-pixel lit planet. It uses very basic 4-channel texture blending utilising a pre-made Blend Map:

image

So as the altitude increases, the texture blending shifts from water to sand, to grass, to rock. With the creation of the Slope Map just recently, I’ve made the slope value affect the height at which it thinks the terrain is. It is completely inaccurate but it adds a little flair for now. I’ll be doing a proper slope-based blend map later when I extend the system to the popular 16-texture blending method.

Once I get the diffuse map shader to a point that I’m happy with, I’ll be generating it into a texture to reduce the pixel shader workload, and I’ll also be able to free up the height map memory which is currently hanging around.

Screenshots

So I took two sets of screenshots. One of a far-orbit view, and one much, much closer. The various maps are shown, as well as the result of combining them all. Also I’ve included a wireframe view showing just how much work the pixel shader and noise functions do. The patch size is 16x16 vertices using 512x512 textures (height & normal). Planet radius is 6378100 game units, with 1unit = 1m, so it’s the radius of Earth.

Note: The images are 1600x1024 PNG’s, averaging 600kb each and are set to open in a new window since a popup wouldn’t cut it ;)

Jump 2009-10-12 22-53-11-52 Jump 2009-10-12 22-53-15-54 Jump 2009-10-12 22-53-18-55 Jump 2009-10-12 22-53-27-81 Jump 2009-10-12 22-53-29-84 Jump 2009-10-12 22-53-09-73

Nearer to the planet. Note how bad the noise looks at this altitude. Overly strong normals are contributing to making it look bad. Tweaking to come.

Jump 2009-10-12 22-55-26-32 Jump 2009-10-12 22-55-18-22 Jump 2009-10-12 22-55-19-81 Jump 2009-10-12 22-55-22-00 Jump 2009-10-12 22-55-23-13 Jump 2009-10-12 22-55-16-42

Next on the list for me is to fix up the terrain stitching which shouldn’t be too hard. I already have a function to find the visible neighbouring patch, and edge vertex LOD adjustment is just a LOD level comparison with a quick calculation to determine how many verts need to be adjusted using midpoint displacement. After that I imagine I’ll start on some LOD patch blending, so that the ugly popping is replaced by a gradual morph from a lower LOD level into the higher one. And the 16-texture slope blending method at some point as well.


Tags:
Categories: General

150 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

So….

It’s been a little while since I’ve made a post. I’m mainly spending my time getting the GPU generation looking good and performing as well as I need it to, then I’ll be posting some pics.

In the meantime, I’ve decided that I’m going to create an entry for Dream Build Play 2010. That means making what I have now work on the Xbox 360.. and buying a Xbox 360 (yes I’m a little slow with the uber consoles..). So I bought an Xbox 360 Elite edition tonight on my way home from work, and an XNA creators club licence, spent 20 minutes pre-processoring away the windows-specific bits in my code and BLAM; It was running on the Xbox.

I’ve attached proof in the form of a grainy mobile phone photo of the Xbox (small green light on far left) displaying on my 24” Asus VW246H using HDMI. At 1080P my WIP code was doing around 30FPS in debug mode, roughly 240fps less than what my I7 920/GTX285 PC does with the same code.

22092009_001

As you can probably tell, I’m going to utilise the planetary system I’ve been developing within the DBP entry, some kind of (preferably) multiplayer flying/ship game which has an element of high-altitude and limited FTL concept. Considering the DBP 2010 starts accepting entries in July 2010, there is plenty of time to sort it all out :)

With a finite goal in mind development will be much more focused, which should result in quicker development because I wont be thinking “will i need this, what about that, or that?” each time I add a piece of functionality. Weeeeeeee.


Tags:
Categories: General | XNA

2 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

Blog up.. hmmm

So I've decided to start a blog to keep track of the things I'm working on in my spare time. Nothing big, just something that I can use to look back on and say "oh yea, that's right, I did that".

I'm currently working a project which I put on the backburner many years ago which relates to space exploration planet to planet with seamless surface->space transitions. Back when I first started I was using  a C# port of Ogre3D which is a very nice open source C++ 3D engine. These days however, I've opted for the XNA route, having done some experiments late last year to learn the basics it seems more than capable of doing the job of rendering planets at near-realistic scales.

The past couple of days I've mainly been setting up the foundation of the project, using a sample off a site [insert reference link later] that I found, I've managed to get a rough sandbox working, enough so that I can prototype various things. I've attached some screenshots of one of those 'various things', star field and star cluster generation. The general idea is that a high-quality star field with clusters of stars (galaxies, nebulae) is generated when entering the app for the first time, 6 views from (0,0,0) are rendered to six textures (top,left,bottom,right,front,back) with a FOV of 45degrees. These 6 textures are then used to construct a skybox to which everything else happens within. The benefits of generating a skybox at load-time rather than pre-rendering and storing on disk are clear; no storage space required, the skybox can change relative to where the user is in space (after a jump, you simply wont move far enough in a reasonable amount of time at sub-light speeds to see a difference).

I’ve attached some extremely rough drafts of the concepts I’ve been working with to get a feel for the correct generation routines. The shots below are using 450,000 textured point sprites. The shapes are largely random; as each point is added it is given the location of the last point but with a random vector in the range ([-1.5f, 1.5f],[-1.5f, 1.5f],[-1.5f, 1.5f]). Now if the entire system as generated like that, you end up with a long worm-looking thing which isn’t necessarily that great, so I added the concept of arms. An arm is essentially resets the source position to a random existing position within the system, and it does this every 15,000 particles. Also, when a new arm is started, a colour is randomly chosen from a pre-set list of 3, in which that arms points will be based on. This results in some interesting shapes as you can see below (the one on the right looks like a scorpion, right?)

shot3 shot2

 

 

This last shot is a slight extension on the previous algorithm, where the point is then linearly interpolated against it’s normalised version using a constant of around 0.9998f. Essentially, it maps the points to a sphere with some tolerance as given by the difference between 0.9998f and 1f. (Similar approach to how the star field and planets will be generated using cube-to-sphere coordinate mapping). The effect ended up being too dense for a star cluster as you would expect to see in a star field, but could be used for other stars like suns for example..

shot4

As you can see on all 3 images, they are quite ‘dotty’. This is due to the nature of their construction using point sprites. Also, all point sprites are the same size currently which doesn't give a good feeling of volume. I’m going to switch to using billboards for each point, and adjust the size of each point given the output of a noise function; colour will also most likely be affected to produce some solid-looking cores with some puffy, light arms hanging out.

There is one more approach which I read about in an old Ysaneya of Infinity Engine fame dev journal post, which is constructing the star cluster (nebula in his case) from a voxel grid, placing some stars within a few cells which emit gas particles, then time-step forward until a reasonable volume has been generated. I think this approach is very interesting and I cant wait to see what the result is.

 

 

 


Tags:
Categories: General | XNA

248 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed