如何为小型游戏创造简单的GUI系统（3）

发布时间：2014-06-05 15:12:52 Tags：API,GUI,字体,渲染,着色器

作者：Martin Prantl

在第一部分中我们了解了定位和单独的GUI部分的规格。而现在我们可以开始在屏幕上渲染它们。本部分比前两部分短，因为我能够分享的内容也就只有这些了。

本系列文章分为以下章节：

第1部分——定位

第3部分——渲染

这一次，你将需要为主要的渲染准备一个API，但这并不是GUI设计的组成部分。最后，你不需要使用任何复杂的图像API，你可以基于自己最喜欢的语言（Java，C#等等）去使用基元和位图。然而我接下来要描述的内容将假设你使用的是某种图像API。我的样本将使用OpenGL和GLSL作，但可能也会为了更加清楚而改成DirectX。

在渲染你的GUI时你拥有2个选择。首先你可以在你的场景上基于每一帧将所有内容渲染成一个几何图形。其次你可以将GUI渲染成一个纹理，然后将这一纹理与你的场景结合在一起。在大多数情况下，因为要考虑结合部分，所以这种方法往往比较慢。而这两种选择的共同之处便是都能够渲染你的系统元素。

基本渲染

为了保持内容的简单性，我们先从简单的方法开始，即每个元素进行分开渲染。如果你拥有许多元素的话，这并不是一种性能友好型方法，但在当前却是可行的。加上在用于主要菜单而非实际游戏的静态GUI中，它同样也是一种有效的解决方法。你可能会注意到性能工具中的警示，即表示你渲染了太多小型基元。如果你的帧数足够高，你便不需要在意像能量消耗等内容，你可以让它们顺其自然。能量消耗更有可能成为手机设备的一大问题，因此在这些设备中电池使用寿命非常重要。较少的绘制调用可能更加便宜，且比较不会消耗你的电池；并且你的设备也不会变得过烫。

在现代API中，渲染内容的最佳方式便是使用着色器。它们提供了非常出色的用户控制—-你可以将纹理与颜色融合在一起，使用掩饰纹理进行摹制等等。你可以使用一个着色器去处理每一种类型的元素。

以下的着色器样本便是基于GLSL进行编写。它们使用了较早的符号版本，因为这能够与OpenGL ES 2.0兼容（游戏邦注：市场上的几乎所有设备都在使用这种API）。这一顶点着色器将假设你已经将你的几何图形转化成屏幕空间。

attribute vec3 POSITION;
attribute vec2 TEXCOORD0;

varying vec2 vTexCoord;

void main()
{
gl_Position = vec4(POSITION.xyz, 1.0);
vTexCoord = TEXCOORD0;
}

在像素着色器中，我抽取了一个纹理并使用简单的组合等式将其与颜色组合在一起。基于这种方法，你可以创造不同颜色的元素，并将一些灰度图当成图像掩膜进行使用。

uniform sampler2D guiElementTexture;
uniform vec4 guiElementColor;

varying vec2 vTexCoord;

void main()
{
vec4 texColor = texture2D(guiElementTexture, vTexCoord);
vec4 finalColor = (vec4(guiElementColor.rgb, 1) * guiElementColor.a);
finalColor += (vec4(texColor.rgb, 1) * (1.0 – guiElementColor.a));
finalColor.a = texColor.a;
gl_FragColor = finalColor;
}

这便是你用于渲染GUI的基本元素所需要的所有内容。

字体渲染

关于字体我选择使用一个基本的渲染器而不是较高级的渲染器。如果你的文本充满动态（经常发生改变），那么这一解决方法可能需要更快速。渲染速度同样也取决于文本的长度。对于一些小标题，如“新游戏”，“继续”，“分数：0”，这便足够了。但在面对像教程等等长文本时，问题可能便会出现。如果你在每一帧拥有超过100个绘制调用，你的帧率可能会出现明显的下降。这很难说清楚，主要取决于你的硬件，驱动器优化以及其它元素。而最好的方法便是亲自去尝试。根据我的经验，有一些伴随着渲染80多个字母的丢帧，但另一方面，屏幕可能是静态的，用户可能不会注意到60帧/秒与20帧/秒之间的区别。

对于经典的GUI元素，你必须使用能够根据每个元素进行改变的纹理。对于字体，它可能会是对你的应用造成主要杀伤力的一大元素。当然，在某些情况下，使用这种蛮力方法也是件好事。

我们将使用所谓的纹理集进行替代。那是一个包含了所有可能纹理的单一纹理。如果你不知道我指的是什么的话你可以看看下图。当然，如果你不知道每个字母的位置，那么拥有这些纹理也是无用的。这一信息经常储存于一个包含每个字母的坐标位置的单独文件。第二个问题便是分辨率。由FreeType所提供并生成的字体是相对于矢量表示法的字体大小进行创造。通过使用字体纹理，你可能将在一个较小的分辨率中创造出非常清晰的字体，但是在较高的分辨率中却创造出较为模糊的字体。所以你需要找到纹理大小和你的字体大小间的平衡。加上你还必须考虑大多数GPU（特别是手机）拥有最大的纹理大小为4096×4096。另一方面，为字体使用这一分辨率太具杀伤力了。大多数情况下我会使用512×512或256×256去渲染大小为20的字体。这在Retina iPad上看起来更合适。

font texture atlas(from gamedev)

我通过使用FreeType库存以及我自己的atlas creator创造了这一纹理。关于这些纹理的生成并不存在任何支持，所以你必须独自进行编写。这可能听起来很复杂，但事实却不是如此，你可以使用同一的代码去包装其它的GUI纹理。

每个字体的字母是由一个四方形呈现出来。这一四方形只是由它的纹理坐标所创造出来。关于字体的定位和“真正的纹理坐标”都转变成了主要应用，并且每个字母都不同。我曾提到“真正的纹理坐标”。而这到底是什么呢？你拥有一个纹理字体集，而那些便是这一字体集中的字母坐标。

在如下的代码样本中呈现出了一个强大的变量。通过缓冲一些已经生产的字体而获得加速。如果你生成了过多纹理并超过了API限制，这便会引起某些问题。例如，如果你拥有较长的文本并基于几个字体对其进行渲染，你便很容易生成无数非常小的纹理。

//calculate “scaling”
float sx = 2.0f / screen_width;
float sy = 2.0f / screen_height;

//Map texture coordinates from [0, 1] to screen space [-1, 1]
x = MyMathUtils::MapRange(0, 1, -1, 1, x);
y = -MyMathUtils::MapRange(0, 1, -1, 1, y); //-1 is to put origin to bottom left corner of the letter

//wText is UTF-8 since FreeType expect this
for (int i = 0; i < wText.GetLength(); i++)
{
unsigned long c = FT_Get_Char_Index(this->fontFace, wText[i]);
FT_Error error = FT_Load_Glyph(this->fontFace, c, FT_LOAD_RENDER);
if(error)
{
Logger::LogWarning(“Character %c not found.”, wText.GetCharAt(i));
continue;
}
FT_GlyphSlot glyph = this->fontFace->glyph;

//build texture name according to letter
MyStringAnsi textureName = “Font_Renderer_Texture_”;
textureName += this->fontFace;
textureName += “_”;
textureName += znak;
if (!MyGraphics::G_TexturePool::GetInstance()->ExistTexture(textureName))
{
//upload new letter only if it doesnt exist yet
//some kind of cache to improve performance
MyGraphics::G_TexturePool::GetInstance()->AddTexture2D(textureName, //name o texture within pool
glyph->bitmap.buffer, //buffer with raw texture data
glyph->bitmap.width * glyph->bitmap.rows, //buffer byte size
MyGraphics::A8, //only grayscaled texture
glyph->bitmap.width, glyph->bitmap.rows); //width / height of texture
}

//calculate letter position within screen
float x2 = x + glyph->bitmap_left * sx;
float y2 = -y – glyph->bitmap_top * sy;

//calculate letter size within screen
float w = glyph->bitmap.width * sx;
float h = glyph->bitmap.rows * sy;

this->fontQuad->GetEffect()->SetVector4(“cornersData”, Vector4(x2, y2, w, h));
this->fontQuad->GetEffect()->SetVector4(“fontColor”, fontColor);
this->fontQuad->GetEffect()->SetTexture(“fontTexture”, textureName);
this->fontQuad->Render();

//advance start position to the next letter
x += (glyph->advance.x >> 6) * sx;
y += (glyph->advance.y >> 6) * sy;
}

改变这一代码使其能与纹理集相互协作是非常简单的事。你需要做的便是使用一个伴随着字体集内部字体坐标的额外文件。对于每个字体，这些坐标将伴随着字体的位置和大小。纹理只有一次设置机会，并会保持一样直至你改变了字体类型。然而剩下的代码也会保持一样。

就像你在代码中所看到的，纹理位图（glyph->bitmap.buffer）是FreeType所提供的图像字符的一部分。即使你并未使用它，它也仍然会生成，只是需要花点时间。如果你的文本是静态的，你可以对它们进行“缓冲”，并在第一次运行的时候储存FreeType生成的所有内容，然后在运行时间中只是用之前创造的变量而不调用任何FreeType功能。大多数时候我会使用这种方法，这不会对字体渲染造成任何性能影响和问题。

高级渲染

到目前为止我们只讨论了基本的渲染。很多内容可能是你们之前就知道的，所以并没有什么好惊讶的。但很有可能在接下来的这部分中也没有可让人惊讶的内容。

如果你拥有更多元素并希望能够尽可能对其进行渲染，那么进行分开渲染可能还不够。基于这一原因，我使用了一种“炮制”方法。我创造了一个几何图形缓存，它能够在屏幕上呈现来自所有元素的几何图形，我可以使用单一的绘制调用去绘制它们。这里所存在的问题是你需要拥有一个着色器，并且所有的元素可能是不同的。为了达到这一目的，我使用了能够处理“所有内容”的单一着色器，并且每个元素都具有统一的图形呈现。这意味着对于某些元素，你将拥有闲散的部分。你也许可以用任何内容去填补这些空间，但通常情况下它们都是0。伴随着这些闲散的部分的呈现将以“更大的”几何图形数据而告终。好好想想我所说的“更大的”这一词意味着什么。它并不是指一份巨大的开支，你的GUI仍应该基于伴随着更快的绘制的廉价内存。这便是一种平衡。

作为几何图形我们需要传达给每个元素什么内容：

位置—-这将被划分成2个部分。XYZ坐标以及关于元素索引的W。

TEXCOORD0—-2组纹理坐标

TEXCOORD1—-2组纹理坐标

TEXCOORD2—-颜色

TEXCOORD3—-额外1组纹理坐标和用于填补vec4的备用空间

为什么我们需要不同组的纹理坐标的？答案很简单。我们在一个几何图形描述中创造一个完整的GUI。我们不知道哪个纹理属于哪个元素，并且还加上我们拥有源自片段着色器的有限纹理组。如果你将它们两两放置在一起，你便最终只能获得一个解决方法。是的，我们从单独的纹理为每个“炮制”元素创造了另外一个纹理集架构。通过我们从元素中所发现的内容，我们知道自己可以拥有更多纹理。这也是我们为何要在一个图形描述中拥有多个纹理坐标的原因。第一组是用于默认的纹理，而第二组是用于“徘徊的”纹理，接下来是用于点击的纹理等等。你也可以选择自己的描述。

在一个顶点着色器中我们根据元素的当前状态选择了正确的纹理坐标，并将坐标发送到一个片段着色器。当前的元素状态是源自整数数组中的主要应用，在那里每个数字都相当于一个特定的状态，就像-1便代表一个看不见的元素（将不被渲染）。我们并不会每一帧都传输这一数据，只会在一个元素的状态发生改变时进行传输。然后我们将为“炮制”元素更新所有状态。我将最大数值限制在每个单一绘制调用64，你也可以根据自己的情况进行增加或减少（但如果选择增加的话你就要小心了，因为你可能会超过GPU统一规格的限制）。索引至这一数组被当成是位置中的W组件。

我们可以在如下的代码中看到完整的顶点着色器和片段着色器：

//Vertex buffer content
attribute vec4 POSITION; //pos (xyz), index (w)
attribute vec4 TEXCOORD0; //T0 (xy), T1 (zw)
attribute vec4 TEXCOORD1; //T2 (xy), T3 (zw)
attribute vec4 TEXCOORD2; //color
attribute vec4 TEXCOORD3; //T4 (xy), unused (zw)

//User provided input
uniform int stateIndex[64]; //64 = max number of elements baked in one buffer

//Output
varying vec2 vTexCoord;
varying vec4 vColor;

void main()
{
gl_Position = vec4(POSITION.xyz, 1.0);
int index = stateIndex[int(POSITION.w)];
if (index == -1) //not visible
{
gl_Position = vec4(0,0,0,0);
index = 0;
}

if (index == 0) vTexCoord = TEXCOORD0.xy;
if (index == 1) vTexCoord = TEXCOORD0.zw;
if (index == 2) vTexCoord = TEXCOORD1.xy;
if (index == 3) vTexCoord = TEXCOORD1.zw;
if (index == 4) vTexCoord = TEXCOORD3.xy;
vColor = TEXCOORD2;
}

注：如果我用一个条件结构替换这一代吗，ES版本的GLSL优化程序将会脱离我的代码，它也将停止运行。所以这是唯一适合我的解决方法。

结论

渲染GUI并不是件困难的事。如果你熟悉渲染的基本理念，并且知道API是如何运行的，你便能够轻松地渲染所有内容。你需要谨慎地进行文本渲染，因为如果你选择了错误的方法，你将遭遇到巨大的瓶颈。

（本文为游戏邦/gamerboom.com编译，拒绝任何不保留版权的转载，如需转载请联系：游戏邦）

Creating a Very Simple GUI System for Small Games – Part III

By Martin Prantl

In part one, we familiarized ourselves with positioning and sizes of single GUI parts. Now, its time to render them on the screen. This part is shorter than the previous two, because there is not so much to tell.

You can look at previous chapters:

Part I – Positioning

Part II – Control logic

Part III – Rendering

This time, you will need some kind of an API for main rendering, but doing this is not part of a GUI design. At the end, you don’t need to use any sophisticated graphics API at all and you can render your GUI using primitives and bitmaps in your favourite language (Java, C#, etc). Hovewer, what I will be describing next assumes a usage of some graphics API. My samples will be using OpenGL and GLSL, but a change to DirectX should be straightforward.

You have two choices in rendering your GUI. First, you can render everything as a geometry in each frame on top of your scene. Second option is to render the GUI into a texture and then blend this texture with your scene. In most of the cases this will be slower because of the blending part. What is the same for both cases are the rendering elements of your system.

Basic rendering

To keep things as easy as possible we start with the simple approach where each element is rendered separately. This is not a very performance-friendly way if you have a lot of elements, but it will do for now. Plus, in a static GUI used in a main menu and not in an actual game, it can also be a sufficient solution. You may notice warnings in performance utilities that you are rendering too many small primitives. If your framerate is high enough and you don’t care about things like power consumption, you can leave things as they are. Power consumption is more likely to be a problem for mobile devices, where a battery lifetime is important. Fewer draw calls can be cheaper and put less strain on your battery; plus your device won’t be hot as hell.

In modern APIs, the best way to render things is to use shaders. They offer great user control – you can blend textures with colors, use mask textures to do patterns, etc. We use one shader that can handle every type of element.

The following shader samples are written in GLSL. They use an old version of notation because of a compatibility with OpenGL ES 2.0 (almost every device on the market is using this API). This vertex shader assumes that you have already converted your geometry into the screen space (see first part of the tutorial where [-1, 1] coordinates were mentioned).

In a pixel (fragment) shader, I am sampling a texture and combining it with color using a simple blending equation. This way, you can create differently colored elements and use some grayscaled texture as a pattern mask.

That is all you need for rendering basic elements of your GUI.

Font rendering

For fonts I have chosen to use this basic renderer instead of an advanced one. If your texts are dynamic (changing very often – score, time), this solution may be faster. The speed of rendering also depends on the text length. For small captions, like “New Game”, “Continue”, “Score: 0” this will be enough. Problems may (and probably will) occur with long texts like tutorials, credits etc. If you will have more than 100 draw-calls in every frame, your frame rate will probably drop down significantly. This is something that can not be told explicitly, it depends on your hardware, driver optimization and other factors. Best way is to try. From my experience, there is a major frame drop with rendering 80+ letters, but on the other hand, the screen could be static and the user probably won’t notice the difference between 60 and 20 fps.

For classic GUI elements, you have used textures that are changed for every element. For fonts, it would be an overkill and a major slowdown of your application. Of course, in some cases (debug), it may be good to use this brute-force way.

We will use something called a texture atlas instead. That is nothing more then a single texture that holds all possible textures (in our case letters). Look at the picture below if you don’t know what I mean . Of course, to have only this texture is useless without knowing where each letter is located. This information is usually stored in a separate file that contains coordinate locations for each letter. Second problem is the resolution. Fonts provided and generated by FreeType are created with respect to the font size from vector representations, so they are sharp every time. By using a font texture you may end up with good looking fonts on a small resolutions and blurry ones for a high resolution. You need to find a trade-off between a texture size and your font size. Plus, you must take in mind that most of the GPUs (especially the mobile ones), have a max texture size of 4096×4096. On the other hand, using this resolution for fonts is an overkill. Most of the time I have used 512×512 or 256×256 for rendering fonts with a size 20. It looks good even on Retina iPad.

Example of font texture atlas

I have created this texture by myself using the FreeType library and my own atlas creator. There is no support for generating these textures, so you have to write it by yourself. It may sound complicated, but it is not and you can use the same code also for packing other GUI textures. I will give some details of implementation in part IV of the tutorial.

Every font letter is represented by a single quad without the geometry. This quad is created only by its texture coordinates. Position and “real texture coordinates” for the font are passed from the main application and they differ for each letter. I have mentioned “real texture coordinates”. What are they? You have a texture font atlas and those are the coordinates of a letter within this atlas.

In following code samples, a brute-force variant is shown. There is some speed-up, achieved by caching already generated fonts. This can cause problems if you generate too many textures and exceed some of the API limits. For example, if you have long text and render it with several font faces, you can easily generate hunderds of very small textures.

To change this code to be able work with a texture atlas is quite easy. What you need to do is use an additional file with coordinates of letters within the atlas. For each letter, those coordinates will be passed along with letter position and size. The texture will be set only once and stay the same until you change the font type. The rest of the code, hovewer, remains the same.

As you can see from code, texture bitmap (glyph->bitmap.buffer) is a part of the glyph provided by FreeType. Even if you don’t use it, it is still generated and it takes some time. If your texts are static, you can “cache” them and store everything generated by FreeType during first run (or in some Init step) and then, in runtime, just use precreated variables and don’t call any FreeType functions at all. I use this most of the time and there are no performance impacts and problems with font rendering.

Advanced rendering

So far only basic rendering has been presented. Many of you probably knew that, and there was nothing surprising. Well, there will probably be no surprises in this section too.

If you have more elements and want to have them rendered as fast as possible, rendering each of them separately may not be enough. For this reason I have used a “baked” solution. I have created a single geometry buffer, that holds geometry from all elements on the screen and I can draw them with a single draw-call. The problem is that you need to have single shader and elements may be different. For this purpose, I have used a single shader that can handle “everything” and each element has a unified graphics representation. It means that for some elements, you will have unused parts. You may fill those with anything you like, usually zeros. Having this representation with unused parts will end up with a “larger” geometry data. If I have used the word “larger”, think about it. It won’t be such a massive overhead and your GUI should still be cheap on memory with a faster drawing. That is the trade-off.

What we need to pass as geometry for every element:

POSITION – this will be divided into two parts. XYZ coordinates and W for element index.

TEXCOORD0 – two sets of texture coordinates

TEXCOORD1 – two sets of texture coordinates

TEXCOORD2 – color

TEXCOORD3 – additional set of texture coordinates and reserved space to kept padding to vec4

Why do we need different sets of texture coordinates? That is simple. We have baked an entire GUI into one geometry representation. We don’t know which texture belongs to which element, plus we have a limited set of textures accessible from a fragment shader. If you put two and two together, you may end up with one solution for textures. Yes, we create another texture atlas built from separate textures for every “baked” element. From what we have already discovered about elements, we know that they can have more than one texture. That is precisely the reason why we have multiple texture coordinates “baked” in a geometry representation. First set is used for the default texture, second for “hovered” textures, next for clicked ones etc. You may choose your own representation.

In a vertex shader we choose the correct texture coordinates according to the element’s current state and send coordinates to a fragment shader. Current element state is passed from the main application in an integer array, where each number corresponds to a certain state and -1 for an invisible element (won’t be rendered). We don’t pass this data every frame but only when the state of an element has been changed. Only then do we update all states for “baked” elements. I have limited the max number of those to be 64 per a single draw-call, but you can decrease or increase this number (be careful with increase, since you may hit the GPU uniforms size limits). Index to this array has been passed as a W component in a POSITION.

The full vertex and the fragment shader can be seen in the following code snipset.

Note: In vertex shader, you can spot the “ugly” if sequence. If I replaced this code with an if-else, or even a switch, GLSL optimizer for ES version stripped my code somehow and it stopped working. This was the only solution, that worked for me.

Conclusion

Rendering GUI is not a complicated thing to do. If you are familiar with basic concepts of rendering and you know how an API works, you will have no problem rendering everything. You need to be careful with text rendering, since there could be significant bottlnecks if you choose the wrong approach.

Next time, in part IV, some tips & tricks will be presented. There will be a simple texture atlas creation, example of user-friendly GUI layout with XML, details regarding touch controls and maybe more . The glitch is, that I don’t have currently much time, so there could be a longer delay before part IV will see the light of day (source:gamedev)

分享到： QQ空间新浪微博开心网人人网

上一篇:针对iTunes和Google Play的ASO优化建议

下一篇:阐述游戏UX与QA测试的定义及区别