游戏邦在:
杂志专栏:
gamerboom.com订阅到鲜果订阅到抓虾google reader订阅到有道订阅到QQ邮箱订阅到帮看

分享以Texture2DArray快速渲染GUI的经验

发布时间:2014-03-24 16:09:41 Tags:,,,,,

作者:Kristoffer Lindström

许多时候使用定制游戏引擎时总会遇到UI上的麻烦,尤其是在易用性和运行表现这两个方面。为了克服这一点,我在构建引擎时试过了大量GUI渲染系统,我将在本文分享自己的一些心得。

重点

*本文的重点是渲染

*我们将使用Texture2DArray(DirectX 10功能,相当确定OpenGL也有类似功能)

*本文不会比较其他UI库的表现

*这是运用于游戏的结果

*没有提供源代码/下载内容

*有些源代码标签将置于lua代码中,因为这里没有lua语法格式化程序

不同的GUI风格

立即和传统模式的GUI

立即模式GUI今天即时应用中的运用已经相当普遍。它很容易设置和调整,但却需要付出代价:

// during render (or maybe update but never seen that)
// this will also be drawn in this function, so if we dont call this
// function the buttons does not exist anymore

do_button(“my label”, x, y, function() print(“I got clicked”) end)

优点:

*容易创造、调整,移除无需重新开始等;

*非常易于制作基础的GUI元素

*较少的代码编写量

缺点:

*难以维持良好的次序,所以在其他控制之后的控制可能被激活

*需要一些状态,以及大量输入数据的操作会变得很复杂

*输入通常是在游戏更新时而非渲染时传送,点击时可能会产生很奇怪的行为

*更好的易用性通常会以运行表现为代价

传统模式GUI需要更长时间进行设置,并且难以调整,但更为稳定,涉及与即时模式一起执行的高级UI控制时会变得很棘手。

// during some kind of init of a scene for example

local button = create_button(“my label”, x, y)
button:set_callback(function() print(“I got clicked”) end)

// later in render (this will draw all gui elements we have created)

draw_gui()

优点:

*你在开始抓取/更新之前就了解所有的控制

*拥有大量状态和转化/继承的复杂控制更为容易

*输入处理更为自然和稳定

缺点:

*需要更多代码

*难以调整

*编写和维护很棘手(个人观点)

对于像按钮这种简单的控制方式来说,这两种方法都很可行,但对于像滚动效果的列表框这种拥有大量内容的东西来说,其中的操作可能会立即陷入一团混乱。

我提到这一点的原因在于,使用传统模式时draw_gui函数知道所有将被抽取的GUI元素,因此它可以制定更好的抽取顺序,并根据状态变化将其划分成组等优化。

立即模式GUI在这方面较为欠缺,当渲染一个含有文本的按钮时,我们无法假设自己能够首先渲染按钮,之后再另一批处理中渲染文字。

混用立即模式和传统模式

因为我喜欢立即模式GUI,但也想获得非立即模式的好处,所以我创造了一种混合式的方法,以便我在创造高级的列表框、窗口、输入框的同时能够以立即模式调取它们。

// here are two versions of the immediate mode that do require an id
// but the id just need to be unique per scene
ui_button(“button-id”, “my label”, onclick):draw(x, y)
ui_button({id = “button-id”, label = “my label”, onclick = print}):draw(x, y)

// this is what you do if you want to create the button beforehand
// this becomes useful when dealing with listboxes and more advanced controls
local button = ui_button({label = “my_label”, onclick = print})

// in both cases the control comes to life when calling the draw function
button:draw(x, y)

以这种方式制作UI可以让我们获得立即模式的所有功能,除非我们停止渲染一个元素,获得一些与之相关的状态,但它终会消失并且无法获得进一步输入。所以我们基本上是每抽取一个元素,就控制一个元素,但每

个控制还附有一个状态,所以我们可以制作更高级的控制。这个状态允许我们向更新循环拖入输入,而不是在渲染时执行此操作,我们还可以用反向渲染顺序隐藏更新,这让我们获得了忽略被隐藏元素的能力。

虽然这个方法很好,但我们还面临运行表现的问题,我们将通过使用广泛顶点缓存结合纹理阵列,以及特定大小的纹理表单的方法来执行这一操作。

混合GUI的技术目标

为创造混合型UI系统,我们应该完成一些技术目标。

*能够呈现良好的运行表现,即使元素的抽取顺序非常奇怪

*我们无法调整抽取顺序

*文本和精灵/纹理必须在无需切换着色器或添加新抽取调用时进行渲染

为了实现这些要求,我们可以推断自己需要一个无需创造新抽取调用,拥有不同属性和纹理的相当良好的draw_rect路径。

纹理阵列

这是一个相对较新的功能,允许我们根据可能来自一个持续缓存的输入索引在一个着色器中使用不同的纹理。

纹理阵列的局限性在于,所有纹理必须拥有同样的宽度,高度和格式,所以为了让它更易于管理,我创造了一个纹理库针对每个(宽度、高度和格式)组合托管不同的纹理阵列。之后我只需使用任间纹理查询纹理库,如果该纹理之前尚未被使用,并且不适用于任何现成的纹理阵列,我们就创造一个新纹理阵列,并向其载入纹理,并返回(id 0) 绑定新纹理阵列对象。如果我们要求绑定一个相同大小的二级纹理,它就会令当前纹理阵列处于活跃状态,并用新纹理和返回(id 0) 来更新它。

你可以通过合并更小的纹理,将其提升为更大的纹理,并添加uv补偿,这样你就可以得到一个长阵列的1024 * 1024纹理。

纹理渲染

1024 * 1024纹理中存储着一个特定字体,它包括所有不同的大小。例如calibri标准字体、粗体和斜体就是3个1024 * 1024纹理,其中包括已渲染的不同大小字形。

GUI渲染

其运行原理与字体渲染一样,将所有图像存储在1024 * 1024的单个纹理即可。

整合在一起

将字体和UI渲染整合在一起,我们可以获得能够并入同一个阵列的X数量的1024 * 1024纹理。之后,我们选择了自己要用的纹理,而不是切换纹理,并创造一个新的抽取调用将纹理索引插入一个constantbuffer,并让每个顶点提供在constantbuffer(含有我们需使用哪个纹理的信息)中的索引。

GUI(from gamedev.net)

GUI(from gamedev.net)

使用混合GUI执行方法的结果

由于本文针对的是GUI执行的渲染环节,我将不再讨论按钮、输入框、列表框、滑块等按钮的运行原理。

下图是我从自己的引擎渲染的一张图,呈现了100个按钮和10个输入框,最有趣的环节在于抽取调用的数量和顶点数。

buttons & input boxes(from gamedev.net)

buttons & input boxes(from gamedev.net)

*完整场景的抽取调用=5

*完整场景的顶点数=8336

在此你可以看到从不同文本大小切换对抽取调用并没有影响。

text sizes(from gamedev.net)

text sizes(from gamedev.net)

*完整场景抽取调用=5(同之前一样)

*完整场景顶点数=3374

这个按钮&输入框图像是使用以下代码块组成的:

// object
ui_inputbox({id = i, value = string.format(“input #%i”, i)}):draw()

// this will adjust each element 40 units down from the last one
add_translation(0, 40)

end

pop()

// ui_textbox draw function would then look something like this
function draw(self)

local width = self.width
local height = self.height

set_blend_color(1, 1, 1, 1)

// set texture for complete gui texture sheet
set_texture(gui_texture_id)
draw_rect(…) // here the uv data would go in to grab the right part

// set font, and this will trigger another set_texture internally
set_text_font(“arial.ttf”)
set_text_size(16)
set_text_align(0, 0.5)

// this function is essentialy just calling multiple
// draw rects internally for each character to be drawn
draw_text_area(text, 0, 0, width, height)
end

使用HLSL着色器以C++执行

这晨我们将一个纹理对象绑定到渲染器,它将在活跃纹理库登记当前在使用哪个纹理,会激活或切换活跃纹理索引。

void IntermediateRenderer::bind_texture(Texture * texture)
{

// this is a texture pool that contains several arrays of similar sized textures
// lets say we want to bind texture A and that texture already exists in in the pool
// then if we have a different array bounded we must flush but otherwise we just use
// another index for the next operations since texture A was already in the
// current active array texture
auto mat = materials.get_active_state();

if (texture == NULL)
{
// we dont need to unbind anything just reduce the impact of the texture to 0
mat->texture_alpha = 0.0f;
}
else
{
unsigned int texture_index = 0;
if (texture_pool.bind(texture, &texture_index, std::bind(&IntermediateRenderer::flush, this)))
{
// this means we flushed
// this will start a new draw call

// refresh the state, usually means we grab the first
// material index again (0)
mat = materials.get_active_state();
}

// just set the constant buffer values
// and unless we flushed nothing will change
// we will just continue to build our vertex buffer

mat->texture_index = reinterpret_cast<float>(texture_index);
mat->texture_alpha = 1.0f;
}
}

因为我们使用了bitmap字体渲染,我们也可以在抽取一个字母时使用相同的渲染函数,正如我们抽取任何一个纹理化矩形一样。所以下一步就是创造一个函数有效渲染这个纹理化矩形。

这是我用C++执行渲染的一个简单矩形。RECT_DESC托管位置、宽度、色彩和uv坐标等属性。需要指出的是model_id和mat_id也将以DXGI_FORMAT_R8_UINT的格式纳入每个顶点。

void IntermediateRenderer::draw_rect(const RECT_DESC & desc)
{
// this will switch what buffers we are pushing data to
// so even if we switch from trianglelist to linelist
// we dont need to flush but the rendering order will be wrong
set_draw_topology(D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST);

// here we just get the currently active material and model states
// model contains transformation data
auto mat_id = materials.use_active_state();
auto model_id = models.use_active_state();

push_stream_ids(6, model_id, mat_id);

// currently I am not using any index list, but might do in the future if I feel
// I could benefit from it

// its important to keep these sizes known at compile time
// so we dont need to allocate temporary space on the heap somewhere
Vector3 position_data[6] =
{
Vector3(desc.x, desc.y, 0),
Vector3(desc.x + desc.width, desc.y, 0),
Vector3(desc.x, desc.y + desc.height, 0),
Vector3(desc.x, desc.y + desc.height, 0),
Vector3(desc.x + desc.width, desc.y, 0),
Vector3(desc.x + desc.width, desc.y + desc.height, 0)
};

Vector2 texcoord_data[6] =
{
Vector2(desc.u1, desc.v1),
Vector2(desc.u2, desc.v1),
Vector2(desc.u1, desc.v2),
Vector2(desc.u1, desc.v2),
Vector2(desc.u2, desc.v1),
Vector2(desc.u2, desc.v2)
};

// i will switch this from float4 to an unsigned int
// in the future so each vertex becomes much smaller
// the desc.color_top and desc.color_bottom are already
// uint32 formats

Vector4 ctop(desc.color_top);
Vector4 cbottom(desc.color_bottom);

Vector4 color_data[6] =
{
ctop,
ctop,
cbottom,
cbottom,
ctop,
cbottom,
};

// this will just copy in our stack data to the vertex buffers
position_stream->push(position_data);
texcoord_stream->push(texcoord_data);
color_stream->push(color_data);
}

之后在着色器中我们使用位于各个顶点的材料id从Material持续缓存中查到材料。

// instead of a normal array, we use an array of textures
Texture2DArray Texture : register(t0);

// each material is 8 floats
struct Material
{
float4 color;
float texture_index;
float texture_alpha;
float a; // padding
float b; // padding
};

// by having 256 different material at the same time
// we can draw 256 different entities in only one draw call
cbuffer MaterialBuffer : register(b0)
{
Material material[256];
};

struct Vertex
{
float4 position : SV_Position;
float3 vposition : Position0;

float3 normal : Normal0;

float2 uv : Texcoord0;
float4 color : Color0;

// this is how we control what material
// to use for what vertex, its only 1 byte in size
// for a value range of 0-255
uint material_id : Color1;
};

Result main(Vertex input)
{
// lookup material
Material mat = material[input.material_id];

// read from the right texture
float4 texel = Texture.Sample(Sampler, float3(input.uv, mat.texture_index));

//… rest of shader
}

使用库

使用WINAPI完成窗口和输入

使用freetype完成字体渲染

总结

我在此展现了自己渲染即时GUI的一些代码,虽然并不能满足所有人的需求,但我认为它对其他爱好捣鼓引擎的人来说应该颇有用处。通过使用Texture2DArray,我们创造了一个避免在切换纹理时创造新抽取调用的系统,而通过用与GUI图像相同的方式包装字体,我们就可以同时抽取文本和美术元素。(本文为游戏邦/gamerboom.com编译,拒绝任何不保留版权的转载,如需转载请联系:游戏邦

Fast GUI Rendering using Texture2DArray

By Kristoffer Lindström

A lot of the times when working with custom game engines the UI has always been a struggle, both when it comes to usability and performance. To combat this I have worked a lot with GUI rendering in mind when structuring the engine, and with this article I want to share my results.

Important

The focus in this article is the rendering

We will be using Texture2DArray (DirectX 10 feature, pretty sure OpenGL has similar features)

This article will not be a performance comparison with other UI libraries

It is a result of working on a game

No source code/download is available

Some of source code tags will be in lua(ish) code because there is no lua syntax formatter the comments will be // instead of –

Ideas and Different GUI Styles

Immediate and Traditional Mode GUI

Immediate Mode GUI has grown to be quite common nowadays when it comes to realtime applications and for all the right reasons. It is easy to setup and easy to modify but it comes with a price.

// during render (or maybe update but never seen that)
// this will also be drawn in this function, so if we dont call this
// function the buttons does not exist anymore

do_button(“my label”, x, y, function() print(“I got clicked”) end)

Pros

Easy to create, modify and remove without restarting etc..

Really easy to make basic GUI elements

Less amount of code to write

Cons

Harder to maintain good ordering, so controls behind other controls could get activated instead

Things that require some kind of state and a lot of input data get complicated to implement

Input is usually delivered during game updates and not when rendering, can make for strange behavior when clicking stuff

You tend to pay with performance for better usability

Traditional Mode GUI takes longer to setup and it is hard to change but it tends to be more stable and when it comes to advanced UI controls it can get tricky to implement with immediate mode.

// during some kind of init of a scene for example

local button = create_button(“my label”, x, y)
button:set_callback(function() print(“I got clicked”) end)

// later in render (this will draw all gui elements we have created)

draw_gui()

Pros

You know about all your controls before you start to draw/update them

Complicated controls with a lot of state and transformation/inheritance gets easier

Input handling gets more natural and stable

Cons

A lot more code needed

Hard to modify

Annoying to write and maintain (personal opinion)

For a simple control like a button both of the methods looks good, but for something like a scrollable listbox with a lot of items it can get messy really quick.

The reason I wanted to bring this up is because when using the traditional method the draw_gui function knows about all the GUI elements that will be drawn so it can make optimizations like a better draw order and separate them into groups depending on state changes (texture switches) etc..

The immediate GUI kinda lacks in this department and when rendering a button with text on it, we cannot assume that we can render the button first and then the text later on in another batch.

Mixing Immediate Mode with Traditional Mode

Since I like the Immediate mode GUI but wanted to have the benefits of the non immediate as well I have created a mixed kinda style that allows me to create advanced listboxes, windows, inputboxes while still drawing them in immediate mode.

// here are two versions of the immediate mode that do require an id
// but the id just need to be unique per scene
ui_button(“button-id”, “my label”, onclick):draw(x, y)
ui_button({id = “button-id”, label = “my label”, onclick = print}):draw(x, y)

// this is what you do if you want to create the button beforehand
// this becomes useful when dealing with listboxes and more advanced controls
local button = ui_button({label = “my_label”, onclick = print})

// in both cases the control comes to life when calling the draw function
button:draw(x, y)

Doing the UI in this way gives us all the functionality from the immediate mode, except that if we stop rendering an element we could end up with some state associated with it, but it will disappear and does not receive any further input. So basically we have the per element drawing and control of an element, but we also have a state associated with each control so we can make more advanced controls. This state allows us to poll input in the update loop instead of when rendering, and we can do the hidden update in reverse rendering order giving us the ability to ignore elements hidden under something.

While this is all good we still have the performance problem to tackle, we will do this by using extensive vertex buffering combined with texture arrays and texture sheets of specific sizes.

Technical Goals for the Mixed GUI

To create the Mixed UI system we need to achieve a few technical feats

Being able to have good performance even when elements have a very strange draw order

We cannot modify the draw order

Text and sprites/textures must be rendered without switching shader or adding a new draw calls

To meet these requirements we can conclude that we need a ‘pretty’ good draw_rect routine that can have different properties and textures without creating new draw calls.

Texture Arrays

This is a relatively new feature that allows us to use different textures in a shader depending on an input index that can come from a constant buffer
(this could be simulated with mega textures like 4096×4096)

The restriction with a texture array is that all textures in it must have the same width, height and format, so to make it a bit easier to manage I created a texture pool that holds a separate texture array for each combination of (width, height and format). Then I can just query the texture pool using any texture and if that texture has not been used before and does not fit in any of the existing texture arrays, we create a new texture array and load the texture in it and return (id 0) along with binding the new texture array object. If we had asked to bind a second texture with the same size it would just leave the current texture array active but update it with the new texture and return (id 1)

You could improve a lot upon this by merging smaller sized textures to a bigger one and add uv offsets, so you would end up with, let’s say, mostly 1024×1024 textures in a long array.

Text Rendering

A specific font is stored on 1024×1024 texture and it contains all the different sizes packed as well. So for example calibri normal, bold and italic would be three 1024×1024 textures filled with glyphs rendered with various sizes.

An example of a packed font with different sizes, this is far from optimal right now since you can pack bold and italic variants as well and have a better packing

GUI Rendering

This is working in the same way as font rendering by storing all the graphics on a single texture that is 1024×1024

Putting it all together

Putting the font and UI rendering together we get x amount of 1024×1024 textures that we can put in an array. Then, when we select what texture we want to use, instead of switching textures and creating a new draw call we just insert the texture index to a constantbuffer and with every vertex supply the the index in the constantbuffer that has information about which texture we want to use.

Results of Using the Mixed GUI Implementation

Since this article is aimed at the rendering part of the GUI implementation, I will not put any focus on how the buttons, inputboxes, listboxes, sliders etc… are working. Maybe in another article.

This is an image I rendered from my engine that shows 100 buttons and 10 input boxes, the most interesting part is the number of draw calls made and the vertex count.

Attached Image: mixed_texture_with_text.png

draw calls for complete scene = 5
vertex count for complete scene = 8336

Here you can see switching from different text sizes have no impact on draw calls either

Attached Image: tex_renderingt.png

draw calls for complete scene = 5 (same as before)
vertex count for complete scene = 3374

This button & inputbox image was composed using code chunks like this one

// the push and pop is a stack system of render states and in this case
// it keeps the translation local to between them
push()

for i = 1, 10, 1 do

// this is the only place that knows about this textbox
// it is not created in some init function, but we need the id
// so it can keep track of itself the next time it gets drawn
// after the first call the ui_textbox function will return the same
// object
ui_inputbox({id = i, value = string.format(“input #%i”, i)}):draw()

// this will adjust each element 40 units down from the last one
add_translation(0, 40)

end

pop()

// ui_textbox draw function would then look something like this
function draw(self)

local width = self.width
local height = self.height

set_blend_color(1, 1, 1, 1)

// set texture for complete gui texture sheet
set_texture(gui_texture_id)
draw_rect(…) // here the uv data would go in to grab the right part

// set font, and this will trigger another set_texture internally
set_text_font(“arial.ttf”)
set_text_size(16)
set_text_align(0, 0.5)

// this function is essentialy just calling multiple
// draw rects internally for each character to be drawn
draw_text_area(text, 0, 0, width, height)
end

Implementation in C++ Using HLSL Shaders

Here we bind a texture object to the renderer and it will check in the active texture pool what texture is currently being used and either flush or just swap the active texture index

void IntermediateRenderer::bind_texture(Texture * texture)
{

// this is a texture pool that contains several arrays of similar sized textures
// lets say we want to bind texture A and that texture already exists in in the pool
// then if we have a different array bounded we must flush but otherwise we just use
// another index for the next operations since texture A was already in the
// current active array texture
auto mat = materials.get_active_state();

if (texture == NULL)
{
// we dont need to unbind anything just reduce the impact of the texture to 0
mat->texture_alpha = 0.0f;
}
else
{
unsigned int texture_index = 0;
if (texture_pool.bind(texture, &texture_index, std::bind(&IntermediateRenderer::flush, this)))
{
// this means we flushed
// this will start a new draw call

// refresh the state, usually means we grab the first
// material index again (0)
mat = materials.get_active_state();
}

// just set the constant buffer values
// and unless we flushed nothing will change
// we will just continue to build our vertex buffer

mat->texture_index = reinterpret_cast<float>(texture_index);
mat->texture_alpha = 1.0f;
}
}

Since we do use bitmap font rendering we can use the same rendering function when drawing a letter, as when we would draw any other textured rect. So the next step would be to create a function to render this textured rect efficiently.

Here is my implementation in c++ for rendering a simple rect. RECT_DESC just holds attributes like position, width, color and uv coordinates. It is also important to note that model_id and mat_id will be included in each vertex in the format DXGI_FORMAT_R8_UINT

void IntermediateRenderer::draw_rect(const RECT_DESC & desc)
{
// this will switch what buffers we are pushing data to
// so even if we switch from trianglelist to linelist
// we dont need to flush but the rendering order will be wrong
set_draw_topology(D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST);

// here we just get the currently active material and model states
// model contains transformation data
auto mat_id = materials.use_active_state();
auto model_id = models.use_active_state();

push_stream_ids(6, model_id, mat_id);

// currently I am not using any index list, but might do in the future if I feel
// I could benefit from it

// its important to keep these sizes known at compile time
// so we dont need to allocate temporary space on the heap somewhere
Vector3 position_data[6] =
{
Vector3(desc.x, desc.y, 0),
Vector3(desc.x + desc.width, desc.y, 0),
Vector3(desc.x, desc.y + desc.height, 0),
Vector3(desc.x, desc.y + desc.height, 0),
Vector3(desc.x + desc.width, desc.y, 0),
Vector3(desc.x + desc.width, desc.y + desc.height, 0)
};

Vector2 texcoord_data[6] =
{
Vector2(desc.u1, desc.v1),
Vector2(desc.u2, desc.v1),
Vector2(desc.u1, desc.v2),
Vector2(desc.u1, desc.v2),
Vector2(desc.u2, desc.v1),
Vector2(desc.u2, desc.v2)
};

// i will switch this from float4 to an unsigned int
// in the future so each vertex becomes much smaller
// the desc.color_top and desc.color_bottom are already
// uint32 formats

Vector4 ctop(desc.color_top);
Vector4 cbottom(desc.color_bottom);

Vector4 color_data[6] =
{
ctop,
ctop,
cbottom,
cbottom,
ctop,
cbottom,
};

// this will just copy in our stack data to the vertex buffers
position_stream->push(position_data);
texcoord_stream->push(texcoord_data);
color_stream->push(color_data);
}

Then later in the shader we use the material id that is located in each vertex and lookup the material from the Material constant buffer.

// instead of a normal array, we use an array of textures
Texture2DArray Texture : register(t0);

// each material is 8 floats
struct Material
{
float4 color;
float texture_index;
float texture_alpha;
float a; // padding
float b; // padding
};

// by having 256 different material at the same time
// we can draw 256 different entities in only one draw call
cbuffer MaterialBuffer : register(b0)
{
Material material[256];
};

struct Vertex
{
float4 position : SV_Position;
float3 vposition : Position0;

float3 normal : Normal0;

float2 uv : Texcoord0;
float4 color : Color0;

// this is how we control what material
// to use for what vertex, its only 1 byte in size
// for a value range of 0-255
uint material_id : Color1;
};

Result main(Vertex input)
{
// lookup material
Material mat = material[input.material_id];

// read from the right texture
float4 texel = Texture.Sample(Sampler, float3(input.uv, mat.texture_index));

//… rest of shader
}

Libraries used

Window and input was done just using WINAPI

Font rendering was done using freetype2

Conclusion

I presented the theory and some code for my approach to render a real time GUI, this is far from everyone’s need but I think it could prove useful to other engine tinkerers. By using Texture2DArray we created a system to prevent creating new draw calls when switching textures and by packing the fonts in the same manner as the GUI graphic we could draw text and art at the same time.

I am well aware that this article does not include everything about the mixed immediate UI but if people are interested I might create an article about that as well.(source:gamedev


上一篇:

下一篇: