<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.2.1" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments on: Order your graphics draw calls around!</title>
	<link>http://realtimecollisiondetection.net/blog/?p=86</link>
	<description>Coding wisdom and rants of Christer Ericson</description>
	<pubDate>Fri, 24 May 2013 13:42:15 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.2.1</generator>

	<item>
		<title>By: Post-Mortem &#8211; ColorIt &#124; Fatal Abstraction</title>
		<link>http://realtimecollisiondetection.net/blog/?p=86#comment-3601</link>
		<author>Post-Mortem &#8211; ColorIt &#124; Fatal Abstraction</author>
		<pubDate>Sun, 05 Feb 2012 15:46:07 +0000</pubDate>
		<guid>http://realtimecollisiondetection.net/blog/?p=86#comment-3601</guid>
		<description>[...] opportunities. I&#8217;ve put everything in place that would allow me to use clever stuff like render queue sorting but I never really got around to actually implement the damn thing. My list of neat engine [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] opportunities. I&#8217;ve put everything in place that would allow me to use clever stuff like render queue sorting but I never really got around to actually implement the damn thing. My list of neat engine [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: latimerius</title>
		<link>http://realtimecollisiondetection.net/blog/?p=86#comment-3578</link>
		<author>latimerius</author>
		<pubDate>Wed, 21 Dec 2011 14:32:51 +0000</pubDate>
		<guid>http://realtimecollisiondetection.net/blog/?p=86#comment-3578</guid>
		<description>Hi Christer, let me tell you I learnt a lot from your book, and your non-conformist blog is really cool too!

This might be a naive question as I'm not graphics programmer, I just dabble into it once in a while.  Anyway, would you see a way how to handle scoped commands within your system?  By "scoped" I mean things like glPushMatrix()/glPopMatrix() or glEnable(GL_BLEND)/glDisable(GL_BLEND) command pairs, anything that defines a scope of sorts, sets a state for the scope that anything rendered within the scope is affected by.

If I used the extension for commands you describe I'd still have to make sure that the draw calls generated within a scope (and no others) are still between the scope delimiting commands after sorting.  That could be done by introducing the "scope" into the sorting key somehow, probably close to its MSB.  However, some possible use cases I have in my head seem to indicate there could be clashes trying to determine the importance of the categories.</description>
		<content:encoded><![CDATA[<p>Hi Christer, let me tell you I learnt a lot from your book, and your non-conformist blog is really cool too!</p>
<p>This might be a naive question as I&#8217;m not graphics programmer, I just dabble into it once in a while.  Anyway, would you see a way how to handle scoped commands within your system?  By &#8220;scoped&#8221; I mean things like glPushMatrix()/glPopMatrix() or glEnable(GL_BLEND)/glDisable(GL_BLEND) command pairs, anything that defines a scope of sorts, sets a state for the scope that anything rendered within the scope is affected by.</p>
<p>If I used the extension for commands you describe I&#8217;d still have to make sure that the draw calls generated within a scope (and no others) are still between the scope delimiting commands after sorting.  That could be done by introducing the &#8220;scope&#8221; into the sorting key somehow, probably close to its MSB.  However, some possible use cases I have in my head seem to indicate there could be clashes trying to determine the importance of the categories.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: christer</title>
		<link>http://realtimecollisiondetection.net/blog/?p=86#comment-3536</link>
		<author>christer</author>
		<pubDate>Thu, 04 Aug 2011 05:53:39 +0000</pubDate>
		<guid>http://realtimecollisiondetection.net/blog/?p=86#comment-3536</guid>
		<description>Robin, things differ greatly between PCs (which you're asking about) and consoles (which I work on). On PCs you have driver overhead, on consoles we don't. On PS3 we pretty much don't worry about small draw calls at all. So the VBO issues you ask about are pretty alien to me, so I cannot really provide reliable recommendations.

For the material and texture sorting question, the same applies. On PC you'll have the driver doing things behind your back somewhat, whereas we explicitly upload textures, and vertex and fragment shaders. We also explicitly handle patching of parameters to the shaders etc. so we know exactly in what order things will be presented to the hardware and we can therefore sort things to account for that. There's no uploading of textures or shaders "behind our back."

Perhaps someone more familiar with the atrocities performed by PC drivers can comment directly on your questions!</description>
		<content:encoded><![CDATA[<p>Robin, things differ greatly between PCs (which you&#8217;re asking about) and consoles (which I work on). On PCs you have driver overhead, on consoles we don&#8217;t. On PS3 we pretty much don&#8217;t worry about small draw calls at all. So the VBO issues you ask about are pretty alien to me, so I cannot really provide reliable recommendations.</p>
<p>For the material and texture sorting question, the same applies. On PC you&#8217;ll have the driver doing things behind your back somewhat, whereas we explicitly upload textures, and vertex and fragment shaders. We also explicitly handle patching of parameters to the shaders etc. so we know exactly in what order things will be presented to the hardware and we can therefore sort things to account for that. There&#8217;s no uploading of textures or shaders &#8220;behind our back.&#8221;</p>
<p>Perhaps someone more familiar with the atrocities performed by PC drivers can comment directly on your questions!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: robindegen</title>
		<link>http://realtimecollisiondetection.net/blog/?p=86#comment-3534</link>
		<author>robindegen</author>
		<pubDate>Wed, 03 Aug 2011 13:06:52 +0000</pubDate>
		<guid>http://realtimecollisiondetection.net/blog/?p=86#comment-3534</guid>
		<description>Thanks for this post! It helped me get a good idea of where to begin with my render queue (i'm currently not sorting at all, just rendering).

One thing i'm wondering though, at the lower level, do you optimize anything with your VBO's? For example, if you have a few small models (chairs, tables, whatever), do you have an equal amount of VBO's or do you combine them into a larger one somewhere and then just render parts? And if so, at what point do you do that? Or do you just have a whole bunch of smaller vbo's, accepting the statechanges for those?

One idea i had was that whenever i load the world, it will determine the primitive count (triangles usually) of all the static objects and spread them over various VBO's (i have yet to find the "magic" number of vbo size). When the vbo is "full" it will add the next model to a new vbo and so on. What is your view on this idea?

Another thing i was wondering; in most modern games you’d rarely see the same texture used twice, as about 90% of the rendered scene would be unique for materials (except maybe for the used shader). Would you then choose to only depth-sort and completely ignore materials if thats the case? In a typical scene i’ve seen theres barely any texture used twice, and if it is it’d be something used a lot, which would then be instanced anyway</description>
		<content:encoded><![CDATA[<p>Thanks for this post! It helped me get a good idea of where to begin with my render queue (i&#8217;m currently not sorting at all, just rendering).</p>
<p>One thing i&#8217;m wondering though, at the lower level, do you optimize anything with your VBO&#8217;s? For example, if you have a few small models (chairs, tables, whatever), do you have an equal amount of VBO&#8217;s or do you combine them into a larger one somewhere and then just render parts? And if so, at what point do you do that? Or do you just have a whole bunch of smaller vbo&#8217;s, accepting the statechanges for those?</p>
<p>One idea i had was that whenever i load the world, it will determine the primitive count (triangles usually) of all the static objects and spread them over various VBO&#8217;s (i have yet to find the &#8220;magic&#8221; number of vbo size). When the vbo is &#8220;full&#8221; it will add the next model to a new vbo and so on. What is your view on this idea?</p>
<p>Another thing i was wondering; in most modern games you’d rarely see the same texture used twice, as about 90% of the rendered scene would be unique for materials (except maybe for the used shader). Would you then choose to only depth-sort and completely ignore materials if thats the case? In a typical scene i’ve seen theres barely any texture used twice, and if it is it’d be something used a lot, which would then be instanced anyway</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Growing Ginkgo Pt. 1: The Reading List &#187; AltDevBlogADay (Staging Site)</title>
		<link>http://realtimecollisiondetection.net/blog/?p=86#comment-3393</link>
		<author>Growing Ginkgo Pt. 1: The Reading List &#187; AltDevBlogADay (Staging Site)</author>
		<pubDate>Sun, 13 Feb 2011 01:07:45 +0000</pubDate>
		<guid>http://realtimecollisiondetection.net/blog/?p=86#comment-3393</guid>
		<description>[...] rendering subsystem is following the guidelines outlined in Christer Ericson&#8217;s &#8220;Order Your Drawcalls&#8221;. We have buckets of arrays of faces that are sorted according to a render mask. More about that in [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] rendering subsystem is following the guidelines outlined in Christer Ericson&#8217;s &#8220;Order Your Drawcalls&#8221;. We have buckets of arrays of faces that are sorted according to a render mask. More about that in [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: christer</title>
		<link>http://realtimecollisiondetection.net/blog/?p=86#comment-3370</link>
		<author>christer</author>
		<pubDate>Thu, 06 Jan 2011 04:17:55 +0000</pubDate>
		<guid>http://realtimecollisiondetection.net/blog/?p=86#comment-3370</guid>
		<description>CodeLord, I described commands in the way I did for two primary reasons.

First, I wanted to indicate that several fields of the key are irrelevant for a command key and can be reused for relevant data. This is nice because if all data you need fits within the key, you don't need to even fetch the value field. Of course, you can use the value field to store data too like you outlined. Whichever works best for you.

Second, I wanted to introduce the concept of one (or more) "sequence ID" field within the key for a command, which would allow several commands to be sorted together (which is actually a likely need). This second reason is why you'd want to put these bits in the key instead of the value field.</description>
		<content:encoded><![CDATA[<p>CodeLord, I described commands in the way I did for two primary reasons.</p>
<p>First, I wanted to indicate that several fields of the key are irrelevant for a command key and can be reused for relevant data. This is nice because if all data you need fits within the key, you don&#8217;t need to even fetch the value field. Of course, you can use the value field to store data too like you outlined. Whichever works best for you.</p>
<p>Second, I wanted to introduce the concept of one (or more) &#8220;sequence ID&#8221; field within the key for a command, which would allow several commands to be sorted together (which is actually a likely need). This second reason is why you&#8217;d want to put these bits in the key instead of the value field.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: CodeLord</title>
		<link>http://realtimecollisiondetection.net/blog/?p=86#comment-3369</link>
		<author>CodeLord</author>
		<pubDate>Thu, 06 Jan 2011 01:37:30 +0000</pubDate>
		<guid>http://realtimecollisiondetection.net/blog/?p=86#comment-3369</guid>
		<description>Hi Christer, first of all, great book, great blog and great games, nice job! You have to write a book about game engines, I would be first to buy .

Now about this key,value scheme that you suggested here. I wanted to know why are you adding the command id itself to the key (I'm referring to the second figure), doesn't it make more sense to have the command id on the value part, perhaps among with some parameters for the command? We don't need to sort the commands based on their memory address (in case of a function ptr)  right?</description>
		<content:encoded><![CDATA[<p>Hi Christer, first of all, great book, great blog and great games, nice job! You have to write a book about game engines, I would be first to buy .</p>
<p>Now about this key,value scheme that you suggested here. I wanted to know why are you adding the command id itself to the key (I&#8217;m referring to the second figure), doesn&#8217;t it make more sense to have the command id on the value part, perhaps among with some parameters for the command? We don&#8217;t need to sort the commands based on their memory address (in case of a function ptr)  right?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: nulstein v2 plog &#8211; rendering overview &#8211; Intel Software Network Blogs</title>
		<link>http://realtimecollisiondetection.net/blog/?p=86#comment-3358</link>
		<author>nulstein v2 plog &#8211; rendering overview &#8211; Intel Software Network Blogs</author>
		<pubDate>Fri, 26 Nov 2010 17:24:08 +0000</pubDate>
		<guid>http://realtimecollisiondetection.net/blog/?p=86#comment-3358</guid>
		<description>[...] all starts with Christer Ericcson's solution to keeping things in order (Order your graphics draw calls around!) which really is about generating them in no particular order and sorting afterwards. Please go [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] all starts with Christer Ericcson&#8217;s solution to keeping things in order (Order your graphics draw calls around!) which really is about generating them in no particular order and sorting afterwards. Please go [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Some more frustum culling notes &#171; The ryg blog</title>
		<link>http://realtimecollisiondetection.net/blog/?p=86#comment-3345</link>
		<author>Some more frustum culling notes &#171; The ryg blog</author>
		<pubDate>Wed, 20 Oct 2010 07:57:22 +0000</pubDate>
		<guid>http://realtimecollisiondetection.net/blog/?p=86#comment-3345</guid>
		<description>[...] before it ever gets submitted to the GPU. You don&#8217;t draw things immediately; you normally build a job list first that is then sorted. You may also do some sort of occlusion culling. For all survivors, you then need to set the state [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] before it ever gets submitted to the GPU. You don&#8217;t draw things immediately; you normally build a job list first that is then sorted. You may also do some sort of occlusion culling. For all survivors, you then need to set the state [&#8230;]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: christer</title>
		<link>http://realtimecollisiondetection.net/blog/?p=86#comment-3317</link>
		<author>christer</author>
		<pubDate>Tue, 10 Aug 2010 03:35:22 +0000</pubDate>
		<guid>http://realtimecollisiondetection.net/blog/?p=86#comment-3317</guid>
		<description>"Typically we want to sort opaque geometry and normal, additive, and subtractive translucent geometry into separate groups." can be parsed as either

{ opaque } and { normal, additive, and subtractive translucent }

or as

{ opaque }, { normal translucent }, { additive translucent }, and { subtractive translucent }

to be the separate groups. These two alternatives will give different results, obviously.

Visually, the former is seen as correct (as long as you sort the translucent group correctly), but for games in particular we have traditionally compromised on correctness and done the latter, to cut down on state changes (to make rendering faster).

The example above though is just that though: an example. You can adjust the mapping such that you have depth be more significant than translucency if that works better for your application. You can even split it, so there's some depth bits both above and below the translucency type. And that's the beauty of sort-based draw call ordering!</description>
		<content:encoded><![CDATA[<p>&#8220;Typically we want to sort opaque geometry and normal, additive, and subtractive translucent geometry into separate groups.&#8221; can be parsed as either</p>
<p>{ opaque } and { normal, additive, and subtractive translucent }</p>
<p>or as</p>
<p>{ opaque }, { normal translucent }, { additive translucent }, and { subtractive translucent }</p>
<p>to be the separate groups. These two alternatives will give different results, obviously.</p>
<p>Visually, the former is seen as correct (as long as you sort the translucent group correctly), but for games in particular we have traditionally compromised on correctness and done the latter, to cut down on state changes (to make rendering faster).</p>
<p>The example above though is just that though: an example. You can adjust the mapping such that you have depth be more significant than translucency if that works better for your application. You can even split it, so there&#8217;s some depth bits both above and below the translucency type. And that&#8217;s the beauty of sort-based draw call ordering!</p>
]]></content:encoded>
	</item>
</channel>
</rss>
