Cross-Platform Shader Handling

Posted on
gamedev coding

When targeting all major mobile platforms and windows with a tiny team, writing all shaders for OpenGL ES3.x, D3D11 and Metal is a time consuming and error prone task. A desirable goal is to write shaders once and cross-compile or translate them for all target platforms. Windows being the primary development platform, therefore shaders are written in HLSL first. Unfortunately there’s not just pure HLSL code - there’s also quite a bunch of state associated with shaders.

This article gives an overview over our current approach.

  • Shader libraries
  • Code generation
  • Translation and compilation
  • Tooling

Terminology

The term shader has alot of different meanings in computer graphics. At the lowest level it is used for shader binaries or functions running on GPUs for vertex, pixel, geometry shader stages or compute jobs. At the next level the term is used for shader for render (pipeline) state linked with shader functions and so on.

In this context the following terminology is used:

stage
is the lowest building block. It’s the shader binary/function as used in the graphics APIs. One function per pipeline stage (i.e. vertex,fragment,geometry,compute)
technique
links shader functions/stages with render state
shader
is a collection of techniques
shaderlib
file contains multiple shaders - a shader library

Shader Libraries

As mentioned above - shaders are not only pure HLSL/GLSL/MSL code but also require some additional render state. The purpose of shaderlib files is to define collections of shaders which are translated and or compiled to platform specific shader binaries and render state. The goal is to provide enough metadata for the render system to feed them with data and setup what is usually called render pipeline state in modern GPU APIs (DX12,Metal,Vk).

Shaderlibs are built from the following components:

graph TD; ShaderLib --> CB(Common Blocks) CB --> RSETS(Resource Sets) CB --> PSETS(Param Sets) CB --> CBLOCK(Code Blocks) CB --> SBLOCKS(Stage Blocks) ShaderLib --> B(Shader A) ShaderLib --> B1(Shader B) B --> C(Technique 1) B --> D(Technique 2) D --> F[Render State] D --> G[Vertex Layout] D --> H[Vertex Shader Stage] D --> I[Geometry Shader Stage] D --> J[Pixel Shader Stage] D --> K[Parameters] D --> L[Resources]

The full specification and syntax is beyond the scope of the this article. The structure is heavily inspired by the one described on the excellent Our Machinery Blog.

Shaderlibs are written in HJSON, a relaxed JSON variant with a few more convenience features and look like this:

{

	VertexLayouts: {
		default_layout: [
			{ type:"float3", name:"position", slot:0, offset:0 },
			{ type:"float3", name:"normal", slot:1, offset: 12 }
		]
        ...
	}

	ParamBlocks: {
		material_params: [
			{ type:"float3", name:"materialColorDiffuse"}
			{ type:"float" , name:"materialOpacity"}
			...
		]

		view_params: [
			{ type:"float4x4", name:"viewMatrix" }
            ...
		]
	}

	ShaderStageBlocks: {
			mesh_vs: {
				Exports: [
					{type:"float4", name:"position", semantic:"Position"}
					{type:"float3", name:"normal"}
					{type:"float3", name:"view_dir"}
					{type:"float3", name:"wsposition"}
				]
				Code: '''
					float4 pos = float4(input.position.xyz, 1.0);
					...
					return output;
				'''
			}
	}


	Shaders: {
		
		"mesh/default": {
			RenderState: {
				CullMode: Back
			}

			Params: [
				{ slot: "VIEW", ParamBlock: "view_params" }
				{ slot: "MATERIAL", ParamBlock: "material_params" }
				{ slot: "OBJECT", ParamBlock: "object_params" }
			]

			VertexLayout: default_layout

			Techniques: {
				default: {
					ShaderLang: hlsl
					TargetProfile: [ "d3d_4_0","gles3"]

					VertexShader: mesh_vs // reference a common vertex shader stage block, could be used in multiple techniques/shaders

					PixelShader: {
						Exports: [
							{type:"float4", name:"color", semantic:"RenderTarget0"}
						]
						Code: '''
							float3 nrm = -normalize(cross( ddx(input.wsposition),ddy(input.wsposition)));

							const float3 sunPos = float3(100.0,100.0,1000.0);
							float3 lightDir = normalize( sunPos);
							float lDotN = dot(lightDir, nrm)*0.5+0.5;

							float3 col = materialColorDiffuse.xyz * lDotN + float3(0.1,0.1,0.12);
							output.color = float4(col.x, col.y, col.z, 1.0f);
							return output;
						'''
					}
				}
			}
		
		}
	} 
	// END: Shaders

}

As outline in the diagram above, shader libraries contain a collection of shaders which contain a number of techniques. We generate a runtime render pipeline state for every Technique of a shader.

Constant/uniform buffer layout is not defined directly in shader code but explicitly via ParamBlocks. Code generator then generates the cbuffer/uniform buffer layouts out of this ensuring consistent padding and alignment. Another option would be to use shader reflection but this would require separate implementations for every target bytecode format (min. DX and SPIR-V).

Same applies to VertexLayout and shader stage inputs and outputs. Every stage defines Exports which are then used as input for the next stage in the graphics pipeline.

Code Generation: Shaderlib to HLSL

Shaderlibs are processed using the Shader Compiler command-line tool which generates the actual shader in the target platform’s byte- or source code format along with all render state in a proprietary binary format.

The shader compiler auto-generates boilerplate code of shader stage functions for parameter and resource blocks, for vertex layouts and samplers.

To provide helpful error messages for generated target shader code, the generator injects

#line [lineno] [file_uri]

statements, referencing the corresponding parts in the source .shaderlib file as far as possible.

Defines

The output starts with some compiler specific preprocessor defines. Those are always available and can be used in user shader code.

#define P_SC 1
#define P_SC_OPTIMIZED 0
#define P_SC_DEBUG 1
#define P_SC_TARGET_D3D_4_0_LEVEL_9_3 1
#define P_SC_TARGET_D3D 1
#define M_PI           3.14159265358979323846  /* pi */
#define M_PI_2         1.57079632679489661923  /* pi/2 */

...
#define USER_DEFINES....

User defines specified in the stage Defines property are appended right after the system defines (see ShaderStage subsection).

Parameter Blocks

Parameter bindings specified by Params are mapped to constant buffer blocks. Register are allocated per binding automatically and use the same allocation strategy as the render system. The names of parameters in a block are preserved.

// myshaders.shaderlib:194
ParamBlock: [ 
	{type:"float4x4", name:"u_modelViewProj"}	
	{type:"float4", name:"u_paintMat_0"} 
	{type:"float4", name:"u_paintMat_1"}
	{type:"float4", name:"u_paintMat_2"}
	{type:"float4", name:"u_extentRadiusFeather"}
	{type:"float4", name:"u_innerCol"}
	{type:"float4", name:"u_outerCol"}
]

Get translated to the following output:

#line 193 "/path/to/myshaders.shaderlib"
cbuffer ParamBlock_1 : register(b1)
{
#line 194 "/path/to/myshaders.shaderlib"
	float4x4	u_modelViewProj;
#line 196 "/path/to/myshaders.shaderlib"
	float4	u_paintMat_0;
#line 197 "/path/to/myshaders.shaderlib"
	float4	u_paintMat_1;
#line 198 "/path/to/myshaders.shaderlib"
	float4	u_paintMat_2;
#line 199 "/path/to/myshaders.shaderlib"
	float4	u_extentRadiusFeather;
#line 200 "/path/to/myshaders.shaderlib"
	float4	u_innerCol;
#line 201 "/path/to/myshaders.shaderlib"
	float4	u_outerCol;
}

Stage Inputs & Exports

Each shader stage has inputs and outputs. A vertex shaders input is specified by the vertex layout (see VertexLayout subsection). A pixel shader gets its input from the vertex shader stage exports (see ShaderStage subsection). The shader compiler autogenerated the input and output structures.

For a vertex shader RSInput is defined by the vertex layout.

struct RSInput
{
#line 185 "/path/to/myshaders.shaderlib"
	float2	a_position:RS_VTXATTR0;
	float4	a_color0:RS_VTXATTR1;
};

The output RSOutput is defined by the Exports property and is automatically used as input for the following shader stage.

struct RSOutput
{
	float4	v_position:SV_Position;
	float4	v_color0:RS_ATTRIB_1;
	float2	v_texcoord0:RS_ATTRIB_2;
};

At this stage the imported code blocks are merged into the output

//
// Imported code blocks
//

void function_from_utils_block(..)
{

}

..

Stage Main Function

Finally the shader stage main function is generated.

RSOutput main(RSInput input)
{
	RSOutput output;

Now the body of the main function as specified in the Code property starts (see ShaderStage subsection).

#line 99 "/path/to/myshaders.shaderlib"
	#define u_paintMat (float3x3(u_paintMat_0.xyz, u_paintMat_1.xyz, u_paintMat_2.xyz))

	output.v_position = mul(u_modelViewProj, float4(input.a_position, 0.0, 1.0) );
	output.v_texcoord0 = mul(float3(input.a_position, 1.0), u_paintMat).xy;
	output.v_color0 = input.a_color0;
	return output;

The user main function body is responsible for returning output.

}

The shader is complete now.

Target Compilation

The generated shader is now fed into the target platform compiler if the source shader language is supported. This means the D3D HLSL compiler on windows, the MetalSL compiler on macOS/iOS. GLSL is processed using the Khronos GLSlang reference compiler for verification. Since GLSL does not have an intermediate or binary format the compiler step is a verification step to catch any errors. The GLSL shaders are compiled on-line by the GLES driver.

If HLSL is used as source shader language, but the target is not D3D, then the shaders will be cross-compiled.

Cross-Compilation

Since the goal is to write all shaders in HLSL, cross-compilation is the default when targeting GLSL/MetalSL-based platforms. There are several options to translate HLSL to GLSL and MetalSL. Generally one has to decide if translation should be based on bytecode/IL or source. Projects like HLSLcc allow translation of DirectX bytecode (DXBC) to MSL or GLSL.

Drawback of this approach is - at least when starting from DXBC, that shaders can only be compiled on windows, the (old) directx shader compiler (fxc.exe) is not available on linux or macOS.

For DX12 onwards Microsoft started work on a new DirectX shader compiler: DXC. DXC is open-source, based on LLVM and works on Windows, macOS and Linux. It outputs DXIL - DirectX Intermediate Language instead of DXBC. What makes DXC interesting for cross-platform work is that it also comes with SPIR-V support.

In the Khronos (OpenGL/Vulkan) ecosystem GLSL shaders are either directly compiled at runtime by the driver (OpenGL ES) or by the Khronos Reference Compiler which is shipping with the Vulkan SDK. In the later case the output is SPIR-V which can be used with the Vulkan runtime.

Little known is the fact, that the Khronos GLSL compiler also got a HLSL frontend. This means it’s also possible to produce SPIR-V IL from HLSL, not just GLSL source.

So there are two options to get SPIR-V from HLSL

  • Microsoft DirectX Shader Compiler (DXC)
  • Khronos GLSL Reference Compiler

We decided to go with the Khronos compiler for now, since it’s far easier to build and integrate. In the long-term DXC should be the better choice.

The last building block is SPIR-V Cross. SPIR-V Cross is a library designed for parsing SPIR-V IL, provide reflection data and generation of high-level HLSL, GLSL and MSL code.

The cross-compilation result is then processed by the target platform compiler as if the source language was directly supported.

For GLSL some additional optimizations and validation is employed to ensure optimal output. Additionally the #line .. defines are stripped out from the final GLSL code since we us the extended version with file, line and column information which requires an GLSL extension rarely supported on target drivers.

The following graph shows the full processing pipeline:

graph TD; shlb(Shaderlib Binary) spv(SPIR-V) SL(.shaderlib) --> Parser Parser --> IL(Intermediate Representation) IL --> Generator Generator --> hlsl(HLSL) hlsl --> fxc[D3D HLSL Compiler] fxc --> shlb Generator --> glslc[Khronos GLSL Reference Compiler] glslc --> spv spv --> spvx[SPIR-V Cross] spvx --> glsl(GLSL) glsl --> glslv[Khronos GLSL Validator] glslv --> shlb spvx --> msl(MetalSL) msl --> mtlc[MetalSL Compiler] mtlc --> shlb

Tooling / IDE support

Custom rules/build step are used to handle .shaderlib files in Visual Studio and Xcode as source files. As a mimimum diagnostics like compiler warnings and errors should show up properly in the IDE. Additionally basic syntax highlighting should be provided to make editing shaderlib files within an IDE more pleasantly.

Visual Studio and Xcode are the primary IDEs used. Additionally VSCode is used. Tooling support is therefor targeted at those three.

Diagnostics

Let’s start with diagnostics. The IDE has to understand that format of warnings and errors of the shader compiler. This means that on windows the output is formated to be MSBuild/Visual Studio aware:

sourcefile(lineno,column): warning CS0168: The variable 'foo' is declared but never used

On macOS the GCC/Clang output format is used:

sourcefile:lineno:column: message

Since we place #line ... markers during code generation we get error messages/warnings directly for the source shaderlib file.

Syntax Highlighting

Currently only Visual Studio and VSCode provide extension points for syntax highlighting. Visual Studio supports TextMaker .tmLanguage files out of the box. It isn’t even necessary to write an extension/add-on. All it takes is placing the *.tmLanguage files in %USERPROFILE%\.vs\Extensions\Shaderlib\*

The official HJSON site provides a HJSON.tmLanguage. To get embedded HLSL highlighting within multi-line code strings, the HLSL.tmLanguage from VSCode is used and activated within mutli-line string sections.

    ....
	<!-- modify mstring rule.. -->
	<key>mstring</key>
      <dict>
        <key>begin</key>
        <string>'''</string>
        <key>beginCaptures</key>
        <array>
          <dict/>
        </array>
        <key>end</key>
        <string>(''')(?:\s*((?:[^\s#/]|/[^/*]).*)$)?</string>
        <key>endCaptures</key>
        <dict>
          <key>1</key>
          <dict/>
          <key>2</key>
          <dict>
            <key>name</key>
            <string>invalid.illegal.value.shaderlib</string>
          </dict>
        </dict>
        <key>patterns</key>
        <array>
          <dict>
            <key>include</key>
			<!-- Activate HLSL here... -->
            <string>source.hlsl</string>
          </dict>
        </array>
        <key>name</key>
        <string>string.quoted.multiline.shaderlib</string>
      </dict>

Visual Studio Code provides a rich extension API. Information on how to add custom syntax highlighting can be found in the VSCode documentation.

Wrap up

The pipeline outlined in this post has served us well for a year now and was suprisingly straightforward to build. This is largely due to the excellent SPIR-V Cross project.

There are also some more generic solutions out there which are also based on SPIR-V providing similar or better functionality.

Happy to discuss on Twitter