Skip to content

saturate和max #198

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
qgj opened this issue Dec 22, 2017 · 6 comments
Closed

saturate和max #198

qgj opened this issue Dec 22, 2017 · 6 comments

Comments

@qgj
Copy link

qgj commented Dec 22, 2017

在基础光照那一章中,我看到计算worldNormal与worldLightDir的时候用的是saturate,而后面章节使用的是max(0,…),这两个方法在计算结果上会产生什么差异吗?

@candycat1992
Copy link
Owner

对于那些值范围在-1到1的变量,两个方法在计算结果上没有区别。不过,saturate在Unity里通常会编译成两个指令,大致是min(max(val, 0.0), 1.0),如果自己知道值的范围可以直接使用max来节省一条运算指令。

@qgj
Copy link
Author

qgj commented Dec 25, 2017

好的谢谢,还有一个问题,在求tangentNormal.z的时候dot(tangentNormal.xy,tangentNormal.xy) 这个操作不是很理解

@candycat1992
Copy link
Owner

向量自己dot自己就是对自己取平方操作,因为tangentNormal是单位向量,所以根据xy分量来求z分量。

@qgj
Copy link
Author

qgj commented Dec 25, 2017

哦哦好的,十分感谢

@alasja
Copy link

alasja commented Mar 17, 2018

我在Unity里面,点Shader属性中 “Compile and show code”按钮,看到

u_xlat0.x = dot(u_xlat1.xyz, u_xlat0.xyz);
u_xlat0.x = clamp(u_xlat0.x, 0.0, 1.0);

对应的是 saturate(dot(lightDir, worldNormal)); 。所以应该不是解析成 min(max(val, 0.0), 1.0)吧?

因为今天看到《Real-time Rendering 3rd》5.5节 115页 中提到

staturate is faster than the more general max function on most hardware.

所以有点奇怪。

@candycat1992
Copy link
Owner

candycat1992 commented Mar 19, 2018

@alasja 我之前的回答的确说得很模糊,这里补充下。

在Unity里,同一份ShaderLab代码在不同目标平台上、甚至不同设备上编译结果都是不同的。比如同样一份fragment shader:

fixed4 frag (v2f i) : SV_Target
{
	fixed4 col = i.uv.xyxy;
	col = saturate(col);
	return col;
}

在DX9下,编译出来就是:

-- Fragment shader for "d3d9":
// Stats: 1 math
Shader Disassembly:
//
// Generated by Microsoft (R) HLSL Shader Compiler 10.1
    ps_3_0
    dcl_texcoord_pp v0.xy
    mov_sat_pp oC0, v0.xyxy

// approximately 1 instruction slot used

DX11下就是:

-- Fragment shader for "d3d11":
Shader Disassembly:
//
// Generated by Microsoft (R) D3D Shader Disassembler
//
//
// Input signature:
//
// Name                 Index   Mask Register SysValue  Format   Used
// -------------------- ----- ------ -------- -------- ------- ------
// TEXCOORD                 0   xy          0     NONE   float   xy  
// SV_POSITION              0   xyzw        1      POS   float       
//
//
// Output signature:
//
// Name                 Index   Mask Register SysValue  Format   Used
// -------------------- ----- ------ -------- -------- ------- ------
// SV_Target                0   xyzw        0   TARGET   float   xyzw
//
      ps_4_0
      dcl_input_ps linear v0.xy
      dcl_output o0.xyzw
   0: mov_sat o0.xyzw, v0.xyxy
   1: ret 
// Approximately 0 instruction slots used

OpenGL ES2.0下是:

#ifdef FRAGMENT
varying highp vec2 xlv_TEXCOORD0;
void main ()
{
  lowp vec4 col_1;
  highp vec4 tmpvar_2;
  tmpvar_2 = xlv_TEXCOORD0.xyxy;
  col_1 = tmpvar_2;
  lowp vec4 tmpvar_3;
  tmpvar_3 = clamp (col_1, 0.0, 1.0);
  col_1 = tmpvar_3;
  gl_FragData[0] = tmpvar_3;
}

OpenGLES 3.0下是:

#ifdef FRAGMENT
#version 300 es

precision highp int;
in highp vec2 vs_TEXCOORD0;
layout(location = 0) out mediump vec4 SV_Target0;
void main()
{
    SV_Target0 = vs_TEXCOORD0.xyxy;
#ifdef UNITY_ADRENO_ES3
    SV_Target0 = min(max(SV_Target0, 0.0), 1.0);
#else
    SV_Target0 = clamp(SV_Target0, 0.0, 1.0);
#endif
    return;
}

#endif

Metal下是:

-- Fragment shader for "metal":
Shader Disassembly:
#include <metal_stdlib>
#include <metal_texture>
using namespace metal;
struct Mtl_FragmentIn
{
    float2 TEXCOORD0 [[ user(TEXCOORD0) ]] ;
};

struct Mtl_FragmentOut
{
    float4 SV_Target0 [[ color(0) ]];
};

fragment Mtl_FragmentOut xlatMtlMain(
    Mtl_FragmentIn input [[ stage_in ]])
{
    Mtl_FragmentOut output;
    output.SV_Target0 = input.TEXCOORD0.xyxy;
    output.SV_Target0 = clamp(output.SV_Target0, 0.0f, 1.0f);
    return output;
}

Vulkan太长了我就不粘贴了……

你可以看出来在一些平台下,比如DX9、DX11下,编译出来的确是使用真正的saturate运算操作,而在大多数移动平台上,比如ES 2.0、Metal等,是编译成了clamp,而在ES 3.0下跟设备还有关系。至于为什么要分这么多种情况,Unity应该是根据各个平台每种操作耗时的平衡,选择同等运算结果下最优的等价操作来代替saturate。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants