Shader models are functionality sets introduced in the various Direct3D releases. Because the term has seen widespread use and is often cited to final users, using the term to identify a specific hardware class is now commonplace.
This page actually covers only the effects of different shader models on shaders programming but a shader model usually introduces other functionalities as well, such as better precision, or instancing. Providing description of those features is left to a future revision of this document.
Version | Instruction slots | Constant count |
---|---|---|
vs 1.1 | 96 | (4) |
vs 2.0 | 256 | (4) |
vs 2.a | 256 | (4) |
vs 3.0 | (1) | (4) |
ps 1.1 - ps 1.3 | 8 - 12 | 8 |
ps 1.4 | 28 (two phases) | 8 |
ps 2.0 | 96 | 32 |
ps 2.a | (2) | 32 |
ps 2.b | (2) | 32 |
ps 3.0 | (3) | 224 |
(1): D3DCAPS9.MaxVertexShader30InstructionSlots |
Hardware support for shading models edit
Card (chip) | Vertex shader | Pixel shader |
---|---|---|
GeForce 3 (NV20) | 1.1 | 1.1 |
Radeon 8500 up to 9200 (R200) | 1.1 | 1.4 |
GeForce 4 Ti (NV25) | 1.1 | 1.3 |
Parhelia | 2.0 | 1.3 |
Radeon 9500 and later (R3xx) | 2.0 | 2.0 |
Wildcat VP 10 | 1.1 | 1.2 |
GeForce FX (NV30) | 2.a | 2.a |
GeForce 6xxx (NV4x) | 3.0 | 3.0 |
Radeon X800 (R4xx) | 2.0 | 2.b (1) |
(1): The 2.b pixel shader model is actually a subset of 2.a thus providing less functionality. |
Arithmetic instructions edit
How to read this table: dark gray means not available in this profile.
Instruction mnemonic | Description | Used slots | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Vertex shader | Pixel shader | |||||||||
1.1-1.3 | 1.4 | 2.0 | 2.x | 3.0 | 1.1-1.3 | 1.4 | 2.0 | 3.0 | ||
add | Add two vector registers | 1 | ||||||||
abs | Absolute value | 1 | ||||||||
crs | Cross product | 2 | ||||||||
dp3, dp4 | Dot product of 3D or 4D vectors | 1 | ||||||||
dst | Compute distance vector (???) | 1 | ||||||||
exp/expp | 2x, full and partial precision. | 10/1 | 1/1 | |||||||
frc | Fractional part | 3 | 1 | |||||||
lrp | Linear interpolation of values. | 2 | ||||||||
lit | ??? | 1 | 3??? | |||||||
log/logp | log2(x), full and partial precision. | 10/1 | 1/1 | |||||||
m3x2/m3x3/m3x4 | Multiply two 3xn matrices. | 2/3/4 | ||||||||
m4x3/m4x4 | Multiply two 4xn matrices. | 3/4 | ||||||||
mad | Multiply first two vectors, then add third one. | 1 | ||||||||
min/max | Return component-wise min/max vector. | 1 | ||||||||
mov | Move ??first?? to ??second?? | 1 | ||||||||
mova | Move float value to address register. | 1 | ||||||||
mul | Component-wise multiply | 1 | ||||||||
nop | No operation. | 1 | ||||||||
nrm | Normalize vector | 3 | ||||||||
pow | xy | 3 | ||||||||
rcp | Reciprocal. | 1 | ||||||||
rsq | Reciprocal SQuare root. | 1 | ||||||||
sincos | Optimized computation of both sine and cosine. | 8 | ||||||||
sge | Set if greater or equal than. | 1 | ||||||||
sgn | Sign | 3 | ||||||||
slt | Set if less than. | 1 | ||||||||
sub | Subtract. | 1 |
Flow instructions edit
Instruction mnemonic | Description | Used slots | |||||||
---|---|---|---|---|---|---|---|---|---|
Vertex shader | Pixel shader | ||||||||
2.0 | 2.x | 3.0 | 1.1-1.3 | 1.4 | 2.0 | 3.0 | |||
call | "Push IP and jump" to subroutine. | 2 | |||||||
callnz bool | Conditional jump to subroutine if boolean register is not zero. | 3 | |||||||
ret | Return from subroutine. | 1 | |||||||
if bool | Execute block if condition is met. | 3 | |||||||
else | Execute block if condition is not met. | 1 | |||||||
endif | End of if/else instructions. | 1 | |||||||
loop | Begin loop instruction block. | 3 | |||||||
endloop | End of loop instruction block. | 2 | |||||||
rep | Begin rep instruction block | 2 | |||||||
endrep | End of rep instruction block. | 2 |
Setup instructions edit
All setup instructions take 0 slots.
Instruction mnemonic | Description | Notes | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
dcl_usage_input | Bind vertex stream to input register. | |||||||||
def | Define constant. | |||||||||
defb | Define boolean constant (for static flow control). | Introduced in vs2.0 | ||||||||
defi | Define integer constant (for static flow control). | Introduced in vs2.0 | ||||||||
label | ??? | Introduced in vs2.0 | ||||||||
vs | Declare profile, must be first instruction. | VS only |
Vertex shader registers edit
All registers are four-component wide, unless otherwise noted. RIVEDERE QUESTA TABELLA
Register mnemonic | Description | Count | Read/Write | Relative addressing | Notes | ||
---|---|---|---|---|---|---|---|
1.1 | In | a0 | Address register | 14 | RW | No | Only .x write mask allowed in vs1.x, all 4 comps available on vs2 |
aL | Loop register | 1 | Ro | No | Only .x write mask allowed in vs1.x, all 4 comps available on vs2 | ||
cn | Float constant register | >96 (1) | Ro | Use a0.x | Default 0,0,0,0., INF read/ist in vs1.x, 2read/inst in vs2 | ||
vn | Input register from vertex stream | 16 | Ro | No | Default 0,0,0,1. | ||
rn | Temporary register | 12 | RW | No | Undefined, will cause error if read before initialization. | ||
Out | ??? | ??? | ??? | ??? | ??? | ??? |
Pixel shader registers edit
All registers are four-component wide, unless otherwise noted.
Register mnemonic | Description | Count | Read/Write | Dimensionality | Notes | ||
---|---|---|---|---|---|---|---|
1.1 | In | ??? | ??? | ??? | ??? | ??? | ??? |
Out | oPos | Position register | 1 | Wo | 4D | ||
oFog | Fog density register | 1 | Wo | 1D | |||
oDn | Color register | 2 | Wo | 4D | oD0 is diffuse color, oD1 is specular. | ||
oTn | Texcoord register | 8 | Wo | 4D | Indexable as oT[a0.x+n]. |
1.1 first released
2.0 static flow control, 4 cmp address reg, new ops, new regs
2.x dyn flow cntr, nesting, more temps, predication, new ops, new regs
3.0 tex lookups (samplers), indexable regs, 32tmps, new ops.
- ^ "Programming vertex and pixel shaders" by Wolfgang Engel, ISBN 1-58450-349-1.
- ^ Microsoft MSDN, Direct3D reference, the D3DCAPS9 and D3DPSHADERCAPS2_0 structures.