|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 | Next > |
|
|
gallium: add blitterHi Keith,
I've finished the blitter module. It fully implements the clear, surface_copy, and surface_fill functions. It properly fallbacks to software in case a surface cannot be sampled or rendered to according to usage. Copying a stencil buffer always fallbacks unless the ignore_stencil parameter (see util_blitter_copy) is set to TRUE. To my knowledge, GPUs cannot copy the stencil buffer (not sure if fiddling with texture formats can help). It's all documented in u_blitter.h. The pipe driver can optionally hook up a function to draw a quad (blitter_context::draw_quad). I realized that embedding 4 vertices into a command stream (AKA immediate mode) is much faster than writing them to a vertex buffer due to reduced driver overhead. It might be worth to consider adding the draw_quad function to pipe_context. When working on the blitter, I added the following things to util/u_simple_shaders: - util_make_fragment_tex_shader has a new parametr tex_target and the value should be one of TGSI_TEXTURE_* enums so that it can be used to sample from any kind of texture. - Added util_make_fragment_tex_shader_writedepth, which writes depth sampled from a texture. It's used for copying depth textures. - Added util_make_fragment_clonecolor_shader, which copies input COLOR[0] to a specified number of render targets. It's used to clear MRTs. Also, I moved the code for converting 2D texture coordinates into cubemap texture coordinates from u_gen_mipmap to a new function in util/u_texture. Please review/push. Once it gets approved, I will send patches with r300g blit support to Corbin. With this work, untiling a texture will be as easy as calling surface_copy whereas the driver state remains intact (theoretically). Cheers. Marek On Thu, Dec 10, 2009 at 6:23 PM, Keith Whitwell <keithw@...> wrote: > On Thu, 2009-12-10 at 01:52 -0800, Marek Olšák wrote: >> Keith, >> >> I've taken your comment into consideration and started laying out a >> new simple driver module which I call Blitter. The idea is to provide >> acceleration for operations like clear, surface_copy, and >> surface_fill. The module doesn't depend on a CSO context, instead, a >> driver must call appropriate util_blitter_save* functions to save CSOs >> and a blit operation takes care of their restoration once it's done. >> >> I attached a patch illustrating the idea with the clear implemented >> and a working example of usage, but it's not ready to get pushed yet. >> >> Please tell me what you think about it. > > Marek, > > This looks good to me. It looks like this approach keeps the > implementation entirely on the driver side of the interface, which is > what I was hoping for. > > I had assumed that doing this type of operation in the driver would > require assistance "from above" for saving and restoring state. But it > seems like you've been able to do without that, which is nice. > > Let me know how it progresses. > > Keith > > [0001-util-add-new-fragment-shaders-to-simple_shaders.patch] From 511f58a54315d07740493cdda050d1ebd5a4ecd3 Mon Sep 17 00:00:00 2001 From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...> Date: Sat, 12 Dec 2009 06:34:29 +0100 Subject: [PATCH 1/3] util: add new fragment shaders to simple_shaders New shaders: * Fragment shader which writes depth sampled from a texture * Fragment shader which copies COLOR[0] to multiple render targets Additional improvements: * The fragment 'tex' shaders now take a sampler type (TGSI_TEXTURE_*) so that they can sample from any type of texture, not only from a 2D one. --- src/gallium/auxiliary/util/u_blit.c | 7 ++- src/gallium/auxiliary/util/u_gen_mipmap.c | 2 +- src/gallium/auxiliary/util/u_simple_shaders.c | 70 ++++++++++++++++++++++--- src/gallium/auxiliary/util/u_simple_shaders.h | 13 ++++- 4 files changed, 80 insertions(+), 12 deletions(-) diff --git a/src/gallium/auxiliary/util/u_blit.c b/src/gallium/auxiliary/util/u_blit.c index abe1de3..c9050ca 100644 --- a/src/gallium/auxiliary/util/u_blit.c +++ b/src/gallium/auxiliary/util/u_blit.c @@ -126,7 +126,8 @@ util_create_blit(struct pipe_context *pipe, struct cso_context *cso) } /* fragment shader */ - ctx->fs[TGSI_WRITEMASK_XYZW] = util_make_fragment_tex_shader(pipe); + ctx->fs[TGSI_WRITEMASK_XYZW] = + util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_2D); ctx->vbuf = NULL; /* init vertex data that doesn't change */ @@ -420,7 +421,9 @@ util_blit_pixels_writemask(struct blit_state *ctx, cso_set_sampler_textures(ctx->cso, 1, &tex); if (ctx->fs[writemask] == NULL) - ctx->fs[writemask] = util_make_fragment_tex_shader_writemask(pipe, writemask); + ctx->fs[writemask] = + util_make_fragment_tex_shader_writemask(pipe, TGSI_TEXTURE_2D, + writemask); /* shaders */ cso_set_fragment_shader_handle(ctx->cso, ctx->fs[writemask]); diff --git a/src/gallium/auxiliary/util/u_gen_mipmap.c b/src/gallium/auxiliary/util/u_gen_mipmap.c index 83263d9..1728e66 100644 --- a/src/gallium/auxiliary/util/u_gen_mipmap.c +++ b/src/gallium/auxiliary/util/u_gen_mipmap.c @@ -1317,7 +1317,7 @@ util_create_gen_mipmap(struct pipe_context *pipe, } /* fragment shader */ - ctx->fs = util_make_fragment_tex_shader(pipe); + ctx->fs = util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_2D); /* vertex data that doesn't change */ for (i = 0; i < 4; i++) { diff --git a/src/gallium/auxiliary/util/u_simple_shaders.c b/src/gallium/auxiliary/util/u_simple_shaders.c index 1c8b157..8172ead 100644 --- a/src/gallium/auxiliary/util/u_simple_shaders.c +++ b/src/gallium/auxiliary/util/u_simple_shaders.c @@ -2,6 +2,7 @@ * * Copyright 2008 Tungsten Graphics, Inc., Cedar Park, Texas. * All Rights Reserved. + * Copyright 2009 Marek Olšák <maraeo@...> * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the @@ -30,6 +31,7 @@ * Simple vertex/fragment shader generators. * * @author Brian Paul + Marek Olšák */ @@ -87,6 +89,7 @@ util_make_vertex_passthrough_shader(struct pipe_context *pipe, */ void * util_make_fragment_tex_shader_writemask(struct pipe_context *pipe, + unsigned tex_target, unsigned writemask ) { struct ureg_program *ureg; @@ -116,20 +119,63 @@ util_make_fragment_tex_shader_writemask(struct pipe_context *pipe, ureg_TEX( ureg, ureg_writemask(out, writemask), - TGSI_TEXTURE_2D, tex, sampler ); + tex_target, tex, sampler ); ureg_END( ureg ); return ureg_create_shader_and_destroy( ureg, pipe ); } void * -util_make_fragment_tex_shader(struct pipe_context *pipe ) +util_make_fragment_tex_shader(struct pipe_context *pipe, unsigned tex_target ) { return util_make_fragment_tex_shader_writemask( pipe, + tex_target, TGSI_WRITEMASK_XYZW ); } +/** + * Make a simple fragment texture shader which reads an X component from + * a texture and writes it as depth. + */ +void * +util_make_fragment_tex_shader_writedepth(struct pipe_context *pipe, + unsigned tex_target) +{ + struct ureg_program *ureg; + struct ureg_src sampler; + struct ureg_src tex; + struct ureg_dst out, depth; + struct ureg_src imm; + ureg = ureg_create( TGSI_PROCESSOR_FRAGMENT ); + if (ureg == NULL) + return NULL; + + sampler = ureg_DECL_sampler( ureg, 0 ); + + tex = ureg_DECL_fs_input( ureg, + TGSI_SEMANTIC_GENERIC, 0, + TGSI_INTERPOLATE_PERSPECTIVE ); + + out = ureg_DECL_output( ureg, + TGSI_SEMANTIC_COLOR, + 0 ); + + depth = ureg_DECL_output( ureg, + TGSI_SEMANTIC_POSITION, + 0 ); + + imm = ureg_imm4f( ureg, 0, 0, 0, 1 ); + + ureg_MOV( ureg, out, imm ); + + ureg_TEX( ureg, + ureg_writemask(depth, TGSI_WRITEMASK_Z), + tex_target, tex, sampler ); + ureg_END( ureg ); + + return ureg_create_shader_and_destroy( ureg, pipe ); +} /** * Make simple fragment color pass-through shader. @@ -137,9 +183,18 @@ util_make_fragment_tex_shader(struct pipe_context *pipe ) void * util_make_fragment_passthrough_shader(struct pipe_context *pipe) { + return util_make_fragment_clonecolor_shader(pipe, 1); +} + +void * +util_make_fragment_clonecolor_shader(struct pipe_context *pipe, int num_cbufs) +{ struct ureg_program *ureg; struct ureg_src src; - struct ureg_dst dst; + struct ureg_dst dst[8]; + int i; + + assert(num_cbufs <= 8); ureg = ureg_create( TGSI_PROCESSOR_FRAGMENT ); if (ureg == NULL) @@ -148,12 +203,13 @@ util_make_fragment_passthrough_shader(struct pipe_context *pipe) src = ureg_DECL_fs_input( ureg, TGSI_SEMANTIC_COLOR, 0, TGSI_INTERPOLATE_PERSPECTIVE ); - dst = ureg_DECL_output( ureg, TGSI_SEMANTIC_COLOR, 0 ); + for (i = 0; i < num_cbufs; i++) + dst[i] = ureg_DECL_output( ureg, TGSI_SEMANTIC_COLOR, i ); + + for (i = 0; i < num_cbufs; i++) + ureg_MOV( ureg, dst[i], src ); - ureg_MOV( ureg, dst, src ); ureg_END( ureg ); return ureg_create_shader_and_destroy( ureg, pipe ); } - - diff --git a/src/gallium/auxiliary/util/u_simple_shaders.h b/src/gallium/auxiliary/util/u_simple_shaders.h index d2e80d6..6e76094 100644 --- a/src/gallium/auxiliary/util/u_simple_shaders.h +++ b/src/gallium/auxiliary/util/u_simple_shaders.h @@ -51,16 +51,25 @@ util_make_vertex_passthrough_shader(struct pipe_context *pipe, extern void * util_make_fragment_tex_shader_writemask(struct pipe_context *pipe, - unsigned writemask ); + unsigned tex_target, + unsigned writemask); extern void * -util_make_fragment_tex_shader(struct pipe_context *pipe); +util_make_fragment_tex_shader(struct pipe_context *pipe, unsigned tex_target); + + +extern void * +util_make_fragment_tex_shader_writedepth(struct pipe_context *pipe, + unsigned tex_target); extern void * util_make_fragment_passthrough_shader(struct pipe_context *pipe); +extern void * +util_make_fragment_clonecolor_shader(struct pipe_context *pipe, int num_cbufs); + #ifdef __cplusplus } #endif -- 1.6.3.3 [0002-util-add-a-function-which-converts-2D-coordinates-to.patch] From dddb77c058d67c0a192b871deb8d837dfabbefce Mon Sep 17 00:00:00 2001 From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...> Date: Sat, 12 Dec 2009 23:38:17 +0100 Subject: [PATCH 2/3] util: add a function which converts 2D coordinates to cubemap coordinates The code was taken over from u_gen_mipmap. --- src/gallium/auxiliary/util/Makefile | 1 + src/gallium/auxiliary/util/SConscript | 1 + src/gallium/auxiliary/util/u_gen_mipmap.c | 55 +--------------- src/gallium/auxiliary/util/u_texture.c | 102 +++++++++++++++++++++++++++++ src/gallium/auxiliary/util/u_texture.h | 54 +++++++++++++++ 5 files changed, 161 insertions(+), 52 deletions(-) create mode 100644 src/gallium/auxiliary/util/u_texture.c create mode 100644 src/gallium/auxiliary/util/u_texture.h diff --git a/src/gallium/auxiliary/util/Makefile b/src/gallium/auxiliary/util/Makefile index 1d8bb55..894958f 100644 --- a/src/gallium/auxiliary/util/Makefile +++ b/src/gallium/auxiliary/util/Makefile @@ -30,6 +30,7 @@ C_SOURCES = \ u_stream_stdc.c \ u_stream_wd.c \ u_surface.c \ + u_texture.c \ u_tile.c \ u_time.c \ u_timed_winsys.c \ diff --git a/src/gallium/auxiliary/util/SConscript b/src/gallium/auxiliary/util/SConscript index 8d99106..0c0e048 100644 --- a/src/gallium/auxiliary/util/SConscript +++ b/src/gallium/auxiliary/util/SConscript @@ -48,6 +48,7 @@ util = env.ConvenienceLibrary( 'u_stream_stdc.c', 'u_stream_wd.c', 'u_surface.c', + 'u_texture.c', 'u_tile.c', 'u_time.c', 'u_timed_winsys.c', diff --git a/src/gallium/auxiliary/util/u_gen_mipmap.c b/src/gallium/auxiliary/util/u_gen_mipmap.c index 1728e66..69ff3b9 100644 --- a/src/gallium/auxiliary/util/u_gen_mipmap.c +++ b/src/gallium/auxiliary/util/u_gen_mipmap.c @@ -46,6 +46,7 @@ #include "util/u_gen_mipmap.h" #include "util/u_simple_shaders.h" #include "util/u_math.h" +#include "util/u_texture.h" #include "cso_cache/cso_context.h" @@ -1383,59 +1384,9 @@ set_vertex_data(struct gen_mipmap_state *ctx, static const float st[4][2] = { {0.0f, 0.0f}, {1.0f, 0.0f}, {1.0f, 1.0f}, {0.0f, 1.0f} }; - float rx, ry, rz; - uint i; - - /* loop over quad verts */ - for (i = 0; i < 4; i++) { - /* Compute sc = +/-scale and tc = +/-scale. - * Not +/-1 to avoid cube face selection ambiguity near the edges, - * though that can still sometimes happen with this scale factor... - */ - const float scale = 0.9999f; - const float sc = (2.0f * st[i][0] - 1.0f) * scale; - const float tc = (2.0f * st[i][1] - 1.0f) * scale; - - switch (face) { - case PIPE_TEX_FACE_POS_X: - rx = 1.0f; - ry = -tc; - rz = -sc; - break; - case PIPE_TEX_FACE_NEG_X: - rx = -1.0f; - ry = -tc; - rz = sc; - break; - case PIPE_TEX_FACE_POS_Y: - rx = sc; - ry = 1.0f; - rz = tc; - break; - case PIPE_TEX_FACE_NEG_Y: - rx = sc; - ry = -1.0f; - rz = -tc; - break; - case PIPE_TEX_FACE_POS_Z: - rx = sc; - ry = -tc; - rz = 1.0f; - break; - case PIPE_TEX_FACE_NEG_Z: - rx = -sc; - ry = -tc; - rz = -1.0f; - break; - default: - rx = ry = rz = 0.0f; - assert(0); - } - ctx->vertices[i][1][0] = rx; /*s*/ - ctx->vertices[i][1][1] = ry; /*t*/ - ctx->vertices[i][1][2] = rz; /*r*/ - } + util_map_texcoords2d_onto_cubemap(face, &st[0][0], 2, + &ctx->vertices[0][1][0], 8); } else { /* 1D/2D */ diff --git a/src/gallium/auxiliary/util/u_texture.c b/src/gallium/auxiliary/util/u_texture.c new file mode 100644 index 0000000..cd477ab --- /dev/null +++ b/src/gallium/auxiliary/util/u_texture.c @@ -0,0 +1,102 @@ +/************************************************************************** + * + * Copyright 2008 Tungsten Graphics, Inc., Cedar Park, Texas. + * All Rights Reserved. + * Copyright 2008 VMware, Inc. All rights reserved. + * Copyright 2009 Marek Olšák <maraeo@...> + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * "Software"), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **************************************************************************/ + +/** + * @file + * Texture mapping utility functions. + * + * @author Brian Paul + * Marek Olšák + */ + +#include "pipe/p_defines.h" + +#include "util/u_texture.h" + +void util_map_texcoords2d_onto_cubemap(unsigned face, + const float *in_st, unsigned in_stride, + float *out_str, unsigned out_stride) +{ + int i; + float rx, ry, rz; + + /* loop over quad verts */ + for (i = 0; i < 4; i++) { + /* Compute sc = +/-scale and tc = +/-scale. + * Not +/-1 to avoid cube face selection ambiguity near the edges, + * though that can still sometimes happen with this scale factor... + */ + const float scale = 0.9999f; + const float sc = (2 * in_st[0] - 1) * scale; + const float tc = (2 * in_st[1] - 1) * scale; + + switch (face) { + case PIPE_TEX_FACE_POS_X: + rx = 1; + ry = -tc; + rz = -sc; + break; + case PIPE_TEX_FACE_NEG_X: + rx = -1; + ry = -tc; + rz = sc; + break; + case PIPE_TEX_FACE_POS_Y: + rx = sc; + ry = 1; + rz = tc; + break; + case PIPE_TEX_FACE_NEG_Y: + rx = sc; + ry = -1; + rz = -tc; + break; + case PIPE_TEX_FACE_POS_Z: + rx = sc; + ry = -tc; + rz = 1; + break; + case PIPE_TEX_FACE_NEG_Z: + rx = -sc; + ry = -tc; + rz = -1; + break; + default: + rx = ry = rz = 0; + assert(0); + } + + out_str[0] = rx; /*s*/ + out_str[1] = ry; /*t*/ + out_str[2] = rz; /*r*/ + + in_st += in_stride; + out_str += out_stride; + } +} diff --git a/src/gallium/auxiliary/util/u_texture.h b/src/gallium/auxiliary/util/u_texture.h new file mode 100644 index 0000000..93b2f1e --- /dev/null +++ b/src/gallium/auxiliary/util/u_texture.h @@ -0,0 +1,54 @@ +/************************************************************************** + * + * Copyright 2009 Marek Olšák <maraeo@...> + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * "Software"), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **************************************************************************/ + +#ifndef U_TEXTURE_H +#define U_TEXTURE_H + +#ifdef __cplusplus +extern "C" { +#endif + +/** + * Convert 2D texture coordinates of 4 vertices into cubemap coordinates + * in the given face. + * Coordinates must be in the range [0,1]. + * + * \param face Cubemap face. + * \param in_st 4 pairs of 2D texture coordinates to convert. + * \param in_stride Stride of in_st in floats. + * \param out_str STR cubemap texture coordinates to compute. + * \param out_stride Stride of out_str in floats. + */ +void util_map_texcoords2d_onto_cubemap(unsigned face, + const float *in_st, unsigned in_stride, + float *out_str, unsigned out_stride); + + +#ifdef __cplusplus +} +#endif + +#endif -- 1.6.3.3 [0003-util-add-blitter.patch] From 0917877d9326d63378548defce0d7233b90f4b60 Mon Sep 17 00:00:00 2001 From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...> Date: Thu, 10 Dec 2009 10:25:33 +0100 Subject: [PATCH 3/3] util: add blitter --- src/gallium/auxiliary/util/Makefile | 1 + src/gallium/auxiliary/util/SConscript | 1 + src/gallium/auxiliary/util/u_blitter.c | 605 ++++++++++++++++++++++++++++++++ src/gallium/auxiliary/util/u_blitter.h | 242 +++++++++++++ 4 files changed, 849 insertions(+), 0 deletions(-) create mode 100644 src/gallium/auxiliary/util/u_blitter.c create mode 100644 src/gallium/auxiliary/util/u_blitter.h diff --git a/src/gallium/auxiliary/util/Makefile b/src/gallium/auxiliary/util/Makefile index 894958f..f81fc46 100644 --- a/src/gallium/auxiliary/util/Makefile +++ b/src/gallium/auxiliary/util/Makefile @@ -9,6 +9,7 @@ C_SOURCES = \ u_debug_symbol.c \ u_debug_stack.c \ u_blit.c \ + u_blitter.c \ u_cache.c \ u_cpu_detect.c \ u_draw_quad.c \ diff --git a/src/gallium/auxiliary/util/SConscript b/src/gallium/auxiliary/util/SConscript index 0c0e048..024a370 100644 --- a/src/gallium/auxiliary/util/SConscript +++ b/src/gallium/auxiliary/util/SConscript @@ -23,6 +23,7 @@ util = env.ConvenienceLibrary( source = [ 'u_bitmask.c', 'u_blit.c', + 'u_blitter.c', 'u_cache.c', 'u_cpu_detect.c', 'u_debug.c', diff --git a/src/gallium/auxiliary/util/u_blitter.c b/src/gallium/auxiliary/util/u_blitter.c new file mode 100644 index 0000000..e51a5df --- /dev/null +++ b/src/gallium/auxiliary/util/u_blitter.c @@ -0,0 +1,605 @@ +/************************************************************************** + * + * Copyright 2009 Marek Olšák <maraeo@...> + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * "Software"), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **************************************************************************/ + +/** + * @file + * Blitter utility to facilitate acceleration of the clear, surface_copy, + * and surface_fill functions. + * + * @author Marek Olšák + */ + +#include "pipe/p_context.h" +#include "pipe/p_defines.h" +#include "pipe/p_inlines.h" +#include "pipe/p_shader_tokens.h" +#include "pipe/p_state.h" + +#include "util/u_memory.h" +#include "util/u_math.h" +#include "util/u_blitter.h" +#include "util/u_draw_quad.h" +#include "util/u_pack_color.h" +#include "util/u_rect.h" +#include "util/u_simple_shaders.h" +#include "util/u_texture.h" + +struct blitter_context_priv +{ + struct blitter_context blitter; + + struct pipe_context *pipe; /**< pipe context */ + struct pipe_buffer *vbuf; /**< quad */ + + float vertices[4][2][4]; /**< {pos, color} or {pos, texcoord} */ + + /* Constant state objects. */ + /* Vertex shaders. */ + void *vs_col; /**< Vertex shader which passes {pos, color} to the output */ + void *vs_tex; /**<Vertex shader which passes {pos, texcoord} to the output.*/ + + /* Fragment shaders. */ + void *fs_col[8]; /**< FS which outputs colors to 1-8 color buffers */ + void *fs_texfetch_col[4]; /**< FS which outputs a color from a texture */ + void *fs_texfetch_depth[4]; /**< FS which outputs a depth from a texture, + where the index is PIPE_TEXTURE_* to be sampled */ + + /* Blend state. */ + void *blend_write_color; /**< blend state with writemask of RGBA */ + void *blend_keep_color; /**< blend state with writemask of 0 */ + + /* Depth stencil alpha state. */ + void *dsa_write_depth_stencil[0xff]; /**< indices are stencil clear values */ + void *dsa_write_depth_keep_stencil; + void *dsa_keep_depth_stencil; + + /* Other state. */ + void *sampler_state[16]; /**< sampler state for clamping to a miplevel */ + void *rs_state; /**< rasterizer state */ +}; + +struct blitter_context *util_blitter_create(struct pipe_context *pipe) +{ + struct blitter_context_priv *ctx; + struct pipe_blend_state blend; + struct pipe_depth_stencil_alpha_state dsa; + struct pipe_rasterizer_state rs_state; + struct pipe_sampler_state sampler_state; + unsigned i, max_render_targets; + + ctx = CALLOC_STRUCT(blitter_context_priv); + if (!ctx) + return NULL; + + ctx->pipe = pipe; + + /* init state objects for them to be considered invalid */ + ctx->blitter.saved_fb_state.nr_cbufs = ~0; + ctx->blitter.saved_num_textures = ~0; + ctx->blitter.saved_num_sampler_states = ~0; + + /* blend state objects */ + memset(&blend, 0, sizeof(blend)); + ctx->blend_keep_color = pipe->create_blend_state(pipe, &blend); + + blend.colormask = PIPE_MASK_RGBA; + ctx->blend_write_color = pipe->create_blend_state(pipe, &blend); + + /* depth stencil alpha state objects */ + memset(&dsa, 0, sizeof(dsa)); + ctx->dsa_keep_depth_stencil = + pipe->create_depth_stencil_alpha_state(pipe, &dsa); + + dsa.depth.enabled = 1; + dsa.depth.writemask = 1; + dsa.depth.func = PIPE_FUNC_ALWAYS; + ctx->dsa_write_depth_keep_stencil = + pipe->create_depth_stencil_alpha_state(pipe, &dsa); + + dsa.stencil[0].enabled = 1; + dsa.stencil[0].func = PIPE_FUNC_ALWAYS; + dsa.stencil[0].fail_op = PIPE_STENCIL_OP_REPLACE; + dsa.stencil[0].zpass_op = PIPE_STENCIL_OP_REPLACE; + dsa.stencil[0].zfail_op = PIPE_STENCIL_OP_REPLACE; + dsa.stencil[0].valuemask = 0xff; + dsa.stencil[0].writemask = 0xff; + + /* create a depth stencil alpha state for each possible stencil clear + * value */ + for (i = 0; i < 0xff; i++) { + dsa.stencil[0].ref_value = i; + + ctx->dsa_write_depth_stencil[i] = + pipe->create_depth_stencil_alpha_state(pipe, &dsa); + } + + /* sampler state */ + memset(&sampler_state, 0, sizeof(sampler_state)); + sampler_state.wrap_s = PIPE_TEX_WRAP_CLAMP_TO_EDGE; + sampler_state.wrap_t = PIPE_TEX_WRAP_CLAMP_TO_EDGE; + sampler_state.wrap_r = PIPE_TEX_WRAP_CLAMP_TO_EDGE; + + for (i = 0; i < 16; i++) { + sampler_state.lod_bias = i; + sampler_state.min_lod = i; + sampler_state.max_lod = i; + + ctx->sampler_state[i] = pipe->create_sampler_state(pipe, &sampler_state); + } + + /* rasterizer state */ + memset(&rs_state, 0, sizeof(rs_state)); + rs_state.front_winding = PIPE_WINDING_CW; + rs_state.cull_mode = PIPE_WINDING_NONE; + rs_state.bypass_vs_clip_and_viewport = 1; + rs_state.gl_rasterization_rules = 1; + ctx->rs_state = pipe->create_rasterizer_state(pipe, &rs_state); + + /* vertex shaders */ + { + const uint semantic_names[] = { TGSI_SEMANTIC_POSITION, + TGSI_SEMANTIC_COLOR }; + const uint semantic_indices[] = { 0, 0 }; + ctx->vs_col = + util_make_vertex_passthrough_shader(pipe, 2, semantic_names, + semantic_indices); + } + { + const uint semantic_names[] = { TGSI_SEMANTIC_POSITION, + TGSI_SEMANTIC_GENERIC }; + const uint semantic_indices[] = { 0, 0 }; + ctx->vs_tex = + util_make_vertex_passthrough_shader(pipe, 2, semantic_names, + semantic_indices); + } + + /* fragment shaders */ + ctx->fs_texfetch_col[PIPE_TEXTURE_1D] = + util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_1D); + ctx->fs_texfetch_col[PIPE_TEXTURE_2D] = + util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_2D); + ctx->fs_texfetch_col[PIPE_TEXTURE_3D] = + util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_3D); + ctx->fs_texfetch_col[PIPE_TEXTURE_CUBE] = + util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_CUBE); + + ctx->fs_texfetch_depth[PIPE_TEXTURE_1D] = + util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_1D); + ctx->fs_texfetch_depth[PIPE_TEXTURE_2D] = + util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_2D); + ctx->fs_texfetch_depth[PIPE_TEXTURE_3D] = + util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_3D); + ctx->fs_texfetch_depth[PIPE_TEXTURE_CUBE] = + util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_CUBE); + + max_render_targets = pipe->screen->get_param(pipe->screen, + PIPE_CAP_MAX_RENDER_TARGETS); + assert(max_render_targets <= 8); + for (i = 0; i < max_render_targets; i++) + ctx->fs_col[i] = util_make_fragment_clonecolor_shader(pipe, 1+i); + + /* set invariant vertex coordinates */ + for (i = 0; i < 4; i++) + ctx->vertices[i][0][3] = 1; /*v.w*/ + + /* create the vertex buffer */ + ctx->vbuf = pipe_buffer_create(ctx->pipe->screen, + 32, + PIPE_BUFFER_USAGE_VERTEX, + sizeof(ctx->vertices)); + + return &ctx->blitter; +} + +void util_blitter_destroy(struct blitter_context *blitter) +{ + struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter; + struct pipe_context *pipe = ctx->pipe; + int i; + + pipe->delete_blend_state(pipe, ctx->blend_write_color); + pipe->delete_blend_state(pipe, ctx->blend_keep_color); + pipe->delete_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil); + pipe->delete_depth_stencil_alpha_state(pipe, + ctx->dsa_write_depth_keep_stencil); + + for (i = 0; i < 0xff; i++) + pipe->delete_depth_stencil_alpha_state(pipe, + ctx->dsa_write_depth_stencil[i]); + + pipe->delete_rasterizer_state(pipe, ctx->rs_state); + pipe->delete_vs_state(pipe, ctx->vs_col); + pipe->delete_vs_state(pipe, ctx->vs_tex); + + for (i = 0; i < 4; i++) { + pipe->delete_fs_state(pipe, ctx->fs_texfetch_col[i]); + pipe->delete_fs_state(pipe, ctx->fs_texfetch_depth[i]); + } + for (i = 0; i < 8 && ctx->fs_col[i]; i++) + pipe->delete_fs_state(pipe, ctx->fs_col[i]); + + pipe_buffer_reference(&ctx->vbuf, NULL); + FREE(ctx); +} + +static void blitter_check_saved_CSOs(struct blitter_context_priv *ctx) +{ + /* make sure these CSOs have been saved */ + assert(ctx->blitter.saved_blend_state && + ctx->blitter.saved_dsa_state && + ctx->blitter.saved_rs_state && + ctx->blitter.saved_fs && + ctx->blitter.saved_vs); +} + +static void blitter_restore_CSOs(struct blitter_context_priv *ctx) +{ + struct pipe_context *pipe = ctx->pipe; + + /* restore the state objects which are always required to be saved */ + pipe->bind_blend_state(pipe, ctx->blitter.saved_blend_state); + pipe->bind_depth_stencil_alpha_state(pipe, ctx->blitter.saved_dsa_state); + pipe->bind_rasterizer_state(pipe, ctx->blitter.saved_rs_state); + pipe->bind_fs_state(pipe, ctx->blitter.saved_fs); + pipe->bind_vs_state(pipe, ctx->blitter.saved_vs); + + ctx->blitter.saved_blend_state = 0; + ctx->blitter.saved_dsa_state = 0; + ctx->blitter.saved_rs_state = 0; + ctx->blitter.saved_fs = 0; + ctx->blitter.saved_vs = 0; + + /* restore the state objects which are required to be saved before copy/fill + */ + if (ctx->blitter.saved_fb_state.nr_cbufs != ~0) { + pipe->set_framebuffer_state(pipe, &ctx->blitter.saved_fb_state); + ctx->blitter.saved_fb_state.nr_cbufs = ~0; + } + + if (ctx->blitter.saved_num_sampler_states != ~0) { + pipe->bind_fragment_sampler_states(pipe, + ctx->blitter.saved_num_sampler_states, + ctx->blitter.saved_sampler_states); + ctx->blitter.saved_num_sampler_states = ~0; + } + + if (ctx->blitter.saved_num_textures != ~0) { + pipe->set_fragment_sampler_textures(pipe, + ctx->blitter.saved_num_textures, + ctx->blitter.saved_textures); + ctx->blitter.saved_num_textures = ~0; + } +} + +static void blitter_set_rectangle(struct blitter_context_priv *ctx, + unsigned x1, unsigned y1, + unsigned x2, unsigned y2, + float depth) +{ + int i; + + /* set vertex positions */ + ctx->vertices[0][0][0] = x1; /*v0.x*/ + ctx->vertices[0][0][1] = y1; /*v0.y*/ + + ctx->vertices[1][0][0] = x2; /*v1.x*/ + ctx->vertices[1][0][1] = y1; /*v1.y*/ + + ctx->vertices[2][0][0] = x2; /*v2.x*/ + ctx->vertices[2][0][1] = y2; /*v2.y*/ + + ctx->vertices[3][0][0] = x1; /*v3.x*/ + ctx->vertices[3][0][1] = y2; /*v3.y*/ + + for (i = 0; i < 4; i++) + ctx->vertices[i][0][2] = depth; /*z*/ +} + +static void blitter_set_clear_color(struct blitter_context_priv *ctx, + const float *rgba) +{ + int i; + + for (i = 0; i < 4; i++) { + ctx->vertices[i][1][0] = rgba[0]; + ctx->vertices[i][1][1] = rgba[1]; + ctx->vertices[i][1][2] = rgba[2]; + ctx->vertices[i][1][3] = rgba[3]; + } +} + +static void blitter_set_texcoords_2d(struct blitter_context_priv *ctx, + struct pipe_surface *surf, + unsigned x1, unsigned y1, + unsigned x2, unsigned y2) +{ + int i; + float s1 = x1 / (float)surf->width; + float t1 = y1 / (float)surf->height; + float s2 = x2 / (float)surf->width; + float t2 = y2 / (float)surf->height; + + ctx->vertices[0][1][0] = s1; /*t0.s*/ + ctx->vertices[0][1][1] = t1; /*t0.t*/ + + ctx->vertices[1][1][0] = s2; /*t1.s*/ + ctx->vertices[1][1][1] = t1; /*t1.t*/ + + ctx->vertices[2][1][0] = s2; /*t2.s*/ + ctx->vertices[2][1][1] = t2; /*t2.t*/ + + ctx->vertices[3][1][0] = s1; /*t3.s*/ + ctx->vertices[3][1][1] = t2; /*t3.t*/ + + for (i = 0; i < 4; i++) { + ctx->vertices[i][1][2] = 0; /*r*/ + ctx->vertices[i][1][3] = 1; /*q*/ + } +} + +static void blitter_set_texcoords_3d(struct blitter_context_priv *ctx, + struct pipe_surface *surf, + unsigned x1, unsigned y1, + unsigned x2, unsigned y2) +{ + int i; + float depth = u_minify(surf->texture->depth0, surf->level); + float r = surf->zslice / depth; + + blitter_set_texcoords_2d(ctx, surf, x1, y1, x2, y2); + + for (i = 0; i < 4; i++) + ctx->vertices[i][1][2] = r; /*r*/ +} + +static void blitter_set_texcoords_cube(struct blitter_context_priv *ctx, + struct pipe_surface *surf, + unsigned x1, unsigned y1, + unsigned x2, unsigned y2) +{ + int i; + float s1 = x1 / (float)surf->width; + float t1 = y1 / (float)surf->height; + float s2 = x2 / (float)surf->width; + float t2 = y2 / (float)surf->height; + const float st[4][2] = { + {s1, t1}, {s2, t1}, {s2, t2}, {s1, t2} + }; + + util_map_texcoords2d_onto_cubemap(surf->face, + /* pointer, stride in floats */ + &st[0][0], 2, + &ctx->vertices[0][1][0], 8); + + for (i = 0; i < 4; i++) + ctx->vertices[i][1][3] = 1; /*q*/ +} + +static void blitter_draw_quad(struct blitter_context_priv *ctx) +{ + struct blitter_context *blitter = &ctx->blitter; + struct pipe_context *pipe = ctx->pipe; + + if (blitter->draw_quad) { + blitter->draw_quad(pipe, &ctx->vertices[0][0][0]); + } else { + /* write vertices and draw them */ + pipe_buffer_write(pipe->screen, ctx->vbuf, + 0, sizeof(ctx->vertices), ctx->vertices); + + util_draw_vertex_buffer(ctx->pipe, ctx->vbuf, 0, PIPE_PRIM_TRIANGLE_FAN, + 4, /* verts */ + 2); /* attribs/vert */ + } +} + +void util_blitter_clear(struct blitter_context *blitter, + unsigned width, unsigned height, + unsigned num_cbufs, + unsigned clear_buffers, + const float *rgba, + double depth, unsigned stencil) +{ + struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter; + struct pipe_context *pipe = ctx->pipe; + + assert(num_cbufs <= 8); + + blitter_check_saved_CSOs(ctx); + + /* bind CSOs */ + if (clear_buffers & PIPE_CLEAR_COLOR) + pipe->bind_blend_state(pipe, ctx->blend_write_color); + else + pipe->bind_blend_state(pipe, ctx->blend_keep_color); + + if (clear_buffers & PIPE_CLEAR_DEPTHSTENCIL) + pipe->bind_depth_stencil_alpha_state(pipe, + ctx->dsa_write_depth_stencil[stencil&0xff]); + else + pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil); + + pipe->bind_rasterizer_state(pipe, ctx->rs_state); + pipe->bind_fs_state(pipe, ctx->fs_col[num_cbufs ? num_cbufs-1 : 0]); + pipe->bind_vs_state(pipe, ctx->vs_col); + + blitter_set_clear_color(ctx, rgba); + blitter_set_rectangle(ctx, 0, 0, width, height, depth); + blitter_draw_quad(ctx); + blitter_restore_CSOs(ctx); +} + +void util_blitter_copy(struct blitter_context *blitter, + struct pipe_surface *dst, + unsigned dstx, unsigned dsty, + struct pipe_surface *src, + unsigned srcx, unsigned srcy, + unsigned width, unsigned height, + boolean ignore_stencil) +{ + struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter; + struct pipe_context *pipe = ctx->pipe; + struct pipe_screen *screen = pipe->screen; + struct pipe_framebuffer_state fb_state; + boolean is_stencil, is_depth; + unsigned dst_tex_usage; + + /* give up if textures are not set */ + assert(dst->texture && src->texture); + if (!dst->texture || !src->texture) + return; + + is_depth = pf_get_component_bits(src->format, PIPE_FORMAT_COMP_Z) != 0; + is_stencil = pf_get_component_bits(src->format, PIPE_FORMAT_COMP_S) != 0; + dst_tex_usage = is_depth || is_stencil ? PIPE_TEXTURE_USAGE_DEPTH_STENCIL : + PIPE_TEXTURE_USAGE_RENDER_TARGET; + + /* check if we can sample from and render to the surfaces */ + /* (assuming copying a stencil buffer is not possible) */ + if ((!ignore_stencil && is_stencil) || + !screen->is_format_supported(screen, dst->format, dst->texture->target, + dst_tex_usage, 0) || + !screen->is_format_supported(screen, src->format, src->texture->target, + PIPE_TEXTURE_USAGE_SAMPLER, 0)) { + util_surface_copy(pipe, FALSE, dst, dstx, dsty, src, srcx, srcy, + width, height); + return; + } + + /* check whether the states are properly saved */ + blitter_check_saved_CSOs(ctx); + assert(blitter->saved_fb_state.nr_cbufs != ~0); + assert(blitter->saved_num_textures != ~0); + assert(blitter->saved_num_sampler_states != ~0); + assert(src->texture->target < 4); + + /* bind CSOs */ + fb_state.width = dst->width; + fb_state.height = dst->height; + + if (is_depth) { + pipe->bind_blend_state(pipe, ctx->blend_keep_color); + pipe->bind_depth_stencil_alpha_state(pipe, + ctx->dsa_write_depth_keep_stencil); + pipe->bind_fs_state(pipe, ctx->fs_texfetch_depth[src->texture->target]); + + fb_state.nr_cbufs = 0; + fb_state.zsbuf = dst; + } else { + pipe->bind_blend_state(pipe, ctx->blend_write_color); + pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil); + pipe->bind_fs_state(pipe, ctx->fs_texfetch_col[src->texture->target]); + + fb_state.nr_cbufs = 1; + fb_state.cbufs[0] = dst; + fb_state.zsbuf = 0; + } + pipe->bind_rasterizer_state(pipe, ctx->rs_state); + pipe->bind_vs_state(pipe, ctx->vs_tex); + pipe->bind_fragment_sampler_states(pipe, 1, &ctx->sampler_state[src->level]); + pipe->set_fragment_sampler_textures(pipe, 1, &src->texture); + pipe->set_framebuffer_state(pipe, &fb_state); + + /* set texture coordinates */ + switch (src->texture->target) { + case PIPE_TEXTURE_1D: + case PIPE_TEXTURE_2D: + blitter_set_texcoords_2d(ctx, src, srcx, srcy, + srcx+width, srcy+height); + break; + case PIPE_TEXTURE_3D: + blitter_set_texcoords_3d(ctx, src, srcx, srcy, + srcx+width, srcy+height); + break; + case PIPE_TEXTURE_CUBE: + blitter_set_texcoords_cube(ctx, src, srcx, srcy, + srcx+width, srcy+height); + break; + } + + blitter_set_rectangle(ctx, dstx, dsty, dstx+width, dsty+height, 0); + blitter_draw_quad(ctx); + blitter_restore_CSOs(ctx); +} + +void util_blitter_fill(struct blitter_context *blitter, + struct pipe_surface *dst, + unsigned dstx, unsigned dsty, + unsigned width, unsigned height, + unsigned value) +{ + struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter; + struct pipe_context *pipe = ctx->pipe; + struct pipe_screen *screen = pipe->screen; + struct pipe_framebuffer_state fb_state; + float rgba[4]; + ubyte ub_rgba[4] = {0}; + union util_color color; + int i; + + assert(dst->texture); + if (!dst->texture) + return; + + /* check if we can render to the surface */ + if (pf_is_depth_or_stencil(dst->format) || /* unlikely, but you never know */ + !screen->is_format_supported(screen, dst->format, dst->texture->target, + PIPE_TEXTURE_USAGE_RENDER_TARGET, 0)) { + util_surface_fill(pipe, dst, dstx, dsty, width, height, value); + return; + } + + /* unpack the color */ + color.ui = value; + util_unpack_color_ub(dst->format, &color, + ub_rgba, ub_rgba+1, ub_rgba+2, ub_rgba+3); + for (i = 0; i < 4; i++) + rgba[i] = ubyte_to_float(ub_rgba[i]); + + /* check the saved state */ + blitter_check_saved_CSOs(ctx); + assert(blitter->saved_fb_state.nr_cbufs != ~0); + + /* bind CSOs */ + pipe->bind_blend_state(pipe, ctx->blend_write_color); + pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil); + pipe->bind_rasterizer_state(pipe, ctx->rs_state); + pipe->bind_fs_state(pipe, ctx->fs_col[0]); + pipe->bind_vs_state(pipe, ctx->vs_col); + + /* set a framebuffer state */ + fb_state.width = dst->width; + fb_state.height = dst->height; + fb_state.nr_cbufs = 1; + fb_state.cbufs[0] = dst; + fb_state.zsbuf = 0; + pipe->set_framebuffer_state(pipe, &fb_state); + + blitter_set_clear_color(ctx, rgba); + blitter_set_rectangle(ctx, 0, 0, width, height, 0); + blitter_draw_quad(ctx); + blitter_restore_CSOs(ctx); +} diff --git a/src/gallium/auxiliary/util/u_blitter.h b/src/gallium/auxiliary/util/u_blitter.h new file mode 100644 index 0000000..d03915c --- /dev/null +++ b/src/gallium/auxiliary/util/u_blitter.h @@ -0,0 +1,242 @@ +/************************************************************************** + * + * Copyright 2009 Marek Olšák <maraeo@...> + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * "Software"), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **************************************************************************/ + +#ifndef U_BLITTER_H +#define U_BLITTER_H + +#include "pipe/p_state.h" + + +#ifdef __cplusplus +extern "C" { +#endif + +struct pipe_context; + +struct blitter_context +{ + /** + * Draw a quad. + * + * The pipe driver can set this to provide a more efficient way of drawing + * a quad. If it's NULL, the quad is drawn using a vertex buffer. + * + * There are always 4 vertices with interleaved vertex elements of type + * RGBA32F. See the vertex shader _output_ semantics to know what those are. + * The primitive type is always PIPE_PRIM_TRIANGLE_FAN and VS/clip/viewport + * is bypasssed. + */ + void (*draw_quad)(struct pipe_context *pipe, + const float *vertices); + + /* Private members, really. */ + void *saved_blend_state; /**< blend state */ + void *saved_dsa_state; /**< depth stencil alpha state */ + void *saved_rs_state; /**< rasterizer state */ + void *saved_fs, *saved_vs; /**< fragment shader, vertex shader */ + + struct pipe_framebuffer_state saved_fb_state; /**< framebuffer state */ + + int saved_num_sampler_states; + void *saved_sampler_states[32]; + + int saved_num_textures; + struct pipe_texture *saved_textures[32]; /* is 32 enough? */ +}; + +/** + * Create a blitter context. + */ +struct blitter_context *util_blitter_create(struct pipe_context *pipe); + +/** + * Destroy a blitter context. + */ +void util_blitter_destroy(struct blitter_context *blitter); + +/* + * These CSOs must be saved before any of the following functions is called: + * - blend state + * - depth stencil alpha state + * - rasterizer state + * - vertex shader + * - fragment shader + */ + +/** + * Clear a specified set of currently bound buffers to specified values. + */ +void util_blitter_clear(struct blitter_context *blitter, + unsigned width, unsigned height, + unsigned num_cbufs, + unsigned clear_buffers, + const float *rgba, + double depth, unsigned stencil); + +/** + * Copy a block of pixels from one surface to another. + * + * You can copy from any color format to any other color format provided + * the former can be sampled and the latter can be rendered to. Otherwise, + * a software fallback path is taken and both surfaces must be of the same + * format. + * + * The same holds for depth-stencil formats with the exception that stencil + * cannot be copied unless you set ignore_stencil to FALSE. In that case, + * a software fallback path is taken and both surfaces must be of the same + * format. + * + * Use pipe_screen->is_format_supported to know your options. + * + * These states must be saved in the blitter in addition to the state objects + * already required to be saved: + * - framebuffer state + * - fragment sampler states + * - fragment sampler textures + */ +void util_blitter_copy(struct blitter_context *blitter, + struct pipe_surface *dst, + unsigned dstx, unsigned dsty, + struct pipe_surface *src, + unsigned srcx, unsigned srcy, + unsigned width, unsigned height, + boolean ignore_stencil); + +/** + * Fill a region of a surface with a constant value. + * + * If the surface cannot be rendered to or it's a depth-stencil format, + * a software fallback path is taken. + * + * These states must be saved in the blitter in addition to the state objects + * already required to be saved: + * - framebuffer state + */ +void util_blitter_fill(struct blitter_context *blitter, + struct pipe_surface *dst, + unsigned dstx, unsigned dsty, + unsigned width, unsigned height, + unsigned value); + +/** + * Copy all pixels from one surface to another. + * + * The rules are the same as in util_blitter_copy with the addition that + * surfaces must have the same size. + */ +static INLINE +void util_blitter_copy_surface(struct blitter_context *blitter, + struct pipe_surface *dst, + struct pipe_surface *src, + boolean ignore_stencil) +{ + assert(dst->width == src->width && dst->height == src->height); + + util_blitter_copy(blitter, dst, 0, 0, src, 0, 0, src->width, src->height, + ignore_stencil); +} + + +/* The functions below should be used to save currently bound constant state + * objects inside a driver. The objects are automatically restored at the end + * of the util_blitter_{clear, fill, copy, copy_surface} functions and then + * forgotten. + * + * CSOs not listed here are not affected by util_blitter. */ + +static INLINE +void util_blitter_save_blend(struct blitter_context *blitter, + void *state) +{ + blitter->saved_blend_state = state; +} + +static INLINE +void util_blitter_save_depth_stencil_alpha(struct blitter_context *blitter, + void *state) +{ + blitter->saved_dsa_state = state; +} + +static INLINE +void util_blitter_save_rasterizer(struct blitter_context *blitter, + void *state) +{ + blitter->saved_rs_state = state; +} + +static INLINE +void util_blitter_save_fragment_shader(struct blitter_context *blitter, + void *fs) +{ + blitter->saved_fs = fs; +} + +static INLINE +void util_blitter_save_vertex_shader(struct blitter_context *blitter, + void *vs) +{ + blitter->saved_vs = vs; +} + +static INLINE +void util_blitter_save_framebuffer(struct blitter_context *blitter, + struct pipe_framebuffer_state *state) +{ + blitter->saved_fb_state = *state; +} + +static INLINE +void util_blitter_save_fragment_sampler_states( + struct blitter_context *blitter, + int num_sampler_states, + void **sampler_states) +{ + assert(num_textures <= 32); + + blitter->saved_num_sampler_states = num_sampler_states; + memcpy(blitter->saved_sampler_states, sampler_states, + num_sampler_states * sizeof(void *)); +} + +static INLINE +void util_blitter_save_fragment_sampler_textures( + struct blitter_context *blitter, + int num_textures, + struct pipe_texture **textures) +{ + assert(num_textures <= 32); + + blitter->saved_num_textures = num_textures; + memcpy(blitter->saved_textures, textures, + num_textures * sizeof(struct pipe_texture *)); +} + +#ifdef __cplusplus +} +#endif + +#endif -- 1.6.3.3 ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterOn Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
> Hi Keith, > > I've finished the blitter module. It fully implements the clear, > surface_copy, and surface_fill functions. It properly fallbacks to > software in case a surface cannot be sampled or rendered to according > to usage. Copying a stencil buffer always fallbacks unless the > ignore_stencil parameter (see util_blitter_copy) is set to TRUE. To my > knowledge, GPUs cannot copy the stencil buffer (not sure if fiddling > with texture formats can help). It's all documented in u_blitter.h. > > The pipe driver can optionally hook up a function to draw a quad > (blitter_context::draw_quad). I realized that embedding 4 vertices > into a command stream (AKA immediate mode) is much faster than writing > them to a vertex buffer due to reduced driver overhead. It might be > worth to consider adding the draw_quad function to pipe_context. > > When working on the blitter, I added the following things to > util/u_simple_shaders: > - util_make_fragment_tex_shader has a new parametr tex_target and the > value should be one of TGSI_TEXTURE_* enums so that it can be used to > sample from any kind of texture. > - Added util_make_fragment_tex_shader_writedepth, which writes depth > sampled from a texture. It's used for copying depth textures. > - Added util_make_fragment_clonecolor_shader, which copies input > COLOR[0] to a specified number of render targets. It's used to clear > MRTs. > > Also, I moved the code for converting 2D texture coordinates into > cubemap texture coordinates from u_gen_mipmap to a new function in > util/u_texture. > > Please review/push. > > Once it gets approved, I will send patches with r300g blit support to > Corbin. With this work, untiling a texture will be as easy as calling > surface_copy whereas the driver state remains intact (theoretically). Marek, This all looks great. Many thanks for adding this functionality - I'm sure we'll be building on it in many ways going forward. I'll push the patches intact, but one thing we need to start thinking about is the mix of code in the util/ directory -- there's some stuff in there that's only legal/useful for state-trackers, some that's likewise only legal for drivers, and a lot that is valid everywhere. At some stage we want to split that up. Keith ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterOn Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
> > +static INLINE > +void util_blitter_save_fragment_sampler_states( > + struct blitter_context *blitter, > + int num_sampler_states, > + void **sampler_states) > +{ > + assert(num_textures <= 32); > + > + blitter->saved_num_sampler_states = num_sampler_states; > + memcpy(blitter->saved_sampler_states, sampler_states, > + num_sampler_states * sizeof(void *)); > +} > + Have you tried compiling with debug enabled? The assert above fails to compile. Also, can you use Elements() or similar instead of the hard-coded 32? Maybe we can figure out how to go back to having asserts keep exposing their contents to the compiler even on non-debug builds. This used to work without problem on linux and helped a lot to avoid these type of problems. Keith ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterOn Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
> -- /dev/null > +++ b/src/gallium/auxiliary/util/u_blitter.c > @@ -0,0 +1,605 @@ > +/************************************************************************** > + * > + * Copyright 2009 Marek Olšák <maraeo@...> > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the > + * "Software"), to deal in the Software without restriction, including > + * without limitation the rights to use, copy, modify, merge, publish, > + * distribute, sub license, and/or sell copies of the Software, and to > + * permit persons to whom the Software is furnished to do so, subject to > + * the following conditions: > + * > + * The above copyright notice and this permission notice (including the > + * next paragraph) shall be included in all copies or substantial portions > + * of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS > + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. > + * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR > + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, > + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE > + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. > + * > + **************************************************************************/ > + > +/** > + * @file > + * Blitter utility to facilitate acceleration of the clear, surface_copy, > + * and surface_fill functions. > + * > + * @author Marek Olšák > + */ > + > +#include "pipe/p_context.h" > +#include "pipe/p_defines.h" > +#include "pipe/p_inlines.h" > +#include "pipe/p_shader_tokens.h" > +#include "pipe/p_state.h" > + > +#include "util/u_memory.h" > +#include "util/u_math.h" > +#include "util/u_blitter.h" > +#include "util/u_draw_quad.h" > +#include "util/u_pack_color.h" > +#include "util/u_rect.h" > +#include "util/u_simple_shaders.h" > +#include "util/u_texture.h" > + > +struct blitter_context_priv > +{ > + struct blitter_context blitter; > + > + struct pipe_context *pipe; /**< pipe context */ > + struct pipe_buffer *vbuf; /**< quad */ > + > + float vertices[4][2][4]; /**< {pos, color} or {pos, texcoord} */ > + > + /* Constant state objects. */ > + /* Vertex shaders. */ > + void *vs_col; /**< Vertex shader which passes {pos, color} to the output */ > + void *vs_tex; /**<Vertex shader which passes {pos, texcoord} to the output.*/ > + > + /* Fragment shaders. */ > + void *fs_col[8]; /**< FS which outputs colors to 1-8 color buffers */ > + void *fs_texfetch_col[4]; /**< FS which outputs a color from a texture */ > + void *fs_texfetch_depth[4]; /**< FS which outputs a depth from a texture, > + where the index is PIPE_TEXTURE_* to be sampled */ Please use PIPE_MAX_COLOR_BUFS or other defines to size these arrays. > + /* Blend state. */ > + void *blend_write_color; /**< blend state with writemask of RGBA */ > + void *blend_keep_color; /**< blend state with writemask of 0 */ > + > + /* Depth stencil alpha state. */ > + void *dsa_write_depth_stencil[0xff]; /**< indices are stencil clear values */ That's a lot of state objects... > + void *dsa_write_depth_keep_stencil; > + void *dsa_keep_depth_stencil; > + > + /* Other state. */ > + void *sampler_state[16]; /**< sampler state for clamping to a miplevel */ > + void *rs_state; /**< rasterizer state */ > +}; > + > +struct blitter_context *util_blitter_create(struct pipe_context *pipe) > +{ > + struct blitter_context_priv *ctx; > + struct pipe_blend_state blend; > + struct pipe_depth_stencil_alpha_state dsa; > + struct pipe_rasterizer_state rs_state; > + struct pipe_sampler_state sampler_state; > + unsigned i, max_render_targets; > + > + ctx = CALLOC_STRUCT(blitter_context_priv); > + if (!ctx) > + return NULL; > + > + ctx->pipe = pipe; > + > + /* init state objects for them to be considered invalid */ > + ctx->blitter.saved_fb_state.nr_cbufs = ~0; > + ctx->blitter.saved_num_textures = ~0; > + ctx->blitter.saved_num_sampler_states = ~0; > + > + /* blend state objects */ > + memset(&blend, 0, sizeof(blend)); > + ctx->blend_keep_color = pipe->create_blend_state(pipe, &blend); > + > + blend.colormask = PIPE_MASK_RGBA; > + ctx->blend_write_color = pipe->create_blend_state(pipe, &blend); > + > + /* depth stencil alpha state objects */ > + memset(&dsa, 0, sizeof(dsa)); > + ctx->dsa_keep_depth_stencil = > + pipe->create_depth_stencil_alpha_state(pipe, &dsa); > + > + dsa.depth.enabled = 1; > + dsa.depth.writemask = 1; > + dsa.depth.func = PIPE_FUNC_ALWAYS; > + ctx->dsa_write_depth_keep_stencil = > + pipe->create_depth_stencil_alpha_state(pipe, &dsa); > + > + dsa.stencil[0].enabled = 1; > + dsa.stencil[0].func = PIPE_FUNC_ALWAYS; > + dsa.stencil[0].fail_op = PIPE_STENCIL_OP_REPLACE; > + dsa.stencil[0].zpass_op = PIPE_STENCIL_OP_REPLACE; > + dsa.stencil[0].zfail_op = PIPE_STENCIL_OP_REPLACE; > + dsa.stencil[0].valuemask = 0xff; > + dsa.stencil[0].writemask = 0xff; > + > + /* create a depth stencil alpha state for each possible stencil clear > + * value */ > + for (i = 0; i < 0xff; i++) { > + dsa.stencil[0].ref_value = i; > + > + ctx->dsa_write_depth_stencil[i] = > + pipe->create_depth_stencil_alpha_state(pipe, &dsa); > + } Ouch - that's an unexpectedly large number of state objects being created for this path. Can these be created on-demand / lazily? Can you maybe limit this code to a (much) smaller maximum number of simultaneously live states of this type? Eg. 4 or 8 of them? Creating states isn't so terribly expensive, and this seems a bit excessive. > + /* sampler state */ > + memset(&sampler_state, 0, sizeof(sampler_state)); > + sampler_state.wrap_s = PIPE_TEX_WRAP_CLAMP_TO_EDGE; > + sampler_state.wrap_t = PIPE_TEX_WRAP_CLAMP_TO_EDGE; > + sampler_state.wrap_r = PIPE_TEX_WRAP_CLAMP_TO_EDGE; > + > + for (i = 0; i < 16; i++) { > + sampler_state.lod_bias = i; > + sampler_state.min_lod = i; > + sampler_state.max_lod = i; > + > + ctx->sampler_state[i] = pipe->create_sampler_state(pipe, &sampler_state); > + } Similarly, create on demand? And use a PIPE_MAX_xxx enum for the loop? Keith ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterKeith Whitwell pisze:
> On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote: > >> +static INLINE >> +void util_blitter_save_fragment_sampler_states( >> + struct blitter_context *blitter, >> + int num_sampler_states, >> + void **sampler_states) >> +{ >> + assert(num_textures <= 32); >> + >> + blitter->saved_num_sampler_states = num_sampler_states; >> + memcpy(blitter->saved_sampler_states, sampler_states, >> + num_sampler_states * sizeof(void *)); >> +} >> + >> > > Have you tried compiling with debug enabled? The assert above fails to > compile. Also, can you use Elements() or similar instead of the > hard-coded 32? > > Maybe we can figure out how to go back to having asserts keep exposing > their contents to the compiler even on non-debug builds. This used to > work without problem on linux and helped a lot to avoid these type of > problems. > > __assume() for non-debug builds on windows and MSVC. http://msdn.microsoft.com/en-us/library/1b3fsfxw%28VS.80%29.aspx ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterOn Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote:
> On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote: > > > > +static INLINE > > +void util_blitter_save_fragment_sampler_states( > > + struct blitter_context *blitter, > > + int num_sampler_states, > > + void **sampler_states) > > +{ > > + assert(num_textures <= 32); > > + > > + blitter->saved_num_sampler_states = num_sampler_states; > > + memcpy(blitter->saved_sampler_states, sampler_states, > > + num_sampler_states * sizeof(void *)); > > +} > > + > > Have you tried compiling with debug enabled? The assert above fails to > compile. Also, can you use Elements() or similar instead of the > hard-coded 32? > > Maybe we can figure out how to go back to having asserts keep exposing > their contents to the compiler even on non-debug builds. This used to > work without problem on linux and helped a lot to avoid these type of > problems. I wouldn't say without a problem: defining assert(expr) as (void)0 instead of (void)(expr) on release builds yielded a non-negligible performance improvement. I don't recall the exact figure, but I believe it was the 3-5% for the driver I was benchmarking at the time. YMMV. Different drivers will give different results, but there's nothing platform specific about this. I believe the problem is we sometimes have assert(very_expensive_check()); and it should be really #ifdef DEBUG assert(very_expensive_check()); #endf We could go through the files with a fine-toothed comb and fix it, but it's quite likely this sort of checks creep back in unnoticed and the thing repeats again. Between having debug builds temporarily broken and slower release builds I personally I'm for the former. No suprise that (void)0 is the common practice: glibc, ms's headers, etc. all do that. Also, I don't understand why a developer wouldn't want to use a debug build unless he's profiling. I don't see why we should make easy for a developer not to test its code, and running a debug build is the bare minimum. Jose ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterOn Mon, 2009-12-14 at 03:52 -0800, Keith Whitwell wrote:
> On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote: > > Hi Keith, > > > > I've finished the blitter module. It fully implements the clear, > > surface_copy, and surface_fill functions. It properly fallbacks to > > software in case a surface cannot be sampled or rendered to according > > to usage. Copying a stencil buffer always fallbacks unless the > > ignore_stencil parameter (see util_blitter_copy) is set to TRUE. To my > > knowledge, GPUs cannot copy the stencil buffer (not sure if fiddling > > with texture formats can help). It's all documented in u_blitter.h. > > > > The pipe driver can optionally hook up a function to draw a quad > > (blitter_context::draw_quad). I realized that embedding 4 vertices > > into a command stream (AKA immediate mode) is much faster than writing > > them to a vertex buffer due to reduced driver overhead. It might be > > worth to consider adding the draw_quad function to pipe_context. > > > > When working on the blitter, I added the following things to > > util/u_simple_shaders: > > - util_make_fragment_tex_shader has a new parametr tex_target and the > > value should be one of TGSI_TEXTURE_* enums so that it can be used to > > sample from any kind of texture. > > - Added util_make_fragment_tex_shader_writedepth, which writes depth > > sampled from a texture. It's used for copying depth textures. > > - Added util_make_fragment_clonecolor_shader, which copies input > > COLOR[0] to a specified number of render targets. It's used to clear > > MRTs. > > > > Also, I moved the code for converting 2D texture coordinates into > > cubemap texture coordinates from u_gen_mipmap to a new function in > > util/u_texture. > > > > Please review/push. > > > > Once it gets approved, I will send patches with r300g blit support to > > Corbin. With this work, untiling a texture will be as easy as calling > > surface_copy whereas the driver state remains intact (theoretically). > > Marek, > > This all looks great. Many thanks for adding this functionality - I'm > sure we'll be building on it in many ways going forward. > > I'll push the patches intact, but one thing we need to start thinking > about is the mix of code in the util/ directory -- there's some stuff in > there that's only legal/useful for state-trackers, some that's likewise > only legal for drivers, and a lot that is valid everywhere. At some > stage we want to split that up. I plan to split the os specific stuff out soon. I'm referring to memory allocation. debug printing, file abstraction, etc. All stuff that is not Gallium related and is needed everywhere. Jose ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterOn Mon, 2009-12-14 at 03:52 -0800, Keith Whitwell wrote:
> On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote: > > Hi Keith, > > > > I've finished the blitter module. It fully implements the clear, > > surface_copy, and surface_fill functions. It properly fallbacks to > > software in case a surface cannot be sampled or rendered to according > > to usage. Copying a stencil buffer always fallbacks unless the > > ignore_stencil parameter (see util_blitter_copy) is set to TRUE. To my > > knowledge, GPUs cannot copy the stencil buffer (not sure if fiddling > > with texture formats can help). It's all documented in u_blitter.h. > > > > The pipe driver can optionally hook up a function to draw a quad > > (blitter_context::draw_quad). I realized that embedding 4 vertices > > into a command stream (AKA immediate mode) is much faster than writing > > them to a vertex buffer due to reduced driver overhead. It might be > > worth to consider adding the draw_quad function to pipe_context. > > > > When working on the blitter, I added the following things to > > util/u_simple_shaders: > > - util_make_fragment_tex_shader has a new parametr tex_target and the > > value should be one of TGSI_TEXTURE_* enums so that it can be used to > > sample from any kind of texture. > > - Added util_make_fragment_tex_shader_writedepth, which writes depth > > sampled from a texture. It's used for copying depth textures. > > - Added util_make_fragment_clonecolor_shader, which copies input > > COLOR[0] to a specified number of render targets. It's used to clear > > MRTs. > > > > Also, I moved the code for converting 2D texture coordinates into > > cubemap texture coordinates from u_gen_mipmap to a new function in > > util/u_texture. > > > > Please review/push. > > > > Once it gets approved, I will send patches with r300g blit support to > > Corbin. With this work, untiling a texture will be as easy as calling > > surface_copy whereas the driver state remains intact (theoretically). > > Marek, > > This all looks great. Many thanks for adding this functionality - I'm > sure we'll be building on it in many ways going forward. Nice stuff indeed. FWIW, I also think that putting a reasonable functionality bars instead querying the pipe for every little capability will benefit us in the long term. It worked well for vertex processing and hardware unsupported API quirks (via draw module); it's nice to see the same for blits; and I hope this becomes a trend. It not only makes things less complex, having all pipe drivers with similar capabilities is what allows us to plug'n'play pipe drivers; do things like replay a trace of one driver on top of another; perhaps in the future code a drivers that do differential analysis with a reference one, etc. Jose ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterOn Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote:
> On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote: > > On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote: > > > > > > +static INLINE > > > +void util_blitter_save_fragment_sampler_states( > > > + struct blitter_context *blitter, > > > + int num_sampler_states, > > > + void **sampler_states) > > > +{ > > > + assert(num_textures <= 32); > > > + > > > + blitter->saved_num_sampler_states = num_sampler_states; > > > + memcpy(blitter->saved_sampler_states, sampler_states, > > > + num_sampler_states * sizeof(void *)); > > > +} > > > + > > > > Have you tried compiling with debug enabled? The assert above fails to > > compile. Also, can you use Elements() or similar instead of the > > hard-coded 32? > > > > Maybe we can figure out how to go back to having asserts keep exposing > > their contents to the compiler even on non-debug builds. This used to > > work without problem on linux and helped a lot to avoid these type of > > problems. > > I wouldn't say without a problem: defining assert(expr) as (void)0 > instead of (void)(expr) on release builds yielded a non-negligible > performance improvement. I don't recall the exact figure, but I believe > it was the 3-5% for the driver I was benchmarking at the time. YMMV. > Different drivers will give different results, but there's nothing > platform specific about this. It's not hard to avoid excuting code... For instance we could always have it translated to something like: if (0) { (void)(expr); } (void)(0) > I believe the problem is we sometimes have > > assert(very_expensive_check()); > > and it should be really > > #ifdef DEBUG > assert(very_expensive_check()); > #endf I think the above translation is fine, without the extra ifdefs. > We could go through the files with a fine-toothed comb and fix it, but > it's quite likely this sort of checks creep back in unnoticed and the > thing repeats again. Between having debug builds temporarily broken and > slower release builds I personally I'm for the former. > > No suprise that (void)0 is the common practice: glibc, ms's headers, > etc. all do that. > Also, I don't understand why a developer wouldn't want to use a debug > build unless he's profiling. I don't see why we should make easy for a > developer not to test its code, and running a debug build is the bare > minimum. There are other issues as well, such as unused variable warnings for vars used only in asserts, etc. Keith ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterOn Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote:
> On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote: > > On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote: > > > On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote: > > > > > > > > +static INLINE > > > > +void util_blitter_save_fragment_sampler_states( > > > > + struct blitter_context *blitter, > > > > + int num_sampler_states, > > > > + void **sampler_states) > > > > +{ > > > > + assert(num_textures <= 32); > > > > + > > > > + blitter->saved_num_sampler_states = num_sampler_states; > > > > + memcpy(blitter->saved_sampler_states, sampler_states, > > > > + num_sampler_states * sizeof(void *)); > > > > +} > > > > + > > > > > > Have you tried compiling with debug enabled? The assert above fails to > > > compile. Also, can you use Elements() or similar instead of the > > > hard-coded 32? > > > > > > Maybe we can figure out how to go back to having asserts keep exposing > > > their contents to the compiler even on non-debug builds. This used to > > > work without problem on linux and helped a lot to avoid these type of > > > problems. > > > > I wouldn't say without a problem: defining assert(expr) as (void)0 > > instead of (void)(expr) on release builds yielded a non-negligible > > performance improvement. I don't recall the exact figure, but I believe > > it was the 3-5% for the driver I was benchmarking at the time. YMMV. > > Different drivers will give different results, but there's nothing > > platform specific about this. > > It's not hard to avoid excuting code... For instance we could always > have it translated to something like: > > if (0) { > (void)(expr); > } > (void)(0) > Obviously I would have meant to say something cleaner like: do { if (0) { (void)(expr); } } while (0) Keith ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterAs far as immediate verts, why don't we just add support to r300g to switch to immediate mode for small VBOs? Posting from a mobile, pardon my terseness. ~ C. On Dec 13, 2009 3:28 PM, "Marek Olšák" <maraeo@...> wrote: ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterOn Mon, 2009-12-14 at 08:22 -0800, Keith Whitwell wrote:
> On Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote: > > On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote: > > > On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote: > > > > On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote: > > > > > > > > > > +static INLINE > > > > > +void util_blitter_save_fragment_sampler_states( > > > > > + struct blitter_context *blitter, > > > > > + int num_sampler_states, > > > > > + void **sampler_states) > > > > > +{ > > > > > + assert(num_textures <= 32); > > > > > + > > > > > + blitter->saved_num_sampler_states = num_sampler_states; > > > > > + memcpy(blitter->saved_sampler_states, sampler_states, > > > > > + num_sampler_states * sizeof(void *)); > > > > > +} > > > > > + > > > > > > > > Have you tried compiling with debug enabled? The assert above fails to > > > > compile. Also, can you use Elements() or similar instead of the > > > > hard-coded 32? > > > > > > > > Maybe we can figure out how to go back to having asserts keep exposing > > > > their contents to the compiler even on non-debug builds. This used to > > > > work without problem on linux and helped a lot to avoid these type of > > > > problems. > > > > > > I wouldn't say without a problem: defining assert(expr) as (void)0 > > > instead of (void)(expr) on release builds yielded a non-negligible > > > performance improvement. I don't recall the exact figure, but I believe > > > it was the 3-5% for the driver I was benchmarking at the time. YMMV. > > > Different drivers will give different results, but there's nothing > > > platform specific about this. > > > > It's not hard to avoid excuting code... For instance we could always > > have it translated to something like: > > > > if (0) { > > (void)(expr); > > } > > (void)(0) > > > > Obviously I would have meant to say something cleaner like: > > do { > if (0) { (void)(expr); } > } > while (0) This only works if expr has no calls, or just inline calls. Using my earlier example, if very_expensive_check() is in another file then the compiler has to assume the function will have side effects, and the call can't be removed. I'm not sure __assume keyword that Michal mentioned helps. It's more a hint to the compiler to help him optimize code around the assertion, but perhaps it helps with the warnings too. Jose ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterOn Mon, Dec 14, 2009 at 11:42 AM, Corbin Simpson
<mostawesomedude@...> wrote: > As far as immediate verts, why don't we just add support to r300g to switch > to immediate mode for small VBOs? > > Posting from a mobile, pardon my terseness. ~ C. That was what I was thinking for Nouveau, silently create a user buffer for size < some threshold and when we get a draw call with a user vertex buffer submit it in immediate mode. ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterJosé Fonseca pisze:
> On Mon, 2009-12-14 at 08:22 -0800, Keith Whitwell wrote: > >> On Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote: >> >>> On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote: >>> >>>> On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote: >>>> >>>>> On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote: >>>>> >>>>>> +static INLINE >>>>>> +void util_blitter_save_fragment_sampler_states( >>>>>> + struct blitter_context *blitter, >>>>>> + int num_sampler_states, >>>>>> + void **sampler_states) >>>>>> +{ >>>>>> + assert(num_textures <= 32); >>>>>> + >>>>>> + blitter->saved_num_sampler_states = num_sampler_states; >>>>>> + memcpy(blitter->saved_sampler_states, sampler_states, >>>>>> + num_sampler_states * sizeof(void *)); >>>>>> +} >>>>>> + >>>>>> >>>>> Have you tried compiling with debug enabled? The assert above fails to >>>>> compile. Also, can you use Elements() or similar instead of the >>>>> hard-coded 32? >>>>> >>>>> Maybe we can figure out how to go back to having asserts keep exposing >>>>> their contents to the compiler even on non-debug builds. This used to >>>>> work without problem on linux and helped a lot to avoid these type of >>>>> problems. >>>>> >>>> I wouldn't say without a problem: defining assert(expr) as (void)0 >>>> instead of (void)(expr) on release builds yielded a non-negligible >>>> performance improvement. I don't recall the exact figure, but I believe >>>> it was the 3-5% for the driver I was benchmarking at the time. YMMV. >>>> Different drivers will give different results, but there's nothing >>>> platform specific about this. >>>> >>> It's not hard to avoid excuting code... For instance we could always >>> have it translated to something like: >>> >>> if (0) { >>> (void)(expr); >>> } >>> (void)(0) >>> >>> >> Obviously I would have meant to say something cleaner like: >> >> do { >> if (0) { (void)(expr); } >> } >> while (0) >> > > This only works if expr has no calls, or just inline calls. Using my > earlier example, if very_expensive_check() is in another file then the > compiler has to assume the function will have side effects, and the call > can't be removed. > > I'm not sure __assume keyword that Michal mentioned helps. It's more a > hint to the compiler to help him optimize code around the assertion, but > perhaps it helps with the warnings too. > > __assume(lalala); I get: error C2065: 'lalala' : undeclared identifier On the other side, the compiler is going to be serious about the assumptions inside __assume(), and if they happen to be false, the application can behave not as expected. This is against current gallium paradigm, where we put assertions, but also do the same check in non-debug builds to early out from a function or provide default values (e.g. in switch-case statements). ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterOn Mon, 2009-12-14 at 08:58 -0800, michal wrote:
> José Fonseca pisze: > > On Mon, 2009-12-14 at 08:22 -0800, Keith Whitwell wrote: > > > >> On Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote: > >> > >>> On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote: > >>> > >>>> On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote: > >>>> > >>>>> On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote: > >>>>> > >>>>>> +static INLINE > >>>>>> +void util_blitter_save_fragment_sampler_states( > >>>>>> + struct blitter_context *blitter, > >>>>>> + int num_sampler_states, > >>>>>> + void **sampler_states) > >>>>>> +{ > >>>>>> + assert(num_textures <= 32); > >>>>>> + > >>>>>> + blitter->saved_num_sampler_states = num_sampler_states; > >>>>>> + memcpy(blitter->saved_sampler_states, sampler_states, > >>>>>> + num_sampler_states * sizeof(void *)); > >>>>>> +} > >>>>>> + > >>>>>> > >>>>> Have you tried compiling with debug enabled? The assert above fails to > >>>>> compile. Also, can you use Elements() or similar instead of the > >>>>> hard-coded 32? > >>>>> > >>>>> Maybe we can figure out how to go back to having asserts keep exposing > >>>>> their contents to the compiler even on non-debug builds. This used to > >>>>> work without problem on linux and helped a lot to avoid these type of > >>>>> problems. > >>>>> > >>>> I wouldn't say without a problem: defining assert(expr) as (void)0 > >>>> instead of (void)(expr) on release builds yielded a non-negligible > >>>> performance improvement. I don't recall the exact figure, but I believe > >>>> it was the 3-5% for the driver I was benchmarking at the time. YMMV. > >>>> Different drivers will give different results, but there's nothing > >>>> platform specific about this. > >>>> > >>> It's not hard to avoid excuting code... For instance we could always > >>> have it translated to something like: > >>> > >>> if (0) { > >>> (void)(expr); > >>> } > >>> (void)(0) > >>> > >>> > >> Obviously I would have meant to say something cleaner like: > >> > >> do { > >> if (0) { (void)(expr); } > >> } > >> while (0) > >> > > > > This only works if expr has no calls, or just inline calls. Using my > > earlier example, if very_expensive_check() is in another file then the > > compiler has to assume the function will have side effects, and the call > > can't be removed. > > > > I'm not sure __assume keyword that Michal mentioned helps. It's more a > > hint to the compiler to help him optimize code around the assertion, but > > perhaps it helps with the warnings too. > > > > > If I try to compile this: > > __assume(lalala); > > I get: > > error C2065: 'lalala' : undeclared identifier > > On the other side, the compiler is going to be serious about the > assumptions inside __assume(), and if they happen to be false, the > application can behave not as expected. This is against current gallium > paradigm, where we put assertions, but also do the same check in > non-debug builds to early out from a function or provide default values > (e.g. in switch-case statements). Bummer... that's no good. Jose ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterJosé Fonseca pisze:
> On Mon, 2009-12-14 at 08:58 -0800, michal wrote: > >> José Fonseca pisze: >> >>> On Mon, 2009-12-14 at 08:22 -0800, Keith Whitwell wrote: >>> >>> >>>> On Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote: >>>> >>>> >>>>> On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote: >>>>> >>>>> >>>>>> On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote: >>>>>> >>>>>> >>>>>>> On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote: >>>>>>> >>>>>>> >>>>>>>> +static INLINE >>>>>>>> +void util_blitter_save_fragment_sampler_states( >>>>>>>> + struct blitter_context *blitter, >>>>>>>> + int num_sampler_states, >>>>>>>> + void **sampler_states) >>>>>>>> +{ >>>>>>>> + assert(num_textures <= 32); >>>>>>>> + >>>>>>>> + blitter->saved_num_sampler_states = num_sampler_states; >>>>>>>> + memcpy(blitter->saved_sampler_states, sampler_states, >>>>>>>> + num_sampler_states * sizeof(void *)); >>>>>>>> +} >>>>>>>> + >>>>>>>> >>>>>>>> >>>>>>> Have you tried compiling with debug enabled? The assert above fails to >>>>>>> compile. Also, can you use Elements() or similar instead of the >>>>>>> hard-coded 32? >>>>>>> >>>>>>> Maybe we can figure out how to go back to having asserts keep exposing >>>>>>> their contents to the compiler even on non-debug builds. This used to >>>>>>> work without problem on linux and helped a lot to avoid these type of >>>>>>> problems. >>>>>>> >>>>>>> >>>>>> I wouldn't say without a problem: defining assert(expr) as (void)0 >>>>>> instead of (void)(expr) on release builds yielded a non-negligible >>>>>> performance improvement. I don't recall the exact figure, but I believe >>>>>> it was the 3-5% for the driver I was benchmarking at the time. YMMV. >>>>>> Different drivers will give different results, but there's nothing >>>>>> platform specific about this. >>>>>> >>>>>> >>>>> It's not hard to avoid excuting code... For instance we could always >>>>> have it translated to something like: >>>>> >>>>> if (0) { >>>>> (void)(expr); >>>>> } >>>>> (void)(0) >>>>> >>>>> >>>>> >>>> Obviously I would have meant to say something cleaner like: >>>> >>>> do { >>>> if (0) { (void)(expr); } >>>> } >>>> while (0) >>>> >>>> >>> This only works if expr has no calls, or just inline calls. Using my >>> earlier example, if very_expensive_check() is in another file then the >>> compiler has to assume the function will have side effects, and the call >>> can't be removed. >>> >>> I'm not sure __assume keyword that Michal mentioned helps. It's more a >>> hint to the compiler to help him optimize code around the assertion, but >>> perhaps it helps with the warnings too. >>> >>> >>> >> If I try to compile this: >> >> __assume(lalala); >> >> I get: >> >> error C2065: 'lalala' : undeclared identifier >> >> On the other side, the compiler is going to be serious about the >> assumptions inside __assume(), and if they happen to be false, the >> application can behave not as expected. This is against current gallium >> paradigm, where we put assertions, but also do the same check in >> non-debug builds to early out from a function or provide default values >> (e.g. in switch-case statements). >> > > Bummer... that's no good. > > > switch (foo) { case 1: bar = 22; default: assert(0); bar = 11; /* Safe value. */ } to use some flavour of assert() that doesn't get substituted with __assume() on non-debug builds. Something like weak_assert() or warning(). Then assert() could be used in places where there is no backup plan and the app is going to crash anyway. Or... do the opposite and introduce strong_assert() that translates to __assume() and leave assert() as it is now. ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterOn Mon, 2009-12-14 at 08:51 -0800, José Fonseca wrote:
> On Mon, 2009-12-14 at 08:22 -0800, Keith Whitwell wrote: > > On Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote: > > > On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote: > > > > On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote: > > > > > On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote: > > > > > > > > > > > > +static INLINE > > > > > > +void util_blitter_save_fragment_sampler_states( > > > > > > + struct blitter_context *blitter, > > > > > > + int num_sampler_states, > > > > > > + void **sampler_states) > > > > > > +{ > > > > > > + assert(num_textures <= 32); > > > > > > + > > > > > > + blitter->saved_num_sampler_states = num_sampler_states; > > > > > > + memcpy(blitter->saved_sampler_states, sampler_states, > > > > > > + num_sampler_states * sizeof(void *)); > > > > > > +} > > > > > > + > > > > > > > > > > Have you tried compiling with debug enabled? The assert above fails to > > > > > compile. Also, can you use Elements() or similar instead of the > > > > > hard-coded 32? > > > > > > > > > > Maybe we can figure out how to go back to having asserts keep exposing > > > > > their contents to the compiler even on non-debug builds. This used to > > > > > work without problem on linux and helped a lot to avoid these type of > > > > > problems. > > > > > > > > I wouldn't say without a problem: defining assert(expr) as (void)0 > > > > instead of (void)(expr) on release builds yielded a non-negligible > > > > performance improvement. I don't recall the exact figure, but I believe > > > > it was the 3-5% for the driver I was benchmarking at the time. YMMV. > > > > Different drivers will give different results, but there's nothing > > > > platform specific about this. > > > > > > It's not hard to avoid excuting code... For instance we could always > > > have it translated to something like: > > > > > > if (0) { > > > (void)(expr); > > > } > > > (void)(0) > > > > > > > Obviously I would have meant to say something cleaner like: > > > > do { > > if (0) { (void)(expr); } > > } > > while (0) > > This only works if expr has no calls, or just inline calls. Using my > earlier example, if very_expensive_check() is in another file then the > compiler has to assume the function will have side effects, and the call > can't be removed. What call?!? if (0) do_something_with_side_effects(); Has no side effects. Keith ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterOn Mon, 2009-12-14 at 09:28 -0800, Keith Whitwell wrote:
> On Mon, 2009-12-14 at 08:51 -0800, José Fonseca wrote: > > On Mon, 2009-12-14 at 08:22 -0800, Keith Whitwell wrote: > > > On Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote: > > > > On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote: > > > > > On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote: > > > > > > On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote: > > > > > > > > > > > > > > +static INLINE > > > > > > > +void util_blitter_save_fragment_sampler_states( > > > > > > > + struct blitter_context *blitter, > > > > > > > + int num_sampler_states, > > > > > > > + void **sampler_states) > > > > > > > +{ > > > > > > > + assert(num_textures <= 32); > > > > > > > + > > > > > > > + blitter->saved_num_sampler_states = num_sampler_states; > > > > > > > + memcpy(blitter->saved_sampler_states, sampler_states, > > > > > > > + num_sampler_states * sizeof(void *)); > > > > > > > +} > > > > > > > + > > > > > > > > > > > > Have you tried compiling with debug enabled? The assert above fails to > > > > > > compile. Also, can you use Elements() or similar instead of the > > > > > > hard-coded 32? > > > > > > > > > > > > Maybe we can figure out how to go back to having asserts keep exposing > > > > > > their contents to the compiler even on non-debug builds. This used to > > > > > > work without problem on linux and helped a lot to avoid these type of > > > > > > problems. > > > > > > > > > > I wouldn't say without a problem: defining assert(expr) as (void)0 > > > > > instead of (void)(expr) on release builds yielded a non-negligible > > > > > performance improvement. I don't recall the exact figure, but I believe > > > > > it was the 3-5% for the driver I was benchmarking at the time. YMMV. > > > > > Different drivers will give different results, but there's nothing > > > > > platform specific about this. > > > > > > > > It's not hard to avoid excuting code... For instance we could always > > > > have it translated to something like: > > > > > > > > if (0) { > > > > (void)(expr); > > > > } > > > > (void)(0) > > > > > > > > > > Obviously I would have meant to say something cleaner like: > > > > > > do { > > > if (0) { (void)(expr); } > > > } > > > while (0) > > > > This only works if expr has no calls, or just inline calls. Using my > > earlier example, if very_expensive_check() is in another file then the > > compiler has to assume the function will have side effects, and the call > > can't be removed. > > What call?!? > > if (0) do_something_with_side_effects(); > > Has no side effects. Nevermind. Don't know what I was thinking. Jose ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterKeith,
thanks for reviewing. On Mon, Dec 14, 2009 at 2:39 PM, Keith Whitwell <keithw@...> wrote: > On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote: >> >> +static INLINE >> +void util_blitter_save_fragment_sampler_states( >> + struct blitter_context *blitter, >> + int num_sampler_states, >> + void **sampler_states) >> +{ >> + assert(num_textures <= 32); >> + >> + blitter->saved_num_sampler_states = num_sampler_states; >> + memcpy(blitter->saved_sampler_states, sampler_states, >> + num_sampler_states * sizeof(void *)); >> +} >> + > > Have you tried compiling with debug enabled? The assert above fails to > compile. Also, can you use Elements() or similar instead of the > hard-coded 32? that don't break the compilation are in separate patches. On Mon, Dec 14, 2009 at 3:44 PM, Keith Whitwell <keithw@...> wrote: > Can these be created on-demand / lazily? > Done. It now creates even fragment shaders on-demand, because some of them might not be used at all in some applications. On Mon, Dec 14, 2009 at 3:44 PM, Keith Whitwell <keithw@...> wrote: > Can you maybe limit this code to a (much) smaller maximum number of > simultaneously live states of this type? Eg. 4 or 8 of them? Creating > states isn't so terribly expensive, and this seems a bit excessive. > Well, I'd like to avoid re-allocating state objects too often. Moreover, it's quite rare for an application to use more than 4 values to clear the stencil buffer. I also removed the draw_quad callback as there appears to be a more efficient way of handling this, and cleaned up a code to use PIPE_MAX_* constants. Please review/push. Marek [0001-util-add-new-fragment-shaders-to-simple_shaders.patch] From 511f58a54315d07740493cdda050d1ebd5a4ecd3 Mon Sep 17 00:00:00 2001 From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...> Date: Sat, 12 Dec 2009 06:34:29 +0100 Subject: [PATCH 1/7] util: add new fragment shaders to simple_shaders New shaders: * Fragment shader which writes depth sampled from a texture * Fragment shader which copies COLOR[0] to multiple render targets Additional improvements: * The fragment 'tex' shaders now take a sampler type (TGSI_TEXTURE_*) so that they can sample from any type of texture, not only from a 2D one. --- src/gallium/auxiliary/util/u_blit.c | 7 ++- src/gallium/auxiliary/util/u_gen_mipmap.c | 2 +- src/gallium/auxiliary/util/u_simple_shaders.c | 70 ++++++++++++++++++++++--- src/gallium/auxiliary/util/u_simple_shaders.h | 13 ++++- 4 files changed, 80 insertions(+), 12 deletions(-) diff --git a/src/gallium/auxiliary/util/u_blit.c b/src/gallium/auxiliary/util/u_blit.c index abe1de3..c9050ca 100644 --- a/src/gallium/auxiliary/util/u_blit.c +++ b/src/gallium/auxiliary/util/u_blit.c @@ -126,7 +126,8 @@ util_create_blit(struct pipe_context *pipe, struct cso_context *cso) } /* fragment shader */ - ctx->fs[TGSI_WRITEMASK_XYZW] = util_make_fragment_tex_shader(pipe); + ctx->fs[TGSI_WRITEMASK_XYZW] = + util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_2D); ctx->vbuf = NULL; /* init vertex data that doesn't change */ @@ -420,7 +421,9 @@ util_blit_pixels_writemask(struct blit_state *ctx, cso_set_sampler_textures(ctx->cso, 1, &tex); if (ctx->fs[writemask] == NULL) - ctx->fs[writemask] = util_make_fragment_tex_shader_writemask(pipe, writemask); + ctx->fs[writemask] = + util_make_fragment_tex_shader_writemask(pipe, TGSI_TEXTURE_2D, + writemask); /* shaders */ cso_set_fragment_shader_handle(ctx->cso, ctx->fs[writemask]); diff --git a/src/gallium/auxiliary/util/u_gen_mipmap.c b/src/gallium/auxiliary/util/u_gen_mipmap.c index 83263d9..1728e66 100644 --- a/src/gallium/auxiliary/util/u_gen_mipmap.c +++ b/src/gallium/auxiliary/util/u_gen_mipmap.c @@ -1317,7 +1317,7 @@ util_create_gen_mipmap(struct pipe_context *pipe, } /* fragment shader */ - ctx->fs = util_make_fragment_tex_shader(pipe); + ctx->fs = util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_2D); /* vertex data that doesn't change */ for (i = 0; i < 4; i++) { diff --git a/src/gallium/auxiliary/util/u_simple_shaders.c b/src/gallium/auxiliary/util/u_simple_shaders.c index 1c8b157..8172ead 100644 --- a/src/gallium/auxiliary/util/u_simple_shaders.c +++ b/src/gallium/auxiliary/util/u_simple_shaders.c @@ -2,6 +2,7 @@ * * Copyright 2008 Tungsten Graphics, Inc., Cedar Park, Texas. * All Rights Reserved. + * Copyright 2009 Marek Olšák <maraeo@...> * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the @@ -30,6 +31,7 @@ * Simple vertex/fragment shader generators. * * @author Brian Paul + Marek Olšák */ @@ -87,6 +89,7 @@ util_make_vertex_passthrough_shader(struct pipe_context *pipe, */ void * util_make_fragment_tex_shader_writemask(struct pipe_context *pipe, + unsigned tex_target, unsigned writemask ) { struct ureg_program *ureg; @@ -116,20 +119,63 @@ util_make_fragment_tex_shader_writemask(struct pipe_context *pipe, ureg_TEX( ureg, ureg_writemask(out, writemask), - TGSI_TEXTURE_2D, tex, sampler ); + tex_target, tex, sampler ); ureg_END( ureg ); return ureg_create_shader_and_destroy( ureg, pipe ); } void * -util_make_fragment_tex_shader(struct pipe_context *pipe ) +util_make_fragment_tex_shader(struct pipe_context *pipe, unsigned tex_target ) { return util_make_fragment_tex_shader_writemask( pipe, + tex_target, TGSI_WRITEMASK_XYZW ); } +/** + * Make a simple fragment texture shader which reads an X component from + * a texture and writes it as depth. + */ +void * +util_make_fragment_tex_shader_writedepth(struct pipe_context *pipe, + unsigned tex_target) +{ + struct ureg_program *ureg; + struct ureg_src sampler; + struct ureg_src tex; + struct ureg_dst out, depth; + struct ureg_src imm; + ureg = ureg_create( TGSI_PROCESSOR_FRAGMENT ); + if (ureg == NULL) + return NULL; + + sampler = ureg_DECL_sampler( ureg, 0 ); + + tex = ureg_DECL_fs_input( ureg, + TGSI_SEMANTIC_GENERIC, 0, + TGSI_INTERPOLATE_PERSPECTIVE ); + + out = ureg_DECL_output( ureg, + TGSI_SEMANTIC_COLOR, + 0 ); + + depth = ureg_DECL_output( ureg, + TGSI_SEMANTIC_POSITION, + 0 ); + + imm = ureg_imm4f( ureg, 0, 0, 0, 1 ); + + ureg_MOV( ureg, out, imm ); + + ureg_TEX( ureg, + ureg_writemask(depth, TGSI_WRITEMASK_Z), + tex_target, tex, sampler ); + ureg_END( ureg ); + + return ureg_create_shader_and_destroy( ureg, pipe ); +} /** * Make simple fragment color pass-through shader. @@ -137,9 +183,18 @@ util_make_fragment_tex_shader(struct pipe_context *pipe ) void * util_make_fragment_passthrough_shader(struct pipe_context *pipe) { + return util_make_fragment_clonecolor_shader(pipe, 1); +} + +void * +util_make_fragment_clonecolor_shader(struct pipe_context *pipe, int num_cbufs) +{ struct ureg_program *ureg; struct ureg_src src; - struct ureg_dst dst; + struct ureg_dst dst[8]; + int i; + + assert(num_cbufs <= 8); ureg = ureg_create( TGSI_PROCESSOR_FRAGMENT ); if (ureg == NULL) @@ -148,12 +203,13 @@ util_make_fragment_passthrough_shader(struct pipe_context *pipe) src = ureg_DECL_fs_input( ureg, TGSI_SEMANTIC_COLOR, 0, TGSI_INTERPOLATE_PERSPECTIVE ); - dst = ureg_DECL_output( ureg, TGSI_SEMANTIC_COLOR, 0 ); + for (i = 0; i < num_cbufs; i++) + dst[i] = ureg_DECL_output( ureg, TGSI_SEMANTIC_COLOR, i ); + + for (i = 0; i < num_cbufs; i++) + ureg_MOV( ureg, dst[i], src ); - ureg_MOV( ureg, dst, src ); ureg_END( ureg ); return ureg_create_shader_and_destroy( ureg, pipe ); } - - diff --git a/src/gallium/auxiliary/util/u_simple_shaders.h b/src/gallium/auxiliary/util/u_simple_shaders.h index d2e80d6..6e76094 100644 --- a/src/gallium/auxiliary/util/u_simple_shaders.h +++ b/src/gallium/auxiliary/util/u_simple_shaders.h @@ -51,16 +51,25 @@ util_make_vertex_passthrough_shader(struct pipe_context *pipe, extern void * util_make_fragment_tex_shader_writemask(struct pipe_context *pipe, - unsigned writemask ); + unsigned tex_target, + unsigned writemask); extern void * -util_make_fragment_tex_shader(struct pipe_context *pipe); +util_make_fragment_tex_shader(struct pipe_context *pipe, unsigned tex_target); + + +extern void * +util_make_fragment_tex_shader_writedepth(struct pipe_context *pipe, + unsigned tex_target); extern void * util_make_fragment_passthrough_shader(struct pipe_context *pipe); +extern void * +util_make_fragment_clonecolor_shader(struct pipe_context *pipe, int num_cbufs); + #ifdef __cplusplus } #endif -- 1.6.3.3 [0002-util-add-a-function-which-converts-2D-coordinates-to.patch] From dddb77c058d67c0a192b871deb8d837dfabbefce Mon Sep 17 00:00:00 2001 From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...> Date: Sat, 12 Dec 2009 23:38:17 +0100 Subject: [PATCH 2/7] util: add a function which converts 2D coordinates to cubemap coordinates The code was taken over from u_gen_mipmap. --- src/gallium/auxiliary/util/Makefile | 1 + src/gallium/auxiliary/util/SConscript | 1 + src/gallium/auxiliary/util/u_gen_mipmap.c | 55 +--------------- src/gallium/auxiliary/util/u_texture.c | 102 +++++++++++++++++++++++++++++ src/gallium/auxiliary/util/u_texture.h | 54 +++++++++++++++ 5 files changed, 161 insertions(+), 52 deletions(-) create mode 100644 src/gallium/auxiliary/util/u_texture.c create mode 100644 src/gallium/auxiliary/util/u_texture.h diff --git a/src/gallium/auxiliary/util/Makefile b/src/gallium/auxiliary/util/Makefile index 1d8bb55..894958f 100644 --- a/src/gallium/auxiliary/util/Makefile +++ b/src/gallium/auxiliary/util/Makefile @@ -30,6 +30,7 @@ C_SOURCES = \ u_stream_stdc.c \ u_stream_wd.c \ u_surface.c \ + u_texture.c \ u_tile.c \ u_time.c \ u_timed_winsys.c \ diff --git a/src/gallium/auxiliary/util/SConscript b/src/gallium/auxiliary/util/SConscript index 8d99106..0c0e048 100644 --- a/src/gallium/auxiliary/util/SConscript +++ b/src/gallium/auxiliary/util/SConscript @@ -48,6 +48,7 @@ util = env.ConvenienceLibrary( 'u_stream_stdc.c', 'u_stream_wd.c', 'u_surface.c', + 'u_texture.c', 'u_tile.c', 'u_time.c', 'u_timed_winsys.c', diff --git a/src/gallium/auxiliary/util/u_gen_mipmap.c b/src/gallium/auxiliary/util/u_gen_mipmap.c index 1728e66..69ff3b9 100644 --- a/src/gallium/auxiliary/util/u_gen_mipmap.c +++ b/src/gallium/auxiliary/util/u_gen_mipmap.c @@ -46,6 +46,7 @@ #include "util/u_gen_mipmap.h" #include "util/u_simple_shaders.h" #include "util/u_math.h" +#include "util/u_texture.h" #include "cso_cache/cso_context.h" @@ -1383,59 +1384,9 @@ set_vertex_data(struct gen_mipmap_state *ctx, static const float st[4][2] = { {0.0f, 0.0f}, {1.0f, 0.0f}, {1.0f, 1.0f}, {0.0f, 1.0f} }; - float rx, ry, rz; - uint i; - - /* loop over quad verts */ - for (i = 0; i < 4; i++) { - /* Compute sc = +/-scale and tc = +/-scale. - * Not +/-1 to avoid cube face selection ambiguity near the edges, - * though that can still sometimes happen with this scale factor... - */ - const float scale = 0.9999f; - const float sc = (2.0f * st[i][0] - 1.0f) * scale; - const float tc = (2.0f * st[i][1] - 1.0f) * scale; - - switch (face) { - case PIPE_TEX_FACE_POS_X: - rx = 1.0f; - ry = -tc; - rz = -sc; - break; - case PIPE_TEX_FACE_NEG_X: - rx = -1.0f; - ry = -tc; - rz = sc; - break; - case PIPE_TEX_FACE_POS_Y: - rx = sc; - ry = 1.0f; - rz = tc; - break; - case PIPE_TEX_FACE_NEG_Y: - rx = sc; - ry = -1.0f; - rz = -tc; - break; - case PIPE_TEX_FACE_POS_Z: - rx = sc; - ry = -tc; - rz = 1.0f; - break; - case PIPE_TEX_FACE_NEG_Z: - rx = -sc; - ry = -tc; - rz = -1.0f; - break; - default: - rx = ry = rz = 0.0f; - assert(0); - } - ctx->vertices[i][1][0] = rx; /*s*/ - ctx->vertices[i][1][1] = ry; /*t*/ - ctx->vertices[i][1][2] = rz; /*r*/ - } + util_map_texcoords2d_onto_cubemap(face, &st[0][0], 2, + &ctx->vertices[0][1][0], 8); } else { /* 1D/2D */ diff --git a/src/gallium/auxiliary/util/u_texture.c b/src/gallium/auxiliary/util/u_texture.c new file mode 100644 index 0000000..cd477ab --- /dev/null +++ b/src/gallium/auxiliary/util/u_texture.c @@ -0,0 +1,102 @@ +/************************************************************************** + * + * Copyright 2008 Tungsten Graphics, Inc., Cedar Park, Texas. + * All Rights Reserved. + * Copyright 2008 VMware, Inc. All rights reserved. + * Copyright 2009 Marek Olšák <maraeo@...> + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * "Software"), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **************************************************************************/ + +/** + * @file + * Texture mapping utility functions. + * + * @author Brian Paul + * Marek Olšák + */ + +#include "pipe/p_defines.h" + +#include "util/u_texture.h" + +void util_map_texcoords2d_onto_cubemap(unsigned face, + const float *in_st, unsigned in_stride, + float *out_str, unsigned out_stride) +{ + int i; + float rx, ry, rz; + + /* loop over quad verts */ + for (i = 0; i < 4; i++) { + /* Compute sc = +/-scale and tc = +/-scale. + * Not +/-1 to avoid cube face selection ambiguity near the edges, + * though that can still sometimes happen with this scale factor... + */ + const float scale = 0.9999f; + const float sc = (2 * in_st[0] - 1) * scale; + const float tc = (2 * in_st[1] - 1) * scale; + + switch (face) { + case PIPE_TEX_FACE_POS_X: + rx = 1; + ry = -tc; + rz = -sc; + break; + case PIPE_TEX_FACE_NEG_X: + rx = -1; + ry = -tc; + rz = sc; + break; + case PIPE_TEX_FACE_POS_Y: + rx = sc; + ry = 1; + rz = tc; + break; + case PIPE_TEX_FACE_NEG_Y: + rx = sc; + ry = -1; + rz = -tc; + break; + case PIPE_TEX_FACE_POS_Z: + rx = sc; + ry = -tc; + rz = 1; + break; + case PIPE_TEX_FACE_NEG_Z: + rx = -sc; + ry = -tc; + rz = -1; + break; + default: + rx = ry = rz = 0; + assert(0); + } + + out_str[0] = rx; /*s*/ + out_str[1] = ry; /*t*/ + out_str[2] = rz; /*r*/ + + in_st += in_stride; + out_str += out_stride; + } +} diff --git a/src/gallium/auxiliary/util/u_texture.h b/src/gallium/auxiliary/util/u_texture.h new file mode 100644 index 0000000..93b2f1e --- /dev/null +++ b/src/gallium/auxiliary/util/u_texture.h @@ -0,0 +1,54 @@ +/************************************************************************** + * + * Copyright 2009 Marek Olšák <maraeo@...> + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * "Software"), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **************************************************************************/ + +#ifndef U_TEXTURE_H +#define U_TEXTURE_H + +#ifdef __cplusplus +extern "C" { +#endif + +/** + * Convert 2D texture coordinates of 4 vertices into cubemap coordinates + * in the given face. + * Coordinates must be in the range [0,1]. + * + * \param face Cubemap face. + * \param in_st 4 pairs of 2D texture coordinates to convert. + * \param in_stride Stride of in_st in floats. + * \param out_str STR cubemap texture coordinates to compute. + * \param out_stride Stride of out_str in floats. + */ +void util_map_texcoords2d_onto_cubemap(unsigned face, + const float *in_st, unsigned in_stride, + float *out_str, unsigned out_stride); + + +#ifdef __cplusplus +} +#endif + +#endif -- 1.6.3.3 [0003-util-add-blitter.patch] From 6ff91fad38eae6d489f2d0ac2dac4508a499bbdc Mon Sep 17 00:00:00 2001 From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...> Date: Thu, 10 Dec 2009 10:25:33 +0100 Subject: [PATCH 3/7] util: add blitter --- src/gallium/auxiliary/util/Makefile | 1 + src/gallium/auxiliary/util/SConscript | 1 + src/gallium/auxiliary/util/u_blitter.c | 605 ++++++++++++++++++++++++++++++++ src/gallium/auxiliary/util/u_blitter.h | 244 +++++++++++++ 4 files changed, 851 insertions(+), 0 deletions(-) create mode 100644 src/gallium/auxiliary/util/u_blitter.c create mode 100644 src/gallium/auxiliary/util/u_blitter.h diff --git a/src/gallium/auxiliary/util/Makefile b/src/gallium/auxiliary/util/Makefile index 894958f..f81fc46 100644 --- a/src/gallium/auxiliary/util/Makefile +++ b/src/gallium/auxiliary/util/Makefile @@ -9,6 +9,7 @@ C_SOURCES = \ u_debug_symbol.c \ u_debug_stack.c \ u_blit.c \ + u_blitter.c \ u_cache.c \ u_cpu_detect.c \ u_draw_quad.c \ diff --git a/src/gallium/auxiliary/util/SConscript b/src/gallium/auxiliary/util/SConscript index 0c0e048..024a370 100644 --- a/src/gallium/auxiliary/util/SConscript +++ b/src/gallium/auxiliary/util/SConscript @@ -23,6 +23,7 @@ util = env.ConvenienceLibrary( source = [ 'u_bitmask.c', 'u_blit.c', + 'u_blitter.c', 'u_cache.c', 'u_cpu_detect.c', 'u_debug.c', diff --git a/src/gallium/auxiliary/util/u_blitter.c b/src/gallium/auxiliary/util/u_blitter.c new file mode 100644 index 0000000..e51a5df --- /dev/null +++ b/src/gallium/auxiliary/util/u_blitter.c @@ -0,0 +1,605 @@ +/************************************************************************** + * + * Copyright 2009 Marek Olšák <maraeo@...> + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * "Software"), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **************************************************************************/ + +/** + * @file + * Blitter utility to facilitate acceleration of the clear, surface_copy, + * and surface_fill functions. + * + * @author Marek Olšák + */ + +#include "pipe/p_context.h" +#include "pipe/p_defines.h" +#include "pipe/p_inlines.h" +#include "pipe/p_shader_tokens.h" +#include "pipe/p_state.h" + +#include "util/u_memory.h" +#include "util/u_math.h" +#include "util/u_blitter.h" +#include "util/u_draw_quad.h" +#include "util/u_pack_color.h" +#include "util/u_rect.h" +#include "util/u_simple_shaders.h" +#include "util/u_texture.h" + +struct blitter_context_priv +{ + struct blitter_context blitter; + + struct pipe_context *pipe; /**< pipe context */ + struct pipe_buffer *vbuf; /**< quad */ + + float vertices[4][2][4]; /**< {pos, color} or {pos, texcoord} */ + + /* Constant state objects. */ + /* Vertex shaders. */ + void *vs_col; /**< Vertex shader which passes {pos, color} to the output */ + void *vs_tex; /**<Vertex shader which passes {pos, texcoord} to the output.*/ + + /* Fragment shaders. */ + void *fs_col[8]; /**< FS which outputs colors to 1-8 color buffers */ + void *fs_texfetch_col[4]; /**< FS which outputs a color from a texture */ + void *fs_texfetch_depth[4]; /**< FS which outputs a depth from a texture, + where the index is PIPE_TEXTURE_* to be sampled */ + + /* Blend state. */ + void *blend_write_color; /**< blend state with writemask of RGBA */ + void *blend_keep_color; /**< blend state with writemask of 0 */ + + /* Depth stencil alpha state. */ + void *dsa_write_depth_stencil[0xff]; /**< indices are stencil clear values */ + void *dsa_write_depth_keep_stencil; + void *dsa_keep_depth_stencil; + + /* Other state. */ + void *sampler_state[16]; /**< sampler state for clamping to a miplevel */ + void *rs_state; /**< rasterizer state */ +}; + +struct blitter_context *util_blitter_create(struct pipe_context *pipe) +{ + struct blitter_context_priv *ctx; + struct pipe_blend_state blend; + struct pipe_depth_stencil_alpha_state dsa; + struct pipe_rasterizer_state rs_state; + struct pipe_sampler_state sampler_state; + unsigned i, max_render_targets; + + ctx = CALLOC_STRUCT(blitter_context_priv); + if (!ctx) + return NULL; + + ctx->pipe = pipe; + + /* init state objects for them to be considered invalid */ + ctx->blitter.saved_fb_state.nr_cbufs = ~0; + ctx->blitter.saved_num_textures = ~0; + ctx->blitter.saved_num_sampler_states = ~0; + + /* blend state objects */ + memset(&blend, 0, sizeof(blend)); + ctx->blend_keep_color = pipe->create_blend_state(pipe, &blend); + + blend.colormask = PIPE_MASK_RGBA; + ctx->blend_write_color = pipe->create_blend_state(pipe, &blend); + + /* depth stencil alpha state objects */ + memset(&dsa, 0, sizeof(dsa)); + ctx->dsa_keep_depth_stencil = + pipe->create_depth_stencil_alpha_state(pipe, &dsa); + + dsa.depth.enabled = 1; + dsa.depth.writemask = 1; + dsa.depth.func = PIPE_FUNC_ALWAYS; + ctx->dsa_write_depth_keep_stencil = + pipe->create_depth_stencil_alpha_state(pipe, &dsa); + + dsa.stencil[0].enabled = 1; + dsa.stencil[0].func = PIPE_FUNC_ALWAYS; + dsa.stencil[0].fail_op = PIPE_STENCIL_OP_REPLACE; + dsa.stencil[0].zpass_op = PIPE_STENCIL_OP_REPLACE; + dsa.stencil[0].zfail_op = PIPE_STENCIL_OP_REPLACE; + dsa.stencil[0].valuemask = 0xff; + dsa.stencil[0].writemask = 0xff; + + /* create a depth stencil alpha state for each possible stencil clear + * value */ + for (i = 0; i < 0xff; i++) { + dsa.stencil[0].ref_value = i; + + ctx->dsa_write_depth_stencil[i] = + pipe->create_depth_stencil_alpha_state(pipe, &dsa); + } + + /* sampler state */ + memset(&sampler_state, 0, sizeof(sampler_state)); + sampler_state.wrap_s = PIPE_TEX_WRAP_CLAMP_TO_EDGE; + sampler_state.wrap_t = PIPE_TEX_WRAP_CLAMP_TO_EDGE; + sampler_state.wrap_r = PIPE_TEX_WRAP_CLAMP_TO_EDGE; + + for (i = 0; i < 16; i++) { + sampler_state.lod_bias = i; + sampler_state.min_lod = i; + sampler_state.max_lod = i; + + ctx->sampler_state[i] = pipe->create_sampler_state(pipe, &sampler_state); + } + + /* rasterizer state */ + memset(&rs_state, 0, sizeof(rs_state)); + rs_state.front_winding = PIPE_WINDING_CW; + rs_state.cull_mode = PIPE_WINDING_NONE; + rs_state.bypass_vs_clip_and_viewport = 1; + rs_state.gl_rasterization_rules = 1; + ctx->rs_state = pipe->create_rasterizer_state(pipe, &rs_state); + + /* vertex shaders */ + { + const uint semantic_names[] = { TGSI_SEMANTIC_POSITION, + TGSI_SEMANTIC_COLOR }; + const uint semantic_indices[] = { 0, 0 }; + ctx->vs_col = + util_make_vertex_passthrough_shader(pipe, 2, semantic_names, + semantic_indices); + } + { + const uint semantic_names[] = { TGSI_SEMANTIC_POSITION, + TGSI_SEMANTIC_GENERIC }; + const uint semantic_indices[] = { 0, 0 }; + ctx->vs_tex = + util_make_vertex_passthrough_shader(pipe, 2, semantic_names, + semantic_indices); + } + + /* fragment shaders */ + ctx->fs_texfetch_col[PIPE_TEXTURE_1D] = + util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_1D); + ctx->fs_texfetch_col[PIPE_TEXTURE_2D] = + util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_2D); + ctx->fs_texfetch_col[PIPE_TEXTURE_3D] = + util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_3D); + ctx->fs_texfetch_col[PIPE_TEXTURE_CUBE] = + util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_CUBE); + + ctx->fs_texfetch_depth[PIPE_TEXTURE_1D] = + util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_1D); + ctx->fs_texfetch_depth[PIPE_TEXTURE_2D] = + util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_2D); + ctx->fs_texfetch_depth[PIPE_TEXTURE_3D] = + util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_3D); + ctx->fs_texfetch_depth[PIPE_TEXTURE_CUBE] = + util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_CUBE); + + max_render_targets = pipe->screen->get_param(pipe->screen, + PIPE_CAP_MAX_RENDER_TARGETS); + assert(max_render_targets <= 8); + for (i = 0; i < max_render_targets; i++) + ctx->fs_col[i] = util_make_fragment_clonecolor_shader(pipe, 1+i); + + /* set invariant vertex coordinates */ + for (i = 0; i < 4; i++) + ctx->vertices[i][0][3] = 1; /*v.w*/ + + /* create the vertex buffer */ + ctx->vbuf = pipe_buffer_create(ctx->pipe->screen, + 32, + PIPE_BUFFER_USAGE_VERTEX, + sizeof(ctx->vertices)); + + return &ctx->blitter; +} + +void util_blitter_destroy(struct blitter_context *blitter) +{ + struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter; + struct pipe_context *pipe = ctx->pipe; + int i; + + pipe->delete_blend_state(pipe, ctx->blend_write_color); + pipe->delete_blend_state(pipe, ctx->blend_keep_color); + pipe->delete_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil); + pipe->delete_depth_stencil_alpha_state(pipe, + ctx->dsa_write_depth_keep_stencil); + + for (i = 0; i < 0xff; i++) + pipe->delete_depth_stencil_alpha_state(pipe, + ctx->dsa_write_depth_stencil[i]); + + pipe->delete_rasterizer_state(pipe, ctx->rs_state); + pipe->delete_vs_state(pipe, ctx->vs_col); + pipe->delete_vs_state(pipe, ctx->vs_tex); + + for (i = 0; i < 4; i++) { + pipe->delete_fs_state(pipe, ctx->fs_texfetch_col[i]); + pipe->delete_fs_state(pipe, ctx->fs_texfetch_depth[i]); + } + for (i = 0; i < 8 && ctx->fs_col[i]; i++) + pipe->delete_fs_state(pipe, ctx->fs_col[i]); + + pipe_buffer_reference(&ctx->vbuf, NULL); + FREE(ctx); +} + +static void blitter_check_saved_CSOs(struct blitter_context_priv *ctx) +{ + /* make sure these CSOs have been saved */ + assert(ctx->blitter.saved_blend_state && + ctx->blitter.saved_dsa_state && + ctx->blitter.saved_rs_state && + ctx->blitter.saved_fs && + ctx->blitter.saved_vs); +} + +static void blitter_restore_CSOs(struct blitter_context_priv *ctx) +{ + struct pipe_context *pipe = ctx->pipe; + + /* restore the state objects which are always required to be saved */ + pipe->bind_blend_state(pipe, ctx->blitter.saved_blend_state); + pipe->bind_depth_stencil_alpha_state(pipe, ctx->blitter.saved_dsa_state); + pipe->bind_rasterizer_state(pipe, ctx->blitter.saved_rs_state); + pipe->bind_fs_state(pipe, ctx->blitter.saved_fs); + pipe->bind_vs_state(pipe, ctx->blitter.saved_vs); + + ctx->blitter.saved_blend_state = 0; + ctx->blitter.saved_dsa_state = 0; + ctx->blitter.saved_rs_state = 0; + ctx->blitter.saved_fs = 0; + ctx->blitter.saved_vs = 0; + + /* restore the state objects which are required to be saved before copy/fill + */ + if (ctx->blitter.saved_fb_state.nr_cbufs != ~0) { + pipe->set_framebuffer_state(pipe, &ctx->blitter.saved_fb_state); + ctx->blitter.saved_fb_state.nr_cbufs = ~0; + } + + if (ctx->blitter.saved_num_sampler_states != ~0) { + pipe->bind_fragment_sampler_states(pipe, + ctx->blitter.saved_num_sampler_states, + ctx->blitter.saved_sampler_states); + ctx->blitter.saved_num_sampler_states = ~0; + } + + if (ctx->blitter.saved_num_textures != ~0) { + pipe->set_fragment_sampler_textures(pipe, + ctx->blitter.saved_num_textures, + ctx->blitter.saved_textures); + ctx->blitter.saved_num_textures = ~0; + } +} + +static void blitter_set_rectangle(struct blitter_context_priv *ctx, + unsigned x1, unsigned y1, + unsigned x2, unsigned y2, + float depth) +{ + int i; + + /* set vertex positions */ + ctx->vertices[0][0][0] = x1; /*v0.x*/ + ctx->vertices[0][0][1] = y1; /*v0.y*/ + + ctx->vertices[1][0][0] = x2; /*v1.x*/ + ctx->vertices[1][0][1] = y1; /*v1.y*/ + + ctx->vertices[2][0][0] = x2; /*v2.x*/ + ctx->vertices[2][0][1] = y2; /*v2.y*/ + + ctx->vertices[3][0][0] = x1; /*v3.x*/ + ctx->vertices[3][0][1] = y2; /*v3.y*/ + + for (i = 0; i < 4; i++) + ctx->vertices[i][0][2] = depth; /*z*/ +} + +static void blitter_set_clear_color(struct blitter_context_priv *ctx, + const float *rgba) +{ + int i; + + for (i = 0; i < 4; i++) { + ctx->vertices[i][1][0] = rgba[0]; + ctx->vertices[i][1][1] = rgba[1]; + ctx->vertices[i][1][2] = rgba[2]; + ctx->vertices[i][1][3] = rgba[3]; + } +} + +static void blitter_set_texcoords_2d(struct blitter_context_priv *ctx, + struct pipe_surface *surf, + unsigned x1, unsigned y1, + unsigned x2, unsigned y2) +{ + int i; + float s1 = x1 / (float)surf->width; + float t1 = y1 / (float)surf->height; + float s2 = x2 / (float)surf->width; + float t2 = y2 / (float)surf->height; + + ctx->vertices[0][1][0] = s1; /*t0.s*/ + ctx->vertices[0][1][1] = t1; /*t0.t*/ + + ctx->vertices[1][1][0] = s2; /*t1.s*/ + ctx->vertices[1][1][1] = t1; /*t1.t*/ + + ctx->vertices[2][1][0] = s2; /*t2.s*/ + ctx->vertices[2][1][1] = t2; /*t2.t*/ + + ctx->vertices[3][1][0] = s1; /*t3.s*/ + ctx->vertices[3][1][1] = t2; /*t3.t*/ + + for (i = 0; i < 4; i++) { + ctx->vertices[i][1][2] = 0; /*r*/ + ctx->vertices[i][1][3] = 1; /*q*/ + } +} + +static void blitter_set_texcoords_3d(struct blitter_context_priv *ctx, + struct pipe_surface *surf, + unsigned x1, unsigned y1, + unsigned x2, unsigned y2) +{ + int i; + float depth = u_minify(surf->texture->depth0, surf->level); + float r = surf->zslice / depth; + + blitter_set_texcoords_2d(ctx, surf, x1, y1, x2, y2); + + for (i = 0; i < 4; i++) + ctx->vertices[i][1][2] = r; /*r*/ +} + +static void blitter_set_texcoords_cube(struct blitter_context_priv *ctx, + struct pipe_surface *surf, + unsigned x1, unsigned y1, + unsigned x2, unsigned y2) +{ + int i; + float s1 = x1 / (float)surf->width; + float t1 = y1 / (float)surf->height; + float s2 = x2 / (float)surf->width; + float t2 = y2 / (float)surf->height; + const float st[4][2] = { + {s1, t1}, {s2, t1}, {s2, t2}, {s1, t2} + }; + + util_map_texcoords2d_onto_cubemap(surf->face, + /* pointer, stride in floats */ + &st[0][0], 2, + &ctx->vertices[0][1][0], 8); + + for (i = 0; i < 4; i++) + ctx->vertices[i][1][3] = 1; /*q*/ +} + +static void blitter_draw_quad(struct blitter_context_priv *ctx) +{ + struct blitter_context *blitter = &ctx->blitter; + struct pipe_context *pipe = ctx->pipe; + + if (blitter->draw_quad) { + blitter->draw_quad(pipe, &ctx->vertices[0][0][0]); + } else { + /* write vertices and draw them */ + pipe_buffer_write(pipe->screen, ctx->vbuf, + 0, sizeof(ctx->vertices), ctx->vertices); + + util_draw_vertex_buffer(ctx->pipe, ctx->vbuf, 0, PIPE_PRIM_TRIANGLE_FAN, + 4, /* verts */ + 2); /* attribs/vert */ + } +} + +void util_blitter_clear(struct blitter_context *blitter, + unsigned width, unsigned height, + unsigned num_cbufs, + unsigned clear_buffers, + const float *rgba, + double depth, unsigned stencil) +{ + struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter; + struct pipe_context *pipe = ctx->pipe; + + assert(num_cbufs <= 8); + + blitter_check_saved_CSOs(ctx); + + /* bind CSOs */ + if (clear_buffers & PIPE_CLEAR_COLOR) + pipe->bind_blend_state(pipe, ctx->blend_write_color); + else + pipe->bind_blend_state(pipe, ctx->blend_keep_color); + + if (clear_buffers & PIPE_CLEAR_DEPTHSTENCIL) + pipe->bind_depth_stencil_alpha_state(pipe, + ctx->dsa_write_depth_stencil[stencil&0xff]); + else + pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil); + + pipe->bind_rasterizer_state(pipe, ctx->rs_state); + pipe->bind_fs_state(pipe, ctx->fs_col[num_cbufs ? num_cbufs-1 : 0]); + pipe->bind_vs_state(pipe, ctx->vs_col); + + blitter_set_clear_color(ctx, rgba); + blitter_set_rectangle(ctx, 0, 0, width, height, depth); + blitter_draw_quad(ctx); + blitter_restore_CSOs(ctx); +} + +void util_blitter_copy(struct blitter_context *blitter, + struct pipe_surface *dst, + unsigned dstx, unsigned dsty, + struct pipe_surface *src, + unsigned srcx, unsigned srcy, + unsigned width, unsigned height, + boolean ignore_stencil) +{ + struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter; + struct pipe_context *pipe = ctx->pipe; + struct pipe_screen *screen = pipe->screen; + struct pipe_framebuffer_state fb_state; + boolean is_stencil, is_depth; + unsigned dst_tex_usage; + + /* give up if textures are not set */ + assert(dst->texture && src->texture); + if (!dst->texture || !src->texture) + return; + + is_depth = pf_get_component_bits(src->format, PIPE_FORMAT_COMP_Z) != 0; + is_stencil = pf_get_component_bits(src->format, PIPE_FORMAT_COMP_S) != 0; + dst_tex_usage = is_depth || is_stencil ? PIPE_TEXTURE_USAGE_DEPTH_STENCIL : + PIPE_TEXTURE_USAGE_RENDER_TARGET; + + /* check if we can sample from and render to the surfaces */ + /* (assuming copying a stencil buffer is not possible) */ + if ((!ignore_stencil && is_stencil) || + !screen->is_format_supported(screen, dst->format, dst->texture->target, + dst_tex_usage, 0) || + !screen->is_format_supported(screen, src->format, src->texture->target, + PIPE_TEXTURE_USAGE_SAMPLER, 0)) { + util_surface_copy(pipe, FALSE, dst, dstx, dsty, src, srcx, srcy, + width, height); + return; + } + + /* check whether the states are properly saved */ + blitter_check_saved_CSOs(ctx); + assert(blitter->saved_fb_state.nr_cbufs != ~0); + assert(blitter->saved_num_textures != ~0); + assert(blitter->saved_num_sampler_states != ~0); + assert(src->texture->target < 4); + + /* bind CSOs */ + fb_state.width = dst->width; + fb_state.height = dst->height; + + if (is_depth) { + pipe->bind_blend_state(pipe, ctx->blend_keep_color); + pipe->bind_depth_stencil_alpha_state(pipe, + ctx->dsa_write_depth_keep_stencil); + pipe->bind_fs_state(pipe, ctx->fs_texfetch_depth[src->texture->target]); + + fb_state.nr_cbufs = 0; + fb_state.zsbuf = dst; + } else { + pipe->bind_blend_state(pipe, ctx->blend_write_color); + pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil); + pipe->bind_fs_state(pipe, ctx->fs_texfetch_col[src->texture->target]); + + fb_state.nr_cbufs = 1; + fb_state.cbufs[0] = dst; + fb_state.zsbuf = 0; + } + pipe->bind_rasterizer_state(pipe, ctx->rs_state); + pipe->bind_vs_state(pipe, ctx->vs_tex); + pipe->bind_fragment_sampler_states(pipe, 1, &ctx->sampler_state[src->level]); + pipe->set_fragment_sampler_textures(pipe, 1, &src->texture); + pipe->set_framebuffer_state(pipe, &fb_state); + + /* set texture coordinates */ + switch (src->texture->target) { + case PIPE_TEXTURE_1D: + case PIPE_TEXTURE_2D: + blitter_set_texcoords_2d(ctx, src, srcx, srcy, + srcx+width, srcy+height); + break; + case PIPE_TEXTURE_3D: + blitter_set_texcoords_3d(ctx, src, srcx, srcy, + srcx+width, srcy+height); + break; + case PIPE_TEXTURE_CUBE: + blitter_set_texcoords_cube(ctx, src, srcx, srcy, + srcx+width, srcy+height); + break; + } + + blitter_set_rectangle(ctx, dstx, dsty, dstx+width, dsty+height, 0); + blitter_draw_quad(ctx); + blitter_restore_CSOs(ctx); +} + +void util_blitter_fill(struct blitter_context *blitter, + struct pipe_surface *dst, + unsigned dstx, unsigned dsty, + unsigned width, unsigned height, + unsigned value) +{ + struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter; + struct pipe_context *pipe = ctx->pipe; + struct pipe_screen *screen = pipe->screen; + struct pipe_framebuffer_state fb_state; + float rgba[4]; + ubyte ub_rgba[4] = {0}; + union util_color color; + int i; + + assert(dst->texture); + if (!dst->texture) + return; + + /* check if we can render to the surface */ + if (pf_is_depth_or_stencil(dst->format) || /* unlikely, but you never know */ + !screen->is_format_supported(screen, dst->format, dst->texture->target, + PIPE_TEXTURE_USAGE_RENDER_TARGET, 0)) { + util_surface_fill(pipe, dst, dstx, dsty, width, height, value); + return; + } + + /* unpack the color */ + color.ui = value; + util_unpack_color_ub(dst->format, &color, + ub_rgba, ub_rgba+1, ub_rgba+2, ub_rgba+3); + for (i = 0; i < 4; i++) + rgba[i] = ubyte_to_float(ub_rgba[i]); + + /* check the saved state */ + blitter_check_saved_CSOs(ctx); + assert(blitter->saved_fb_state.nr_cbufs != ~0); + + /* bind CSOs */ + pipe->bind_blend_state(pipe, ctx->blend_write_color); + pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil); + pipe->bind_rasterizer_state(pipe, ctx->rs_state); + pipe->bind_fs_state(pipe, ctx->fs_col[0]); + pipe->bind_vs_state(pipe, ctx->vs_col); + + /* set a framebuffer state */ + fb_state.width = dst->width; + fb_state.height = dst->height; + fb_state.nr_cbufs = 1; + fb_state.cbufs[0] = dst; + fb_state.zsbuf = 0; + pipe->set_framebuffer_state(pipe, &fb_state); + + blitter_set_clear_color(ctx, rgba); + blitter_set_rectangle(ctx, 0, 0, width, height, 0); + blitter_draw_quad(ctx); + blitter_restore_CSOs(ctx); +} diff --git a/src/gallium/auxiliary/util/u_blitter.h b/src/gallium/auxiliary/util/u_blitter.h new file mode 100644 index 0000000..e4cbb5c --- /dev/null +++ b/src/gallium/auxiliary/util/u_blitter.h @@ -0,0 +1,244 @@ +/************************************************************************** + * + * Copyright 2009 Marek Olšák <maraeo@...> + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * "Software"), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. + * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **************************************************************************/ + +#ifndef U_BLITTER_H +#define U_BLITTER_H + +#include "util/u_memory.h" + +#include "pipe/p_state.h" + + +#ifdef __cplusplus +extern "C" { +#endif + +struct pipe_context; + +struct blitter_context +{ + /** + * Draw a quad. + * + * The pipe driver can set this to provide a more efficient way of drawing + * a quad. If it's NULL, the quad is drawn using a vertex buffer. + * + * There are always 4 vertices with interleaved vertex elements of type + * RGBA32F. See the vertex shader _output_ semantics to know what those are. + * The primitive type is always PIPE_PRIM_TRIANGLE_FAN and VS/clip/viewport + * is bypasssed. + */ + void (*draw_quad)(struct pipe_context *pipe, + const float *vertices); + + /* Private members, really. */ + void *saved_blend_state; /**< blend state */ + void *saved_dsa_state; /**< depth stencil alpha state */ + void *saved_rs_state; /**< rasterizer state */ + void *saved_fs, *saved_vs; /**< fragment shader, vertex shader */ + + struct pipe_framebuffer_state saved_fb_state; /**< framebuffer state */ + + int saved_num_sampler_states; + void *saved_sampler_states[32]; + + int saved_num_textures; + struct pipe_texture *saved_textures[32]; /* is 32 enough? */ +}; + +/** + * Create a blitter context. + */ +struct blitter_context *util_blitter_create(struct pipe_context *pipe); + +/** + * Destroy a blitter context. + */ +void util_blitter_destroy(struct blitter_context *blitter); + +/* + * These CSOs must be saved before any of the following functions is called: + * - blend state + * - depth stencil alpha state + * - rasterizer state + * - vertex shader + * - fragment shader + */ + +/** + * Clear a specified set of currently bound buffers to specified values. + */ +void util_blitter_clear(struct blitter_context *blitter, + unsigned width, unsigned height, + unsigned num_cbufs, + unsigned clear_buffers, + const float *rgba, + double depth, unsigned stencil); + +/** + * Copy a block of pixels from one surface to another. + * + * You can copy from any color format to any other color format provided + * the former can be sampled and the latter can be rendered to. Otherwise, + * a software fallback path is taken and both surfaces must be of the same + * format. + * + * The same holds for depth-stencil formats with the exception that stencil + * cannot be copied unless you set ignore_stencil to FALSE. In that case, + * a software fallback path is taken and both surfaces must be of the same + * format. + * + * Use pipe_screen->is_format_supported to know your options. + * + * These states must be saved in the blitter in addition to the state objects + * already required to be saved: + * - framebuffer state + * - fragment sampler states + * - fragment sampler textures + */ +void util_blitter_copy(struct blitter_context *blitter, + struct pipe_surface *dst, + unsigned dstx, unsigned dsty, + struct pipe_surface *src, + unsigned srcx, unsigned srcy, + unsigned width, unsigned height, + boolean ignore_stencil); + +/** + * Fill a region of a surface with a constant value. + * + * If the surface cannot be rendered to or it's a depth-stencil format, + * a software fallback path is taken. + * + * These states must be saved in the blitter in addition to the state objects + * already required to be saved: + * - framebuffer state + */ +void util_blitter_fill(struct blitter_context *blitter, + struct pipe_surface *dst, + unsigned dstx, unsigned dsty, + unsigned width, unsigned height, + unsigned value); + +/** + * Copy all pixels from one surface to another. + * + * The rules are the same as in util_blitter_copy with the addition that + * surfaces must have the same size. + */ +static INLINE +void util_blitter_copy_surface(struct blitter_context *blitter, + struct pipe_surface *dst, + struct pipe_surface *src, + boolean ignore_stencil) +{ + assert(dst->width == src->width && dst->height == src->height); + + util_blitter_copy(blitter, dst, 0, 0, src, 0, 0, src->width, src->height, + ignore_stencil); +} + + +/* The functions below should be used to save currently bound constant state + * objects inside a driver. The objects are automatically restored at the end + * of the util_blitter_{clear, fill, copy, copy_surface} functions and then + * forgotten. + * + * CSOs not listed here are not affected by util_blitter. */ + +static INLINE +void util_blitter_save_blend(struct blitter_context *blitter, + void *state) +{ + blitter->saved_blend_state = state; +} + +static INLINE +void util_blitter_save_depth_stencil_alpha(struct blitter_context *blitter, + void *state) +{ + blitter->saved_dsa_state = state; +} + +static INLINE +void util_blitter_save_rasterizer(struct blitter_context *blitter, + void *state) +{ + blitter->saved_rs_state = state; +} + +static INLINE +void util_blitter_save_fragment_shader(struct blitter_context *blitter, + void *fs) +{ + blitter->saved_fs = fs; +} + +static INLINE +void util_blitter_save_vertex_shader(struct blitter_context *blitter, + void *vs) +{ + blitter->saved_vs = vs; +} + +static INLINE +void util_blitter_save_framebuffer(struct blitter_context *blitter, + struct pipe_framebuffer_state *state) +{ + blitter->saved_fb_state = *state; +} + +static INLINE +void util_blitter_save_fragment_sampler_states( + struct blitter_context *blitter, + int num_sampler_states, + void **sampler_states) +{ + assert(num_sampler_states <= Elements(blitter->saved_sampler_states)); + + blitter->saved_num_sampler_states = num_sampler_states; + memcpy(blitter->saved_sampler_states, sampler_states, + num_sampler_states * sizeof(void *)); +} + +static INLINE +void util_blitter_save_fragment_sampler_textures( + struct blitter_context *blitter, + int num_textures, + struct pipe_texture **textures) +{ + assert(num_textures <= Elements(blitter->saved_textures)); + + blitter->saved_num_textures = num_textures; + memcpy(blitter->saved_textures, textures, + num_textures * sizeof(struct pipe_texture *)); +} + +#ifdef __cplusplus +} +#endif + +#endif -- 1.6.3.3 [0004-pipe-add-PIPE_MAX_TEXTURE_TYPES.patch] From b781b83f0d119b0c3dc6a4ce3f7e31a7084219be Mon Sep 17 00:00:00 2001 From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...> Date: Mon, 14 Dec 2009 19:05:15 +0100 Subject: [PATCH 4/7] pipe: add PIPE_MAX_TEXTURE_TYPES --- src/gallium/include/pipe/p_defines.h | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h index 69a0970..fe1390d 100644 --- a/src/gallium/include/pipe/p_defines.h +++ b/src/gallium/include/pipe/p_defines.h @@ -140,7 +140,8 @@ enum pipe_texture_target { PIPE_TEXTURE_1D = 0, PIPE_TEXTURE_2D = 1, PIPE_TEXTURE_3D = 2, - PIPE_TEXTURE_CUBE = 3 + PIPE_TEXTURE_CUBE = 3, + PIPE_MAX_TEXTURE_TYPES }; #define PIPE_TEX_FACE_POS_X 0 -- 1.6.3.3 [0005-util-blitter-use-PIPE_MAX_-limits-and-fix-a-memory-l.patch] From 2f8bfcbe1223e29efa188b7bdd0d87fc64b749a8 Mon Sep 17 00:00:00 2001 From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...> Date: Mon, 14 Dec 2009 19:14:49 +0100 Subject: [PATCH 5/7] util/blitter: use PIPE_MAX_* limits, and fix a memory leak --- src/gallium/auxiliary/util/u_blitter.c | 40 +++++++++++++++++++++---------- 1 files changed, 27 insertions(+), 13 deletions(-) diff --git a/src/gallium/auxiliary/util/u_blitter.c b/src/gallium/auxiliary/util/u_blitter.c index e51a5df..f8f9e4a 100644 --- a/src/gallium/auxiliary/util/u_blitter.c +++ b/src/gallium/auxiliary/util/u_blitter.c @@ -62,10 +62,16 @@ struct blitter_context_priv void *vs_tex; /**<Vertex shader which passes {pos, texcoord} to the output.*/ /* Fragment shaders. */ - void *fs_col[8]; /**< FS which outputs colors to 1-8 color buffers */ - void *fs_texfetch_col[4]; /**< FS which outputs a color from a texture */ - void *fs_texfetch_depth[4]; /**< FS which outputs a depth from a texture, - where the index is PIPE_TEXTURE_* to be sampled */ + /* FS which outputs a color to multiple color buffers. */ + void *fs_col[PIPE_MAX_COLOR_BUFS]; + + /* FS which outputs a color from a texture, + where the index is PIPE_TEXTURE_* to be sampled. */ + void *fs_texfetch_col[PIPE_MAX_TEXTURE_TYPES]; + + /* FS which outputs a depth from a texture, + where the index is PIPE_TEXTURE_* to be sampled. */ + void *fs_texfetch_depth[PIPE_MAX_TEXTURE_TYPES]; /* Blend state. */ void *blend_write_color; /**< blend state with writemask of RGBA */ @@ -76,9 +82,11 @@ struct blitter_context_priv void *dsa_write_depth_keep_stencil; void *dsa_keep_depth_stencil; - /* Other state. */ - void *sampler_state[16]; /**< sampler state for clamping to a miplevel */ - void *rs_state; /**< rasterizer state */ + /* Sampler state for clamping to a miplevel. */ + void *sampler_state[PIPE_MAX_TEXTURE_LEVELS]; + + /* Rasterizer state. */ + void *rs_state; }; struct blitter_context *util_blitter_create(struct pipe_context *pipe) @@ -142,7 +150,7 @@ struct blitter_context *util_blitter_create(struct pipe_context *pipe) sampler_state.wrap_t = PIPE_TEX_WRAP_CLAMP_TO_EDGE; sampler_state.wrap_r = PIPE_TEX_WRAP_CLAMP_TO_EDGE; - for (i = 0; i < 16; i++) { + for (i = 0; i < PIPE_MAX_TEXTURE_LEVELS; i++) { sampler_state.lod_bias = i; sampler_state.min_lod = i; sampler_state.max_lod = i; @@ -197,7 +205,7 @@ struct blitter_context *util_blitter_create(struct pipe_context *pipe) max_render_targets = pipe->screen->get_param(pipe->screen, PIPE_CAP_MAX_RENDER_TARGETS); - assert(max_render_targets <= 8); + assert(max_render_targets <= PIPE_MAX_COLOR_BUFS); for (i = 0; i < max_render_targets; i++) ctx->fs_col[i] = util_make_fragment_clonecolor_shader(pipe, 1+i); @@ -234,13 +242,17 @@ void util_blitter_destroy(struct blitter_context *blitter) pipe->delete_vs_state(pipe, ctx->vs_col); pipe->delete_vs_state(pipe, ctx->vs_tex); - for (i = 0; i < 4; i++) { + for (i = 0; i < PIPE_MAX_TEXTURE_TYPES; i++) { pipe->delete_fs_state(pipe, ctx->fs_texfetch_col[i]); pipe->delete_fs_state(pipe, ctx->fs_texfetch_depth[i]); } - for (i = 0; i < 8 && ctx->fs_col[i]; i++) + + for (i = 0; i < PIPE_MAX_COLOR_BUFS && ctx->fs_col[i]; i++) pipe->delete_fs_state(pipe, ctx->fs_col[i]); + for (i = 0; i < PIPE_MAX_TEXTURE_LEVELS; i++) + pipe->delete_sampler_state(pipe, ctx->sampler_state[i]); + pipe_buffer_reference(&ctx->vbuf, NULL); FREE(ctx); } @@ -426,7 +438,7 @@ void util_blitter_clear(struct blitter_context *blitter, struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter; struct pipe_context *pipe = ctx->pipe; - assert(num_cbufs <= 8); + assert(num_cbufs <= PIPE_MAX_COLOR_BUFS); blitter_check_saved_CSOs(ctx); @@ -494,7 +506,7 @@ void util_blitter_copy(struct blitter_context *blitter, assert(blitter->saved_fb_state.nr_cbufs != ~0); assert(blitter->saved_num_textures != ~0); assert(blitter->saved_num_sampler_states != ~0); - assert(src->texture->target < 4); + assert(src->texture->target < PIPE_MAX_TEXTURE_TYPES); /* bind CSOs */ fb_state.width = dst->width; @@ -538,6 +550,8 @@ void util_blitter_copy(struct blitter_context *blitter, blitter_set_texcoords_cube(ctx, src, srcx, srcy, srcx+width, srcy+height); break; + default: + assert(0); } blitter_set_rectangle(ctx, dstx, dsty, dstx+width, dsty+height, 0); -- 1.6.3.3 [0006-util-blitter-allocate-most-of-the-state-objects-on-d.patch] From 61d103c43b7e26bc406f159aa572e468366abcae Mon Sep 17 00:00:00 2001 From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...> Date: Tue, 15 Dec 2009 00:26:10 +0100 Subject: [PATCH 6/7] util/blitter: allocate most of the state objects on-demand --- src/gallium/auxiliary/util/u_blitter.c | 254 ++++++++++++++++++++++---------- 1 files changed, 179 insertions(+), 75 deletions(-) diff --git a/src/gallium/auxiliary/util/u_blitter.c b/src/gallium/auxiliary/util/u_blitter.c index f8f9e4a..42efa86 100644 --- a/src/gallium/auxiliary/util/u_blitter.c +++ b/src/gallium/auxiliary/util/u_blitter.c @@ -56,6 +56,10 @@ struct blitter_context_priv float vertices[4][2][4]; /**< {pos, color} or {pos, texcoord} */ + /* Templates for various state objects. */ + struct pipe_depth_stencil_alpha_state template_dsa; + struct pipe_sampler_state template_sampler_state; + /* Constant state objects. */ /* Vertex shaders. */ void *vs_col; /**< Vertex shader which passes {pos, color} to the output */ @@ -93,10 +97,10 @@ struct blitter_context *util_blitter_create(struct pipe_context *pipe) { struct blitter_context_priv *ctx; struct pipe_blend_state blend; - struct pipe_depth_stencil_alpha_state dsa; + struct pipe_depth_stencil_alpha_state *dsa; struct pipe_rasterizer_state rs_state; - struct pipe_sampler_state sampler_state; - unsigned i, max_render_targets; + struct pipe_sampler_state *sampler_state; + unsigned i; ctx = CALLOC_STRUCT(blitter_context_priv); if (!ctx) @@ -117,46 +121,33 @@ struct blitter_context *util_blitter_create(struct pipe_context *pipe) ctx->blend_write_color = pipe->create_blend_state(pipe, &blend); /* depth stencil alpha state objects */ - memset(&dsa, 0, sizeof(dsa)); + dsa = &ctx->template_dsa; ctx->dsa_keep_depth_stencil = - pipe->create_depth_stencil_alpha_state(pipe, &dsa); + pipe->create_depth_stencil_alpha_state(pipe, dsa); - dsa.depth.enabled = 1; - dsa.depth.writemask = 1; - dsa.depth.func = PIPE_FUNC_ALWAYS; + dsa->depth.enabled = 1; + dsa->depth.writemask = 1; + dsa->depth.func = PIPE_FUNC_ALWAYS; ctx->dsa_write_depth_keep_stencil = - pipe->create_depth_stencil_alpha_state(pipe, &dsa); - - dsa.stencil[0].enabled = 1; - dsa.stencil[0].func = PIPE_FUNC_ALWAYS; - dsa.stencil[0].fail_op = PIPE_STENCIL_OP_REPLACE; - dsa.stencil[0].zpass_op = PIPE_STENCIL_OP_REPLACE; - dsa.stencil[0].zfail_op = PIPE_STENCIL_OP_REPLACE; - dsa.stencil[0].valuemask = 0xff; - dsa.stencil[0].writemask = 0xff; - - /* create a depth stencil alpha state for each possible stencil clear - * value */ - for (i = 0; i < 0xff; i++) { - dsa.stencil[0].ref_value = i; - - ctx->dsa_write_depth_stencil[i] = - pipe->create_depth_stencil_alpha_state(pipe, &dsa); - } + pipe->create_depth_stencil_alpha_state(pipe, dsa); + + dsa->stencil[0].enabled = 1; + dsa->stencil[0].func = PIPE_FUNC_ALWAYS; + dsa->stencil[0].fail_op = PIPE_STENCIL_OP_REPLACE; + dsa->stencil[0].zpass_op = PIPE_STENCIL_OP_REPLACE; + dsa->stencil[0].zfail_op = PIPE_STENCIL_OP_REPLACE; + dsa->stencil[0].valuemask = 0xff; + dsa->stencil[0].writemask = 0xff; + /* The DSA state objects which write depth and stencil are created + * on-demand. */ /* sampler state */ - memset(&sampler_state, 0, sizeof(sampler_state)); - sampler_state.wrap_s = PIPE_TEX_WRAP_CLAMP_TO_EDGE; - sampler_state.wrap_t = PIPE_TEX_WRAP_CLAMP_TO_EDGE; - sampler_state.wrap_r = PIPE_TEX_WRAP_CLAMP_TO_EDGE; - - for (i = 0; i < PIPE_MAX_TEXTURE_LEVELS; i++) { - sampler_state.lod_bias = i; - sampler_state.min_lod = i; - sampler_state.max_lod = i; - - ctx->sampler_state[i] = pipe->create_sampler_state(pipe, &sampler_state); - } + sampler_state = &ctx->template_sampler_state; + sampler_state->wrap_s = PIPE_TEX_WRAP_CLAMP_TO_EDGE; + sampler_state->wrap_t = PIPE_TEX_WRAP_CLAMP_TO_EDGE; + sampler_state->wrap_r = PIPE_TEX_WRAP_CLAMP_TO_EDGE; + /* The sampler state objects which sample from a specified mipmap level + * are created on-demand. */ /* rasterizer state */ memset(&rs_state, 0, sizeof(rs_state)); @@ -166,6 +157,8 @@ struct blitter_context *util_blitter_create(struct pipe_context *pipe) rs_state.gl_rasterization_rules = 1; ctx->rs_state = pipe->create_rasterizer_state(pipe, &rs_state); + /* fragment shaders are created on-demand */ + /* vertex shaders */ { const uint semantic_names[] = { TGSI_SEMANTIC_POSITION, @@ -184,31 +177,6 @@ struct blitter_context *util_blitter_create(struct pipe_context *pipe) semantic_indices); } - /* fragment shaders */ - ctx->fs_texfetch_col[PIPE_TEXTURE_1D] = - util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_1D); - ctx->fs_texfetch_col[PIPE_TEXTURE_2D] = - util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_2D); - ctx->fs_texfetch_col[PIPE_TEXTURE_3D] = - util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_3D); - ctx->fs_texfetch_col[PIPE_TEXTURE_CUBE] = - util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_CUBE); - - ctx->fs_texfetch_depth[PIPE_TEXTURE_1D] = - util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_1D); - ctx->fs_texfetch_depth[PIPE_TEXTURE_2D] = - util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_2D); - ctx->fs_texfetch_depth[PIPE_TEXTURE_3D] = - util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_3D); - ctx->fs_texfetch_depth[PIPE_TEXTURE_CUBE] = - util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_CUBE); - - max_render_targets = pipe->screen->get_param(pipe->screen, - PIPE_CAP_MAX_RENDER_TARGETS); - assert(max_render_targets <= PIPE_MAX_COLOR_BUFS); - for (i = 0; i < max_render_targets; i++) - ctx->fs_col[i] = util_make_fragment_clonecolor_shader(pipe, 1+i); - /* set invariant vertex coordinates */ for (i = 0; i < 4; i++) ctx->vertices[i][0][3] = 1; /*v.w*/ @@ -235,23 +203,28 @@ void util_blitter_destroy(struct blitter_context *blitter) ctx->dsa_write_depth_keep_stencil); for (i = 0; i < 0xff; i++) - pipe->delete_depth_stencil_alpha_state(pipe, - ctx->dsa_write_depth_stencil[i]); + if (ctx->dsa_write_depth_stencil[i]) + pipe->delete_depth_stencil_alpha_state(pipe, + ctx->dsa_write_depth_stencil[i]); pipe->delete_rasterizer_state(pipe, ctx->rs_state); pipe->delete_vs_state(pipe, ctx->vs_col); pipe->delete_vs_state(pipe, ctx->vs_tex); for (i = 0; i < PIPE_MAX_TEXTURE_TYPES; i++) { - pipe->delete_fs_state(pipe, ctx->fs_texfetch_col[i]); - pipe->delete_fs_state(pipe, ctx->fs_texfetch_depth[i]); + if (ctx->fs_texfetch_col[i]) + pipe->delete_fs_state(pipe, ctx->fs_texfetch_col[i]); + if (ctx->fs_texfetch_depth[i]) + pipe->delete_fs_state(pipe, ctx->fs_texfetch_depth[i]); } for (i = 0; i < PIPE_MAX_COLOR_BUFS && ctx->fs_col[i]; i++) - pipe->delete_fs_state(pipe, ctx->fs_col[i]); + if (ctx->fs_col[i]) + pipe->delete_fs_state(pipe, ctx->fs_col[i]); for (i = 0; i < PIPE_MAX_TEXTURE_LEVELS; i++) - pipe->delete_sampler_state(pipe, ctx->sampler_state[i]); + if (ctx->sampler_state[i]) + pipe->delete_sampler_state(pipe, ctx->sampler_state[i]); pipe_buffer_reference(&ctx->vbuf, NULL); FREE(ctx); @@ -428,6 +401,133 @@ static void blitter_draw_quad(struct blitter_context_priv *ctx) } } +static INLINE +void *blitter_get_state_write_depth_stencil( + struct blitter_context_priv *ctx, + unsigned stencil) +{ + struct pipe_context *pipe = ctx->pipe; + + stencil &= 0xff; + + /* Create the DSA state on-demand. */ + if (!ctx->dsa_write_depth_stencil[stencil]) { + ctx->template_dsa.stencil[0].ref_value = stencil; + + ctx->dsa_write_depth_stencil[stencil] = + pipe->create_depth_stencil_alpha_state(pipe, &ctx->template_dsa); + } + + return ctx->dsa_write_depth_stencil[stencil]; +} + +static INLINE +void **blitter_get_sampler_state(struct blitter_context_priv *ctx, + int miplevel) +{ + struct pipe_context *pipe = ctx->pipe; + struct pipe_sampler_state *sampler_state = &ctx->template_sampler_state; + + assert(miplevel < PIPE_MAX_TEXTURE_LEVELS); + + /* Create the sampler state on-demand. */ + if (!ctx->sampler_state[miplevel]) { + sampler_state->lod_bias = miplevel; + sampler_state->min_lod = miplevel; + sampler_state->max_lod = miplevel; + + ctx->sampler_state[miplevel] = pipe->create_sampler_state(pipe, + sampler_state); + } + + /* Return void** so that it can be passed to bind_fragment_sampler_states + * directly. */ + return &ctx->sampler_state[miplevel]; +} + +static INLINE +void *blitter_get_fs_col(struct blitter_context_priv *ctx, unsigned num_cbufs) +{ + struct pipe_context *pipe = ctx->pipe; + unsigned index = num_cbufs ? num_cbufs - 1 : 0; + + assert(num_cbufs <= PIPE_MAX_COLOR_BUFS); + + if (!ctx->fs_col[index]) + ctx->fs_col[index] = + util_make_fragment_clonecolor_shader(pipe, num_cbufs); + + return ctx->fs_col[index]; +} + +static INLINE +void *blitter_get_fs_texfetch_col(struct blitter_context_priv *ctx, + unsigned tex_target) +{ + struct pipe_context *pipe = ctx->pipe; + + assert(tex_target < PIPE_MAX_TEXTURE_TYPES); + + /* Create the fragment shader on-demand. */ + if (!ctx->fs_texfetch_col[tex_target]) { + switch (tex_target) { + case PIPE_TEXTURE_1D: + ctx->fs_texfetch_col[PIPE_TEXTURE_1D] = + util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_1D); + break; + case PIPE_TEXTURE_2D: + ctx->fs_texfetch_col[PIPE_TEXTURE_2D] = + util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_2D); + break; + case PIPE_TEXTURE_3D: + ctx->fs_texfetch_col[PIPE_TEXTURE_3D] = + util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_3D); + break; + case PIPE_TEXTURE_CUBE: + ctx->fs_texfetch_col[PIPE_TEXTURE_CUBE] = + util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_CUBE); + break; + default:; + } + } + + return ctx->fs_texfetch_col[tex_target]; +} + +static INLINE +void *blitter_get_fs_texfetch_depth(struct blitter_context_priv *ctx, + unsigned tex_target) +{ + struct pipe_context *pipe = ctx->pipe; + + assert(tex_target < PIPE_MAX_TEXTURE_TYPES); + + /* Create the fragment shader on-demand. */ + if (!ctx->fs_texfetch_depth[tex_target]) { + switch (tex_target) { + case PIPE_TEXTURE_1D: + ctx->fs_texfetch_depth[PIPE_TEXTURE_1D] = + util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_1D); + break; + case PIPE_TEXTURE_2D: + ctx->fs_texfetch_depth[PIPE_TEXTURE_2D] = + util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_2D); + break; + case PIPE_TEXTURE_3D: + ctx->fs_texfetch_depth[PIPE_TEXTURE_3D] = + util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_3D); + break; + case PIPE_TEXTURE_CUBE: + ctx->fs_texfetch_depth[PIPE_TEXTURE_CUBE] = + util_make_fragment_tex_shader_writedepth(pipe,TGSI_TEXTURE_CUBE); + break; + default:; + } + } + + return ctx->fs_texfetch_depth[tex_target]; +} + void util_blitter_clear(struct blitter_context *blitter, unsigned width, unsigned height, unsigned num_cbufs, @@ -450,12 +550,12 @@ void util_blitter_clear(struct blitter_context *blitter, if (clear_buffers & PIPE_CLEAR_DEPTHSTENCIL) pipe->bind_depth_stencil_alpha_state(pipe, - ctx->dsa_write_depth_stencil[stencil&0xff]); + blitter_get_state_write_depth_stencil(ctx, stencil)); else pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil); pipe->bind_rasterizer_state(pipe, ctx->rs_state); - pipe->bind_fs_state(pipe, ctx->fs_col[num_cbufs ? num_cbufs-1 : 0]); + pipe->bind_fs_state(pipe, blitter_get_fs_col(ctx, num_cbufs)); pipe->bind_vs_state(pipe, ctx->vs_col); blitter_set_clear_color(ctx, rgba); @@ -516,22 +616,26 @@ void util_blitter_copy(struct blitter_context *blitter, pipe->bind_blend_state(pipe, ctx->blend_keep_color); pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_write_depth_keep_stencil); - pipe->bind_fs_state(pipe, ctx->fs_texfetch_depth[src->texture->target]); + pipe->bind_fs_state(pipe, + blitter_get_fs_texfetch_depth(ctx, src->texture->target)); fb_state.nr_cbufs = 0; fb_state.zsbuf = dst; } else { pipe->bind_blend_state(pipe, ctx->blend_write_color); pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil); - pipe->bind_fs_state(pipe, ctx->fs_texfetch_col[src->texture->target]); + pipe->bind_fs_state(pipe, + blitter_get_fs_texfetch_col(ctx, src->texture->target)); fb_state.nr_cbufs = 1; fb_state.cbufs[0] = dst; fb_state.zsbuf = 0; } + pipe->bind_rasterizer_state(pipe, ctx->rs_state); pipe->bind_vs_state(pipe, ctx->vs_tex); - pipe->bind_fragment_sampler_states(pipe, 1, &ctx->sampler_state[src->level]); + pipe->bind_fragment_sampler_states(pipe, 1, + blitter_get_sampler_state(ctx, src->level)); pipe->set_fragment_sampler_textures(pipe, 1, &src->texture); pipe->set_framebuffer_state(pipe, &fb_state); @@ -601,7 +705,7 @@ void util_blitter_fill(struct blitter_context *blitter, pipe->bind_blend_state(pipe, ctx->blend_write_color); pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil); pipe->bind_rasterizer_state(pipe, ctx->rs_state); - pipe->bind_fs_state(pipe, ctx->fs_col[0]); + pipe->bind_fs_state(pipe, blitter_get_fs_col(ctx, 1)); pipe->bind_vs_state(pipe, ctx->vs_col); /* set a framebuffer state */ -- 1.6.3.3 [0007-util-blitter-kill-the-draw_quad-callback.patch] From 4e1a135d7cef207b7bbff1759031c338e91750b5 Mon Sep 17 00:00:00 2001 From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...> Date: Tue, 15 Dec 2009 01:11:22 +0100 Subject: [PATCH 7/7] util/blitter: kill the draw_quad callback --- src/gallium/auxiliary/util/u_blitter.c | 17 ++++++----------- src/gallium/auxiliary/util/u_blitter.h | 14 -------------- 2 files changed, 6 insertions(+), 25 deletions(-) diff --git a/src/gallium/auxiliary/util/u_blitter.c b/src/gallium/auxiliary/util/u_blitter.c index 42efa86..895af2c 100644 --- a/src/gallium/auxiliary/util/u_blitter.c +++ b/src/gallium/auxiliary/util/u_blitter.c @@ -385,20 +385,15 @@ static void blitter_set_texcoords_cube(struct blitter_context_priv *ctx, static void blitter_draw_quad(struct blitter_context_priv *ctx) { - struct blitter_context *blitter = &ctx->blitter; struct pipe_context *pipe = ctx->pipe; - if (blitter->draw_quad) { - blitter->draw_quad(pipe, &ctx->vertices[0][0][0]); - } else { - /* write vertices and draw them */ - pipe_buffer_write(pipe->screen, ctx->vbuf, - 0, sizeof(ctx->vertices), ctx->vertices); + /* write vertices and draw them */ + pipe_buffer_write(pipe->screen, ctx->vbuf, + 0, sizeof(ctx->vertices), ctx->vertices); - util_draw_vertex_buffer(ctx->pipe, ctx->vbuf, 0, PIPE_PRIM_TRIANGLE_FAN, - 4, /* verts */ - 2); /* attribs/vert */ - } + util_draw_vertex_buffer(pipe, ctx->vbuf, 0, PIPE_PRIM_TRIANGLE_FAN, + 4, /* verts */ + 2); /* attribs/vert */ } static INLINE diff --git a/src/gallium/auxiliary/util/u_blitter.h b/src/gallium/auxiliary/util/u_blitter.h index e4cbb5c..3da5a6c 100644 --- a/src/gallium/auxiliary/util/u_blitter.h +++ b/src/gallium/auxiliary/util/u_blitter.h @@ -40,20 +40,6 @@ struct pipe_context; struct blitter_context { - /** - * Draw a quad. - * - * The pipe driver can set this to provide a more efficient way of drawing - * a quad. If it's NULL, the quad is drawn using a vertex buffer. - * - * There are always 4 vertices with interleaved vertex elements of type - * RGBA32F. See the vertex shader _output_ semantics to know what those are. - * The primitive type is always PIPE_PRIM_TRIANGLE_FAN and VS/clip/viewport - * is bypasssed. - */ - void (*draw_quad)(struct pipe_context *pipe, - const float *vertices); - /* Private members, really. */ void *saved_blend_state; /**< blend state */ void *saved_dsa_state; /**< depth stencil alpha state */ -- 1.6.3.3 ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
|
|
Re: gallium: add blitterOn Mon, Dec 14, 2009 at 5:42 PM, Corbin Simpson
<mostawesomedude@...> wrote: > As far as immediate verts, why don't we just add support to r300g to switch > to immediate mode for small VBOs? > > Posting from a mobile, pardon my terseness. ~ C. > Corbin, that seems reasonable, and it's the reason I killed the draw_quad function. BTW immediate mode doubles the performance in glxgears. To others: I noticed that there is a weird optimization in u_gen_mipmaps. It allocates a large vertex buffer and uses small chunks of it to render consecutive quads (one for each mipmap level and cubemap face). If we implement switching to immediate mode, it would be nice for VBOs to be as small as possible so that the driver can easily recognize the most efficient path. The simplest solution (4 vertices in a VBO) may end up being the fastest one here. Marek ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@... https://lists.sourceforge.net/lists/listinfo/mesa3d-dev |
| < Prev | 1 - 2 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |