gallium: add blitter

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

gallium: add blitter

by Marek Olšák :: Rate this Message:

| View Threaded | Show Only this Message

Hi Keith,

I've finished the blitter module. It fully implements the clear,
surface_copy, and surface_fill functions. It properly fallbacks to
software in case a surface cannot be sampled or rendered to according
to usage. Copying a stencil buffer always fallbacks unless the
ignore_stencil parameter (see util_blitter_copy) is set to TRUE. To my
knowledge, GPUs cannot copy the stencil buffer (not sure if fiddling
with texture formats can help). It's all documented in u_blitter.h.

The pipe driver can optionally hook up a function to draw a quad
(blitter_context::draw_quad). I realized that embedding 4 vertices
into a command stream (AKA immediate mode) is much faster than writing
them to a vertex buffer due to reduced driver overhead. It might be
worth to consider adding the draw_quad function to pipe_context.

When working on the blitter, I added the following things to
util/u_simple_shaders:
- util_make_fragment_tex_shader has a new parametr tex_target and the
value should be one of TGSI_TEXTURE_* enums so that it can be used to
sample from any kind of texture.
- Added util_make_fragment_tex_shader_writedepth, which writes depth
sampled from a texture. It's used for copying depth textures.
- Added util_make_fragment_clonecolor_shader, which copies input
COLOR[0] to a specified number of render targets. It's used to clear
MRTs.

Also, I moved the code for converting 2D texture coordinates into
cubemap texture coordinates from u_gen_mipmap to a new function in
util/u_texture.

Please review/push.

Once it gets approved, I will send patches with r300g blit support to
Corbin. With this work, untiling a texture will be as easy as calling
surface_copy whereas the driver state remains intact (theoretically).

Cheers.

Marek

On Thu, Dec 10, 2009 at 6:23 PM, Keith Whitwell <keithw@...> wrote:

> On Thu, 2009-12-10 at 01:52 -0800, Marek Olšák wrote:
>> Keith,
>>
>> I've taken your comment into consideration and started laying out a
>> new simple driver module which I call Blitter. The idea is to provide
>> acceleration for operations like clear, surface_copy, and
>> surface_fill. The module doesn't depend on a CSO context, instead, a
>> driver must call appropriate util_blitter_save* functions to save CSOs
>> and a blit operation takes care of their restoration once it's done.
>>
>> I attached a patch illustrating the idea with the clear implemented
>> and a working example of usage, but it's not ready to get pushed yet.
>>
>> Please tell me what you think about it.
>
> Marek,
>
> This looks good to me.  It looks like this approach keeps the
> implementation entirely on the driver side of the interface, which is
> what I was hoping for.
>
> I had assumed that doing this type of operation in the driver would
> require assistance "from above" for saving and restoring state.  But it
> seems like you've been able to do without that, which is nice.
>
> Let me know how it progresses.
>
> Keith
>
>

[0001-util-add-new-fragment-shaders-to-simple_shaders.patch]

From 511f58a54315d07740493cdda050d1ebd5a4ecd3 Mon Sep 17 00:00:00 2001
From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...>
Date: Sat, 12 Dec 2009 06:34:29 +0100
Subject: [PATCH 1/3] util: add new fragment shaders to simple_shaders

New shaders:
* Fragment shader which writes depth sampled from a texture
* Fragment shader which copies COLOR[0] to multiple render targets

Additional improvements:
* The fragment 'tex' shaders now take a sampler type (TGSI_TEXTURE_*)
  so that they can sample from any type of texture, not only from a 2D one.
---
 src/gallium/auxiliary/util/u_blit.c           |    7 ++-
 src/gallium/auxiliary/util/u_gen_mipmap.c     |    2 +-
 src/gallium/auxiliary/util/u_simple_shaders.c |   70 ++++++++++++++++++++++---
 src/gallium/auxiliary/util/u_simple_shaders.h |   13 ++++-
 4 files changed, 80 insertions(+), 12 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_blit.c b/src/gallium/auxiliary/util/u_blit.c
index abe1de3..c9050ca 100644
--- a/src/gallium/auxiliary/util/u_blit.c
+++ b/src/gallium/auxiliary/util/u_blit.c
@@ -126,7 +126,8 @@ util_create_blit(struct pipe_context *pipe, struct cso_context *cso)
    }
 
    /* fragment shader */
-   ctx->fs[TGSI_WRITEMASK_XYZW] = util_make_fragment_tex_shader(pipe);
+   ctx->fs[TGSI_WRITEMASK_XYZW] =
+      util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_2D);
    ctx->vbuf = NULL;
 
    /* init vertex data that doesn't change */
@@ -420,7 +421,9 @@ util_blit_pixels_writemask(struct blit_state *ctx,
    cso_set_sampler_textures(ctx->cso, 1, &tex);
 
    if (ctx->fs[writemask] == NULL)
-      ctx->fs[writemask] = util_make_fragment_tex_shader_writemask(pipe, writemask);
+      ctx->fs[writemask] =
+         util_make_fragment_tex_shader_writemask(pipe, TGSI_TEXTURE_2D,
+                                                 writemask);
 
    /* shaders */
    cso_set_fragment_shader_handle(ctx->cso, ctx->fs[writemask]);
diff --git a/src/gallium/auxiliary/util/u_gen_mipmap.c b/src/gallium/auxiliary/util/u_gen_mipmap.c
index 83263d9..1728e66 100644
--- a/src/gallium/auxiliary/util/u_gen_mipmap.c
+++ b/src/gallium/auxiliary/util/u_gen_mipmap.c
@@ -1317,7 +1317,7 @@ util_create_gen_mipmap(struct pipe_context *pipe,
    }
 
    /* fragment shader */
-   ctx->fs = util_make_fragment_tex_shader(pipe);
+   ctx->fs = util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_2D);
 
    /* vertex data that doesn't change */
    for (i = 0; i < 4; i++) {
diff --git a/src/gallium/auxiliary/util/u_simple_shaders.c b/src/gallium/auxiliary/util/u_simple_shaders.c
index 1c8b157..8172ead 100644
--- a/src/gallium/auxiliary/util/u_simple_shaders.c
+++ b/src/gallium/auxiliary/util/u_simple_shaders.c
@@ -2,6 +2,7 @@
  *
  * Copyright 2008 Tungsten Graphics, Inc., Cedar Park, Texas.
  * All Rights Reserved.
+ * Copyright 2009 Marek Olšák <maraeo@...>
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the
@@ -30,6 +31,7 @@
  * Simple vertex/fragment shader generators.
  *  
  * @author Brian Paul
+           Marek Olšák
  */
 
 
@@ -87,6 +89,7 @@ util_make_vertex_passthrough_shader(struct pipe_context *pipe,
  */
 void *
 util_make_fragment_tex_shader_writemask(struct pipe_context *pipe,
+                                        unsigned tex_target,
                                         unsigned writemask )
 {
    struct ureg_program *ureg;
@@ -116,20 +119,63 @@ util_make_fragment_tex_shader_writemask(struct pipe_context *pipe,
 
    ureg_TEX( ureg,
              ureg_writemask(out, writemask),
-             TGSI_TEXTURE_2D, tex, sampler );
+             tex_target, tex, sampler );
    ureg_END( ureg );
 
    return ureg_create_shader_and_destroy( ureg, pipe );
 }
 
 void *
-util_make_fragment_tex_shader(struct pipe_context *pipe )
+util_make_fragment_tex_shader(struct pipe_context *pipe, unsigned tex_target )
 {
    return util_make_fragment_tex_shader_writemask( pipe,
+                                                   tex_target,
                                                    TGSI_WRITEMASK_XYZW );
 }
 
+/**
+ * Make a simple fragment texture shader which reads an X component from
+ * a texture and writes it as depth.
+ */
+void *
+util_make_fragment_tex_shader_writedepth(struct pipe_context *pipe,
+                                         unsigned tex_target)
+{
+   struct ureg_program *ureg;
+   struct ureg_src sampler;
+   struct ureg_src tex;
+   struct ureg_dst out, depth;
+   struct ureg_src imm;
 
+   ureg = ureg_create( TGSI_PROCESSOR_FRAGMENT );
+   if (ureg == NULL)
+      return NULL;
+
+   sampler = ureg_DECL_sampler( ureg, 0 );
+
+   tex = ureg_DECL_fs_input( ureg,
+                             TGSI_SEMANTIC_GENERIC, 0,
+                             TGSI_INTERPOLATE_PERSPECTIVE );
+
+   out = ureg_DECL_output( ureg,
+                           TGSI_SEMANTIC_COLOR,
+                           0 );
+
+   depth = ureg_DECL_output( ureg,
+                             TGSI_SEMANTIC_POSITION,
+                             0 );
+
+   imm = ureg_imm4f( ureg, 0, 0, 0, 1 );
+
+   ureg_MOV( ureg, out, imm );
+
+   ureg_TEX( ureg,
+             ureg_writemask(depth, TGSI_WRITEMASK_Z),
+             tex_target, tex, sampler );
+   ureg_END( ureg );
+
+   return ureg_create_shader_and_destroy( ureg, pipe );
+}
 
 /**
  * Make simple fragment color pass-through shader.
@@ -137,9 +183,18 @@ util_make_fragment_tex_shader(struct pipe_context *pipe )
 void *
 util_make_fragment_passthrough_shader(struct pipe_context *pipe)
 {
+   return util_make_fragment_clonecolor_shader(pipe, 1);
+}
+
+void *
+util_make_fragment_clonecolor_shader(struct pipe_context *pipe, int num_cbufs)
+{
    struct ureg_program *ureg;
    struct ureg_src src;
-   struct ureg_dst dst;
+   struct ureg_dst dst[8];
+   int i;
+
+   assert(num_cbufs <= 8);
 
    ureg = ureg_create( TGSI_PROCESSOR_FRAGMENT );
    if (ureg == NULL)
@@ -148,12 +203,13 @@ util_make_fragment_passthrough_shader(struct pipe_context *pipe)
    src = ureg_DECL_fs_input( ureg, TGSI_SEMANTIC_COLOR, 0,
                              TGSI_INTERPOLATE_PERSPECTIVE );
 
-   dst = ureg_DECL_output( ureg, TGSI_SEMANTIC_COLOR, 0 );
+   for (i = 0; i < num_cbufs; i++)
+      dst[i] = ureg_DECL_output( ureg, TGSI_SEMANTIC_COLOR, i );
+
+   for (i = 0; i < num_cbufs; i++)
+      ureg_MOV( ureg, dst[i], src );
 
-   ureg_MOV( ureg, dst, src );
    ureg_END( ureg );
 
    return ureg_create_shader_and_destroy( ureg, pipe );
 }
-
-
diff --git a/src/gallium/auxiliary/util/u_simple_shaders.h b/src/gallium/auxiliary/util/u_simple_shaders.h
index d2e80d6..6e76094 100644
--- a/src/gallium/auxiliary/util/u_simple_shaders.h
+++ b/src/gallium/auxiliary/util/u_simple_shaders.h
@@ -51,16 +51,25 @@ util_make_vertex_passthrough_shader(struct pipe_context *pipe,
 
 extern void *
 util_make_fragment_tex_shader_writemask(struct pipe_context *pipe,
-                                        unsigned writemask );
+                                        unsigned tex_target,
+                                        unsigned writemask);
 
 extern void *
-util_make_fragment_tex_shader(struct pipe_context *pipe);
+util_make_fragment_tex_shader(struct pipe_context *pipe, unsigned tex_target);
+
+
+extern void *
+util_make_fragment_tex_shader_writedepth(struct pipe_context *pipe,
+                                         unsigned tex_target);
 
 
 extern void *
 util_make_fragment_passthrough_shader(struct pipe_context *pipe);
 
 
+extern void *
+util_make_fragment_clonecolor_shader(struct pipe_context *pipe, int num_cbufs);
+
 #ifdef __cplusplus
 }
 #endif
--
1.6.3.3



[0002-util-add-a-function-which-converts-2D-coordinates-to.patch]

From dddb77c058d67c0a192b871deb8d837dfabbefce Mon Sep 17 00:00:00 2001
From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...>
Date: Sat, 12 Dec 2009 23:38:17 +0100
Subject: [PATCH 2/3] util: add a function which converts 2D coordinates to cubemap coordinates

The code was taken over from u_gen_mipmap.
---
 src/gallium/auxiliary/util/Makefile       |    1 +
 src/gallium/auxiliary/util/SConscript     |    1 +
 src/gallium/auxiliary/util/u_gen_mipmap.c |   55 +---------------
 src/gallium/auxiliary/util/u_texture.c    |  102 +++++++++++++++++++++++++++++
 src/gallium/auxiliary/util/u_texture.h    |   54 +++++++++++++++
 5 files changed, 161 insertions(+), 52 deletions(-)
 create mode 100644 src/gallium/auxiliary/util/u_texture.c
 create mode 100644 src/gallium/auxiliary/util/u_texture.h

diff --git a/src/gallium/auxiliary/util/Makefile b/src/gallium/auxiliary/util/Makefile
index 1d8bb55..894958f 100644
--- a/src/gallium/auxiliary/util/Makefile
+++ b/src/gallium/auxiliary/util/Makefile
@@ -30,6 +30,7 @@ C_SOURCES = \
  u_stream_stdc.c \
  u_stream_wd.c \
  u_surface.c \
+ u_texture.c \
  u_tile.c \
  u_time.c \
  u_timed_winsys.c \
diff --git a/src/gallium/auxiliary/util/SConscript b/src/gallium/auxiliary/util/SConscript
index 8d99106..0c0e048 100644
--- a/src/gallium/auxiliary/util/SConscript
+++ b/src/gallium/auxiliary/util/SConscript
@@ -48,6 +48,7 @@ util = env.ConvenienceLibrary(
  'u_stream_stdc.c',
  'u_stream_wd.c',
  'u_surface.c',
+ 'u_texture.c',
  'u_tile.c',
  'u_time.c',
  'u_timed_winsys.c',
diff --git a/src/gallium/auxiliary/util/u_gen_mipmap.c b/src/gallium/auxiliary/util/u_gen_mipmap.c
index 1728e66..69ff3b9 100644
--- a/src/gallium/auxiliary/util/u_gen_mipmap.c
+++ b/src/gallium/auxiliary/util/u_gen_mipmap.c
@@ -46,6 +46,7 @@
 #include "util/u_gen_mipmap.h"
 #include "util/u_simple_shaders.h"
 #include "util/u_math.h"
+#include "util/u_texture.h"
 
 #include "cso_cache/cso_context.h"
 
@@ -1383,59 +1384,9 @@ set_vertex_data(struct gen_mipmap_state *ctx,
       static const float st[4][2] = {
          {0.0f, 0.0f}, {1.0f, 0.0f}, {1.0f, 1.0f}, {0.0f, 1.0f}
       };
-      float rx, ry, rz;
-      uint i;
-
-      /* loop over quad verts */
-      for (i = 0; i < 4; i++) {
-         /* Compute sc = +/-scale and tc = +/-scale.
-          * Not +/-1 to avoid cube face selection ambiguity near the edges,
-          * though that can still sometimes happen with this scale factor...
-          */
-         const float scale = 0.9999f;
-         const float sc = (2.0f * st[i][0] - 1.0f) * scale;
-         const float tc = (2.0f * st[i][1] - 1.0f) * scale;
-
-         switch (face) {
-         case PIPE_TEX_FACE_POS_X:
-            rx = 1.0f;
-            ry = -tc;
-            rz = -sc;
-            break;
-         case PIPE_TEX_FACE_NEG_X:
-            rx = -1.0f;
-            ry = -tc;
-            rz = sc;
-            break;
-         case PIPE_TEX_FACE_POS_Y:
-            rx = sc;
-            ry = 1.0f;
-            rz = tc;
-            break;
-         case PIPE_TEX_FACE_NEG_Y:
-            rx = sc;
-            ry = -1.0f;
-            rz = -tc;
-            break;
-         case PIPE_TEX_FACE_POS_Z:
-            rx = sc;
-            ry = -tc;
-            rz = 1.0f;
-            break;
-         case PIPE_TEX_FACE_NEG_Z:
-            rx = -sc;
-            ry = -tc;
-            rz = -1.0f;
-            break;
-         default:
-            rx = ry = rz = 0.0f;
-            assert(0);
-         }
 
-         ctx->vertices[i][1][0] = rx; /*s*/
-         ctx->vertices[i][1][1] = ry; /*t*/
-         ctx->vertices[i][1][2] = rz; /*r*/
-      }
+      util_map_texcoords2d_onto_cubemap(face, &st[0][0], 2,
+                                        &ctx->vertices[0][1][0], 8);
    }
    else {
       /* 1D/2D */
diff --git a/src/gallium/auxiliary/util/u_texture.c b/src/gallium/auxiliary/util/u_texture.c
new file mode 100644
index 0000000..cd477ab
--- /dev/null
+++ b/src/gallium/auxiliary/util/u_texture.c
@@ -0,0 +1,102 @@
+/**************************************************************************
+ *
+ * Copyright 2008 Tungsten Graphics, Inc., Cedar Park, Texas.
+ * All Rights Reserved.
+ * Copyright 2008 VMware, Inc.  All rights reserved.
+ * Copyright 2009 Marek Olšák <maraeo@...>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
+ * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR
+ * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ **************************************************************************/
+
+/**
+ * @file
+ * Texture mapping utility functions.
+ *
+ * @author Brian Paul
+ *         Marek Olšák
+ */
+
+#include "pipe/p_defines.h"
+
+#include "util/u_texture.h"
+
+void util_map_texcoords2d_onto_cubemap(unsigned face,
+                                       const float *in_st, unsigned in_stride,
+                                       float *out_str, unsigned out_stride)
+{
+   int i;
+   float rx, ry, rz;
+
+   /* loop over quad verts */
+   for (i = 0; i < 4; i++) {
+      /* Compute sc = +/-scale and tc = +/-scale.
+       * Not +/-1 to avoid cube face selection ambiguity near the edges,
+       * though that can still sometimes happen with this scale factor...
+       */
+      const float scale = 0.9999f;
+      const float sc = (2 * in_st[0] - 1) * scale;
+      const float tc = (2 * in_st[1] - 1) * scale;
+
+      switch (face) {
+         case PIPE_TEX_FACE_POS_X:
+            rx = 1;
+            ry = -tc;
+            rz = -sc;
+            break;
+         case PIPE_TEX_FACE_NEG_X:
+            rx = -1;
+            ry = -tc;
+            rz = sc;
+            break;
+         case PIPE_TEX_FACE_POS_Y:
+            rx = sc;
+            ry = 1;
+            rz = tc;
+            break;
+         case PIPE_TEX_FACE_NEG_Y:
+            rx = sc;
+            ry = -1;
+            rz = -tc;
+            break;
+         case PIPE_TEX_FACE_POS_Z:
+            rx = sc;
+            ry = -tc;
+            rz = 1;
+            break;
+         case PIPE_TEX_FACE_NEG_Z:
+            rx = -sc;
+            ry = -tc;
+            rz = -1;
+            break;
+         default:
+            rx = ry = rz = 0;
+            assert(0);
+      }
+
+      out_str[0] = rx; /*s*/
+      out_str[1] = ry; /*t*/
+      out_str[2] = rz; /*r*/
+
+      in_st += in_stride;
+      out_str += out_stride;
+   }
+}
diff --git a/src/gallium/auxiliary/util/u_texture.h b/src/gallium/auxiliary/util/u_texture.h
new file mode 100644
index 0000000..93b2f1e
--- /dev/null
+++ b/src/gallium/auxiliary/util/u_texture.h
@@ -0,0 +1,54 @@
+/**************************************************************************
+ *
+ * Copyright 2009 Marek Olšák <maraeo@...>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
+ * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR
+ * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ **************************************************************************/
+
+#ifndef U_TEXTURE_H
+#define U_TEXTURE_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Convert 2D texture coordinates of 4 vertices into cubemap coordinates
+ * in the given face.
+ * Coordinates must be in the range [0,1].
+ *
+ * \param face          Cubemap face.
+ * \param in_st         4 pairs of 2D texture coordinates to convert.
+ * \param in_stride     Stride of in_st in floats.
+ * \param out_str       STR cubemap texture coordinates to compute.
+ * \param out_stride    Stride of out_str in floats.
+ */
+void util_map_texcoords2d_onto_cubemap(unsigned face,
+                                       const float *in_st, unsigned in_stride,
+                                       float *out_str, unsigned out_stride);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
--
1.6.3.3



[0003-util-add-blitter.patch]

From 0917877d9326d63378548defce0d7233b90f4b60 Mon Sep 17 00:00:00 2001
From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...>
Date: Thu, 10 Dec 2009 10:25:33 +0100
Subject: [PATCH 3/3] util: add blitter

---
 src/gallium/auxiliary/util/Makefile    |    1 +
 src/gallium/auxiliary/util/SConscript  |    1 +
 src/gallium/auxiliary/util/u_blitter.c |  605 ++++++++++++++++++++++++++++++++
 src/gallium/auxiliary/util/u_blitter.h |  242 +++++++++++++
 4 files changed, 849 insertions(+), 0 deletions(-)
 create mode 100644 src/gallium/auxiliary/util/u_blitter.c
 create mode 100644 src/gallium/auxiliary/util/u_blitter.h

diff --git a/src/gallium/auxiliary/util/Makefile b/src/gallium/auxiliary/util/Makefile
index 894958f..f81fc46 100644
--- a/src/gallium/auxiliary/util/Makefile
+++ b/src/gallium/auxiliary/util/Makefile
@@ -9,6 +9,7 @@ C_SOURCES = \
  u_debug_symbol.c \
  u_debug_stack.c \
  u_blit.c \
+ u_blitter.c \
  u_cache.c \
  u_cpu_detect.c \
  u_draw_quad.c \
diff --git a/src/gallium/auxiliary/util/SConscript b/src/gallium/auxiliary/util/SConscript
index 0c0e048..024a370 100644
--- a/src/gallium/auxiliary/util/SConscript
+++ b/src/gallium/auxiliary/util/SConscript
@@ -23,6 +23,7 @@ util = env.ConvenienceLibrary(
  source = [
  'u_bitmask.c',
  'u_blit.c',
+ 'u_blitter.c',
  'u_cache.c',
  'u_cpu_detect.c',
  'u_debug.c',
diff --git a/src/gallium/auxiliary/util/u_blitter.c b/src/gallium/auxiliary/util/u_blitter.c
new file mode 100644
index 0000000..e51a5df
--- /dev/null
+++ b/src/gallium/auxiliary/util/u_blitter.c
@@ -0,0 +1,605 @@
+/**************************************************************************
+ *
+ * Copyright 2009 Marek Olšák <maraeo@...>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
+ * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR
+ * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ **************************************************************************/
+
+/**
+ * @file
+ * Blitter utility to facilitate acceleration of the clear, surface_copy,
+ * and surface_fill functions.
+ *
+ * @author Marek Olšák
+ */
+
+#include "pipe/p_context.h"
+#include "pipe/p_defines.h"
+#include "pipe/p_inlines.h"
+#include "pipe/p_shader_tokens.h"
+#include "pipe/p_state.h"
+
+#include "util/u_memory.h"
+#include "util/u_math.h"
+#include "util/u_blitter.h"
+#include "util/u_draw_quad.h"
+#include "util/u_pack_color.h"
+#include "util/u_rect.h"
+#include "util/u_simple_shaders.h"
+#include "util/u_texture.h"
+
+struct blitter_context_priv
+{
+   struct blitter_context blitter;
+
+   struct pipe_context *pipe; /**< pipe context */
+   struct pipe_buffer *vbuf;  /**< quad */
+
+   float vertices[4][2][4];   /**< {pos, color} or {pos, texcoord} */
+
+   /* Constant state objects. */
+   /* Vertex shaders. */
+   void *vs_col; /**< Vertex shader which passes {pos, color} to the output */
+   void *vs_tex; /**<Vertex shader which passes {pos, texcoord} to the output.*/
+
+   /* Fragment shaders. */
+   void *fs_col[8];     /**< FS which outputs colors to 1-8 color buffers */
+   void *fs_texfetch_col[4];   /**< FS which outputs a color from a texture */
+   void *fs_texfetch_depth[4]; /**< FS which outputs a depth from a texture,
+                              where the index is PIPE_TEXTURE_* to be sampled */
+
+   /* Blend state. */
+   void *blend_write_color;   /**< blend state with writemask of RGBA */
+   void *blend_keep_color;    /**< blend state with writemask of 0 */
+
+   /* Depth stencil alpha state. */
+   void *dsa_write_depth_stencil[0xff]; /**< indices are stencil clear values */
+   void *dsa_write_depth_keep_stencil;
+   void *dsa_keep_depth_stencil;
+
+   /* Other state. */
+   void *sampler_state[16];   /**< sampler state for clamping to a miplevel */
+   void *rs_state;            /**< rasterizer state */
+};
+
+struct blitter_context *util_blitter_create(struct pipe_context *pipe)
+{
+   struct blitter_context_priv *ctx;
+   struct pipe_blend_state blend;
+   struct pipe_depth_stencil_alpha_state dsa;
+   struct pipe_rasterizer_state rs_state;
+   struct pipe_sampler_state sampler_state;
+   unsigned i, max_render_targets;
+
+   ctx = CALLOC_STRUCT(blitter_context_priv);
+   if (!ctx)
+      return NULL;
+
+   ctx->pipe = pipe;
+
+   /* init state objects for them to be considered invalid */
+   ctx->blitter.saved_fb_state.nr_cbufs = ~0;
+   ctx->blitter.saved_num_textures = ~0;
+   ctx->blitter.saved_num_sampler_states = ~0;
+
+   /* blend state objects */
+   memset(&blend, 0, sizeof(blend));
+   ctx->blend_keep_color = pipe->create_blend_state(pipe, &blend);
+
+   blend.colormask = PIPE_MASK_RGBA;
+   ctx->blend_write_color = pipe->create_blend_state(pipe, &blend);
+
+   /* depth stencil alpha state objects */
+   memset(&dsa, 0, sizeof(dsa));
+   ctx->dsa_keep_depth_stencil =
+      pipe->create_depth_stencil_alpha_state(pipe, &dsa);
+
+   dsa.depth.enabled = 1;
+   dsa.depth.writemask = 1;
+   dsa.depth.func = PIPE_FUNC_ALWAYS;
+   ctx->dsa_write_depth_keep_stencil =
+      pipe->create_depth_stencil_alpha_state(pipe, &dsa);
+
+   dsa.stencil[0].enabled = 1;
+   dsa.stencil[0].func = PIPE_FUNC_ALWAYS;
+   dsa.stencil[0].fail_op = PIPE_STENCIL_OP_REPLACE;
+   dsa.stencil[0].zpass_op = PIPE_STENCIL_OP_REPLACE;
+   dsa.stencil[0].zfail_op = PIPE_STENCIL_OP_REPLACE;
+   dsa.stencil[0].valuemask = 0xff;
+   dsa.stencil[0].writemask = 0xff;
+
+   /* create a depth stencil alpha state for each possible stencil clear
+    * value */
+   for (i = 0; i < 0xff; i++) {
+      dsa.stencil[0].ref_value = i;
+
+      ctx->dsa_write_depth_stencil[i] =
+         pipe->create_depth_stencil_alpha_state(pipe, &dsa);
+   }
+
+   /* sampler state */
+   memset(&sampler_state, 0, sizeof(sampler_state));
+   sampler_state.wrap_s = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
+   sampler_state.wrap_t = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
+   sampler_state.wrap_r = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
+
+   for (i = 0; i < 16; i++) {
+      sampler_state.lod_bias = i;
+      sampler_state.min_lod = i;
+      sampler_state.max_lod = i;
+
+      ctx->sampler_state[i] = pipe->create_sampler_state(pipe, &sampler_state);
+   }
+
+   /* rasterizer state */
+   memset(&rs_state, 0, sizeof(rs_state));
+   rs_state.front_winding = PIPE_WINDING_CW;
+   rs_state.cull_mode = PIPE_WINDING_NONE;
+   rs_state.bypass_vs_clip_and_viewport = 1;
+   rs_state.gl_rasterization_rules = 1;
+   ctx->rs_state = pipe->create_rasterizer_state(pipe, &rs_state);
+
+   /* vertex shaders */
+   {
+      const uint semantic_names[] = { TGSI_SEMANTIC_POSITION,
+                                      TGSI_SEMANTIC_COLOR };
+      const uint semantic_indices[] = { 0, 0 };
+      ctx->vs_col =
+         util_make_vertex_passthrough_shader(pipe, 2, semantic_names,
+                                             semantic_indices);
+   }
+   {
+      const uint semantic_names[] = { TGSI_SEMANTIC_POSITION,
+                                      TGSI_SEMANTIC_GENERIC };
+      const uint semantic_indices[] = { 0, 0 };
+      ctx->vs_tex =
+         util_make_vertex_passthrough_shader(pipe, 2, semantic_names,
+                                             semantic_indices);
+   }
+
+   /* fragment shaders */
+   ctx->fs_texfetch_col[PIPE_TEXTURE_1D] =
+      util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_1D);
+   ctx->fs_texfetch_col[PIPE_TEXTURE_2D] =
+      util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_2D);
+   ctx->fs_texfetch_col[PIPE_TEXTURE_3D] =
+      util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_3D);
+   ctx->fs_texfetch_col[PIPE_TEXTURE_CUBE] =
+      util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_CUBE);
+
+   ctx->fs_texfetch_depth[PIPE_TEXTURE_1D] =
+      util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_1D);
+   ctx->fs_texfetch_depth[PIPE_TEXTURE_2D] =
+      util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_2D);
+   ctx->fs_texfetch_depth[PIPE_TEXTURE_3D] =
+      util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_3D);
+   ctx->fs_texfetch_depth[PIPE_TEXTURE_CUBE] =
+      util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_CUBE);
+
+   max_render_targets = pipe->screen->get_param(pipe->screen,
+                                                PIPE_CAP_MAX_RENDER_TARGETS);
+   assert(max_render_targets <= 8);
+   for (i = 0; i < max_render_targets; i++)
+      ctx->fs_col[i] = util_make_fragment_clonecolor_shader(pipe, 1+i);
+
+   /* set invariant vertex coordinates */
+   for (i = 0; i < 4; i++)
+      ctx->vertices[i][0][3] = 1; /*v.w*/
+
+   /* create the vertex buffer */
+   ctx->vbuf = pipe_buffer_create(ctx->pipe->screen,
+                                  32,
+                                  PIPE_BUFFER_USAGE_VERTEX,
+                                  sizeof(ctx->vertices));
+
+   return &ctx->blitter;
+}
+
+void util_blitter_destroy(struct blitter_context *blitter)
+{
+   struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter;
+   struct pipe_context *pipe = ctx->pipe;
+   int i;
+
+   pipe->delete_blend_state(pipe, ctx->blend_write_color);
+   pipe->delete_blend_state(pipe, ctx->blend_keep_color);
+   pipe->delete_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil);
+   pipe->delete_depth_stencil_alpha_state(pipe,
+                                          ctx->dsa_write_depth_keep_stencil);
+
+   for (i = 0; i < 0xff; i++)
+      pipe->delete_depth_stencil_alpha_state(pipe,
+                                             ctx->dsa_write_depth_stencil[i]);
+
+   pipe->delete_rasterizer_state(pipe, ctx->rs_state);
+   pipe->delete_vs_state(pipe, ctx->vs_col);
+   pipe->delete_vs_state(pipe, ctx->vs_tex);
+
+   for (i = 0; i < 4; i++) {
+      pipe->delete_fs_state(pipe, ctx->fs_texfetch_col[i]);
+      pipe->delete_fs_state(pipe, ctx->fs_texfetch_depth[i]);
+   }
+   for (i = 0; i < 8 && ctx->fs_col[i]; i++)
+      pipe->delete_fs_state(pipe, ctx->fs_col[i]);
+
+   pipe_buffer_reference(&ctx->vbuf, NULL);
+   FREE(ctx);
+}
+
+static void blitter_check_saved_CSOs(struct blitter_context_priv *ctx)
+{
+   /* make sure these CSOs have been saved */
+   assert(ctx->blitter.saved_blend_state &&
+          ctx->blitter.saved_dsa_state &&
+          ctx->blitter.saved_rs_state &&
+          ctx->blitter.saved_fs &&
+          ctx->blitter.saved_vs);
+}
+
+static void blitter_restore_CSOs(struct blitter_context_priv *ctx)
+{
+   struct pipe_context *pipe = ctx->pipe;
+
+   /* restore the state objects which are always required to be saved */
+   pipe->bind_blend_state(pipe, ctx->blitter.saved_blend_state);
+   pipe->bind_depth_stencil_alpha_state(pipe, ctx->blitter.saved_dsa_state);
+   pipe->bind_rasterizer_state(pipe, ctx->blitter.saved_rs_state);
+   pipe->bind_fs_state(pipe, ctx->blitter.saved_fs);
+   pipe->bind_vs_state(pipe, ctx->blitter.saved_vs);
+
+   ctx->blitter.saved_blend_state = 0;
+   ctx->blitter.saved_dsa_state = 0;
+   ctx->blitter.saved_rs_state = 0;
+   ctx->blitter.saved_fs = 0;
+   ctx->blitter.saved_vs = 0;
+
+   /* restore the state objects which are required to be saved before copy/fill
+    */
+   if (ctx->blitter.saved_fb_state.nr_cbufs != ~0) {
+      pipe->set_framebuffer_state(pipe, &ctx->blitter.saved_fb_state);
+      ctx->blitter.saved_fb_state.nr_cbufs = ~0;
+   }
+
+   if (ctx->blitter.saved_num_sampler_states != ~0) {
+      pipe->bind_fragment_sampler_states(pipe,
+                                         ctx->blitter.saved_num_sampler_states,
+                                         ctx->blitter.saved_sampler_states);
+      ctx->blitter.saved_num_sampler_states = ~0;
+   }
+
+   if (ctx->blitter.saved_num_textures != ~0) {
+      pipe->set_fragment_sampler_textures(pipe,
+                                          ctx->blitter.saved_num_textures,
+                                          ctx->blitter.saved_textures);
+      ctx->blitter.saved_num_textures = ~0;
+   }
+}
+
+static void blitter_set_rectangle(struct blitter_context_priv *ctx,
+                                  unsigned x1, unsigned y1,
+                                  unsigned x2, unsigned y2,
+                                  float depth)
+{
+   int i;
+
+   /* set vertex positions */
+   ctx->vertices[0][0][0] = x1; /*v0.x*/
+   ctx->vertices[0][0][1] = y1; /*v0.y*/
+
+   ctx->vertices[1][0][0] = x2; /*v1.x*/
+   ctx->vertices[1][0][1] = y1; /*v1.y*/
+
+   ctx->vertices[2][0][0] = x2; /*v2.x*/
+   ctx->vertices[2][0][1] = y2; /*v2.y*/
+
+   ctx->vertices[3][0][0] = x1; /*v3.x*/
+   ctx->vertices[3][0][1] = y2; /*v3.y*/
+
+   for (i = 0; i < 4; i++)
+      ctx->vertices[i][0][2] = depth; /*z*/
+}
+
+static void blitter_set_clear_color(struct blitter_context_priv *ctx,
+                                    const float *rgba)
+{
+   int i;
+
+   for (i = 0; i < 4; i++) {
+      ctx->vertices[i][1][0] = rgba[0];
+      ctx->vertices[i][1][1] = rgba[1];
+      ctx->vertices[i][1][2] = rgba[2];
+      ctx->vertices[i][1][3] = rgba[3];
+   }
+}
+
+static void blitter_set_texcoords_2d(struct blitter_context_priv *ctx,
+                                     struct pipe_surface *surf,
+                                     unsigned x1, unsigned y1,
+                                     unsigned x2, unsigned y2)
+{
+   int i;
+   float s1 = x1 / (float)surf->width;
+   float t1 = y1 / (float)surf->height;
+   float s2 = x2 / (float)surf->width;
+   float t2 = y2 / (float)surf->height;
+
+   ctx->vertices[0][1][0] = s1; /*t0.s*/
+   ctx->vertices[0][1][1] = t1; /*t0.t*/
+
+   ctx->vertices[1][1][0] = s2; /*t1.s*/
+   ctx->vertices[1][1][1] = t1; /*t1.t*/
+
+   ctx->vertices[2][1][0] = s2; /*t2.s*/
+   ctx->vertices[2][1][1] = t2; /*t2.t*/
+
+   ctx->vertices[3][1][0] = s1; /*t3.s*/
+   ctx->vertices[3][1][1] = t2; /*t3.t*/
+
+   for (i = 0; i < 4; i++) {
+      ctx->vertices[i][1][2] = 0; /*r*/
+      ctx->vertices[i][1][3] = 1; /*q*/
+   }
+}
+
+static void blitter_set_texcoords_3d(struct blitter_context_priv *ctx,
+                                     struct pipe_surface *surf,
+                                     unsigned x1, unsigned y1,
+                                     unsigned x2, unsigned y2)
+{
+   int i;
+   float depth = u_minify(surf->texture->depth0, surf->level);
+   float r = surf->zslice / depth;
+
+   blitter_set_texcoords_2d(ctx, surf, x1, y1, x2, y2);
+
+   for (i = 0; i < 4; i++)
+      ctx->vertices[i][1][2] = r; /*r*/
+}
+
+static void blitter_set_texcoords_cube(struct blitter_context_priv *ctx,
+                                       struct pipe_surface *surf,
+                                       unsigned x1, unsigned y1,
+                                       unsigned x2, unsigned y2)
+{
+   int i;
+   float s1 = x1 / (float)surf->width;
+   float t1 = y1 / (float)surf->height;
+   float s2 = x2 / (float)surf->width;
+   float t2 = y2 / (float)surf->height;
+   const float st[4][2] = {
+      {s1, t1}, {s2, t1}, {s2, t2}, {s1, t2}
+   };
+
+   util_map_texcoords2d_onto_cubemap(surf->face,
+                                     /* pointer, stride in floats */
+                                     &st[0][0], 2,
+                                     &ctx->vertices[0][1][0], 8);
+
+   for (i = 0; i < 4; i++)
+      ctx->vertices[i][1][3] = 1; /*q*/
+}
+
+static void blitter_draw_quad(struct blitter_context_priv *ctx)
+{
+   struct blitter_context *blitter = &ctx->blitter;
+   struct pipe_context *pipe = ctx->pipe;
+
+   if (blitter->draw_quad) {
+      blitter->draw_quad(pipe, &ctx->vertices[0][0][0]);
+   } else {
+      /* write vertices and draw them */
+      pipe_buffer_write(pipe->screen, ctx->vbuf,
+                        0, sizeof(ctx->vertices), ctx->vertices);
+
+      util_draw_vertex_buffer(ctx->pipe, ctx->vbuf, 0, PIPE_PRIM_TRIANGLE_FAN,
+                              4,  /* verts */
+                              2); /* attribs/vert */
+   }
+}
+
+void util_blitter_clear(struct blitter_context *blitter,
+                        unsigned width, unsigned height,
+                        unsigned num_cbufs,
+                        unsigned clear_buffers,
+                        const float *rgba,
+                        double depth, unsigned stencil)
+{
+   struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter;
+   struct pipe_context *pipe = ctx->pipe;
+
+   assert(num_cbufs <= 8);
+
+   blitter_check_saved_CSOs(ctx);
+
+   /* bind CSOs */
+   if (clear_buffers & PIPE_CLEAR_COLOR)
+      pipe->bind_blend_state(pipe, ctx->blend_write_color);
+   else
+      pipe->bind_blend_state(pipe, ctx->blend_keep_color);
+
+   if (clear_buffers & PIPE_CLEAR_DEPTHSTENCIL)
+      pipe->bind_depth_stencil_alpha_state(pipe,
+         ctx->dsa_write_depth_stencil[stencil&0xff]);
+   else
+      pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil);
+
+   pipe->bind_rasterizer_state(pipe, ctx->rs_state);
+   pipe->bind_fs_state(pipe, ctx->fs_col[num_cbufs ? num_cbufs-1 : 0]);
+   pipe->bind_vs_state(pipe, ctx->vs_col);
+
+   blitter_set_clear_color(ctx, rgba);
+   blitter_set_rectangle(ctx, 0, 0, width, height, depth);
+   blitter_draw_quad(ctx);
+   blitter_restore_CSOs(ctx);
+}
+
+void util_blitter_copy(struct blitter_context *blitter,
+                       struct pipe_surface *dst,
+                       unsigned dstx, unsigned dsty,
+                       struct pipe_surface *src,
+                       unsigned srcx, unsigned srcy,
+                       unsigned width, unsigned height,
+                       boolean ignore_stencil)
+{
+   struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter;
+   struct pipe_context *pipe = ctx->pipe;
+   struct pipe_screen *screen = pipe->screen;
+   struct pipe_framebuffer_state fb_state;
+   boolean is_stencil, is_depth;
+   unsigned dst_tex_usage;
+
+   /* give up if textures are not set */
+   assert(dst->texture && src->texture);
+   if (!dst->texture || !src->texture)
+      return;
+
+   is_depth = pf_get_component_bits(src->format, PIPE_FORMAT_COMP_Z) != 0;
+   is_stencil = pf_get_component_bits(src->format, PIPE_FORMAT_COMP_S) != 0;
+   dst_tex_usage = is_depth || is_stencil ? PIPE_TEXTURE_USAGE_DEPTH_STENCIL :
+                                            PIPE_TEXTURE_USAGE_RENDER_TARGET;
+
+   /* check if we can sample from and render to the surfaces */
+   /* (assuming copying a stencil buffer is not possible) */
+   if ((!ignore_stencil && is_stencil) ||
+       !screen->is_format_supported(screen, dst->format, dst->texture->target,
+                                    dst_tex_usage, 0) ||
+       !screen->is_format_supported(screen, src->format, src->texture->target,
+                                    PIPE_TEXTURE_USAGE_SAMPLER, 0)) {
+      util_surface_copy(pipe, FALSE, dst, dstx, dsty, src, srcx, srcy,
+                        width, height);
+      return;
+   }
+
+   /* check whether the states are properly saved */
+   blitter_check_saved_CSOs(ctx);
+   assert(blitter->saved_fb_state.nr_cbufs != ~0);
+   assert(blitter->saved_num_textures != ~0);
+   assert(blitter->saved_num_sampler_states != ~0);
+   assert(src->texture->target < 4);
+
+   /* bind CSOs */
+   fb_state.width = dst->width;
+   fb_state.height = dst->height;
+
+   if (is_depth) {
+      pipe->bind_blend_state(pipe, ctx->blend_keep_color);
+      pipe->bind_depth_stencil_alpha_state(pipe,
+                                           ctx->dsa_write_depth_keep_stencil);
+      pipe->bind_fs_state(pipe, ctx->fs_texfetch_depth[src->texture->target]);
+
+      fb_state.nr_cbufs = 0;
+      fb_state.zsbuf = dst;
+   } else {
+      pipe->bind_blend_state(pipe, ctx->blend_write_color);
+      pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil);
+      pipe->bind_fs_state(pipe, ctx->fs_texfetch_col[src->texture->target]);
+
+      fb_state.nr_cbufs = 1;
+      fb_state.cbufs[0] = dst;
+      fb_state.zsbuf = 0;
+   }
+   pipe->bind_rasterizer_state(pipe, ctx->rs_state);
+   pipe->bind_vs_state(pipe, ctx->vs_tex);
+   pipe->bind_fragment_sampler_states(pipe, 1, &ctx->sampler_state[src->level]);
+   pipe->set_fragment_sampler_textures(pipe, 1, &src->texture);
+   pipe->set_framebuffer_state(pipe, &fb_state);
+
+   /* set texture coordinates */
+   switch (src->texture->target) {
+      case PIPE_TEXTURE_1D:
+      case PIPE_TEXTURE_2D:
+         blitter_set_texcoords_2d(ctx, src, srcx, srcy,
+                                  srcx+width, srcy+height);
+         break;
+      case PIPE_TEXTURE_3D:
+         blitter_set_texcoords_3d(ctx, src, srcx, srcy,
+                                  srcx+width, srcy+height);
+         break;
+      case PIPE_TEXTURE_CUBE:
+         blitter_set_texcoords_cube(ctx, src, srcx, srcy,
+                                    srcx+width, srcy+height);
+         break;
+   }
+
+   blitter_set_rectangle(ctx, dstx, dsty, dstx+width, dsty+height, 0);
+   blitter_draw_quad(ctx);
+   blitter_restore_CSOs(ctx);
+}
+
+void util_blitter_fill(struct blitter_context *blitter,
+                       struct pipe_surface *dst,
+                       unsigned dstx, unsigned dsty,
+                       unsigned width, unsigned height,
+                       unsigned value)
+{
+   struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter;
+   struct pipe_context *pipe = ctx->pipe;
+   struct pipe_screen *screen = pipe->screen;
+   struct pipe_framebuffer_state fb_state;
+   float rgba[4];
+   ubyte ub_rgba[4] = {0};
+   union util_color color;
+   int i;
+
+   assert(dst->texture);
+   if (!dst->texture)
+      return;
+
+   /* check if we can render to the surface */
+   if (pf_is_depth_or_stencil(dst->format) || /* unlikely, but you never know */
+       !screen->is_format_supported(screen, dst->format, dst->texture->target,
+                                    PIPE_TEXTURE_USAGE_RENDER_TARGET, 0)) {
+      util_surface_fill(pipe, dst, dstx, dsty, width, height, value);
+      return;
+   }
+
+   /* unpack the color */
+   color.ui = value;
+   util_unpack_color_ub(dst->format, &color,
+                        ub_rgba, ub_rgba+1, ub_rgba+2, ub_rgba+3);
+   for (i = 0; i < 4; i++)
+      rgba[i] = ubyte_to_float(ub_rgba[i]);
+
+   /* check the saved state */
+   blitter_check_saved_CSOs(ctx);
+   assert(blitter->saved_fb_state.nr_cbufs != ~0);
+
+   /* bind CSOs */
+   pipe->bind_blend_state(pipe, ctx->blend_write_color);
+   pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil);
+   pipe->bind_rasterizer_state(pipe, ctx->rs_state);
+   pipe->bind_fs_state(pipe, ctx->fs_col[0]);
+   pipe->bind_vs_state(pipe, ctx->vs_col);
+
+   /* set a framebuffer state */
+   fb_state.width = dst->width;
+   fb_state.height = dst->height;
+   fb_state.nr_cbufs = 1;
+   fb_state.cbufs[0] = dst;
+   fb_state.zsbuf = 0;
+   pipe->set_framebuffer_state(pipe, &fb_state);
+
+   blitter_set_clear_color(ctx, rgba);
+   blitter_set_rectangle(ctx, 0, 0, width, height, 0);
+   blitter_draw_quad(ctx);
+   blitter_restore_CSOs(ctx);
+}
diff --git a/src/gallium/auxiliary/util/u_blitter.h b/src/gallium/auxiliary/util/u_blitter.h
new file mode 100644
index 0000000..d03915c
--- /dev/null
+++ b/src/gallium/auxiliary/util/u_blitter.h
@@ -0,0 +1,242 @@
+/**************************************************************************
+ *
+ * Copyright 2009 Marek Olšák <maraeo@...>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
+ * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR
+ * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ **************************************************************************/
+
+#ifndef U_BLITTER_H
+#define U_BLITTER_H
+
+#include "pipe/p_state.h"
+
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+struct pipe_context;
+
+struct blitter_context
+{
+   /**
+    * Draw a quad.
+    *
+    * The pipe driver can set this to provide a more efficient way of drawing
+    * a quad. If it's NULL, the quad is drawn using a vertex buffer.
+    *
+    * There are always 4 vertices with interleaved vertex elements of type
+    * RGBA32F. See the vertex shader _output_ semantics to know what those are.
+    * The primitive type is always PIPE_PRIM_TRIANGLE_FAN and VS/clip/viewport
+    * is bypasssed.
+    */
+   void (*draw_quad)(struct pipe_context *pipe,
+                     const float *vertices);
+
+   /* Private members, really. */
+   void *saved_blend_state;   /**< blend state */
+   void *saved_dsa_state;     /**< depth stencil alpha state */
+   void *saved_rs_state;      /**< rasterizer state */
+   void *saved_fs, *saved_vs; /**< fragment shader, vertex shader */
+
+   struct pipe_framebuffer_state saved_fb_state;  /**< framebuffer state */
+
+   int saved_num_sampler_states;
+   void *saved_sampler_states[32];
+
+   int saved_num_textures;
+   struct pipe_texture *saved_textures[32]; /* is 32 enough? */
+};
+
+/**
+ * Create a blitter context.
+ */
+struct blitter_context *util_blitter_create(struct pipe_context *pipe);
+
+/**
+ * Destroy a blitter context.
+ */
+void util_blitter_destroy(struct blitter_context *blitter);
+
+/*
+ * These CSOs must be saved before any of the following functions is called:
+ * - blend state
+ * - depth stencil alpha state
+ * - rasterizer state
+ * - vertex shader
+ * - fragment shader
+ */
+
+/**
+ * Clear a specified set of currently bound buffers to specified values.
+ */
+void util_blitter_clear(struct blitter_context *blitter,
+                        unsigned width, unsigned height,
+                        unsigned num_cbufs,
+                        unsigned clear_buffers,
+                        const float *rgba,
+                        double depth, unsigned stencil);
+
+/**
+ * Copy a block of pixels from one surface to another.
+ *
+ * You can copy from any color format to any other color format provided
+ * the former can be sampled and the latter can be rendered to. Otherwise,
+ * a software fallback path is taken and both surfaces must be of the same
+ * format.
+ *
+ * The same holds for depth-stencil formats with the exception that stencil
+ * cannot be copied unless you set ignore_stencil to FALSE. In that case,
+ * a software fallback path is taken and both surfaces must be of the same
+ * format.
+ *
+ * Use pipe_screen->is_format_supported to know your options.
+ *
+ * These states must be saved in the blitter in addition to the state objects
+ * already required to be saved:
+ * - framebuffer state
+ * - fragment sampler states
+ * - fragment sampler textures
+ */
+void util_blitter_copy(struct blitter_context *blitter,
+                       struct pipe_surface *dst,
+                       unsigned dstx, unsigned dsty,
+                       struct pipe_surface *src,
+                       unsigned srcx, unsigned srcy,
+                       unsigned width, unsigned height,
+                       boolean ignore_stencil);
+
+/**
+ * Fill a region of a surface with a constant value.
+ *
+ * If the surface cannot be rendered to or it's a depth-stencil format,
+ * a software fallback path is taken.
+ *
+ * These states must be saved in the blitter in addition to the state objects
+ * already required to be saved:
+ * - framebuffer state
+ */
+void util_blitter_fill(struct blitter_context *blitter,
+                       struct pipe_surface *dst,
+                       unsigned dstx, unsigned dsty,
+                       unsigned width, unsigned height,
+                       unsigned value);
+
+/**
+ * Copy all pixels from one surface to another.
+ *
+ * The rules are the same as in util_blitter_copy with the addition that
+ * surfaces must have the same size.
+ */
+static INLINE
+void util_blitter_copy_surface(struct blitter_context *blitter,
+                               struct pipe_surface *dst,
+                               struct pipe_surface *src,
+                               boolean ignore_stencil)
+{
+   assert(dst->width == src->width && dst->height == src->height);
+
+   util_blitter_copy(blitter, dst, 0, 0, src, 0, 0, src->width, src->height,
+                     ignore_stencil);
+}
+
+
+/* The functions below should be used to save currently bound constant state
+ * objects inside a driver. The objects are automatically restored at the end
+ * of the util_blitter_{clear, fill, copy, copy_surface} functions and then
+ * forgotten.
+ *
+ * CSOs not listed here are not affected by util_blitter. */
+
+static INLINE
+void util_blitter_save_blend(struct blitter_context *blitter,
+                             void *state)
+{
+   blitter->saved_blend_state = state;
+}
+
+static INLINE
+void util_blitter_save_depth_stencil_alpha(struct blitter_context *blitter,
+                                           void *state)
+{
+   blitter->saved_dsa_state = state;
+}
+
+static INLINE
+void util_blitter_save_rasterizer(struct blitter_context *blitter,
+                                  void *state)
+{
+   blitter->saved_rs_state = state;
+}
+
+static INLINE
+void util_blitter_save_fragment_shader(struct blitter_context *blitter,
+                                       void *fs)
+{
+   blitter->saved_fs = fs;
+}
+
+static INLINE
+void util_blitter_save_vertex_shader(struct blitter_context *blitter,
+                                     void *vs)
+{
+   blitter->saved_vs = vs;
+}
+
+static INLINE
+void util_blitter_save_framebuffer(struct blitter_context *blitter,
+                                   struct pipe_framebuffer_state *state)
+{
+   blitter->saved_fb_state = *state;
+}
+
+static INLINE
+void util_blitter_save_fragment_sampler_states(
+                  struct blitter_context *blitter,
+                  int num_sampler_states,
+                  void **sampler_states)
+{
+   assert(num_textures <= 32);
+
+   blitter->saved_num_sampler_states = num_sampler_states;
+   memcpy(blitter->saved_sampler_states, sampler_states,
+          num_sampler_states * sizeof(void *));
+}
+
+static INLINE
+void util_blitter_save_fragment_sampler_textures(
+                  struct blitter_context *blitter,
+                  int num_textures,
+                  struct pipe_texture **textures)
+{
+   assert(num_textures <= 32);
+
+   blitter->saved_num_textures = num_textures;
+   memcpy(blitter->saved_textures, textures,
+          num_textures * sizeof(struct pipe_texture *));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
--
1.6.3.3



------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev

_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by Keith Whitwell-3 :: Rate this Message:

| View Threaded | Show Only this Message

On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:

> Hi Keith,
>
> I've finished the blitter module. It fully implements the clear,
> surface_copy, and surface_fill functions. It properly fallbacks to
> software in case a surface cannot be sampled or rendered to according
> to usage. Copying a stencil buffer always fallbacks unless the
> ignore_stencil parameter (see util_blitter_copy) is set to TRUE. To my
> knowledge, GPUs cannot copy the stencil buffer (not sure if fiddling
> with texture formats can help). It's all documented in u_blitter.h.
>
> The pipe driver can optionally hook up a function to draw a quad
> (blitter_context::draw_quad). I realized that embedding 4 vertices
> into a command stream (AKA immediate mode) is much faster than writing
> them to a vertex buffer due to reduced driver overhead. It might be
> worth to consider adding the draw_quad function to pipe_context.
>
> When working on the blitter, I added the following things to
> util/u_simple_shaders:
> - util_make_fragment_tex_shader has a new parametr tex_target and the
> value should be one of TGSI_TEXTURE_* enums so that it can be used to
> sample from any kind of texture.
> - Added util_make_fragment_tex_shader_writedepth, which writes depth
> sampled from a texture. It's used for copying depth textures.
> - Added util_make_fragment_clonecolor_shader, which copies input
> COLOR[0] to a specified number of render targets. It's used to clear
> MRTs.
>
> Also, I moved the code for converting 2D texture coordinates into
> cubemap texture coordinates from u_gen_mipmap to a new function in
> util/u_texture.
>
> Please review/push.
>
> Once it gets approved, I will send patches with r300g blit support to
> Corbin. With this work, untiling a texture will be as easy as calling
> surface_copy whereas the driver state remains intact (theoretically).

Marek,

This all looks great.  Many thanks for adding this functionality - I'm
sure we'll be building on it in many ways going forward.

I'll push the patches intact, but one thing we need to start thinking
about is the mix of code in the util/ directory -- there's some stuff in
there that's only legal/useful for state-trackers, some that's likewise
only legal for drivers, and a lot that is valid everywhere.  At some
stage we want to split that up.

Keith


------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by Keith Whitwell-3 :: Rate this Message:

| View Threaded | Show Only this Message

On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:

>
> +static INLINE
> +void util_blitter_save_fragment_sampler_states(
> +                  struct blitter_context *blitter,
> +                  int num_sampler_states,
> +                  void **sampler_states)
> +{
> +   assert(num_textures <= 32);
> +
> +   blitter->saved_num_sampler_states = num_sampler_states;
> +   memcpy(blitter->saved_sampler_states, sampler_states,
> +          num_sampler_states * sizeof(void *));
> +}
> +

Have you tried compiling with debug enabled?  The assert above fails to
compile.  Also, can you use Elements() or similar instead of the
hard-coded 32?

Maybe we can figure out how to go back to having asserts keep exposing
their contents to the compiler even on non-debug builds.  This used to
work without problem on linux and helped a lot to avoid these type of
problems.

Keith


------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by Keith Whitwell-3 :: Rate this Message:

| View Threaded | Show Only this Message

On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:

> -- /dev/null
> +++ b/src/gallium/auxiliary/util/u_blitter.c
> @@ -0,0 +1,605 @@
> +/**************************************************************************
> + *
> + * Copyright 2009 Marek Olšák <maraeo@...>
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> + * "Software"), to deal in the Software without restriction, including
> + * without limitation the rights to use, copy, modify, merge, publish,
> + * distribute, sub license, and/or sell copies of the Software, and to
> + * permit persons to whom the Software is furnished to do so, subject to
> + * the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> + * next paragraph) shall be included in all copies or substantial portions
> + * of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
> + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
> + * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR
> + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
> + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
> + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
> + *
> + **************************************************************************/
> +
> +/**
> + * @file
> + * Blitter utility to facilitate acceleration of the clear, surface_copy,
> + * and surface_fill functions.
> + *
> + * @author Marek Olšák
> + */
> +
> +#include "pipe/p_context.h"
> +#include "pipe/p_defines.h"
> +#include "pipe/p_inlines.h"
> +#include "pipe/p_shader_tokens.h"
> +#include "pipe/p_state.h"
> +
> +#include "util/u_memory.h"
> +#include "util/u_math.h"
> +#include "util/u_blitter.h"
> +#include "util/u_draw_quad.h"
> +#include "util/u_pack_color.h"
> +#include "util/u_rect.h"
> +#include "util/u_simple_shaders.h"
> +#include "util/u_texture.h"
> +
> +struct blitter_context_priv
> +{
> +   struct blitter_context blitter;
> +
> +   struct pipe_context *pipe; /**< pipe context */
> +   struct pipe_buffer *vbuf;  /**< quad */
> +
> +   float vertices[4][2][4];   /**< {pos, color} or {pos, texcoord} */
> +
> +   /* Constant state objects. */
> +   /* Vertex shaders. */
> +   void *vs_col; /**< Vertex shader which passes {pos, color} to the output */
> +   void *vs_tex; /**<Vertex shader which passes {pos, texcoord} to the output.*/
> +
> +   /* Fragment shaders. */
> +   void *fs_col[8];     /**< FS which outputs colors to 1-8 color buffers */
> +   void *fs_texfetch_col[4];   /**< FS which outputs a color from a texture */
> +   void *fs_texfetch_depth[4]; /**< FS which outputs a depth from a texture,
> +                              where the index is PIPE_TEXTURE_* to be sampled */

Please use PIPE_MAX_COLOR_BUFS or other defines to size these arrays.

> +   /* Blend state. */
> +   void *blend_write_color;   /**< blend state with writemask of RGBA */
> +   void *blend_keep_color;    /**< blend state with writemask of 0 */
> +
> +   /* Depth stencil alpha state. */
> +   void *dsa_write_depth_stencil[0xff]; /**< indices are stencil clear values */

That's a lot of state objects...

> +   void *dsa_write_depth_keep_stencil;
> +   void *dsa_keep_depth_stencil;
> +
> +   /* Other state. */
> +   void *sampler_state[16];   /**< sampler state for clamping to a miplevel */
> +   void *rs_state;            /**< rasterizer state */
> +};
> +
> +struct blitter_context *util_blitter_create(struct pipe_context *pipe)
> +{
> +   struct blitter_context_priv *ctx;
> +   struct pipe_blend_state blend;
> +   struct pipe_depth_stencil_alpha_state dsa;
> +   struct pipe_rasterizer_state rs_state;
> +   struct pipe_sampler_state sampler_state;
> +   unsigned i, max_render_targets;
> +
> +   ctx = CALLOC_STRUCT(blitter_context_priv);
> +   if (!ctx)
> +      return NULL;
> +
> +   ctx->pipe = pipe;
> +
> +   /* init state objects for them to be considered invalid */
> +   ctx->blitter.saved_fb_state.nr_cbufs = ~0;
> +   ctx->blitter.saved_num_textures = ~0;
> +   ctx->blitter.saved_num_sampler_states = ~0;
> +
> +   /* blend state objects */
> +   memset(&blend, 0, sizeof(blend));
> +   ctx->blend_keep_color = pipe->create_blend_state(pipe, &blend);
> +
> +   blend.colormask = PIPE_MASK_RGBA;
> +   ctx->blend_write_color = pipe->create_blend_state(pipe, &blend);
> +
> +   /* depth stencil alpha state objects */
> +   memset(&dsa, 0, sizeof(dsa));
> +   ctx->dsa_keep_depth_stencil =
> +      pipe->create_depth_stencil_alpha_state(pipe, &dsa);
> +
> +   dsa.depth.enabled = 1;
> +   dsa.depth.writemask = 1;
> +   dsa.depth.func = PIPE_FUNC_ALWAYS;
> +   ctx->dsa_write_depth_keep_stencil =
> +      pipe->create_depth_stencil_alpha_state(pipe, &dsa);
> +
> +   dsa.stencil[0].enabled = 1;
> +   dsa.stencil[0].func = PIPE_FUNC_ALWAYS;
> +   dsa.stencil[0].fail_op = PIPE_STENCIL_OP_REPLACE;
> +   dsa.stencil[0].zpass_op = PIPE_STENCIL_OP_REPLACE;
> +   dsa.stencil[0].zfail_op = PIPE_STENCIL_OP_REPLACE;
> +   dsa.stencil[0].valuemask = 0xff;
> +   dsa.stencil[0].writemask = 0xff;
> +
> +   /* create a depth stencil alpha state for each possible stencil clear
> +    * value */
> +   for (i = 0; i < 0xff; i++) {
> +      dsa.stencil[0].ref_value = i;
> +
> +      ctx->dsa_write_depth_stencil[i] =
> +         pipe->create_depth_stencil_alpha_state(pipe, &dsa);
> +   }

Ouch - that's an unexpectedly large number of state objects being
created for this path.

Can these be created on-demand / lazily?

Can you maybe limit this code to a (much) smaller maximum number of
simultaneously live states of this type?  Eg. 4 or 8 of them?  Creating
states isn't so terribly expensive, and this seems a bit excessive.

> +   /* sampler state */
> +   memset(&sampler_state, 0, sizeof(sampler_state));
> +   sampler_state.wrap_s = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
> +   sampler_state.wrap_t = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
> +   sampler_state.wrap_r = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
> +
> +   for (i = 0; i < 16; i++) {
> +      sampler_state.lod_bias = i;
> +      sampler_state.min_lod = i;
> +      sampler_state.max_lod = i;
> +
> +      ctx->sampler_state[i] = pipe->create_sampler_state(pipe, &sampler_state);
> +   }

Similarly, create on demand?  And use a PIPE_MAX_xxx enum for the loop?

Keith


------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by michal-9 :: Rate this Message:

| View Threaded | Show Only this Message

Keith Whitwell pisze:

> On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
>  
>> +static INLINE
>> +void util_blitter_save_fragment_sampler_states(
>> +                  struct blitter_context *blitter,
>> +                  int num_sampler_states,
>> +                  void **sampler_states)
>> +{
>> +   assert(num_textures <= 32);
>> +
>> +   blitter->saved_num_sampler_states = num_sampler_states;
>> +   memcpy(blitter->saved_sampler_states, sampler_states,
>> +          num_sampler_states * sizeof(void *));
>> +}
>> +
>>    
>
> Have you tried compiling with debug enabled?  The assert above fails to
> compile.  Also, can you use Elements() or similar instead of the
> hard-coded 32?
>
> Maybe we can figure out how to go back to having asserts keep exposing
> their contents to the compiler even on non-debug builds.  This used to
> work without problem on linux and helped a lot to avoid these type of
> problems.
>
>  
Precisely. Recently I've been thinking about mapping assert() to
__assume() for non-debug builds on windows and MSVC.

http://msdn.microsoft.com/en-us/library/1b3fsfxw%28VS.80%29.aspx

------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by Jose Fonseca :: Rate this Message:

| View Threaded | Show Only this Message

On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote:

> On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
> >
> > +static INLINE
> > +void util_blitter_save_fragment_sampler_states(
> > +                  struct blitter_context *blitter,
> > +                  int num_sampler_states,
> > +                  void **sampler_states)
> > +{
> > +   assert(num_textures <= 32);
> > +
> > +   blitter->saved_num_sampler_states = num_sampler_states;
> > +   memcpy(blitter->saved_sampler_states, sampler_states,
> > +          num_sampler_states * sizeof(void *));
> > +}
> > +
>
> Have you tried compiling with debug enabled?  The assert above fails to
> compile.  Also, can you use Elements() or similar instead of the
> hard-coded 32?
>
> Maybe we can figure out how to go back to having asserts keep exposing
> their contents to the compiler even on non-debug builds.  This used to
> work without problem on linux and helped a lot to avoid these type of
> problems.

I wouldn't say without a problem: defining assert(expr) as (void)0
instead of (void)(expr) on release builds yielded a non-negligible
performance improvement. I don't recall the exact figure, but I believe
it was the 3-5% for the driver I was benchmarking at the time. YMMV.
Different drivers will give different results, but there's nothing
platform specific about this.

I believe the problem is we sometimes have

  assert(very_expensive_check());

and it should be really

#ifdef DEBUG
  assert(very_expensive_check());
#endf

We could go through the files with a fine-toothed comb and fix it, but
it's quite likely this sort of checks creep back in unnoticed and the
thing repeats again. Between having debug builds temporarily broken and
slower release builds I personally I'm for the former.

No suprise that (void)0 is the common practice: glibc, ms's headers,
etc. all do that.

Also, I don't understand why a developer wouldn't want to use a debug
build unless he's profiling. I don't see why we should make easy for a
developer not to test its code, and running a debug build is the bare
minimum.

Jose


------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by Jose Fonseca :: Rate this Message:

| View Threaded | Show Only this Message

On Mon, 2009-12-14 at 03:52 -0800, Keith Whitwell wrote:

> On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
> > Hi Keith,
> >
> > I've finished the blitter module. It fully implements the clear,
> > surface_copy, and surface_fill functions. It properly fallbacks to
> > software in case a surface cannot be sampled or rendered to according
> > to usage. Copying a stencil buffer always fallbacks unless the
> > ignore_stencil parameter (see util_blitter_copy) is set to TRUE. To my
> > knowledge, GPUs cannot copy the stencil buffer (not sure if fiddling
> > with texture formats can help). It's all documented in u_blitter.h.
> >
> > The pipe driver can optionally hook up a function to draw a quad
> > (blitter_context::draw_quad). I realized that embedding 4 vertices
> > into a command stream (AKA immediate mode) is much faster than writing
> > them to a vertex buffer due to reduced driver overhead. It might be
> > worth to consider adding the draw_quad function to pipe_context.
> >
> > When working on the blitter, I added the following things to
> > util/u_simple_shaders:
> > - util_make_fragment_tex_shader has a new parametr tex_target and the
> > value should be one of TGSI_TEXTURE_* enums so that it can be used to
> > sample from any kind of texture.
> > - Added util_make_fragment_tex_shader_writedepth, which writes depth
> > sampled from a texture. It's used for copying depth textures.
> > - Added util_make_fragment_clonecolor_shader, which copies input
> > COLOR[0] to a specified number of render targets. It's used to clear
> > MRTs.
> >
> > Also, I moved the code for converting 2D texture coordinates into
> > cubemap texture coordinates from u_gen_mipmap to a new function in
> > util/u_texture.
> >
> > Please review/push.
> >
> > Once it gets approved, I will send patches with r300g blit support to
> > Corbin. With this work, untiling a texture will be as easy as calling
> > surface_copy whereas the driver state remains intact (theoretically).
>
> Marek,
>
> This all looks great.  Many thanks for adding this functionality - I'm
> sure we'll be building on it in many ways going forward.
>
> I'll push the patches intact, but one thing we need to start thinking
> about is the mix of code in the util/ directory -- there's some stuff in
> there that's only legal/useful for state-trackers, some that's likewise
> only legal for drivers, and a lot that is valid everywhere.  At some
> stage we want to split that up.

I plan to split the os specific stuff out soon. I'm referring to memory
allocation. debug printing, file abstraction, etc. All stuff that is not
Gallium related and is needed everywhere.

Jose


------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by Jose Fonseca :: Rate this Message:

| View Threaded | Show Only this Message

On Mon, 2009-12-14 at 03:52 -0800, Keith Whitwell wrote:

> On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
> > Hi Keith,
> >
> > I've finished the blitter module. It fully implements the clear,
> > surface_copy, and surface_fill functions. It properly fallbacks to
> > software in case a surface cannot be sampled or rendered to according
> > to usage. Copying a stencil buffer always fallbacks unless the
> > ignore_stencil parameter (see util_blitter_copy) is set to TRUE. To my
> > knowledge, GPUs cannot copy the stencil buffer (not sure if fiddling
> > with texture formats can help). It's all documented in u_blitter.h.
> >
> > The pipe driver can optionally hook up a function to draw a quad
> > (blitter_context::draw_quad). I realized that embedding 4 vertices
> > into a command stream (AKA immediate mode) is much faster than writing
> > them to a vertex buffer due to reduced driver overhead. It might be
> > worth to consider adding the draw_quad function to pipe_context.
> >
> > When working on the blitter, I added the following things to
> > util/u_simple_shaders:
> > - util_make_fragment_tex_shader has a new parametr tex_target and the
> > value should be one of TGSI_TEXTURE_* enums so that it can be used to
> > sample from any kind of texture.
> > - Added util_make_fragment_tex_shader_writedepth, which writes depth
> > sampled from a texture. It's used for copying depth textures.
> > - Added util_make_fragment_clonecolor_shader, which copies input
> > COLOR[0] to a specified number of render targets. It's used to clear
> > MRTs.
> >
> > Also, I moved the code for converting 2D texture coordinates into
> > cubemap texture coordinates from u_gen_mipmap to a new function in
> > util/u_texture.
> >
> > Please review/push.
> >
> > Once it gets approved, I will send patches with r300g blit support to
> > Corbin. With this work, untiling a texture will be as easy as calling
> > surface_copy whereas the driver state remains intact (theoretically).
>
> Marek,
>
> This all looks great.  Many thanks for adding this functionality - I'm
> sure we'll be building on it in many ways going forward.

Nice stuff indeed.

FWIW, I also think that putting a reasonable functionality bars instead
querying the pipe for every little capability will benefit us in the
long term. It worked well for vertex processing and hardware unsupported
API quirks (via draw module); it's nice to see the same for blits; and I
hope this becomes a trend.  

It not only makes things less complex, having all pipe drivers with
similar capabilities is what allows us to plug'n'play pipe drivers; do
things like replay a trace of one driver on top of another; perhaps in
the future code a drivers that do differential analysis with a reference
one, etc.

Jose


------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by Keith Whitwell-3 :: Rate this Message:

| View Threaded | Show Only this Message

On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote:

> On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote:
> > On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
> > >
> > > +static INLINE
> > > +void util_blitter_save_fragment_sampler_states(
> > > +                  struct blitter_context *blitter,
> > > +                  int num_sampler_states,
> > > +                  void **sampler_states)
> > > +{
> > > +   assert(num_textures <= 32);
> > > +
> > > +   blitter->saved_num_sampler_states = num_sampler_states;
> > > +   memcpy(blitter->saved_sampler_states, sampler_states,
> > > +          num_sampler_states * sizeof(void *));
> > > +}
> > > +
> >
> > Have you tried compiling with debug enabled?  The assert above fails to
> > compile.  Also, can you use Elements() or similar instead of the
> > hard-coded 32?
> >
> > Maybe we can figure out how to go back to having asserts keep exposing
> > their contents to the compiler even on non-debug builds.  This used to
> > work without problem on linux and helped a lot to avoid these type of
> > problems.
>
> I wouldn't say without a problem: defining assert(expr) as (void)0
> instead of (void)(expr) on release builds yielded a non-negligible
> performance improvement. I don't recall the exact figure, but I believe
> it was the 3-5% for the driver I was benchmarking at the time. YMMV.
> Different drivers will give different results, but there's nothing
> platform specific about this.

It's not hard to avoid excuting code...  For instance we could always
have it translated to something like:

  if (0) {
    (void)(expr);
  }
  (void)(0)



> I believe the problem is we sometimes have
>
>   assert(very_expensive_check());
>
> and it should be really
>
> #ifdef DEBUG
>   assert(very_expensive_check());
> #endf

I think the above translation is fine, without the extra ifdefs.

> We could go through the files with a fine-toothed comb and fix it, but
> it's quite likely this sort of checks creep back in unnoticed and the
> thing repeats again. Between having debug builds temporarily broken and
> slower release builds I personally I'm for the former.
>
> No suprise that (void)0 is the common practice: glibc, ms's headers,
> etc. all do that.

> Also, I don't understand why a developer wouldn't want to use a debug
> build unless he's profiling. I don't see why we should make easy for a
> developer not to test its code, and running a debug build is the bare
> minimum.

There are other issues as well, such as unused variable warnings for
vars used only in asserts, etc.

Keith





------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by Keith Whitwell-3 :: Rate this Message:

| View Threaded | Show Only this Message

On Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote:

> On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote:
> > On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote:
> > > On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
> > > >
> > > > +static INLINE
> > > > +void util_blitter_save_fragment_sampler_states(
> > > > +                  struct blitter_context *blitter,
> > > > +                  int num_sampler_states,
> > > > +                  void **sampler_states)
> > > > +{
> > > > +   assert(num_textures <= 32);
> > > > +
> > > > +   blitter->saved_num_sampler_states = num_sampler_states;
> > > > +   memcpy(blitter->saved_sampler_states, sampler_states,
> > > > +          num_sampler_states * sizeof(void *));
> > > > +}
> > > > +
> > >
> > > Have you tried compiling with debug enabled?  The assert above fails to
> > > compile.  Also, can you use Elements() or similar instead of the
> > > hard-coded 32?
> > >
> > > Maybe we can figure out how to go back to having asserts keep exposing
> > > their contents to the compiler even on non-debug builds.  This used to
> > > work without problem on linux and helped a lot to avoid these type of
> > > problems.
> >
> > I wouldn't say without a problem: defining assert(expr) as (void)0
> > instead of (void)(expr) on release builds yielded a non-negligible
> > performance improvement. I don't recall the exact figure, but I believe
> > it was the 3-5% for the driver I was benchmarking at the time. YMMV.
> > Different drivers will give different results, but there's nothing
> > platform specific about this.
>
> It's not hard to avoid excuting code...  For instance we could always
> have it translated to something like:
>
>   if (0) {
>     (void)(expr);
>   }
>   (void)(0)
>

Obviously I would have meant to say something cleaner like:

 do {
   if (0) { (void)(expr);  }
 }
 while (0)

Keith


------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by Corbin Simpson :: Rate this Message:

| View Threaded | Show Only this Message

As far as immediate verts, why don't we just add support to r300g to switch to immediate mode for small VBOs?

Posting from a mobile, pardon my terseness. ~ C.

On Dec 13, 2009 3:28 PM, "Marek Olšák" <maraeo@...> wrote:

Hi Keith,

I've finished the blitter module. It fully implements the clear,
surface_copy, and surface_fill functions. It properly fallbacks to
software in case a surface cannot be sampled or rendered to according
to usage. Copying a stencil buffer always fallbacks unless the
ignore_stencil parameter (see util_blitter_copy) is set to TRUE. To my
knowledge, GPUs cannot copy the stencil buffer (not sure if fiddling
with texture formats can help). It's all documented in u_blitter.h.

The pipe driver can optionally hook up a function to draw a quad
(blitter_context::draw_quad). I realized that embedding 4 vertices
into a command stream (AKA immediate mode) is much faster than writing
them to a vertex buffer due to reduced driver overhead. It might be
worth to consider adding the draw_quad function to pipe_context.

When working on the blitter, I added the following things to
util/u_simple_shaders:
- util_make_fragment_tex_shader has a new parametr tex_target and the
value should be one of TGSI_TEXTURE_* enums so that it can be used to
sample from any kind of texture.
- Added util_make_fragment_tex_shader_writedepth, which writes depth
sampled from a texture. It's used for copying depth textures.
- Added util_make_fragment_clonecolor_shader, which copies input
COLOR[0] to a specified number of render targets. It's used to clear
MRTs.

Also, I moved the code for converting 2D texture coordinates into
cubemap texture coordinates from u_gen_mipmap to a new function in
util/u_texture.

Please review/push.

Once it gets approved, I will send patches with r300g blit support to
Corbin. With this work, untiling a texture will be as easy as calling
surface_copy whereas the driver state remains intact (theoretically).

Cheers.

Marek

On Thu, Dec 10, 2009 at 6:23 PM, Keith Whitwell <keithw@...> wrote:
> On Thu, 2009-12-10 at 01:52 -0800, Marek Olšák wrote:
>> Keith,
>>
>> I've taken your comment into consideration and started laying out a
>> new simple driver module which I call Blitter. The idea is to provide
>> acceleration for operations like clear, surface_copy, and
>> surface_fill. The module doesn't depend on a CSO context, instead, a
>> driver must call appropriate util_blitter_save* functions to save CSOs
>> and a blit operation takes care of their restoration once it's done.
>>
>> I attached a patch illustrating the idea with the clear implemented
>> and a working example of usage, but it's not ready to get pushed yet.
>>
>> Please tell me what you think about it.
>
> Marek,
>
> This looks good to me.  It looks like this approach keeps the
> implementation entirely on the driver side of the interface, which is
> what I was hoping for.
>
> I had assumed that doing this type of operation in the driver would
> require assistance "from above" for saving and restoring state.  But it
> seems like you've been able to do without that, which is nice.
>
> Let me know how it progresses.
>
> Keith
>
>

------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev

_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev

_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by Jose Fonseca :: Rate this Message:

| View Threaded | Show Only this Message

On Mon, 2009-12-14 at 08:22 -0800, Keith Whitwell wrote:

> On Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote:
> > On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote:
> > > On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote:
> > > > On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
> > > > >
> > > > > +static INLINE
> > > > > +void util_blitter_save_fragment_sampler_states(
> > > > > +                  struct blitter_context *blitter,
> > > > > +                  int num_sampler_states,
> > > > > +                  void **sampler_states)
> > > > > +{
> > > > > +   assert(num_textures <= 32);
> > > > > +
> > > > > +   blitter->saved_num_sampler_states = num_sampler_states;
> > > > > +   memcpy(blitter->saved_sampler_states, sampler_states,
> > > > > +          num_sampler_states * sizeof(void *));
> > > > > +}
> > > > > +
> > > >
> > > > Have you tried compiling with debug enabled?  The assert above fails to
> > > > compile.  Also, can you use Elements() or similar instead of the
> > > > hard-coded 32?
> > > >
> > > > Maybe we can figure out how to go back to having asserts keep exposing
> > > > their contents to the compiler even on non-debug builds.  This used to
> > > > work without problem on linux and helped a lot to avoid these type of
> > > > problems.
> > >
> > > I wouldn't say without a problem: defining assert(expr) as (void)0
> > > instead of (void)(expr) on release builds yielded a non-negligible
> > > performance improvement. I don't recall the exact figure, but I believe
> > > it was the 3-5% for the driver I was benchmarking at the time. YMMV.
> > > Different drivers will give different results, but there's nothing
> > > platform specific about this.
> >
> > It's not hard to avoid excuting code...  For instance we could always
> > have it translated to something like:
> >
> >   if (0) {
> >     (void)(expr);
> >   }
> >   (void)(0)
> >
>
> Obviously I would have meant to say something cleaner like:
>
>  do {
>    if (0) { (void)(expr);  }
>  }
>  while (0)

This only works if expr has no calls, or just inline calls. Using my
earlier example, if very_expensive_check() is in another file then the
compiler has to assume the function will have side effects, and the call
can't be removed.

I'm not sure __assume keyword that Michal mentioned helps. It's more a
hint to the compiler to help him optimize code around the assertion, but
perhaps it helps with the warnings too.

Jose


------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by Younes Manton :: Rate this Message:

| View Threaded | Show Only this Message

On Mon, Dec 14, 2009 at 11:42 AM, Corbin Simpson
<mostawesomedude@...> wrote:
> As far as immediate verts, why don't we just add support to r300g to switch
> to immediate mode for small VBOs?
>
> Posting from a mobile, pardon my terseness. ~ C.

That was what I was thinking for Nouveau, silently create a user
buffer for size < some threshold and when we get a draw call with a
user vertex buffer submit it in immediate mode.

------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by michal-9 :: Rate this Message:

| View Threaded | Show Only this Message

José Fonseca pisze:

> On Mon, 2009-12-14 at 08:22 -0800, Keith Whitwell wrote:
>  
>> On Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote:
>>    
>>> On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote:
>>>      
>>>> On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote:
>>>>        
>>>>> On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
>>>>>          
>>>>>> +static INLINE
>>>>>> +void util_blitter_save_fragment_sampler_states(
>>>>>> +                  struct blitter_context *blitter,
>>>>>> +                  int num_sampler_states,
>>>>>> +                  void **sampler_states)
>>>>>> +{
>>>>>> +   assert(num_textures <= 32);
>>>>>> +
>>>>>> +   blitter->saved_num_sampler_states = num_sampler_states;
>>>>>> +   memcpy(blitter->saved_sampler_states, sampler_states,
>>>>>> +          num_sampler_states * sizeof(void *));
>>>>>> +}
>>>>>> +
>>>>>>            
>>>>> Have you tried compiling with debug enabled?  The assert above fails to
>>>>> compile.  Also, can you use Elements() or similar instead of the
>>>>> hard-coded 32?
>>>>>
>>>>> Maybe we can figure out how to go back to having asserts keep exposing
>>>>> their contents to the compiler even on non-debug builds.  This used to
>>>>> work without problem on linux and helped a lot to avoid these type of
>>>>> problems.
>>>>>          
>>>> I wouldn't say without a problem: defining assert(expr) as (void)0
>>>> instead of (void)(expr) on release builds yielded a non-negligible
>>>> performance improvement. I don't recall the exact figure, but I believe
>>>> it was the 3-5% for the driver I was benchmarking at the time. YMMV.
>>>> Different drivers will give different results, but there's nothing
>>>> platform specific about this.
>>>>        
>>> It's not hard to avoid excuting code...  For instance we could always
>>> have it translated to something like:
>>>
>>>   if (0) {
>>>     (void)(expr);
>>>   }
>>>   (void)(0)
>>>
>>>      
>> Obviously I would have meant to say something cleaner like:
>>
>>  do {
>>    if (0) { (void)(expr);  }
>>  }
>>  while (0)
>>    
>
> This only works if expr has no calls, or just inline calls. Using my
> earlier example, if very_expensive_check() is in another file then the
> compiler has to assume the function will have side effects, and the call
> can't be removed.
>
> I'm not sure __assume keyword that Michal mentioned helps. It's more a
> hint to the compiler to help him optimize code around the assertion, but
> perhaps it helps with the warnings too.
>
>  
If I try to compile this:

__assume(lalala);

I get:

error C2065: 'lalala' : undeclared identifier

On the other side, the compiler is going to be serious about the
assumptions inside __assume(), and if they happen to be false, the
application can behave not as expected. This is against current gallium
paradigm, where we put assertions, but also do the same check in
non-debug builds to early out from a function or provide default values
(e.g. in switch-case statements).

------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by Jose Fonseca :: Rate this Message:

| View Threaded | Show Only this Message

On Mon, 2009-12-14 at 08:58 -0800, michal wrote:

> José Fonseca pisze:
> > On Mon, 2009-12-14 at 08:22 -0800, Keith Whitwell wrote:
> >  
> >> On Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote:
> >>    
> >>> On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote:
> >>>      
> >>>> On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote:
> >>>>        
> >>>>> On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
> >>>>>          
> >>>>>> +static INLINE
> >>>>>> +void util_blitter_save_fragment_sampler_states(
> >>>>>> +                  struct blitter_context *blitter,
> >>>>>> +                  int num_sampler_states,
> >>>>>> +                  void **sampler_states)
> >>>>>> +{
> >>>>>> +   assert(num_textures <= 32);
> >>>>>> +
> >>>>>> +   blitter->saved_num_sampler_states = num_sampler_states;
> >>>>>> +   memcpy(blitter->saved_sampler_states, sampler_states,
> >>>>>> +          num_sampler_states * sizeof(void *));
> >>>>>> +}
> >>>>>> +
> >>>>>>            
> >>>>> Have you tried compiling with debug enabled?  The assert above fails to
> >>>>> compile.  Also, can you use Elements() or similar instead of the
> >>>>> hard-coded 32?
> >>>>>
> >>>>> Maybe we can figure out how to go back to having asserts keep exposing
> >>>>> their contents to the compiler even on non-debug builds.  This used to
> >>>>> work without problem on linux and helped a lot to avoid these type of
> >>>>> problems.
> >>>>>          
> >>>> I wouldn't say without a problem: defining assert(expr) as (void)0
> >>>> instead of (void)(expr) on release builds yielded a non-negligible
> >>>> performance improvement. I don't recall the exact figure, but I believe
> >>>> it was the 3-5% for the driver I was benchmarking at the time. YMMV.
> >>>> Different drivers will give different results, but there's nothing
> >>>> platform specific about this.
> >>>>        
> >>> It's not hard to avoid excuting code...  For instance we could always
> >>> have it translated to something like:
> >>>
> >>>   if (0) {
> >>>     (void)(expr);
> >>>   }
> >>>   (void)(0)
> >>>
> >>>      
> >> Obviously I would have meant to say something cleaner like:
> >>
> >>  do {
> >>    if (0) { (void)(expr);  }
> >>  }
> >>  while (0)
> >>    
> >
> > This only works if expr has no calls, or just inline calls. Using my
> > earlier example, if very_expensive_check() is in another file then the
> > compiler has to assume the function will have side effects, and the call
> > can't be removed.
> >
> > I'm not sure __assume keyword that Michal mentioned helps. It's more a
> > hint to the compiler to help him optimize code around the assertion, but
> > perhaps it helps with the warnings too.
> >
> >  
> If I try to compile this:
>
> __assume(lalala);
>
> I get:
>
> error C2065: 'lalala' : undeclared identifier
>
> On the other side, the compiler is going to be serious about the
> assumptions inside __assume(), and if they happen to be false, the
> application can behave not as expected. This is against current gallium
> paradigm, where we put assertions, but also do the same check in
> non-debug builds to early out from a function or provide default values
> (e.g. in switch-case statements).

Bummer... that's no good.

Jose


------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by michal-9 :: Rate this Message:

| View Threaded | Show Only this Message

José Fonseca pisze:

> On Mon, 2009-12-14 at 08:58 -0800, michal wrote:
>  
>> José Fonseca pisze:
>>    
>>> On Mon, 2009-12-14 at 08:22 -0800, Keith Whitwell wrote:
>>>  
>>>      
>>>> On Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote:
>>>>    
>>>>        
>>>>> On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote:
>>>>>      
>>>>>          
>>>>>> On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote:
>>>>>>        
>>>>>>            
>>>>>>> On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
>>>>>>>          
>>>>>>>              
>>>>>>>> +static INLINE
>>>>>>>> +void util_blitter_save_fragment_sampler_states(
>>>>>>>> +                  struct blitter_context *blitter,
>>>>>>>> +                  int num_sampler_states,
>>>>>>>> +                  void **sampler_states)
>>>>>>>> +{
>>>>>>>> +   assert(num_textures <= 32);
>>>>>>>> +
>>>>>>>> +   blitter->saved_num_sampler_states = num_sampler_states;
>>>>>>>> +   memcpy(blitter->saved_sampler_states, sampler_states,
>>>>>>>> +          num_sampler_states * sizeof(void *));
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>            
>>>>>>>>                
>>>>>>> Have you tried compiling with debug enabled?  The assert above fails to
>>>>>>> compile.  Also, can you use Elements() or similar instead of the
>>>>>>> hard-coded 32?
>>>>>>>
>>>>>>> Maybe we can figure out how to go back to having asserts keep exposing
>>>>>>> their contents to the compiler even on non-debug builds.  This used to
>>>>>>> work without problem on linux and helped a lot to avoid these type of
>>>>>>> problems.
>>>>>>>          
>>>>>>>              
>>>>>> I wouldn't say without a problem: defining assert(expr) as (void)0
>>>>>> instead of (void)(expr) on release builds yielded a non-negligible
>>>>>> performance improvement. I don't recall the exact figure, but I believe
>>>>>> it was the 3-5% for the driver I was benchmarking at the time. YMMV.
>>>>>> Different drivers will give different results, but there's nothing
>>>>>> platform specific about this.
>>>>>>        
>>>>>>            
>>>>> It's not hard to avoid excuting code...  For instance we could always
>>>>> have it translated to something like:
>>>>>
>>>>>   if (0) {
>>>>>     (void)(expr);
>>>>>   }
>>>>>   (void)(0)
>>>>>
>>>>>      
>>>>>          
>>>> Obviously I would have meant to say something cleaner like:
>>>>
>>>>  do {
>>>>    if (0) { (void)(expr);  }
>>>>  }
>>>>  while (0)
>>>>    
>>>>        
>>> This only works if expr has no calls, or just inline calls. Using my
>>> earlier example, if very_expensive_check() is in another file then the
>>> compiler has to assume the function will have side effects, and the call
>>> can't be removed.
>>>
>>> I'm not sure __assume keyword that Michal mentioned helps. It's more a
>>> hint to the compiler to help him optimize code around the assertion, but
>>> perhaps it helps with the warnings too.
>>>
>>>  
>>>      
>> If I try to compile this:
>>
>> __assume(lalala);
>>
>> I get:
>>
>> error C2065: 'lalala' : undeclared identifier
>>
>> On the other side, the compiler is going to be serious about the
>> assumptions inside __assume(), and if they happen to be false, the
>> application can behave not as expected. This is against current gallium
>> paradigm, where we put assertions, but also do the same check in
>> non-debug builds to early out from a function or provide default values
>> (e.g. in switch-case statements).
>>    
>
> Bummer... that's no good.
>
>
>  
On the third hand, we could transform the following idiom

switch (foo) {
case 1:
   bar = 22;
default:
   assert(0);
   bar = 11;   /* Safe value. */
}

to use some flavour of assert() that doesn't get substituted with
__assume() on non-debug builds. Something like weak_assert() or
warning(). Then assert() could be used in places where there is no
backup plan and the app is going to crash anyway.

Or... do the opposite and introduce strong_assert() that translates to
__assume() and leave assert() as it is now.

------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by Keith Whitwell-3 :: Rate this Message:

| View Threaded | Show Only this Message

On Mon, 2009-12-14 at 08:51 -0800, José Fonseca wrote:

> On Mon, 2009-12-14 at 08:22 -0800, Keith Whitwell wrote:
> > On Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote:
> > > On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote:
> > > > On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote:
> > > > > On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
> > > > > >
> > > > > > +static INLINE
> > > > > > +void util_blitter_save_fragment_sampler_states(
> > > > > > +                  struct blitter_context *blitter,
> > > > > > +                  int num_sampler_states,
> > > > > > +                  void **sampler_states)
> > > > > > +{
> > > > > > +   assert(num_textures <= 32);
> > > > > > +
> > > > > > +   blitter->saved_num_sampler_states = num_sampler_states;
> > > > > > +   memcpy(blitter->saved_sampler_states, sampler_states,
> > > > > > +          num_sampler_states * sizeof(void *));
> > > > > > +}
> > > > > > +
> > > > >
> > > > > Have you tried compiling with debug enabled?  The assert above fails to
> > > > > compile.  Also, can you use Elements() or similar instead of the
> > > > > hard-coded 32?
> > > > >
> > > > > Maybe we can figure out how to go back to having asserts keep exposing
> > > > > their contents to the compiler even on non-debug builds.  This used to
> > > > > work without problem on linux and helped a lot to avoid these type of
> > > > > problems.
> > > >
> > > > I wouldn't say without a problem: defining assert(expr) as (void)0
> > > > instead of (void)(expr) on release builds yielded a non-negligible
> > > > performance improvement. I don't recall the exact figure, but I believe
> > > > it was the 3-5% for the driver I was benchmarking at the time. YMMV.
> > > > Different drivers will give different results, but there's nothing
> > > > platform specific about this.
> > >
> > > It's not hard to avoid excuting code...  For instance we could always
> > > have it translated to something like:
> > >
> > >   if (0) {
> > >     (void)(expr);
> > >   }
> > >   (void)(0)
> > >
> >
> > Obviously I would have meant to say something cleaner like:
> >
> >  do {
> >    if (0) { (void)(expr);  }
> >  }
> >  while (0)
>
> This only works if expr has no calls, or just inline calls. Using my
> earlier example, if very_expensive_check() is in another file then the
> compiler has to assume the function will have side effects, and the call
> can't be removed.

What call?!?

  if (0) do_something_with_side_effects();

Has no side effects.

Keith



------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by Jose Fonseca :: Rate this Message:

| View Threaded | Show Only this Message

On Mon, 2009-12-14 at 09:28 -0800, Keith Whitwell wrote:

> On Mon, 2009-12-14 at 08:51 -0800, José Fonseca wrote:
> > On Mon, 2009-12-14 at 08:22 -0800, Keith Whitwell wrote:
> > > On Mon, 2009-12-14 at 08:19 -0800, Keith Whitwell wrote:
> > > > On Mon, 2009-12-14 at 08:04 -0800, José Fonseca wrote:
> > > > > On Mon, 2009-12-14 at 05:39 -0800, Keith Whitwell wrote:
> > > > > > On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
> > > > > > >
> > > > > > > +static INLINE
> > > > > > > +void util_blitter_save_fragment_sampler_states(
> > > > > > > +                  struct blitter_context *blitter,
> > > > > > > +                  int num_sampler_states,
> > > > > > > +                  void **sampler_states)
> > > > > > > +{
> > > > > > > +   assert(num_textures <= 32);
> > > > > > > +
> > > > > > > +   blitter->saved_num_sampler_states = num_sampler_states;
> > > > > > > +   memcpy(blitter->saved_sampler_states, sampler_states,
> > > > > > > +          num_sampler_states * sizeof(void *));
> > > > > > > +}
> > > > > > > +
> > > > > >
> > > > > > Have you tried compiling with debug enabled?  The assert above fails to
> > > > > > compile.  Also, can you use Elements() or similar instead of the
> > > > > > hard-coded 32?
> > > > > >
> > > > > > Maybe we can figure out how to go back to having asserts keep exposing
> > > > > > their contents to the compiler even on non-debug builds.  This used to
> > > > > > work without problem on linux and helped a lot to avoid these type of
> > > > > > problems.
> > > > >
> > > > > I wouldn't say without a problem: defining assert(expr) as (void)0
> > > > > instead of (void)(expr) on release builds yielded a non-negligible
> > > > > performance improvement. I don't recall the exact figure, but I believe
> > > > > it was the 3-5% for the driver I was benchmarking at the time. YMMV.
> > > > > Different drivers will give different results, but there's nothing
> > > > > platform specific about this.
> > > >
> > > > It's not hard to avoid excuting code...  For instance we could always
> > > > have it translated to something like:
> > > >
> > > >   if (0) {
> > > >     (void)(expr);
> > > >   }
> > > >   (void)(0)
> > > >
> > >
> > > Obviously I would have meant to say something cleaner like:
> > >
> > >  do {
> > >    if (0) { (void)(expr);  }
> > >  }
> > >  while (0)
> >
> > This only works if expr has no calls, or just inline calls. Using my
> > earlier example, if very_expensive_check() is in another file then the
> > compiler has to assume the function will have side effects, and the call
> > can't be removed.
>
> What call?!?
>
>   if (0) do_something_with_side_effects();
>
> Has no side effects.

Nevermind. Don't know what I was thinking.

Jose


------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by Marek Olšák :: Rate this Message:

| View Threaded | Show Only this Message

Keith,

thanks for reviewing.

On Mon, Dec 14, 2009 at 2:39 PM, Keith Whitwell <keithw@...> wrote:

> On Sun, 2009-12-13 at 15:27 -0800, Marek Olšák wrote:
>>
>> +static INLINE
>> +void util_blitter_save_fragment_sampler_states(
>> +                  struct blitter_context *blitter,
>> +                  int num_sampler_states,
>> +                  void **sampler_states)
>> +{
>> +   assert(num_textures <= 32);
>> +
>> +   blitter->saved_num_sampler_states = num_sampler_states;
>> +   memcpy(blitter->saved_sampler_states, sampler_states,
>> +          num_sampler_states * sizeof(void *));
>> +}
>> +
>
> Have you tried compiling with debug enabled?  The assert above fails to
> compile.  Also, can you use Elements() or similar instead of the
> hard-coded 32?
Ouch. It's fixed in the attached "add blitter" patch. Other changes
that don't break the compilation are in separate patches.


On Mon, Dec 14, 2009 at 3:44 PM, Keith Whitwell <keithw@...> wrote:
> Can these be created on-demand / lazily?
>

Done. It now creates even fragment shaders on-demand, because some of
them might not be used at all in some applications.


On Mon, Dec 14, 2009 at 3:44 PM, Keith Whitwell <keithw@...> wrote:
> Can you maybe limit this code to a (much) smaller maximum number of
> simultaneously live states of this type?  Eg. 4 or 8 of them?  Creating
> states isn't so terribly expensive, and this seems a bit excessive.
>

Well, I'd like to avoid re-allocating state objects too often.
Moreover, it's quite rare for an application to use more than 4 values
to clear the stencil buffer.

I also removed the draw_quad callback as there appears to be a more
efficient way of handling this, and cleaned up a code to use
PIPE_MAX_* constants.

Please review/push.

Marek

[0001-util-add-new-fragment-shaders-to-simple_shaders.patch]

From 511f58a54315d07740493cdda050d1ebd5a4ecd3 Mon Sep 17 00:00:00 2001
From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...>
Date: Sat, 12 Dec 2009 06:34:29 +0100
Subject: [PATCH 1/7] util: add new fragment shaders to simple_shaders

New shaders:
* Fragment shader which writes depth sampled from a texture
* Fragment shader which copies COLOR[0] to multiple render targets

Additional improvements:
* The fragment 'tex' shaders now take a sampler type (TGSI_TEXTURE_*)
  so that they can sample from any type of texture, not only from a 2D one.
---
 src/gallium/auxiliary/util/u_blit.c           |    7 ++-
 src/gallium/auxiliary/util/u_gen_mipmap.c     |    2 +-
 src/gallium/auxiliary/util/u_simple_shaders.c |   70 ++++++++++++++++++++++---
 src/gallium/auxiliary/util/u_simple_shaders.h |   13 ++++-
 4 files changed, 80 insertions(+), 12 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_blit.c b/src/gallium/auxiliary/util/u_blit.c
index abe1de3..c9050ca 100644
--- a/src/gallium/auxiliary/util/u_blit.c
+++ b/src/gallium/auxiliary/util/u_blit.c
@@ -126,7 +126,8 @@ util_create_blit(struct pipe_context *pipe, struct cso_context *cso)
    }
 
    /* fragment shader */
-   ctx->fs[TGSI_WRITEMASK_XYZW] = util_make_fragment_tex_shader(pipe);
+   ctx->fs[TGSI_WRITEMASK_XYZW] =
+      util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_2D);
    ctx->vbuf = NULL;
 
    /* init vertex data that doesn't change */
@@ -420,7 +421,9 @@ util_blit_pixels_writemask(struct blit_state *ctx,
    cso_set_sampler_textures(ctx->cso, 1, &tex);
 
    if (ctx->fs[writemask] == NULL)
-      ctx->fs[writemask] = util_make_fragment_tex_shader_writemask(pipe, writemask);
+      ctx->fs[writemask] =
+         util_make_fragment_tex_shader_writemask(pipe, TGSI_TEXTURE_2D,
+                                                 writemask);
 
    /* shaders */
    cso_set_fragment_shader_handle(ctx->cso, ctx->fs[writemask]);
diff --git a/src/gallium/auxiliary/util/u_gen_mipmap.c b/src/gallium/auxiliary/util/u_gen_mipmap.c
index 83263d9..1728e66 100644
--- a/src/gallium/auxiliary/util/u_gen_mipmap.c
+++ b/src/gallium/auxiliary/util/u_gen_mipmap.c
@@ -1317,7 +1317,7 @@ util_create_gen_mipmap(struct pipe_context *pipe,
    }
 
    /* fragment shader */
-   ctx->fs = util_make_fragment_tex_shader(pipe);
+   ctx->fs = util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_2D);
 
    /* vertex data that doesn't change */
    for (i = 0; i < 4; i++) {
diff --git a/src/gallium/auxiliary/util/u_simple_shaders.c b/src/gallium/auxiliary/util/u_simple_shaders.c
index 1c8b157..8172ead 100644
--- a/src/gallium/auxiliary/util/u_simple_shaders.c
+++ b/src/gallium/auxiliary/util/u_simple_shaders.c
@@ -2,6 +2,7 @@
  *
  * Copyright 2008 Tungsten Graphics, Inc., Cedar Park, Texas.
  * All Rights Reserved.
+ * Copyright 2009 Marek Olšák <maraeo@...>
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the
@@ -30,6 +31,7 @@
  * Simple vertex/fragment shader generators.
  *  
  * @author Brian Paul
+           Marek Olšák
  */
 
 
@@ -87,6 +89,7 @@ util_make_vertex_passthrough_shader(struct pipe_context *pipe,
  */
 void *
 util_make_fragment_tex_shader_writemask(struct pipe_context *pipe,
+                                        unsigned tex_target,
                                         unsigned writemask )
 {
    struct ureg_program *ureg;
@@ -116,20 +119,63 @@ util_make_fragment_tex_shader_writemask(struct pipe_context *pipe,
 
    ureg_TEX( ureg,
              ureg_writemask(out, writemask),
-             TGSI_TEXTURE_2D, tex, sampler );
+             tex_target, tex, sampler );
    ureg_END( ureg );
 
    return ureg_create_shader_and_destroy( ureg, pipe );
 }
 
 void *
-util_make_fragment_tex_shader(struct pipe_context *pipe )
+util_make_fragment_tex_shader(struct pipe_context *pipe, unsigned tex_target )
 {
    return util_make_fragment_tex_shader_writemask( pipe,
+                                                   tex_target,
                                                    TGSI_WRITEMASK_XYZW );
 }
 
+/**
+ * Make a simple fragment texture shader which reads an X component from
+ * a texture and writes it as depth.
+ */
+void *
+util_make_fragment_tex_shader_writedepth(struct pipe_context *pipe,
+                                         unsigned tex_target)
+{
+   struct ureg_program *ureg;
+   struct ureg_src sampler;
+   struct ureg_src tex;
+   struct ureg_dst out, depth;
+   struct ureg_src imm;
 
+   ureg = ureg_create( TGSI_PROCESSOR_FRAGMENT );
+   if (ureg == NULL)
+      return NULL;
+
+   sampler = ureg_DECL_sampler( ureg, 0 );
+
+   tex = ureg_DECL_fs_input( ureg,
+                             TGSI_SEMANTIC_GENERIC, 0,
+                             TGSI_INTERPOLATE_PERSPECTIVE );
+
+   out = ureg_DECL_output( ureg,
+                           TGSI_SEMANTIC_COLOR,
+                           0 );
+
+   depth = ureg_DECL_output( ureg,
+                             TGSI_SEMANTIC_POSITION,
+                             0 );
+
+   imm = ureg_imm4f( ureg, 0, 0, 0, 1 );
+
+   ureg_MOV( ureg, out, imm );
+
+   ureg_TEX( ureg,
+             ureg_writemask(depth, TGSI_WRITEMASK_Z),
+             tex_target, tex, sampler );
+   ureg_END( ureg );
+
+   return ureg_create_shader_and_destroy( ureg, pipe );
+}
 
 /**
  * Make simple fragment color pass-through shader.
@@ -137,9 +183,18 @@ util_make_fragment_tex_shader(struct pipe_context *pipe )
 void *
 util_make_fragment_passthrough_shader(struct pipe_context *pipe)
 {
+   return util_make_fragment_clonecolor_shader(pipe, 1);
+}
+
+void *
+util_make_fragment_clonecolor_shader(struct pipe_context *pipe, int num_cbufs)
+{
    struct ureg_program *ureg;
    struct ureg_src src;
-   struct ureg_dst dst;
+   struct ureg_dst dst[8];
+   int i;
+
+   assert(num_cbufs <= 8);
 
    ureg = ureg_create( TGSI_PROCESSOR_FRAGMENT );
    if (ureg == NULL)
@@ -148,12 +203,13 @@ util_make_fragment_passthrough_shader(struct pipe_context *pipe)
    src = ureg_DECL_fs_input( ureg, TGSI_SEMANTIC_COLOR, 0,
                              TGSI_INTERPOLATE_PERSPECTIVE );
 
-   dst = ureg_DECL_output( ureg, TGSI_SEMANTIC_COLOR, 0 );
+   for (i = 0; i < num_cbufs; i++)
+      dst[i] = ureg_DECL_output( ureg, TGSI_SEMANTIC_COLOR, i );
+
+   for (i = 0; i < num_cbufs; i++)
+      ureg_MOV( ureg, dst[i], src );
 
-   ureg_MOV( ureg, dst, src );
    ureg_END( ureg );
 
    return ureg_create_shader_and_destroy( ureg, pipe );
 }
-
-
diff --git a/src/gallium/auxiliary/util/u_simple_shaders.h b/src/gallium/auxiliary/util/u_simple_shaders.h
index d2e80d6..6e76094 100644
--- a/src/gallium/auxiliary/util/u_simple_shaders.h
+++ b/src/gallium/auxiliary/util/u_simple_shaders.h
@@ -51,16 +51,25 @@ util_make_vertex_passthrough_shader(struct pipe_context *pipe,
 
 extern void *
 util_make_fragment_tex_shader_writemask(struct pipe_context *pipe,
-                                        unsigned writemask );
+                                        unsigned tex_target,
+                                        unsigned writemask);
 
 extern void *
-util_make_fragment_tex_shader(struct pipe_context *pipe);
+util_make_fragment_tex_shader(struct pipe_context *pipe, unsigned tex_target);
+
+
+extern void *
+util_make_fragment_tex_shader_writedepth(struct pipe_context *pipe,
+                                         unsigned tex_target);
 
 
 extern void *
 util_make_fragment_passthrough_shader(struct pipe_context *pipe);
 
 
+extern void *
+util_make_fragment_clonecolor_shader(struct pipe_context *pipe, int num_cbufs);
+
 #ifdef __cplusplus
 }
 #endif
--
1.6.3.3



[0002-util-add-a-function-which-converts-2D-coordinates-to.patch]

From dddb77c058d67c0a192b871deb8d837dfabbefce Mon Sep 17 00:00:00 2001
From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...>
Date: Sat, 12 Dec 2009 23:38:17 +0100
Subject: [PATCH 2/7] util: add a function which converts 2D coordinates to cubemap coordinates

The code was taken over from u_gen_mipmap.
---
 src/gallium/auxiliary/util/Makefile       |    1 +
 src/gallium/auxiliary/util/SConscript     |    1 +
 src/gallium/auxiliary/util/u_gen_mipmap.c |   55 +---------------
 src/gallium/auxiliary/util/u_texture.c    |  102 +++++++++++++++++++++++++++++
 src/gallium/auxiliary/util/u_texture.h    |   54 +++++++++++++++
 5 files changed, 161 insertions(+), 52 deletions(-)
 create mode 100644 src/gallium/auxiliary/util/u_texture.c
 create mode 100644 src/gallium/auxiliary/util/u_texture.h

diff --git a/src/gallium/auxiliary/util/Makefile b/src/gallium/auxiliary/util/Makefile
index 1d8bb55..894958f 100644
--- a/src/gallium/auxiliary/util/Makefile
+++ b/src/gallium/auxiliary/util/Makefile
@@ -30,6 +30,7 @@ C_SOURCES = \
  u_stream_stdc.c \
  u_stream_wd.c \
  u_surface.c \
+ u_texture.c \
  u_tile.c \
  u_time.c \
  u_timed_winsys.c \
diff --git a/src/gallium/auxiliary/util/SConscript b/src/gallium/auxiliary/util/SConscript
index 8d99106..0c0e048 100644
--- a/src/gallium/auxiliary/util/SConscript
+++ b/src/gallium/auxiliary/util/SConscript
@@ -48,6 +48,7 @@ util = env.ConvenienceLibrary(
  'u_stream_stdc.c',
  'u_stream_wd.c',
  'u_surface.c',
+ 'u_texture.c',
  'u_tile.c',
  'u_time.c',
  'u_timed_winsys.c',
diff --git a/src/gallium/auxiliary/util/u_gen_mipmap.c b/src/gallium/auxiliary/util/u_gen_mipmap.c
index 1728e66..69ff3b9 100644
--- a/src/gallium/auxiliary/util/u_gen_mipmap.c
+++ b/src/gallium/auxiliary/util/u_gen_mipmap.c
@@ -46,6 +46,7 @@
 #include "util/u_gen_mipmap.h"
 #include "util/u_simple_shaders.h"
 #include "util/u_math.h"
+#include "util/u_texture.h"
 
 #include "cso_cache/cso_context.h"
 
@@ -1383,59 +1384,9 @@ set_vertex_data(struct gen_mipmap_state *ctx,
       static const float st[4][2] = {
          {0.0f, 0.0f}, {1.0f, 0.0f}, {1.0f, 1.0f}, {0.0f, 1.0f}
       };
-      float rx, ry, rz;
-      uint i;
-
-      /* loop over quad verts */
-      for (i = 0; i < 4; i++) {
-         /* Compute sc = +/-scale and tc = +/-scale.
-          * Not +/-1 to avoid cube face selection ambiguity near the edges,
-          * though that can still sometimes happen with this scale factor...
-          */
-         const float scale = 0.9999f;
-         const float sc = (2.0f * st[i][0] - 1.0f) * scale;
-         const float tc = (2.0f * st[i][1] - 1.0f) * scale;
-
-         switch (face) {
-         case PIPE_TEX_FACE_POS_X:
-            rx = 1.0f;
-            ry = -tc;
-            rz = -sc;
-            break;
-         case PIPE_TEX_FACE_NEG_X:
-            rx = -1.0f;
-            ry = -tc;
-            rz = sc;
-            break;
-         case PIPE_TEX_FACE_POS_Y:
-            rx = sc;
-            ry = 1.0f;
-            rz = tc;
-            break;
-         case PIPE_TEX_FACE_NEG_Y:
-            rx = sc;
-            ry = -1.0f;
-            rz = -tc;
-            break;
-         case PIPE_TEX_FACE_POS_Z:
-            rx = sc;
-            ry = -tc;
-            rz = 1.0f;
-            break;
-         case PIPE_TEX_FACE_NEG_Z:
-            rx = -sc;
-            ry = -tc;
-            rz = -1.0f;
-            break;
-         default:
-            rx = ry = rz = 0.0f;
-            assert(0);
-         }
 
-         ctx->vertices[i][1][0] = rx; /*s*/
-         ctx->vertices[i][1][1] = ry; /*t*/
-         ctx->vertices[i][1][2] = rz; /*r*/
-      }
+      util_map_texcoords2d_onto_cubemap(face, &st[0][0], 2,
+                                        &ctx->vertices[0][1][0], 8);
    }
    else {
       /* 1D/2D */
diff --git a/src/gallium/auxiliary/util/u_texture.c b/src/gallium/auxiliary/util/u_texture.c
new file mode 100644
index 0000000..cd477ab
--- /dev/null
+++ b/src/gallium/auxiliary/util/u_texture.c
@@ -0,0 +1,102 @@
+/**************************************************************************
+ *
+ * Copyright 2008 Tungsten Graphics, Inc., Cedar Park, Texas.
+ * All Rights Reserved.
+ * Copyright 2008 VMware, Inc.  All rights reserved.
+ * Copyright 2009 Marek Olšák <maraeo@...>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
+ * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR
+ * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ **************************************************************************/
+
+/**
+ * @file
+ * Texture mapping utility functions.
+ *
+ * @author Brian Paul
+ *         Marek Olšák
+ */
+
+#include "pipe/p_defines.h"
+
+#include "util/u_texture.h"
+
+void util_map_texcoords2d_onto_cubemap(unsigned face,
+                                       const float *in_st, unsigned in_stride,
+                                       float *out_str, unsigned out_stride)
+{
+   int i;
+   float rx, ry, rz;
+
+   /* loop over quad verts */
+   for (i = 0; i < 4; i++) {
+      /* Compute sc = +/-scale and tc = +/-scale.
+       * Not +/-1 to avoid cube face selection ambiguity near the edges,
+       * though that can still sometimes happen with this scale factor...
+       */
+      const float scale = 0.9999f;
+      const float sc = (2 * in_st[0] - 1) * scale;
+      const float tc = (2 * in_st[1] - 1) * scale;
+
+      switch (face) {
+         case PIPE_TEX_FACE_POS_X:
+            rx = 1;
+            ry = -tc;
+            rz = -sc;
+            break;
+         case PIPE_TEX_FACE_NEG_X:
+            rx = -1;
+            ry = -tc;
+            rz = sc;
+            break;
+         case PIPE_TEX_FACE_POS_Y:
+            rx = sc;
+            ry = 1;
+            rz = tc;
+            break;
+         case PIPE_TEX_FACE_NEG_Y:
+            rx = sc;
+            ry = -1;
+            rz = -tc;
+            break;
+         case PIPE_TEX_FACE_POS_Z:
+            rx = sc;
+            ry = -tc;
+            rz = 1;
+            break;
+         case PIPE_TEX_FACE_NEG_Z:
+            rx = -sc;
+            ry = -tc;
+            rz = -1;
+            break;
+         default:
+            rx = ry = rz = 0;
+            assert(0);
+      }
+
+      out_str[0] = rx; /*s*/
+      out_str[1] = ry; /*t*/
+      out_str[2] = rz; /*r*/
+
+      in_st += in_stride;
+      out_str += out_stride;
+   }
+}
diff --git a/src/gallium/auxiliary/util/u_texture.h b/src/gallium/auxiliary/util/u_texture.h
new file mode 100644
index 0000000..93b2f1e
--- /dev/null
+++ b/src/gallium/auxiliary/util/u_texture.h
@@ -0,0 +1,54 @@
+/**************************************************************************
+ *
+ * Copyright 2009 Marek Olšák <maraeo@...>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
+ * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR
+ * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ **************************************************************************/
+
+#ifndef U_TEXTURE_H
+#define U_TEXTURE_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Convert 2D texture coordinates of 4 vertices into cubemap coordinates
+ * in the given face.
+ * Coordinates must be in the range [0,1].
+ *
+ * \param face          Cubemap face.
+ * \param in_st         4 pairs of 2D texture coordinates to convert.
+ * \param in_stride     Stride of in_st in floats.
+ * \param out_str       STR cubemap texture coordinates to compute.
+ * \param out_stride    Stride of out_str in floats.
+ */
+void util_map_texcoords2d_onto_cubemap(unsigned face,
+                                       const float *in_st, unsigned in_stride,
+                                       float *out_str, unsigned out_stride);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
--
1.6.3.3



[0003-util-add-blitter.patch]

From 6ff91fad38eae6d489f2d0ac2dac4508a499bbdc Mon Sep 17 00:00:00 2001
From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...>
Date: Thu, 10 Dec 2009 10:25:33 +0100
Subject: [PATCH 3/7] util: add blitter

---
 src/gallium/auxiliary/util/Makefile    |    1 +
 src/gallium/auxiliary/util/SConscript  |    1 +
 src/gallium/auxiliary/util/u_blitter.c |  605 ++++++++++++++++++++++++++++++++
 src/gallium/auxiliary/util/u_blitter.h |  244 +++++++++++++
 4 files changed, 851 insertions(+), 0 deletions(-)
 create mode 100644 src/gallium/auxiliary/util/u_blitter.c
 create mode 100644 src/gallium/auxiliary/util/u_blitter.h

diff --git a/src/gallium/auxiliary/util/Makefile b/src/gallium/auxiliary/util/Makefile
index 894958f..f81fc46 100644
--- a/src/gallium/auxiliary/util/Makefile
+++ b/src/gallium/auxiliary/util/Makefile
@@ -9,6 +9,7 @@ C_SOURCES = \
  u_debug_symbol.c \
  u_debug_stack.c \
  u_blit.c \
+ u_blitter.c \
  u_cache.c \
  u_cpu_detect.c \
  u_draw_quad.c \
diff --git a/src/gallium/auxiliary/util/SConscript b/src/gallium/auxiliary/util/SConscript
index 0c0e048..024a370 100644
--- a/src/gallium/auxiliary/util/SConscript
+++ b/src/gallium/auxiliary/util/SConscript
@@ -23,6 +23,7 @@ util = env.ConvenienceLibrary(
  source = [
  'u_bitmask.c',
  'u_blit.c',
+ 'u_blitter.c',
  'u_cache.c',
  'u_cpu_detect.c',
  'u_debug.c',
diff --git a/src/gallium/auxiliary/util/u_blitter.c b/src/gallium/auxiliary/util/u_blitter.c
new file mode 100644
index 0000000..e51a5df
--- /dev/null
+++ b/src/gallium/auxiliary/util/u_blitter.c
@@ -0,0 +1,605 @@
+/**************************************************************************
+ *
+ * Copyright 2009 Marek Olšák <maraeo@...>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
+ * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR
+ * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ **************************************************************************/
+
+/**
+ * @file
+ * Blitter utility to facilitate acceleration of the clear, surface_copy,
+ * and surface_fill functions.
+ *
+ * @author Marek Olšák
+ */
+
+#include "pipe/p_context.h"
+#include "pipe/p_defines.h"
+#include "pipe/p_inlines.h"
+#include "pipe/p_shader_tokens.h"
+#include "pipe/p_state.h"
+
+#include "util/u_memory.h"
+#include "util/u_math.h"
+#include "util/u_blitter.h"
+#include "util/u_draw_quad.h"
+#include "util/u_pack_color.h"
+#include "util/u_rect.h"
+#include "util/u_simple_shaders.h"
+#include "util/u_texture.h"
+
+struct blitter_context_priv
+{
+   struct blitter_context blitter;
+
+   struct pipe_context *pipe; /**< pipe context */
+   struct pipe_buffer *vbuf;  /**< quad */
+
+   float vertices[4][2][4];   /**< {pos, color} or {pos, texcoord} */
+
+   /* Constant state objects. */
+   /* Vertex shaders. */
+   void *vs_col; /**< Vertex shader which passes {pos, color} to the output */
+   void *vs_tex; /**<Vertex shader which passes {pos, texcoord} to the output.*/
+
+   /* Fragment shaders. */
+   void *fs_col[8];     /**< FS which outputs colors to 1-8 color buffers */
+   void *fs_texfetch_col[4];   /**< FS which outputs a color from a texture */
+   void *fs_texfetch_depth[4]; /**< FS which outputs a depth from a texture,
+                              where the index is PIPE_TEXTURE_* to be sampled */
+
+   /* Blend state. */
+   void *blend_write_color;   /**< blend state with writemask of RGBA */
+   void *blend_keep_color;    /**< blend state with writemask of 0 */
+
+   /* Depth stencil alpha state. */
+   void *dsa_write_depth_stencil[0xff]; /**< indices are stencil clear values */
+   void *dsa_write_depth_keep_stencil;
+   void *dsa_keep_depth_stencil;
+
+   /* Other state. */
+   void *sampler_state[16];   /**< sampler state for clamping to a miplevel */
+   void *rs_state;            /**< rasterizer state */
+};
+
+struct blitter_context *util_blitter_create(struct pipe_context *pipe)
+{
+   struct blitter_context_priv *ctx;
+   struct pipe_blend_state blend;
+   struct pipe_depth_stencil_alpha_state dsa;
+   struct pipe_rasterizer_state rs_state;
+   struct pipe_sampler_state sampler_state;
+   unsigned i, max_render_targets;
+
+   ctx = CALLOC_STRUCT(blitter_context_priv);
+   if (!ctx)
+      return NULL;
+
+   ctx->pipe = pipe;
+
+   /* init state objects for them to be considered invalid */
+   ctx->blitter.saved_fb_state.nr_cbufs = ~0;
+   ctx->blitter.saved_num_textures = ~0;
+   ctx->blitter.saved_num_sampler_states = ~0;
+
+   /* blend state objects */
+   memset(&blend, 0, sizeof(blend));
+   ctx->blend_keep_color = pipe->create_blend_state(pipe, &blend);
+
+   blend.colormask = PIPE_MASK_RGBA;
+   ctx->blend_write_color = pipe->create_blend_state(pipe, &blend);
+
+   /* depth stencil alpha state objects */
+   memset(&dsa, 0, sizeof(dsa));
+   ctx->dsa_keep_depth_stencil =
+      pipe->create_depth_stencil_alpha_state(pipe, &dsa);
+
+   dsa.depth.enabled = 1;
+   dsa.depth.writemask = 1;
+   dsa.depth.func = PIPE_FUNC_ALWAYS;
+   ctx->dsa_write_depth_keep_stencil =
+      pipe->create_depth_stencil_alpha_state(pipe, &dsa);
+
+   dsa.stencil[0].enabled = 1;
+   dsa.stencil[0].func = PIPE_FUNC_ALWAYS;
+   dsa.stencil[0].fail_op = PIPE_STENCIL_OP_REPLACE;
+   dsa.stencil[0].zpass_op = PIPE_STENCIL_OP_REPLACE;
+   dsa.stencil[0].zfail_op = PIPE_STENCIL_OP_REPLACE;
+   dsa.stencil[0].valuemask = 0xff;
+   dsa.stencil[0].writemask = 0xff;
+
+   /* create a depth stencil alpha state for each possible stencil clear
+    * value */
+   for (i = 0; i < 0xff; i++) {
+      dsa.stencil[0].ref_value = i;
+
+      ctx->dsa_write_depth_stencil[i] =
+         pipe->create_depth_stencil_alpha_state(pipe, &dsa);
+   }
+
+   /* sampler state */
+   memset(&sampler_state, 0, sizeof(sampler_state));
+   sampler_state.wrap_s = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
+   sampler_state.wrap_t = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
+   sampler_state.wrap_r = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
+
+   for (i = 0; i < 16; i++) {
+      sampler_state.lod_bias = i;
+      sampler_state.min_lod = i;
+      sampler_state.max_lod = i;
+
+      ctx->sampler_state[i] = pipe->create_sampler_state(pipe, &sampler_state);
+   }
+
+   /* rasterizer state */
+   memset(&rs_state, 0, sizeof(rs_state));
+   rs_state.front_winding = PIPE_WINDING_CW;
+   rs_state.cull_mode = PIPE_WINDING_NONE;
+   rs_state.bypass_vs_clip_and_viewport = 1;
+   rs_state.gl_rasterization_rules = 1;
+   ctx->rs_state = pipe->create_rasterizer_state(pipe, &rs_state);
+
+   /* vertex shaders */
+   {
+      const uint semantic_names[] = { TGSI_SEMANTIC_POSITION,
+                                      TGSI_SEMANTIC_COLOR };
+      const uint semantic_indices[] = { 0, 0 };
+      ctx->vs_col =
+         util_make_vertex_passthrough_shader(pipe, 2, semantic_names,
+                                             semantic_indices);
+   }
+   {
+      const uint semantic_names[] = { TGSI_SEMANTIC_POSITION,
+                                      TGSI_SEMANTIC_GENERIC };
+      const uint semantic_indices[] = { 0, 0 };
+      ctx->vs_tex =
+         util_make_vertex_passthrough_shader(pipe, 2, semantic_names,
+                                             semantic_indices);
+   }
+
+   /* fragment shaders */
+   ctx->fs_texfetch_col[PIPE_TEXTURE_1D] =
+      util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_1D);
+   ctx->fs_texfetch_col[PIPE_TEXTURE_2D] =
+      util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_2D);
+   ctx->fs_texfetch_col[PIPE_TEXTURE_3D] =
+      util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_3D);
+   ctx->fs_texfetch_col[PIPE_TEXTURE_CUBE] =
+      util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_CUBE);
+
+   ctx->fs_texfetch_depth[PIPE_TEXTURE_1D] =
+      util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_1D);
+   ctx->fs_texfetch_depth[PIPE_TEXTURE_2D] =
+      util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_2D);
+   ctx->fs_texfetch_depth[PIPE_TEXTURE_3D] =
+      util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_3D);
+   ctx->fs_texfetch_depth[PIPE_TEXTURE_CUBE] =
+      util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_CUBE);
+
+   max_render_targets = pipe->screen->get_param(pipe->screen,
+                                                PIPE_CAP_MAX_RENDER_TARGETS);
+   assert(max_render_targets <= 8);
+   for (i = 0; i < max_render_targets; i++)
+      ctx->fs_col[i] = util_make_fragment_clonecolor_shader(pipe, 1+i);
+
+   /* set invariant vertex coordinates */
+   for (i = 0; i < 4; i++)
+      ctx->vertices[i][0][3] = 1; /*v.w*/
+
+   /* create the vertex buffer */
+   ctx->vbuf = pipe_buffer_create(ctx->pipe->screen,
+                                  32,
+                                  PIPE_BUFFER_USAGE_VERTEX,
+                                  sizeof(ctx->vertices));
+
+   return &ctx->blitter;
+}
+
+void util_blitter_destroy(struct blitter_context *blitter)
+{
+   struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter;
+   struct pipe_context *pipe = ctx->pipe;
+   int i;
+
+   pipe->delete_blend_state(pipe, ctx->blend_write_color);
+   pipe->delete_blend_state(pipe, ctx->blend_keep_color);
+   pipe->delete_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil);
+   pipe->delete_depth_stencil_alpha_state(pipe,
+                                          ctx->dsa_write_depth_keep_stencil);
+
+   for (i = 0; i < 0xff; i++)
+      pipe->delete_depth_stencil_alpha_state(pipe,
+                                             ctx->dsa_write_depth_stencil[i]);
+
+   pipe->delete_rasterizer_state(pipe, ctx->rs_state);
+   pipe->delete_vs_state(pipe, ctx->vs_col);
+   pipe->delete_vs_state(pipe, ctx->vs_tex);
+
+   for (i = 0; i < 4; i++) {
+      pipe->delete_fs_state(pipe, ctx->fs_texfetch_col[i]);
+      pipe->delete_fs_state(pipe, ctx->fs_texfetch_depth[i]);
+   }
+   for (i = 0; i < 8 && ctx->fs_col[i]; i++)
+      pipe->delete_fs_state(pipe, ctx->fs_col[i]);
+
+   pipe_buffer_reference(&ctx->vbuf, NULL);
+   FREE(ctx);
+}
+
+static void blitter_check_saved_CSOs(struct blitter_context_priv *ctx)
+{
+   /* make sure these CSOs have been saved */
+   assert(ctx->blitter.saved_blend_state &&
+          ctx->blitter.saved_dsa_state &&
+          ctx->blitter.saved_rs_state &&
+          ctx->blitter.saved_fs &&
+          ctx->blitter.saved_vs);
+}
+
+static void blitter_restore_CSOs(struct blitter_context_priv *ctx)
+{
+   struct pipe_context *pipe = ctx->pipe;
+
+   /* restore the state objects which are always required to be saved */
+   pipe->bind_blend_state(pipe, ctx->blitter.saved_blend_state);
+   pipe->bind_depth_stencil_alpha_state(pipe, ctx->blitter.saved_dsa_state);
+   pipe->bind_rasterizer_state(pipe, ctx->blitter.saved_rs_state);
+   pipe->bind_fs_state(pipe, ctx->blitter.saved_fs);
+   pipe->bind_vs_state(pipe, ctx->blitter.saved_vs);
+
+   ctx->blitter.saved_blend_state = 0;
+   ctx->blitter.saved_dsa_state = 0;
+   ctx->blitter.saved_rs_state = 0;
+   ctx->blitter.saved_fs = 0;
+   ctx->blitter.saved_vs = 0;
+
+   /* restore the state objects which are required to be saved before copy/fill
+    */
+   if (ctx->blitter.saved_fb_state.nr_cbufs != ~0) {
+      pipe->set_framebuffer_state(pipe, &ctx->blitter.saved_fb_state);
+      ctx->blitter.saved_fb_state.nr_cbufs = ~0;
+   }
+
+   if (ctx->blitter.saved_num_sampler_states != ~0) {
+      pipe->bind_fragment_sampler_states(pipe,
+                                         ctx->blitter.saved_num_sampler_states,
+                                         ctx->blitter.saved_sampler_states);
+      ctx->blitter.saved_num_sampler_states = ~0;
+   }
+
+   if (ctx->blitter.saved_num_textures != ~0) {
+      pipe->set_fragment_sampler_textures(pipe,
+                                          ctx->blitter.saved_num_textures,
+                                          ctx->blitter.saved_textures);
+      ctx->blitter.saved_num_textures = ~0;
+   }
+}
+
+static void blitter_set_rectangle(struct blitter_context_priv *ctx,
+                                  unsigned x1, unsigned y1,
+                                  unsigned x2, unsigned y2,
+                                  float depth)
+{
+   int i;
+
+   /* set vertex positions */
+   ctx->vertices[0][0][0] = x1; /*v0.x*/
+   ctx->vertices[0][0][1] = y1; /*v0.y*/
+
+   ctx->vertices[1][0][0] = x2; /*v1.x*/
+   ctx->vertices[1][0][1] = y1; /*v1.y*/
+
+   ctx->vertices[2][0][0] = x2; /*v2.x*/
+   ctx->vertices[2][0][1] = y2; /*v2.y*/
+
+   ctx->vertices[3][0][0] = x1; /*v3.x*/
+   ctx->vertices[3][0][1] = y2; /*v3.y*/
+
+   for (i = 0; i < 4; i++)
+      ctx->vertices[i][0][2] = depth; /*z*/
+}
+
+static void blitter_set_clear_color(struct blitter_context_priv *ctx,
+                                    const float *rgba)
+{
+   int i;
+
+   for (i = 0; i < 4; i++) {
+      ctx->vertices[i][1][0] = rgba[0];
+      ctx->vertices[i][1][1] = rgba[1];
+      ctx->vertices[i][1][2] = rgba[2];
+      ctx->vertices[i][1][3] = rgba[3];
+   }
+}
+
+static void blitter_set_texcoords_2d(struct blitter_context_priv *ctx,
+                                     struct pipe_surface *surf,
+                                     unsigned x1, unsigned y1,
+                                     unsigned x2, unsigned y2)
+{
+   int i;
+   float s1 = x1 / (float)surf->width;
+   float t1 = y1 / (float)surf->height;
+   float s2 = x2 / (float)surf->width;
+   float t2 = y2 / (float)surf->height;
+
+   ctx->vertices[0][1][0] = s1; /*t0.s*/
+   ctx->vertices[0][1][1] = t1; /*t0.t*/
+
+   ctx->vertices[1][1][0] = s2; /*t1.s*/
+   ctx->vertices[1][1][1] = t1; /*t1.t*/
+
+   ctx->vertices[2][1][0] = s2; /*t2.s*/
+   ctx->vertices[2][1][1] = t2; /*t2.t*/
+
+   ctx->vertices[3][1][0] = s1; /*t3.s*/
+   ctx->vertices[3][1][1] = t2; /*t3.t*/
+
+   for (i = 0; i < 4; i++) {
+      ctx->vertices[i][1][2] = 0; /*r*/
+      ctx->vertices[i][1][3] = 1; /*q*/
+   }
+}
+
+static void blitter_set_texcoords_3d(struct blitter_context_priv *ctx,
+                                     struct pipe_surface *surf,
+                                     unsigned x1, unsigned y1,
+                                     unsigned x2, unsigned y2)
+{
+   int i;
+   float depth = u_minify(surf->texture->depth0, surf->level);
+   float r = surf->zslice / depth;
+
+   blitter_set_texcoords_2d(ctx, surf, x1, y1, x2, y2);
+
+   for (i = 0; i < 4; i++)
+      ctx->vertices[i][1][2] = r; /*r*/
+}
+
+static void blitter_set_texcoords_cube(struct blitter_context_priv *ctx,
+                                       struct pipe_surface *surf,
+                                       unsigned x1, unsigned y1,
+                                       unsigned x2, unsigned y2)
+{
+   int i;
+   float s1 = x1 / (float)surf->width;
+   float t1 = y1 / (float)surf->height;
+   float s2 = x2 / (float)surf->width;
+   float t2 = y2 / (float)surf->height;
+   const float st[4][2] = {
+      {s1, t1}, {s2, t1}, {s2, t2}, {s1, t2}
+   };
+
+   util_map_texcoords2d_onto_cubemap(surf->face,
+                                     /* pointer, stride in floats */
+                                     &st[0][0], 2,
+                                     &ctx->vertices[0][1][0], 8);
+
+   for (i = 0; i < 4; i++)
+      ctx->vertices[i][1][3] = 1; /*q*/
+}
+
+static void blitter_draw_quad(struct blitter_context_priv *ctx)
+{
+   struct blitter_context *blitter = &ctx->blitter;
+   struct pipe_context *pipe = ctx->pipe;
+
+   if (blitter->draw_quad) {
+      blitter->draw_quad(pipe, &ctx->vertices[0][0][0]);
+   } else {
+      /* write vertices and draw them */
+      pipe_buffer_write(pipe->screen, ctx->vbuf,
+                        0, sizeof(ctx->vertices), ctx->vertices);
+
+      util_draw_vertex_buffer(ctx->pipe, ctx->vbuf, 0, PIPE_PRIM_TRIANGLE_FAN,
+                              4,  /* verts */
+                              2); /* attribs/vert */
+   }
+}
+
+void util_blitter_clear(struct blitter_context *blitter,
+                        unsigned width, unsigned height,
+                        unsigned num_cbufs,
+                        unsigned clear_buffers,
+                        const float *rgba,
+                        double depth, unsigned stencil)
+{
+   struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter;
+   struct pipe_context *pipe = ctx->pipe;
+
+   assert(num_cbufs <= 8);
+
+   blitter_check_saved_CSOs(ctx);
+
+   /* bind CSOs */
+   if (clear_buffers & PIPE_CLEAR_COLOR)
+      pipe->bind_blend_state(pipe, ctx->blend_write_color);
+   else
+      pipe->bind_blend_state(pipe, ctx->blend_keep_color);
+
+   if (clear_buffers & PIPE_CLEAR_DEPTHSTENCIL)
+      pipe->bind_depth_stencil_alpha_state(pipe,
+         ctx->dsa_write_depth_stencil[stencil&0xff]);
+   else
+      pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil);
+
+   pipe->bind_rasterizer_state(pipe, ctx->rs_state);
+   pipe->bind_fs_state(pipe, ctx->fs_col[num_cbufs ? num_cbufs-1 : 0]);
+   pipe->bind_vs_state(pipe, ctx->vs_col);
+
+   blitter_set_clear_color(ctx, rgba);
+   blitter_set_rectangle(ctx, 0, 0, width, height, depth);
+   blitter_draw_quad(ctx);
+   blitter_restore_CSOs(ctx);
+}
+
+void util_blitter_copy(struct blitter_context *blitter,
+                       struct pipe_surface *dst,
+                       unsigned dstx, unsigned dsty,
+                       struct pipe_surface *src,
+                       unsigned srcx, unsigned srcy,
+                       unsigned width, unsigned height,
+                       boolean ignore_stencil)
+{
+   struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter;
+   struct pipe_context *pipe = ctx->pipe;
+   struct pipe_screen *screen = pipe->screen;
+   struct pipe_framebuffer_state fb_state;
+   boolean is_stencil, is_depth;
+   unsigned dst_tex_usage;
+
+   /* give up if textures are not set */
+   assert(dst->texture && src->texture);
+   if (!dst->texture || !src->texture)
+      return;
+
+   is_depth = pf_get_component_bits(src->format, PIPE_FORMAT_COMP_Z) != 0;
+   is_stencil = pf_get_component_bits(src->format, PIPE_FORMAT_COMP_S) != 0;
+   dst_tex_usage = is_depth || is_stencil ? PIPE_TEXTURE_USAGE_DEPTH_STENCIL :
+                                            PIPE_TEXTURE_USAGE_RENDER_TARGET;
+
+   /* check if we can sample from and render to the surfaces */
+   /* (assuming copying a stencil buffer is not possible) */
+   if ((!ignore_stencil && is_stencil) ||
+       !screen->is_format_supported(screen, dst->format, dst->texture->target,
+                                    dst_tex_usage, 0) ||
+       !screen->is_format_supported(screen, src->format, src->texture->target,
+                                    PIPE_TEXTURE_USAGE_SAMPLER, 0)) {
+      util_surface_copy(pipe, FALSE, dst, dstx, dsty, src, srcx, srcy,
+                        width, height);
+      return;
+   }
+
+   /* check whether the states are properly saved */
+   blitter_check_saved_CSOs(ctx);
+   assert(blitter->saved_fb_state.nr_cbufs != ~0);
+   assert(blitter->saved_num_textures != ~0);
+   assert(blitter->saved_num_sampler_states != ~0);
+   assert(src->texture->target < 4);
+
+   /* bind CSOs */
+   fb_state.width = dst->width;
+   fb_state.height = dst->height;
+
+   if (is_depth) {
+      pipe->bind_blend_state(pipe, ctx->blend_keep_color);
+      pipe->bind_depth_stencil_alpha_state(pipe,
+                                           ctx->dsa_write_depth_keep_stencil);
+      pipe->bind_fs_state(pipe, ctx->fs_texfetch_depth[src->texture->target]);
+
+      fb_state.nr_cbufs = 0;
+      fb_state.zsbuf = dst;
+   } else {
+      pipe->bind_blend_state(pipe, ctx->blend_write_color);
+      pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil);
+      pipe->bind_fs_state(pipe, ctx->fs_texfetch_col[src->texture->target]);
+
+      fb_state.nr_cbufs = 1;
+      fb_state.cbufs[0] = dst;
+      fb_state.zsbuf = 0;
+   }
+   pipe->bind_rasterizer_state(pipe, ctx->rs_state);
+   pipe->bind_vs_state(pipe, ctx->vs_tex);
+   pipe->bind_fragment_sampler_states(pipe, 1, &ctx->sampler_state[src->level]);
+   pipe->set_fragment_sampler_textures(pipe, 1, &src->texture);
+   pipe->set_framebuffer_state(pipe, &fb_state);
+
+   /* set texture coordinates */
+   switch (src->texture->target) {
+      case PIPE_TEXTURE_1D:
+      case PIPE_TEXTURE_2D:
+         blitter_set_texcoords_2d(ctx, src, srcx, srcy,
+                                  srcx+width, srcy+height);
+         break;
+      case PIPE_TEXTURE_3D:
+         blitter_set_texcoords_3d(ctx, src, srcx, srcy,
+                                  srcx+width, srcy+height);
+         break;
+      case PIPE_TEXTURE_CUBE:
+         blitter_set_texcoords_cube(ctx, src, srcx, srcy,
+                                    srcx+width, srcy+height);
+         break;
+   }
+
+   blitter_set_rectangle(ctx, dstx, dsty, dstx+width, dsty+height, 0);
+   blitter_draw_quad(ctx);
+   blitter_restore_CSOs(ctx);
+}
+
+void util_blitter_fill(struct blitter_context *blitter,
+                       struct pipe_surface *dst,
+                       unsigned dstx, unsigned dsty,
+                       unsigned width, unsigned height,
+                       unsigned value)
+{
+   struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter;
+   struct pipe_context *pipe = ctx->pipe;
+   struct pipe_screen *screen = pipe->screen;
+   struct pipe_framebuffer_state fb_state;
+   float rgba[4];
+   ubyte ub_rgba[4] = {0};
+   union util_color color;
+   int i;
+
+   assert(dst->texture);
+   if (!dst->texture)
+      return;
+
+   /* check if we can render to the surface */
+   if (pf_is_depth_or_stencil(dst->format) || /* unlikely, but you never know */
+       !screen->is_format_supported(screen, dst->format, dst->texture->target,
+                                    PIPE_TEXTURE_USAGE_RENDER_TARGET, 0)) {
+      util_surface_fill(pipe, dst, dstx, dsty, width, height, value);
+      return;
+   }
+
+   /* unpack the color */
+   color.ui = value;
+   util_unpack_color_ub(dst->format, &color,
+                        ub_rgba, ub_rgba+1, ub_rgba+2, ub_rgba+3);
+   for (i = 0; i < 4; i++)
+      rgba[i] = ubyte_to_float(ub_rgba[i]);
+
+   /* check the saved state */
+   blitter_check_saved_CSOs(ctx);
+   assert(blitter->saved_fb_state.nr_cbufs != ~0);
+
+   /* bind CSOs */
+   pipe->bind_blend_state(pipe, ctx->blend_write_color);
+   pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil);
+   pipe->bind_rasterizer_state(pipe, ctx->rs_state);
+   pipe->bind_fs_state(pipe, ctx->fs_col[0]);
+   pipe->bind_vs_state(pipe, ctx->vs_col);
+
+   /* set a framebuffer state */
+   fb_state.width = dst->width;
+   fb_state.height = dst->height;
+   fb_state.nr_cbufs = 1;
+   fb_state.cbufs[0] = dst;
+   fb_state.zsbuf = 0;
+   pipe->set_framebuffer_state(pipe, &fb_state);
+
+   blitter_set_clear_color(ctx, rgba);
+   blitter_set_rectangle(ctx, 0, 0, width, height, 0);
+   blitter_draw_quad(ctx);
+   blitter_restore_CSOs(ctx);
+}
diff --git a/src/gallium/auxiliary/util/u_blitter.h b/src/gallium/auxiliary/util/u_blitter.h
new file mode 100644
index 0000000..e4cbb5c
--- /dev/null
+++ b/src/gallium/auxiliary/util/u_blitter.h
@@ -0,0 +1,244 @@
+/**************************************************************************
+ *
+ * Copyright 2009 Marek Olšák <maraeo@...>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
+ * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR
+ * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ **************************************************************************/
+
+#ifndef U_BLITTER_H
+#define U_BLITTER_H
+
+#include "util/u_memory.h"
+
+#include "pipe/p_state.h"
+
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+struct pipe_context;
+
+struct blitter_context
+{
+   /**
+    * Draw a quad.
+    *
+    * The pipe driver can set this to provide a more efficient way of drawing
+    * a quad. If it's NULL, the quad is drawn using a vertex buffer.
+    *
+    * There are always 4 vertices with interleaved vertex elements of type
+    * RGBA32F. See the vertex shader _output_ semantics to know what those are.
+    * The primitive type is always PIPE_PRIM_TRIANGLE_FAN and VS/clip/viewport
+    * is bypasssed.
+    */
+   void (*draw_quad)(struct pipe_context *pipe,
+                     const float *vertices);
+
+   /* Private members, really. */
+   void *saved_blend_state;   /**< blend state */
+   void *saved_dsa_state;     /**< depth stencil alpha state */
+   void *saved_rs_state;      /**< rasterizer state */
+   void *saved_fs, *saved_vs; /**< fragment shader, vertex shader */
+
+   struct pipe_framebuffer_state saved_fb_state;  /**< framebuffer state */
+
+   int saved_num_sampler_states;
+   void *saved_sampler_states[32];
+
+   int saved_num_textures;
+   struct pipe_texture *saved_textures[32]; /* is 32 enough? */
+};
+
+/**
+ * Create a blitter context.
+ */
+struct blitter_context *util_blitter_create(struct pipe_context *pipe);
+
+/**
+ * Destroy a blitter context.
+ */
+void util_blitter_destroy(struct blitter_context *blitter);
+
+/*
+ * These CSOs must be saved before any of the following functions is called:
+ * - blend state
+ * - depth stencil alpha state
+ * - rasterizer state
+ * - vertex shader
+ * - fragment shader
+ */
+
+/**
+ * Clear a specified set of currently bound buffers to specified values.
+ */
+void util_blitter_clear(struct blitter_context *blitter,
+                        unsigned width, unsigned height,
+                        unsigned num_cbufs,
+                        unsigned clear_buffers,
+                        const float *rgba,
+                        double depth, unsigned stencil);
+
+/**
+ * Copy a block of pixels from one surface to another.
+ *
+ * You can copy from any color format to any other color format provided
+ * the former can be sampled and the latter can be rendered to. Otherwise,
+ * a software fallback path is taken and both surfaces must be of the same
+ * format.
+ *
+ * The same holds for depth-stencil formats with the exception that stencil
+ * cannot be copied unless you set ignore_stencil to FALSE. In that case,
+ * a software fallback path is taken and both surfaces must be of the same
+ * format.
+ *
+ * Use pipe_screen->is_format_supported to know your options.
+ *
+ * These states must be saved in the blitter in addition to the state objects
+ * already required to be saved:
+ * - framebuffer state
+ * - fragment sampler states
+ * - fragment sampler textures
+ */
+void util_blitter_copy(struct blitter_context *blitter,
+                       struct pipe_surface *dst,
+                       unsigned dstx, unsigned dsty,
+                       struct pipe_surface *src,
+                       unsigned srcx, unsigned srcy,
+                       unsigned width, unsigned height,
+                       boolean ignore_stencil);
+
+/**
+ * Fill a region of a surface with a constant value.
+ *
+ * If the surface cannot be rendered to or it's a depth-stencil format,
+ * a software fallback path is taken.
+ *
+ * These states must be saved in the blitter in addition to the state objects
+ * already required to be saved:
+ * - framebuffer state
+ */
+void util_blitter_fill(struct blitter_context *blitter,
+                       struct pipe_surface *dst,
+                       unsigned dstx, unsigned dsty,
+                       unsigned width, unsigned height,
+                       unsigned value);
+
+/**
+ * Copy all pixels from one surface to another.
+ *
+ * The rules are the same as in util_blitter_copy with the addition that
+ * surfaces must have the same size.
+ */
+static INLINE
+void util_blitter_copy_surface(struct blitter_context *blitter,
+                               struct pipe_surface *dst,
+                               struct pipe_surface *src,
+                               boolean ignore_stencil)
+{
+   assert(dst->width == src->width && dst->height == src->height);
+
+   util_blitter_copy(blitter, dst, 0, 0, src, 0, 0, src->width, src->height,
+                     ignore_stencil);
+}
+
+
+/* The functions below should be used to save currently bound constant state
+ * objects inside a driver. The objects are automatically restored at the end
+ * of the util_blitter_{clear, fill, copy, copy_surface} functions and then
+ * forgotten.
+ *
+ * CSOs not listed here are not affected by util_blitter. */
+
+static INLINE
+void util_blitter_save_blend(struct blitter_context *blitter,
+                             void *state)
+{
+   blitter->saved_blend_state = state;
+}
+
+static INLINE
+void util_blitter_save_depth_stencil_alpha(struct blitter_context *blitter,
+                                           void *state)
+{
+   blitter->saved_dsa_state = state;
+}
+
+static INLINE
+void util_blitter_save_rasterizer(struct blitter_context *blitter,
+                                  void *state)
+{
+   blitter->saved_rs_state = state;
+}
+
+static INLINE
+void util_blitter_save_fragment_shader(struct blitter_context *blitter,
+                                       void *fs)
+{
+   blitter->saved_fs = fs;
+}
+
+static INLINE
+void util_blitter_save_vertex_shader(struct blitter_context *blitter,
+                                     void *vs)
+{
+   blitter->saved_vs = vs;
+}
+
+static INLINE
+void util_blitter_save_framebuffer(struct blitter_context *blitter,
+                                   struct pipe_framebuffer_state *state)
+{
+   blitter->saved_fb_state = *state;
+}
+
+static INLINE
+void util_blitter_save_fragment_sampler_states(
+                  struct blitter_context *blitter,
+                  int num_sampler_states,
+                  void **sampler_states)
+{
+   assert(num_sampler_states <= Elements(blitter->saved_sampler_states));
+
+   blitter->saved_num_sampler_states = num_sampler_states;
+   memcpy(blitter->saved_sampler_states, sampler_states,
+          num_sampler_states * sizeof(void *));
+}
+
+static INLINE
+void util_blitter_save_fragment_sampler_textures(
+                  struct blitter_context *blitter,
+                  int num_textures,
+                  struct pipe_texture **textures)
+{
+   assert(num_textures <= Elements(blitter->saved_textures));
+
+   blitter->saved_num_textures = num_textures;
+   memcpy(blitter->saved_textures, textures,
+          num_textures * sizeof(struct pipe_texture *));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
--
1.6.3.3



[0004-pipe-add-PIPE_MAX_TEXTURE_TYPES.patch]

From b781b83f0d119b0c3dc6a4ce3f7e31a7084219be Mon Sep 17 00:00:00 2001
From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...>
Date: Mon, 14 Dec 2009 19:05:15 +0100
Subject: [PATCH 4/7] pipe: add PIPE_MAX_TEXTURE_TYPES

---
 src/gallium/include/pipe/p_defines.h |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/src/gallium/include/pipe/p_defines.h b/src/gallium/include/pipe/p_defines.h
index 69a0970..fe1390d 100644
--- a/src/gallium/include/pipe/p_defines.h
+++ b/src/gallium/include/pipe/p_defines.h
@@ -140,7 +140,8 @@ enum pipe_texture_target {
    PIPE_TEXTURE_1D   = 0,
    PIPE_TEXTURE_2D   = 1,
    PIPE_TEXTURE_3D   = 2,
-   PIPE_TEXTURE_CUBE = 3
+   PIPE_TEXTURE_CUBE = 3,
+   PIPE_MAX_TEXTURE_TYPES
 };
 
 #define PIPE_TEX_FACE_POS_X 0
--
1.6.3.3



[0005-util-blitter-use-PIPE_MAX_-limits-and-fix-a-memory-l.patch]

From 2f8bfcbe1223e29efa188b7bdd0d87fc64b749a8 Mon Sep 17 00:00:00 2001
From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...>
Date: Mon, 14 Dec 2009 19:14:49 +0100
Subject: [PATCH 5/7] util/blitter: use PIPE_MAX_* limits, and fix a memory leak

---
 src/gallium/auxiliary/util/u_blitter.c |   40 +++++++++++++++++++++----------
 1 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_blitter.c b/src/gallium/auxiliary/util/u_blitter.c
index e51a5df..f8f9e4a 100644
--- a/src/gallium/auxiliary/util/u_blitter.c
+++ b/src/gallium/auxiliary/util/u_blitter.c
@@ -62,10 +62,16 @@ struct blitter_context_priv
    void *vs_tex; /**<Vertex shader which passes {pos, texcoord} to the output.*/
 
    /* Fragment shaders. */
-   void *fs_col[8];     /**< FS which outputs colors to 1-8 color buffers */
-   void *fs_texfetch_col[4];   /**< FS which outputs a color from a texture */
-   void *fs_texfetch_depth[4]; /**< FS which outputs a depth from a texture,
-                              where the index is PIPE_TEXTURE_* to be sampled */
+   /* FS which outputs a color to multiple color buffers. */
+   void *fs_col[PIPE_MAX_COLOR_BUFS];
+
+   /* FS which outputs a color from a texture,
+      where the index is PIPE_TEXTURE_* to be sampled. */
+   void *fs_texfetch_col[PIPE_MAX_TEXTURE_TYPES];
+
+   /* FS which outputs a depth from a texture,
+      where the index is PIPE_TEXTURE_* to be sampled. */
+   void *fs_texfetch_depth[PIPE_MAX_TEXTURE_TYPES];
 
    /* Blend state. */
    void *blend_write_color;   /**< blend state with writemask of RGBA */
@@ -76,9 +82,11 @@ struct blitter_context_priv
    void *dsa_write_depth_keep_stencil;
    void *dsa_keep_depth_stencil;
 
-   /* Other state. */
-   void *sampler_state[16];   /**< sampler state for clamping to a miplevel */
-   void *rs_state;            /**< rasterizer state */
+   /* Sampler state for clamping to a miplevel. */
+   void *sampler_state[PIPE_MAX_TEXTURE_LEVELS];
+
+   /* Rasterizer state. */
+   void *rs_state;
 };
 
 struct blitter_context *util_blitter_create(struct pipe_context *pipe)
@@ -142,7 +150,7 @@ struct blitter_context *util_blitter_create(struct pipe_context *pipe)
    sampler_state.wrap_t = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
    sampler_state.wrap_r = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
 
-   for (i = 0; i < 16; i++) {
+   for (i = 0; i < PIPE_MAX_TEXTURE_LEVELS; i++) {
       sampler_state.lod_bias = i;
       sampler_state.min_lod = i;
       sampler_state.max_lod = i;
@@ -197,7 +205,7 @@ struct blitter_context *util_blitter_create(struct pipe_context *pipe)
 
    max_render_targets = pipe->screen->get_param(pipe->screen,
                                                 PIPE_CAP_MAX_RENDER_TARGETS);
-   assert(max_render_targets <= 8);
+   assert(max_render_targets <= PIPE_MAX_COLOR_BUFS);
    for (i = 0; i < max_render_targets; i++)
       ctx->fs_col[i] = util_make_fragment_clonecolor_shader(pipe, 1+i);
 
@@ -234,13 +242,17 @@ void util_blitter_destroy(struct blitter_context *blitter)
    pipe->delete_vs_state(pipe, ctx->vs_col);
    pipe->delete_vs_state(pipe, ctx->vs_tex);
 
-   for (i = 0; i < 4; i++) {
+   for (i = 0; i < PIPE_MAX_TEXTURE_TYPES; i++) {
       pipe->delete_fs_state(pipe, ctx->fs_texfetch_col[i]);
       pipe->delete_fs_state(pipe, ctx->fs_texfetch_depth[i]);
    }
-   for (i = 0; i < 8 && ctx->fs_col[i]; i++)
+
+   for (i = 0; i < PIPE_MAX_COLOR_BUFS && ctx->fs_col[i]; i++)
       pipe->delete_fs_state(pipe, ctx->fs_col[i]);
 
+   for (i = 0; i < PIPE_MAX_TEXTURE_LEVELS; i++)
+      pipe->delete_sampler_state(pipe, ctx->sampler_state[i]);
+
    pipe_buffer_reference(&ctx->vbuf, NULL);
    FREE(ctx);
 }
@@ -426,7 +438,7 @@ void util_blitter_clear(struct blitter_context *blitter,
    struct blitter_context_priv *ctx = (struct blitter_context_priv*)blitter;
    struct pipe_context *pipe = ctx->pipe;
 
-   assert(num_cbufs <= 8);
+   assert(num_cbufs <= PIPE_MAX_COLOR_BUFS);
 
    blitter_check_saved_CSOs(ctx);
 
@@ -494,7 +506,7 @@ void util_blitter_copy(struct blitter_context *blitter,
    assert(blitter->saved_fb_state.nr_cbufs != ~0);
    assert(blitter->saved_num_textures != ~0);
    assert(blitter->saved_num_sampler_states != ~0);
-   assert(src->texture->target < 4);
+   assert(src->texture->target < PIPE_MAX_TEXTURE_TYPES);
 
    /* bind CSOs */
    fb_state.width = dst->width;
@@ -538,6 +550,8 @@ void util_blitter_copy(struct blitter_context *blitter,
          blitter_set_texcoords_cube(ctx, src, srcx, srcy,
                                     srcx+width, srcy+height);
          break;
+      default:
+         assert(0);
    }
 
    blitter_set_rectangle(ctx, dstx, dsty, dstx+width, dsty+height, 0);
--
1.6.3.3



[0006-util-blitter-allocate-most-of-the-state-objects-on-d.patch]

From 61d103c43b7e26bc406f159aa572e468366abcae Mon Sep 17 00:00:00 2001
From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...>
Date: Tue, 15 Dec 2009 00:26:10 +0100
Subject: [PATCH 6/7] util/blitter: allocate most of the state objects on-demand

---
 src/gallium/auxiliary/util/u_blitter.c |  254 ++++++++++++++++++++++----------
 1 files changed, 179 insertions(+), 75 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_blitter.c b/src/gallium/auxiliary/util/u_blitter.c
index f8f9e4a..42efa86 100644
--- a/src/gallium/auxiliary/util/u_blitter.c
+++ b/src/gallium/auxiliary/util/u_blitter.c
@@ -56,6 +56,10 @@ struct blitter_context_priv
 
    float vertices[4][2][4];   /**< {pos, color} or {pos, texcoord} */
 
+   /* Templates for various state objects. */
+   struct pipe_depth_stencil_alpha_state template_dsa;
+   struct pipe_sampler_state template_sampler_state;
+
    /* Constant state objects. */
    /* Vertex shaders. */
    void *vs_col; /**< Vertex shader which passes {pos, color} to the output */
@@ -93,10 +97,10 @@ struct blitter_context *util_blitter_create(struct pipe_context *pipe)
 {
    struct blitter_context_priv *ctx;
    struct pipe_blend_state blend;
-   struct pipe_depth_stencil_alpha_state dsa;
+   struct pipe_depth_stencil_alpha_state *dsa;
    struct pipe_rasterizer_state rs_state;
-   struct pipe_sampler_state sampler_state;
-   unsigned i, max_render_targets;
+   struct pipe_sampler_state *sampler_state;
+   unsigned i;
 
    ctx = CALLOC_STRUCT(blitter_context_priv);
    if (!ctx)
@@ -117,46 +121,33 @@ struct blitter_context *util_blitter_create(struct pipe_context *pipe)
    ctx->blend_write_color = pipe->create_blend_state(pipe, &blend);
 
    /* depth stencil alpha state objects */
-   memset(&dsa, 0, sizeof(dsa));
+   dsa = &ctx->template_dsa;
    ctx->dsa_keep_depth_stencil =
-      pipe->create_depth_stencil_alpha_state(pipe, &dsa);
+      pipe->create_depth_stencil_alpha_state(pipe, dsa);
 
-   dsa.depth.enabled = 1;
-   dsa.depth.writemask = 1;
-   dsa.depth.func = PIPE_FUNC_ALWAYS;
+   dsa->depth.enabled = 1;
+   dsa->depth.writemask = 1;
+   dsa->depth.func = PIPE_FUNC_ALWAYS;
    ctx->dsa_write_depth_keep_stencil =
-      pipe->create_depth_stencil_alpha_state(pipe, &dsa);
-
-   dsa.stencil[0].enabled = 1;
-   dsa.stencil[0].func = PIPE_FUNC_ALWAYS;
-   dsa.stencil[0].fail_op = PIPE_STENCIL_OP_REPLACE;
-   dsa.stencil[0].zpass_op = PIPE_STENCIL_OP_REPLACE;
-   dsa.stencil[0].zfail_op = PIPE_STENCIL_OP_REPLACE;
-   dsa.stencil[0].valuemask = 0xff;
-   dsa.stencil[0].writemask = 0xff;
-
-   /* create a depth stencil alpha state for each possible stencil clear
-    * value */
-   for (i = 0; i < 0xff; i++) {
-      dsa.stencil[0].ref_value = i;
-
-      ctx->dsa_write_depth_stencil[i] =
-         pipe->create_depth_stencil_alpha_state(pipe, &dsa);
-   }
+      pipe->create_depth_stencil_alpha_state(pipe, dsa);
+
+   dsa->stencil[0].enabled = 1;
+   dsa->stencil[0].func = PIPE_FUNC_ALWAYS;
+   dsa->stencil[0].fail_op = PIPE_STENCIL_OP_REPLACE;
+   dsa->stencil[0].zpass_op = PIPE_STENCIL_OP_REPLACE;
+   dsa->stencil[0].zfail_op = PIPE_STENCIL_OP_REPLACE;
+   dsa->stencil[0].valuemask = 0xff;
+   dsa->stencil[0].writemask = 0xff;
+   /* The DSA state objects which write depth and stencil are created
+    * on-demand. */
 
    /* sampler state */
-   memset(&sampler_state, 0, sizeof(sampler_state));
-   sampler_state.wrap_s = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
-   sampler_state.wrap_t = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
-   sampler_state.wrap_r = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
-
-   for (i = 0; i < PIPE_MAX_TEXTURE_LEVELS; i++) {
-      sampler_state.lod_bias = i;
-      sampler_state.min_lod = i;
-      sampler_state.max_lod = i;
-
-      ctx->sampler_state[i] = pipe->create_sampler_state(pipe, &sampler_state);
-   }
+   sampler_state = &ctx->template_sampler_state;
+   sampler_state->wrap_s = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
+   sampler_state->wrap_t = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
+   sampler_state->wrap_r = PIPE_TEX_WRAP_CLAMP_TO_EDGE;
+   /* The sampler state objects which sample from a specified mipmap level
+    * are created on-demand. */
 
    /* rasterizer state */
    memset(&rs_state, 0, sizeof(rs_state));
@@ -166,6 +157,8 @@ struct blitter_context *util_blitter_create(struct pipe_context *pipe)
    rs_state.gl_rasterization_rules = 1;
    ctx->rs_state = pipe->create_rasterizer_state(pipe, &rs_state);
 
+   /* fragment shaders are created on-demand */
+
    /* vertex shaders */
    {
       const uint semantic_names[] = { TGSI_SEMANTIC_POSITION,
@@ -184,31 +177,6 @@ struct blitter_context *util_blitter_create(struct pipe_context *pipe)
                                              semantic_indices);
    }
 
-   /* fragment shaders */
-   ctx->fs_texfetch_col[PIPE_TEXTURE_1D] =
-      util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_1D);
-   ctx->fs_texfetch_col[PIPE_TEXTURE_2D] =
-      util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_2D);
-   ctx->fs_texfetch_col[PIPE_TEXTURE_3D] =
-      util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_3D);
-   ctx->fs_texfetch_col[PIPE_TEXTURE_CUBE] =
-      util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_CUBE);
-
-   ctx->fs_texfetch_depth[PIPE_TEXTURE_1D] =
-      util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_1D);
-   ctx->fs_texfetch_depth[PIPE_TEXTURE_2D] =
-      util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_2D);
-   ctx->fs_texfetch_depth[PIPE_TEXTURE_3D] =
-      util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_3D);
-   ctx->fs_texfetch_depth[PIPE_TEXTURE_CUBE] =
-      util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_CUBE);
-
-   max_render_targets = pipe->screen->get_param(pipe->screen,
-                                                PIPE_CAP_MAX_RENDER_TARGETS);
-   assert(max_render_targets <= PIPE_MAX_COLOR_BUFS);
-   for (i = 0; i < max_render_targets; i++)
-      ctx->fs_col[i] = util_make_fragment_clonecolor_shader(pipe, 1+i);
-
    /* set invariant vertex coordinates */
    for (i = 0; i < 4; i++)
       ctx->vertices[i][0][3] = 1; /*v.w*/
@@ -235,23 +203,28 @@ void util_blitter_destroy(struct blitter_context *blitter)
                                           ctx->dsa_write_depth_keep_stencil);
 
    for (i = 0; i < 0xff; i++)
-      pipe->delete_depth_stencil_alpha_state(pipe,
-                                             ctx->dsa_write_depth_stencil[i]);
+      if (ctx->dsa_write_depth_stencil[i])
+         pipe->delete_depth_stencil_alpha_state(pipe,
+            ctx->dsa_write_depth_stencil[i]);
 
    pipe->delete_rasterizer_state(pipe, ctx->rs_state);
    pipe->delete_vs_state(pipe, ctx->vs_col);
    pipe->delete_vs_state(pipe, ctx->vs_tex);
 
    for (i = 0; i < PIPE_MAX_TEXTURE_TYPES; i++) {
-      pipe->delete_fs_state(pipe, ctx->fs_texfetch_col[i]);
-      pipe->delete_fs_state(pipe, ctx->fs_texfetch_depth[i]);
+      if (ctx->fs_texfetch_col[i])
+         pipe->delete_fs_state(pipe, ctx->fs_texfetch_col[i]);
+      if (ctx->fs_texfetch_depth[i])
+         pipe->delete_fs_state(pipe, ctx->fs_texfetch_depth[i]);
    }
 
    for (i = 0; i < PIPE_MAX_COLOR_BUFS && ctx->fs_col[i]; i++)
-      pipe->delete_fs_state(pipe, ctx->fs_col[i]);
+      if (ctx->fs_col[i])
+         pipe->delete_fs_state(pipe, ctx->fs_col[i]);
 
    for (i = 0; i < PIPE_MAX_TEXTURE_LEVELS; i++)
-      pipe->delete_sampler_state(pipe, ctx->sampler_state[i]);
+      if (ctx->sampler_state[i])
+         pipe->delete_sampler_state(pipe, ctx->sampler_state[i]);
 
    pipe_buffer_reference(&ctx->vbuf, NULL);
    FREE(ctx);
@@ -428,6 +401,133 @@ static void blitter_draw_quad(struct blitter_context_priv *ctx)
    }
 }
 
+static INLINE
+void *blitter_get_state_write_depth_stencil(
+               struct blitter_context_priv *ctx,
+               unsigned stencil)
+{
+   struct pipe_context *pipe = ctx->pipe;
+
+   stencil &= 0xff;
+
+   /* Create the DSA state on-demand. */
+   if (!ctx->dsa_write_depth_stencil[stencil]) {
+      ctx->template_dsa.stencil[0].ref_value = stencil;
+
+      ctx->dsa_write_depth_stencil[stencil] =
+         pipe->create_depth_stencil_alpha_state(pipe, &ctx->template_dsa);
+   }
+
+   return ctx->dsa_write_depth_stencil[stencil];
+}
+
+static INLINE
+void **blitter_get_sampler_state(struct blitter_context_priv *ctx,
+                                 int miplevel)
+{
+   struct pipe_context *pipe = ctx->pipe;
+   struct pipe_sampler_state *sampler_state = &ctx->template_sampler_state;
+
+   assert(miplevel < PIPE_MAX_TEXTURE_LEVELS);
+
+   /* Create the sampler state on-demand. */
+   if (!ctx->sampler_state[miplevel]) {
+      sampler_state->lod_bias = miplevel;
+      sampler_state->min_lod = miplevel;
+      sampler_state->max_lod = miplevel;
+
+      ctx->sampler_state[miplevel] = pipe->create_sampler_state(pipe,
+                                                                sampler_state);
+   }
+
+   /* Return void** so that it can be passed to bind_fragment_sampler_states
+    * directly. */
+   return &ctx->sampler_state[miplevel];
+}
+
+static INLINE
+void *blitter_get_fs_col(struct blitter_context_priv *ctx, unsigned num_cbufs)
+{
+   struct pipe_context *pipe = ctx->pipe;
+   unsigned index = num_cbufs ? num_cbufs - 1 : 0;
+
+   assert(num_cbufs <= PIPE_MAX_COLOR_BUFS);
+
+   if (!ctx->fs_col[index])
+      ctx->fs_col[index] =
+         util_make_fragment_clonecolor_shader(pipe, num_cbufs);
+
+   return ctx->fs_col[index];
+}
+
+static INLINE
+void *blitter_get_fs_texfetch_col(struct blitter_context_priv *ctx,
+                                  unsigned tex_target)
+{
+   struct pipe_context *pipe = ctx->pipe;
+
+   assert(tex_target < PIPE_MAX_TEXTURE_TYPES);
+
+   /* Create the fragment shader on-demand. */
+   if (!ctx->fs_texfetch_col[tex_target]) {
+      switch (tex_target) {
+         case PIPE_TEXTURE_1D:
+            ctx->fs_texfetch_col[PIPE_TEXTURE_1D] =
+               util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_1D);
+            break;
+         case PIPE_TEXTURE_2D:
+            ctx->fs_texfetch_col[PIPE_TEXTURE_2D] =
+               util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_2D);
+            break;
+         case PIPE_TEXTURE_3D:
+            ctx->fs_texfetch_col[PIPE_TEXTURE_3D] =
+               util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_3D);
+            break;
+         case PIPE_TEXTURE_CUBE:
+            ctx->fs_texfetch_col[PIPE_TEXTURE_CUBE] =
+               util_make_fragment_tex_shader(pipe, TGSI_TEXTURE_CUBE);
+            break;
+         default:;
+      }
+   }
+
+   return ctx->fs_texfetch_col[tex_target];
+}
+
+static INLINE
+void *blitter_get_fs_texfetch_depth(struct blitter_context_priv *ctx,
+                                    unsigned tex_target)
+{
+   struct pipe_context *pipe = ctx->pipe;
+
+   assert(tex_target < PIPE_MAX_TEXTURE_TYPES);
+
+   /* Create the fragment shader on-demand. */
+   if (!ctx->fs_texfetch_depth[tex_target]) {
+      switch (tex_target) {
+         case PIPE_TEXTURE_1D:
+            ctx->fs_texfetch_depth[PIPE_TEXTURE_1D] =
+               util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_1D);
+            break;
+         case PIPE_TEXTURE_2D:
+            ctx->fs_texfetch_depth[PIPE_TEXTURE_2D] =
+               util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_2D);
+            break;
+         case PIPE_TEXTURE_3D:
+            ctx->fs_texfetch_depth[PIPE_TEXTURE_3D] =
+               util_make_fragment_tex_shader_writedepth(pipe, TGSI_TEXTURE_3D);
+            break;
+         case PIPE_TEXTURE_CUBE:
+            ctx->fs_texfetch_depth[PIPE_TEXTURE_CUBE] =
+               util_make_fragment_tex_shader_writedepth(pipe,TGSI_TEXTURE_CUBE);
+            break;
+         default:;
+      }
+   }
+
+   return ctx->fs_texfetch_depth[tex_target];
+}
+
 void util_blitter_clear(struct blitter_context *blitter,
                         unsigned width, unsigned height,
                         unsigned num_cbufs,
@@ -450,12 +550,12 @@ void util_blitter_clear(struct blitter_context *blitter,
 
    if (clear_buffers & PIPE_CLEAR_DEPTHSTENCIL)
       pipe->bind_depth_stencil_alpha_state(pipe,
-         ctx->dsa_write_depth_stencil[stencil&0xff]);
+         blitter_get_state_write_depth_stencil(ctx, stencil));
    else
       pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil);
 
    pipe->bind_rasterizer_state(pipe, ctx->rs_state);
-   pipe->bind_fs_state(pipe, ctx->fs_col[num_cbufs ? num_cbufs-1 : 0]);
+   pipe->bind_fs_state(pipe, blitter_get_fs_col(ctx, num_cbufs));
    pipe->bind_vs_state(pipe, ctx->vs_col);
 
    blitter_set_clear_color(ctx, rgba);
@@ -516,22 +616,26 @@ void util_blitter_copy(struct blitter_context *blitter,
       pipe->bind_blend_state(pipe, ctx->blend_keep_color);
       pipe->bind_depth_stencil_alpha_state(pipe,
                                            ctx->dsa_write_depth_keep_stencil);
-      pipe->bind_fs_state(pipe, ctx->fs_texfetch_depth[src->texture->target]);
+      pipe->bind_fs_state(pipe,
+         blitter_get_fs_texfetch_depth(ctx, src->texture->target));
 
       fb_state.nr_cbufs = 0;
       fb_state.zsbuf = dst;
    } else {
       pipe->bind_blend_state(pipe, ctx->blend_write_color);
       pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil);
-      pipe->bind_fs_state(pipe, ctx->fs_texfetch_col[src->texture->target]);
+      pipe->bind_fs_state(pipe,
+         blitter_get_fs_texfetch_col(ctx, src->texture->target));
 
       fb_state.nr_cbufs = 1;
       fb_state.cbufs[0] = dst;
       fb_state.zsbuf = 0;
    }
+
    pipe->bind_rasterizer_state(pipe, ctx->rs_state);
    pipe->bind_vs_state(pipe, ctx->vs_tex);
-   pipe->bind_fragment_sampler_states(pipe, 1, &ctx->sampler_state[src->level]);
+   pipe->bind_fragment_sampler_states(pipe, 1,
+      blitter_get_sampler_state(ctx, src->level));
    pipe->set_fragment_sampler_textures(pipe, 1, &src->texture);
    pipe->set_framebuffer_state(pipe, &fb_state);
 
@@ -601,7 +705,7 @@ void util_blitter_fill(struct blitter_context *blitter,
    pipe->bind_blend_state(pipe, ctx->blend_write_color);
    pipe->bind_depth_stencil_alpha_state(pipe, ctx->dsa_keep_depth_stencil);
    pipe->bind_rasterizer_state(pipe, ctx->rs_state);
-   pipe->bind_fs_state(pipe, ctx->fs_col[0]);
+   pipe->bind_fs_state(pipe, blitter_get_fs_col(ctx, 1));
    pipe->bind_vs_state(pipe, ctx->vs_col);
 
    /* set a framebuffer state */
--
1.6.3.3



[0007-util-blitter-kill-the-draw_quad-callback.patch]

From 4e1a135d7cef207b7bbff1759031c338e91750b5 Mon Sep 17 00:00:00 2001
From: =?utf-8?q?Marek=20Ol=C5=A1=C3=A1k?= <maraeo@...>
Date: Tue, 15 Dec 2009 01:11:22 +0100
Subject: [PATCH 7/7] util/blitter: kill the draw_quad callback

---
 src/gallium/auxiliary/util/u_blitter.c |   17 ++++++-----------
 src/gallium/auxiliary/util/u_blitter.h |   14 --------------
 2 files changed, 6 insertions(+), 25 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_blitter.c b/src/gallium/auxiliary/util/u_blitter.c
index 42efa86..895af2c 100644
--- a/src/gallium/auxiliary/util/u_blitter.c
+++ b/src/gallium/auxiliary/util/u_blitter.c
@@ -385,20 +385,15 @@ static void blitter_set_texcoords_cube(struct blitter_context_priv *ctx,
 
 static void blitter_draw_quad(struct blitter_context_priv *ctx)
 {
-   struct blitter_context *blitter = &ctx->blitter;
    struct pipe_context *pipe = ctx->pipe;
 
-   if (blitter->draw_quad) {
-      blitter->draw_quad(pipe, &ctx->vertices[0][0][0]);
-   } else {
-      /* write vertices and draw them */
-      pipe_buffer_write(pipe->screen, ctx->vbuf,
-                        0, sizeof(ctx->vertices), ctx->vertices);
+   /* write vertices and draw them */
+   pipe_buffer_write(pipe->screen, ctx->vbuf,
+                     0, sizeof(ctx->vertices), ctx->vertices);
 
-      util_draw_vertex_buffer(ctx->pipe, ctx->vbuf, 0, PIPE_PRIM_TRIANGLE_FAN,
-                              4,  /* verts */
-                              2); /* attribs/vert */
-   }
+   util_draw_vertex_buffer(pipe, ctx->vbuf, 0, PIPE_PRIM_TRIANGLE_FAN,
+                           4,  /* verts */
+                           2); /* attribs/vert */
 }
 
 static INLINE
diff --git a/src/gallium/auxiliary/util/u_blitter.h b/src/gallium/auxiliary/util/u_blitter.h
index e4cbb5c..3da5a6c 100644
--- a/src/gallium/auxiliary/util/u_blitter.h
+++ b/src/gallium/auxiliary/util/u_blitter.h
@@ -40,20 +40,6 @@ struct pipe_context;
 
 struct blitter_context
 {
-   /**
-    * Draw a quad.
-    *
-    * The pipe driver can set this to provide a more efficient way of drawing
-    * a quad. If it's NULL, the quad is drawn using a vertex buffer.
-    *
-    * There are always 4 vertices with interleaved vertex elements of type
-    * RGBA32F. See the vertex shader _output_ semantics to know what those are.
-    * The primitive type is always PIPE_PRIM_TRIANGLE_FAN and VS/clip/viewport
-    * is bypasssed.
-    */
-   void (*draw_quad)(struct pipe_context *pipe,
-                     const float *vertices);
-
    /* Private members, really. */
    void *saved_blend_state;   /**< blend state */
    void *saved_dsa_state;     /**< depth stencil alpha state */
--
1.6.3.3



------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev

_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: gallium: add blitter

by Marek Olšák :: Rate this Message:

| View Threaded | Show Only this Message

On Mon, Dec 14, 2009 at 5:42 PM, Corbin Simpson
<mostawesomedude@...> wrote:
> As far as immediate verts, why don't we just add support to r300g to switch
> to immediate mode for small VBOs?
>
> Posting from a mobile, pardon my terseness. ~ C.
>

Corbin,

that seems reasonable, and it's the reason I killed the draw_quad
function. BTW immediate mode doubles the performance in glxgears.

To others:
I noticed that there is a weird optimization in u_gen_mipmaps. It
allocates a large vertex buffer and uses small chunks of it to render
consecutive quads (one for each mipmap level and cubemap face). If we
implement switching to immediate mode, it would be nice for VBOs to be
as small as possible so that the driver can easily recognize the most
efficient path. The simplest solution (4 vertices in a VBO) may end up
being the fastest one here.

Marek

------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
< Prev | 1 - 2 | Next >