Initial version of GL_MESA_gpu_program3

View: New views
13 Messages — Rating Filter:   Alert me  

Initial version of GL_MESA_gpu_program3

by Ian Romanick-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Here is the initial version of the assembly extension that was discussed
at XDC.  This is a very early alpha version, and some parts are not yet
complete.  At this point, I am mainly looking for two things in a review:

- - Are there any issues marked "RESOLVED" where you disagree with the
resolution?  I'm especially interested in issues 2, 4, and 19.

- - Are there any issues marked "UNRESOLVED" that you have an opinion on
or data to support a resolution?  I'm especially interested in issues 7,
11, 15, and 34 (resolution may be related to 4).

- - Are there any instructions listed that cannot be trivially supported
on some relevant hardware?  Some instructions expand to multiple real
instructions (e.g., NRM).  As long as the expansion is trivial and only
adds one or two extra instructions, this is okay.

- - Is there some important SM3 feature that's missing?  I plan to
circulate this around the Wine community after the next revision.

There is some goofy formatting and issue numbering.  This is done to
minimize the diffs with GL_NV_gpu_program4.  The output of 'diff -d
- --side-by-side -W 165 MESA_gpu_program3 NV_gpu_program4' is pretty
readable and useful.

http://people.freedesktop.org/~idr/MESA_gpu_program3.txt
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkrU0wUACgkQX1gOwKyEAw+lxQCgi+hjDeERsB163Ljyv+iyPz7W
z98AnRleHrs1R9Hb1wlk6qLk45gFXfVf
=fggv
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: Initial version of GL_MESA_gpu_program3

by Nicolai Hähnle :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Am Tuesday 13 October 2009 21:20:40 schrieb Ian Romanick:
> Here is the initial version of the assembly extension that was discussed
> at XDC.  This is a very early alpha version, and some parts are not yet
> complete.  At this point, I am mainly looking for two things in a review:

Looks good from a very cursory look.

> - Are there any issues marked "RESOLVED" where you disagree with the
> resolution?  I'm especially interested in issues 2, 4, and 19.

Note: The following replies are based on my understanding of the hardware.
There may still be some missing or unclear information in the docs by AMD. If
this is the case, then it can hopefully be clarified in the course of this
thread.

Issue 2:
1) R500 supports unstructured branching in fragment programs but not in vertex
programs, so I'm happy about leaving it out.

2) R500 supports address registers as described in vertex programs (including
input/output offsets), but has no address registers at all in fragment
programs. A loop address register can be used as offsets in loops, but the
values loaded into this register must be determined at compile time.

Issue 4: Agreed. R500 does not support address register math.

Issue 6 (predicate registers):
Is it correct that there is only a PSEQ instruction and not the full
orthogonal set? The grammar includes the full orthogonal set, but the
instruction list seems to be missing something.

I assume predicate registers can be used to mask writes of ordinary ALU
instructions. Can they also mask TEX instructions? (R500 supports both, and
it's easy to emulate, but see caveat).

I think we can do everything you throw at us on R500. The only difficulty is
that R500 is a bit schizophrenic in that vertex programs are very different
from fragment programs, but we can emulate things. The only stupid weakness is
that swizzling predicates in fragment programs is essentially impossible (the
only natively supported swizzles are .rgba and the smears .rrrr, .gggg, .bbbb,
.aaaa). Obviously we can emulate this.

Issue 11:
R500 supposedly supports relative addressing of temporary registers in vertex
programs, and also in fragment programs (but only using loop indices). I have
never tested whether it actually works, though.

Issue 13:
Similar to issue 2, R500 fragment programs support unstructured everything but
vertex programs don't, so not overlapping sounds good to me.

Issue 15:
I know R500 fragment programs can support a CONT, but I'm not so familiar with
the R500 vertex programs, and they seem generally less flexible.

Issue 17:
I would *expect* negative addressing offsets to work on R500, but somehow I
haven't been able to get them to work. I'll see if I can look into it again.

Issue 34:
I don't see any support for an address register stack on R500, or anything
else to provide for a subroutine stack.

Thanks for working on this!

cu,
Nicolai

>
> - Are there any issues marked "UNRESOLVED" that you have an opinion on
> or data to support a resolution?  I'm especially interested in issues 7,
> 11, 15, and 34 (resolution may be related to 4).
>
> - Are there any instructions listed that cannot be trivially supported
> on some relevant hardware?  Some instructions expand to multiple real
> instructions (e.g., NRM).  As long as the expansion is trivial and only
> adds one or two extra instructions, this is okay.
>
> - Is there some important SM3 feature that's missing?  I plan to
> circulate this around the Wine community after the next revision.
>
> There is some goofy formatting and issue numbering.  This is done to
> minimize the diffs with GL_NV_gpu_program4.  The output of 'diff -d
> --side-by-side -W 165 MESA_gpu_program3 NV_gpu_program4' is pretty
> readable and useful.
>
> http://people.freedesktop.org/~idr/MESA_gpu_program3.txt
>
> ---------------------------------------------------------------------------
>--- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is
> the only developer event you need to attend this year. Jumpstart your
> developing skills, take BlackBerry mobile applications to market and stay
> ahead of the curve. Join us from November 9 - 12, 2009. Register now!
> http://p.sf.net/sfu/devconference
> _______________________________________________
> Mesa3d-dev mailing list
> Mesa3d-dev@...
> https://lists.sourceforge.net/lists/listinfo/mesa3d-dev




------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: Initial version of GL_MESA_gpu_program3

by Ian Romanick-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nicolai Hähnle wrote:

> Am Tuesday 13 October 2009 21:20:40 schrieb Ian Romanick:
>> Here is the initial version of the assembly extension that was discussed
>> at XDC.  This is a very early alpha version, and some parts are not yet
>> complete.  At this point, I am mainly looking for two things in a review:
>
> Looks good from a very cursory look.
>
>> - Are there any issues marked "RESOLVED" where you disagree with the
>> resolution?  I'm especially interested in issues 2, 4, and 19.
>
> Note: The following replies are based on my understanding of the hardware.
> There may still be some missing or unclear information in the docs by AMD. If
> this is the case, then it can hopefully be clarified in the course of this
> thread.
>
> Issue 2:
> 1) R500 supports unstructured branching in fragment programs but not in vertex
> programs, so I'm happy about leaving it out.

Weird.  That's backwards from how other SM3 GPUs do it.  Usually you get
unstructured branching in the AoS vertex shader.

> 2) R500 supports address registers as described in vertex programs (including
> input/output offsets), but has no address registers at all in fragment
> programs. A loop address register can be used as offsets in loops, but the
> values loaded into this register must be determined at compile time.

I had intended to move the grammar for ARL and ARR out of the generic
GPU grammar and into the vertex program-specific grammar.  The intention
is that LOOP/ENDLOOP is the only way to load an address register in a
fragment program.  LOOP/ENDLOOP set the .x component and leave the other
components undefined.  Since the ENDLOOP restores the "previous" value
of the address register, the last ENDLOOP restores garbage.  My
intention was to provide consistent syntactic sugar over the constrained
functionality of the loop index.

> Issue 4: Agreed. R500 does not support address register math.

I looked at the documentation, and I didn't see a way to do it.

> Issue 6 (predicate registers):
> Is it correct that there is only a PSEQ instruction and not the full
> orthogonal set? The grammar includes the full orthogonal set, but the
> instruction list seems to be missing something.

The full complement is supposed to be there.  I created the entry fro
PSEQ, got distracted, and never came back to it.

> I assume predicate registers can be used to mask writes of ordinary ALU
> instructions. Can they also mask TEX instructions? (R500 supports both, and
> it's easy to emulate, but see caveat).

Yes.  Predicates can apply to anything.

> I think we can do everything you throw at us on R500. The only difficulty is
> that R500 is a bit schizophrenic in that vertex programs are very different
> from fragment programs, but we can emulate things. The only stupid weakness is
> that swizzling predicates in fragment programs is essentially impossible (the
> only natively supported swizzles are .rgba and the smears .rrrr, .gggg, .bbbb,
> .aaaa). Obviously we can emulate this.

How painful would it be to emulate?  We could restrict the set of
available predicate swizzles.  I think this matches D3D, so it shouldn't
be a problem for Wine.

> Issue 11:
> R500 supposedly supports relative addressing of temporary registers in vertex
> programs, and also in fragment programs (but only using loop indices). I have
> never tested whether it actually works, though.

This would be a good feature to have.  Would it be possible to hack up a
test?  Do you know of any limitations?

> Issue 13:
> Similar to issue 2, R500 fragment programs support unstructured everything but
> vertex programs don't, so not overlapping sounds good to me.
>
> Issue 15:
> I know R500 fragment programs can support a CONT, but I'm not so familiar with
> the R500 vertex programs, and they seem generally less flexible.

I didn't see an explicit CONT instruction.  If there's no unstructured
branch, there probably isn't a way to do it.

> Issue 17:
> I would *expect* negative addressing offsets to work on R500, but somehow I
> haven't been able to get them to work. I'll see if I can look into it again.

No hardware that I'm aware of supports true negative offsets in the
instructions.  This is made to work with program parameters by putting
the base of the array at a large enough positive offset to make the
largest negative offset be zero.  For example, if the program uses
my_array[A0.x - 10], the driver has to place my_array at parameter slot
10 or higher.

I don't think we can do similar trickery for attributes and results.  I
think we may have to leave the negative offsets just for program
parameters and only allow positive offsets for attributes and results.
Note that NV_gpu_program4 only allows positive offsets.  It can get away
with this because SM4 has general purpose integer instructions and any
register can be used for indirect addressing.

> Issue 34:
> I don't see any support for an address register stack on R500, or anything
> else to provide for a subroutine stack.

If you can do relative addressing of temporaries, you can fake a small
stack.  It's ugly, but it's possible.  Of course, without address
register math it's even more ugly.

I'll post an updated version in the morning with the grammar change (for
ARL and ARR) and the documentation for the other predicate-set instructions.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkrVbbUACgkQX1gOwKyEAw9boQCeOP0HMtIWb3vOoKeSy4b5seMD
tMAAnROKJ61S7EBO6epL9CtYqx4B1xH1
=NhpQ
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: Initial version of GL_MESA_gpu_program3

by Nicolai Hähnle :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Alex, I added you to the CC in case you can help clarify the points on R500
vertex programs.

Am Wednesday 14 October 2009 08:20:42 schrieb Ian Romanick:
> > Issue 2:
> > 1) R500 supports unstructured branching in fragment programs but not in
> > vertex programs, so I'm happy about leaving it out.
>
> Weird.  That's backwards from how other SM3 GPUs do it.  Usually you get
> unstructured branching in the AoS vertex shader.

I agree. To be honest, the vertex processor documentation for R500 confuses
the hell out of me. Somehow, the way it is written it suggests that there is a
JUMP instruction that can only jump based on a constant register, which just
seems extremely bizarre, but the documentation is quite consistent about it,
because it tells us to use conditional write instructions to implement if-
else-statements.

Maybe there is a part which is simply missing? I also see neither JUMP nor
LOOP opcodes anywhere, just registers describing the first and last
instruction pointer for a loop.


> > 2) R500 supports address registers as described in vertex programs
> > (including input/output offsets), but has no address registers at all in
> > fragment programs. A loop address register can be used as offsets in
> > loops, but the values loaded into this register must be determined at
> > compile time.
>
> I had intended to move the grammar for ARL and ARR out of the generic
> GPU grammar and into the vertex program-specific grammar.  The intention
> is that LOOP/ENDLOOP is the only way to load an address register in a
> fragment program.  LOOP/ENDLOOP set the .x component and leave the other
> components undefined.  Since the ENDLOOP restores the "previous" value
> of the address register, the last ENDLOOP restores garbage.  My
> intention was to provide consistent syntactic sugar over the constrained
> functionality of the loop index.

Sounds good.

<snip>

> > I think we can do everything you throw at us on R500. The only difficulty
> > is that R500 is a bit schizophrenic in that vertex programs are very
> > different from fragment programs, but we can emulate things. The only
> > stupid weakness is that swizzling predicates in fragment programs is
> > essentially impossible (the only natively supported swizzles are .rgba
> > and the smears .rrrr, .gggg, .bbbb, .aaaa). Obviously we can emulate
> > this.
>
> How painful would it be to emulate?  We could restrict the set of
> available predicate swizzles.  I think this matches D3D, so it shouldn't
> be a problem for Wine.

I'd always be happier if I didn't have to do it, but it's certainly easier
than what we're already doing for R300 fragment programs anyway. The question
is whether you want to add a fragment-program-only restriction to the provided
swizzles. I don't feel very strongly either way.

> > Issue 11:
> > R500 supposedly supports relative addressing of temporary registers in
> > vertex programs, and also in fragment programs (but only using loop
> > indices). I have never tested whether it actually works, though.
>
> This would be a good feature to have.  Would it be possible to hack up a
> test?  Do you know of any limitations?

Will do this weekend, at least for vertex programs; I don't know of any
limitations.

I don't know if I'll get to hacking something up for fragment programs soon,
because that's slightly more involved (I haven't done fragment program loops
yet).


> > Issue 13:
> > Similar to issue 2, R500 fragment programs support unstructured
> > everything but vertex programs don't, so not overlapping sounds good to
> > me.
> >
> > Issue 15:
> > I know R500 fragment programs can support a CONT, but I'm not so familiar
> > with the R500 vertex programs, and they seem generally less flexible.
>
> I didn't see an explicit CONT instruction.  If there's no unstructured
> branch, there probably isn't a way to do it.
>
> > Issue 17:
> > I would *expect* negative addressing offsets to work on R500, but somehow
> > I haven't been able to get them to work. I'll see if I can look into it
> > again.
>
> No hardware that I'm aware of supports true negative offsets in the
> instructions.  This is made to work with program parameters by putting
> the base of the array at a large enough positive offset to make the
> largest negative offset be zero.  For example, if the program uses
> my_array[A0.x - 10], the driver has to place my_array at parameter slot
> 10 or higher.

I see.

> I don't think we can do similar trickery for attributes and results.  I
> think we may have to leave the negative offsets just for program
> parameters and only allow positive offsets for attributes and results.
> Note that NV_gpu_program4 only allows positive offsets.  It can get away
> with this because SM4 has general purpose integer instructions and any
> register can be used for indirect addressing.

Well, one possible trickery that I believe Corbin suggested was transforming:

ARL A0.x, R.x;
MOV R, CONST[A0.x - 5];

into:

SUB TMP.x, R.x, 5;
ARL A0.x, R.x;
MOV R, CONST[A0.x];

> > Issue 34:
> > I don't see any support for an address register stack on R500, or
> > anything else to provide for a subroutine stack.
>
> If you can do relative addressing of temporaries, you can fake a small
> stack.  It's ugly, but it's possible.  Of course, without address
> register math it's even more ugly.

True, that's a good argument in favour of relative addressing of temporaries.

> I'll post an updated version in the morning with the grammar change (for
> ARL and ARR) and the documentation for the other predicate-set
> instructions.



------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: Initial version of GL_MESA_gpu_program3

by Alex Deucher :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Oct 14, 2009 at 3:02 PM, Nicolai Hähnle <nhaehnle@...> wrote:

> Alex, I added you to the CC in case you can help clarify the points on R500
> vertex programs.
>
> Am Wednesday 14 October 2009 08:20:42 schrieb Ian Romanick:
>> > Issue 2:
>> > 1) R500 supports unstructured branching in fragment programs but not in
>> > vertex programs, so I'm happy about leaving it out.
>>
>> Weird.  That's backwards from how other SM3 GPUs do it.  Usually you get
>> unstructured branching in the AoS vertex shader.
>
> I agree. To be honest, the vertex processor documentation for R500 confuses
> the hell out of me. Somehow, the way it is written it suggests that there is a
> JUMP instruction that can only jump based on a constant register, which just
> seems extremely bizarre, but the documentation is quite consistent about it,
> because it tells us to use conditional write instructions to implement if-
> else-statements.
>
> Maybe there is a part which is simply missing? I also see neither JUMP nor
> LOOP opcodes anywhere, just registers describing the first and last
> instruction pointer for a loop.
>

It's not part of the actual shader code per se;  it's implemented in
parallel to the vertex shader and interacts with it. See the
VAP_PVS_FLOW_CNTL_* regs.  r3xx-r5xx have them so it should be
possible on all 3 generations; r5xx supports longer programs however.

Alex

>
>> > 2) R500 supports address registers as described in vertex programs
>> > (including input/output offsets), but has no address registers at all in
>> > fragment programs. A loop address register can be used as offsets in
>> > loops, but the values loaded into this register must be determined at
>> > compile time.
>>
>> I had intended to move the grammar for ARL and ARR out of the generic
>> GPU grammar and into the vertex program-specific grammar.  The intention
>> is that LOOP/ENDLOOP is the only way to load an address register in a
>> fragment program.  LOOP/ENDLOOP set the .x component and leave the other
>> components undefined.  Since the ENDLOOP restores the "previous" value
>> of the address register, the last ENDLOOP restores garbage.  My
>> intention was to provide consistent syntactic sugar over the constrained
>> functionality of the loop index.
>
> Sounds good.
>
> <snip>
>> > I think we can do everything you throw at us on R500. The only difficulty
>> > is that R500 is a bit schizophrenic in that vertex programs are very
>> > different from fragment programs, but we can emulate things. The only
>> > stupid weakness is that swizzling predicates in fragment programs is
>> > essentially impossible (the only natively supported swizzles are .rgba
>> > and the smears .rrrr, .gggg, .bbbb, .aaaa). Obviously we can emulate
>> > this.
>>
>> How painful would it be to emulate?  We could restrict the set of
>> available predicate swizzles.  I think this matches D3D, so it shouldn't
>> be a problem for Wine.
>
> I'd always be happier if I didn't have to do it, but it's certainly easier
> than what we're already doing for R300 fragment programs anyway. The question
> is whether you want to add a fragment-program-only restriction to the provided
> swizzles. I don't feel very strongly either way.
>
>> > Issue 11:
>> > R500 supposedly supports relative addressing of temporary registers in
>> > vertex programs, and also in fragment programs (but only using loop
>> > indices). I have never tested whether it actually works, though.
>>
>> This would be a good feature to have.  Would it be possible to hack up a
>> test?  Do you know of any limitations?
>
> Will do this weekend, at least for vertex programs; I don't know of any
> limitations.
>
> I don't know if I'll get to hacking something up for fragment programs soon,
> because that's slightly more involved (I haven't done fragment program loops
> yet).
>
>
>> > Issue 13:
>> > Similar to issue 2, R500 fragment programs support unstructured
>> > everything but vertex programs don't, so not overlapping sounds good to
>> > me.
>> >
>> > Issue 15:
>> > I know R500 fragment programs can support a CONT, but I'm not so familiar
>> > with the R500 vertex programs, and they seem generally less flexible.
>>
>> I didn't see an explicit CONT instruction.  If there's no unstructured
>> branch, there probably isn't a way to do it.
>>
>> > Issue 17:
>> > I would *expect* negative addressing offsets to work on R500, but somehow
>> > I haven't been able to get them to work. I'll see if I can look into it
>> > again.
>>
>> No hardware that I'm aware of supports true negative offsets in the
>> instructions.  This is made to work with program parameters by putting
>> the base of the array at a large enough positive offset to make the
>> largest negative offset be zero.  For example, if the program uses
>> my_array[A0.x - 10], the driver has to place my_array at parameter slot
>> 10 or higher.
>
> I see.
>
>> I don't think we can do similar trickery for attributes and results.  I
>> think we may have to leave the negative offsets just for program
>> parameters and only allow positive offsets for attributes and results.
>> Note that NV_gpu_program4 only allows positive offsets.  It can get away
>> with this because SM4 has general purpose integer instructions and any
>> register can be used for indirect addressing.
>
> Well, one possible trickery that I believe Corbin suggested was transforming:
>
> ARL A0.x, R.x;
> MOV R, CONST[A0.x - 5];
>
> into:
>
> SUB TMP.x, R.x, 5;
> ARL A0.x, R.x;
> MOV R, CONST[A0.x];
>
>> > Issue 34:
>> > I don't see any support for an address register stack on R500, or
>> > anything else to provide for a subroutine stack.
>>
>> If you can do relative addressing of temporaries, you can fake a small
>> stack.  It's ugly, but it's possible.  Of course, without address
>> register math it's even more ugly.
>
> True, that's a good argument in favour of relative addressing of temporaries.
>
>> I'll post an updated version in the morning with the grammar change (for
>> ARL and ARR) and the documentation for the other predicate-set
>> instructions.
>
>
>

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: Initial version of GL_MESA_gpu_program3

by Nicolai Hähnle :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Am Tuesday 13 October 2009 21:20:40 schrieb Ian Romanick:

> Here is the initial version of the assembly extension that was discussed
> at XDC.  This is a very early alpha version, and some parts are not yet
> complete.  At this point, I am mainly looking for two things in a review:
>
> - Are there any issues marked "RESOLVED" where you disagree with the
> resolution?  I'm especially interested in issues 2, 4, and 19.
>
> - Are there any issues marked "UNRESOLVED" that you have an opinion on
> or data to support a resolution?  I'm especially interested in issues 7,
> 11, 15, and 34 (resolution may be related to 4).

I forgot about Issue 14:
R300-R500 vertex programs do support EXP and LOG natively.

cu,
Nicolai

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: Initial version of GL_MESA_gpu_program3

by José Fonseca-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, 2009-10-13 at 12:20 -0700, Ian Romanick wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Here is the initial version of the assembly extension that was discussed
> at XDC.  This is a very early alpha version, and some parts are not yet
> complete.  At this point, I am mainly looking for two things in a review:
>
> - - Are there any issues marked "RESOLVED" where you disagree with the
> resolution?  I'm especially interested in issues 2, 4, and 19.
>
> - - Are there any issues marked "UNRESOLVED" that you have an opinion on
> or data to support a resolution?  I'm especially interested in issues 7,
> 11, 15, and 34 (resolution may be related to 4).

Ian,

I've been looking a lot into D3D shaders recently and below is my
feedback from that perspective. Given that the hardware usually followed
the D3D specs it might help resolve some issues.

(11) Is relative addressing of temporaries allowed?

        UNRESOLVED.  It is unclear whether the relevant hardware has this
        capability.


D3D's SM3 spec doesn't allow it:
http://msdn.microsoft.com/en-us/library/ee417947%28VS.85%29.aspx 
http://msdn.microsoft.com/en-us/library/ee418032%28VS.85%29.aspx

(14) Should the EXP and LOG instructions be included?
 
        UNRESOLVED.  It appears that no hardware capable of supporting
        this extension natively support them.  Would this introduce a
        portability issue for old programs or D3D cross compilers?

D3D's SM3 defines approximate EXP/LOG tokens -- EXPP and LOGP - for
vertex shaders, as shown in
http://msdn.microsoft.com/en-us/library/ee417981%28VS.85%29.aspx . EXPP
is not listed in the MSDN but it is listed in the SDK docs.

(17) Should negative offsets be available for relative addressing?

        UNRESOLVED.  See the "FINISHME" block in section 2.X.4.2.

D3D doesn't seem to allow it.
http://msdn.microsoft.com/en-us/library/ee417949%28VS.85%29.aspx


(34) This extension provides subroutines, but doesn't provide a stack to
    push and pop parameters.  How do we deal with this?  NV_vertex_program3
    supported PUSHA/POPA instructions to push and pop address registers.

        UNRESOLVED.

D3D semantics here are that subroutines have access to all registers
visible to the caller, i.e., all subroutines appear as if they were
inlined in the calling code, and have access to all registers visible at
the time.

A good example of such D3D shader is the LightingVS.fx included in the
DXSDK. It simulates fixed function lighting and has a function which
access the caller's outer loop a0 register.

> - - Are there any instructions listed that cannot be trivially supported
> on some relevant hardware?  Some instructions expand to multiple real
> instructions (e.g., NRM).  As long as the expansion is trivial and only
> adds one or two extra instructions, this is okay.
>
> - - Is there some important SM3 feature that's missing?

I didn't notice anything obvious.

>  I plan to
> circulate this around the Wine community after the next revision.

Jose


------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: Initial version of GL_MESA_gpu_program3

by Ian Romanick-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Version 2 of the extension spec has been posted:

http://people.freedesktop.org/~idr/MESA_gpu_program3.txt

Unless anyone has major comments or objections, I think it's time to
circulate this to down-stream users (e.g., Wine).  Who are the right
contacts?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkrc8LwACgkQX1gOwKyEAw8yawCbBxaySDhe8rhRIq43GGqbHXT9
E9sAnRaPBIfid5Zl0gk8J1xniPuIAlL8
=VLxn
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: Initial version of GL_MESA_gpu_program3

by Henri Verbeet :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2009/10/20 Ian Romanick <idr@...>:
> Unless anyone has major comments or objections, I think it's time to
> circulate this to down-stream users (e.g., Wine).  Who are the right
> contacts?

For Wine, you can generally just post to wine-devel@..., but
you can also directly mail Stefan (stefan@...) and me
(either this address or hverbeet@...).

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: Initial version of GL_MESA_gpu_program3

by Ian Romanick-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Henri Verbeet wrote:
> 2009/10/20 Ian Romanick <idr@...>:
>> Unless anyone has major comments or objections, I think it's time to
>> circulate this to down-stream users (e.g., Wine).  Who are the right
>> contacts?
>
> For Wine, you can generally just post to wine-devel@..., but
> you can also directly mail Stefan (stefan@...) and me
> (either this address or hverbeet@...).

Okay.  Can I assume you'll forward this to Stefan? :)  Also, do I have
to subscribe to wine-devel in order to post to it?  If so, I'll probably
just e-mail you and Stefan directly in the future.  I have enough
mailing lists filling my inbox these days. :)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkrfRVYACgkQX1gOwKyEAw9PAwCfVp6QaU7YhEEf8XSCzOQsS5g/
9FwAn3Ht+NthqR+IunI+Z6p+r1PY+/wP
=piUP
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: Initial version of GL_MESA_gpu_program3

by Henri Verbeet :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2009/10/21 Ian Romanick <idr@...>:
> Okay.  Can I assume you'll forward this to Stefan? :)  Also, do I have
> to subscribe to wine-devel in order to post to it?  If so, I'll probably
> just e-mail you and Stefan directly in the future.  I have enough
> mailing lists filling my inbox these days. :)

If you're not subscribed the mail will go though moderation, but it'll
get there eventually. You can also subscribe without mail delivery.

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: Initial version of GL_MESA_gpu_program3

by Keith Whitwell-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, 2009-10-19 at 16:05 -0700, Ian Romanick wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Version 2 of the extension spec has been posted:
>
> http://people.freedesktop.org/~idr/MESA_gpu_program3.txt
>
> Unless anyone has major comments or objections, I think it's time to
> circulate this to down-stream users (e.g., Wine).  Who are the right
> contacts?

Ian,

Is the intention to fill the gap between where we are now and the NV
program4 extensions, or to start out on a new MESA-specific path which
would include later on a MESA_gpu_program4 extension?

The question is relevant because of things like condition-codes which
are in the NV GPU4 extension.  
  - If this is an intermediate step on the way to NV GPU4, then we
should probably prefer condition-codes over predicates.
  - If we expect to define a Mesa GPU4, then sticking with predicates is
fine.

I guess in either case the Mesa IR can opt to choose predicates over
cond-codes, but a great benefit of an extension like this is that we can
point at Mesa IR and say "the semantics of this language are documented
in MESA_gpu_program3".

IE. if you're not intending to provide a Mesa SM4 extension, it might be
better to stick closer to the NV usage for SM3 also.

Keith


------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: Initial version of GL_MESA_gpu_program3

by Ian Romanick-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Keith Whitwell wrote:

> On Mon, 2009-10-19 at 16:05 -0700, Ian Romanick wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Version 2 of the extension spec has been posted:
>>
>> http://people.freedesktop.org/~idr/MESA_gpu_program3.txt
>>
>> Unless anyone has major comments or objections, I think it's time to
>> circulate this to down-stream users (e.g., Wine).  Who are the right
>> contacts?
>
> Is the intention to fill the gap between where we are now and the NV
> program4 extensions, or to start out on a new MESA-specific path which
> would include later on a MESA_gpu_program4 extension?
>
> The question is relevant because of things like condition-codes which
> are in the NV GPU4 extension.  
>   - If this is an intermediate step on the way to NV GPU4, then we
> should probably prefer condition-codes over predicates.
>   - If we expect to define a Mesa GPU4, then sticking with predicates is
> fine.

For right now, the intention is to target what shipping hardware that
Mesa supports can do. :)  R500 can't do NVIDIA-style condition codes.
We can fake it on 965, but it some cases a single instruction would be
expanded to a big pile of instructions.

Eric keeps telling me that he's going to get NV_fragment_program running
on 965, and that will give us a better idea of how much instruction
expansion we'll see.  He keeps find other fun things to work on, though.

My expectation is that there will be a gpu_program4 follow-on, but it
probably won't be for a while.  I want to see if how well this works out
for Wine, for example, before I commit to doing the work of
gpu_program4.  Coming with the gpu_program3 spec, even with all the
"harvesting" I did from other specs, was a shocking amount of work.

> I guess in either case the Mesa IR can opt to choose predicates over
> cond-codes, but a great benefit of an extension like this is that we can
> point at Mesa IR and say "the semantics of this language are documented
> in MESA_gpu_program3".
>
> IE. if you're not intending to provide a Mesa SM4 extension, it might be
> better to stick closer to the NV usage for SM3 also.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkrs38gACgkQX1gOwKyEAw9PeACggPMDer/f9l4SCna4JT1vza+/
3xYAoIuYpBV3At5Cqj9fWo9bKQUR7U5O
=i1R6
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@...
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev