[PATCH, i386]: Fix PR 41900, call *%esp shouldn't be generated because of CPU errata

View: New views
15 Messages — Rating Filter:   Alert me  

[PATCH, i386]: Fix PR 41900, call *%esp shouldn't be generated because of CPU errata

by Uros Bizjak-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello!

As explained in the PR:

Intel P6 family of processors (Pentium Pro, 2, 3) have a bug in call *%esp
instruction. The instruction should put current EIP to stack, decrement ESP by
4 and jump to a value of ESP before the decrement. P6 processors will jump to
the address after the decrement (so the will execute return address as code).
See Pentium Pro errata 70, Pentium 2 errata A33, Pentium 3 errata E17.

Attached patch avoids %esp by constraining available registers for a
call operand with "index_register_operand" which happens to exclude
%esp. Due to the way we define "generic" target, we have to include
i386, i486 and pentium in addition to PPRO, but since you have to use
lots of magic to trigger this bug (see attached testcase) I don't
think this matters at all.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
{,-m32}. Patch was committed to mainline, will be backported to 4.4
and 4.3 branches.

2009-11-03  Uros Bizjak  <ubizjak@...>

        PR target/41900
        * config/i386/i386.h (ix86_arch_indices) <X86_ARCH_CALL_ESP>: New.
        (TARGET_CALL_ESP): New define.
        * config/i386/i386.c (initial_ix86_tune_features): Initialize
        X86_ARCH_CALL_ESP.
        * config/i386/i386.md
        (*call_pop_1_esp, *call_1_esp, *call_value_pop_1_esp,
        *call_value_1_esp): Rename from *call_pop_1, *call_1,
        *call_value_pop_1 and *call_value_1.  Depend on TARGET_CALL_ESP.
        (*call_pop_1, *call_1, *call_value_pop_1, *call_value_1):
        New patterns, use "lsm" as operand 1 constraint.
        * config/i386/predicates.md (call_insn_operand): Depend on
        index_register_operand for !TARGET_CALL_ESP to avoid %esp register.

testsuite/ChangeLog:

2009-11-03  Uros Bizjak  <ubizjak@...>

        PR target/41900
        * gcc.target/i386/pr41900.c: New test.

Uros.

Index: testsuite/gcc.target/i386/pr41900.c
===================================================================
--- testsuite/gcc.target/i386/pr41900.c (revision 0)
+++ testsuite/gcc.target/i386/pr41900.c (revision 0)
@@ -0,0 +1,11 @@
+/* { dg-do run } */
+/* { dg-require-effective-target ilp32 } */
+/* { dg-options "-O2 -fomit-frame-pointer -mpreferred-stack-boundary=2" } */
+
+int main ()
+{
+  unsigned code = 0xc3;
+
+  ((void (*)(void)) &code) ();
+  return 0;
+}
Index: config/i386/i386.h
===================================================================
--- config/i386/i386.h (revision 153837)
+++ config/i386/i386.h (working copy)
@@ -400,6 +400,7 @@ enum ix86_arch_indices {
   X86_ARCH_CMPXCHG8B,
   X86_ARCH_XADD,
   X86_ARCH_BSWAP,
+  X86_ARCH_CALL_ESP,
 
   X86_ARCH_LAST
 };
@@ -411,6 +412,7 @@ extern unsigned char ix86_arch_features[
 #define TARGET_CMPXCHG8B ix86_arch_features[X86_ARCH_CMPXCHG8B]
 #define TARGET_XADD ix86_arch_features[X86_ARCH_XADD]
 #define TARGET_BSWAP ix86_arch_features[X86_ARCH_BSWAP]
+#define TARGET_CALL_ESP ix86_arch_features[X86_ARCH_CALL_ESP]
 
 #define TARGET_FISTTP (TARGET_SSE3 && TARGET_80387)
 
Index: config/i386/i386.md
===================================================================
--- config/i386/i386.md (revision 153837)
+++ config/i386/i386.md (working copy)
@@ -14566,12 +14566,25 @@
 }
   [(set_attr "type" "call")])
 
-(define_insn "*call_pop_1"
+(define_insn "*call_pop_1_esp"
   [(call (mem:QI (match_operand:SI 0 "call_insn_operand" "rsm"))
  (match_operand:SI 1 "" ""))
    (set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG)
     (match_operand:SI 2 "immediate_operand" "i")))]
-  "!SIBLING_CALL_P (insn) && !TARGET_64BIT"
+  "!TARGET_64BIT && TARGET_CALL_ESP && !SIBLING_CALL_P (insn)"
+{
+  if (constant_call_address_operand (operands[0], Pmode))
+    return "call\t%P0";
+  return "call\t%A0";
+}
+  [(set_attr "type" "call")])
+
+(define_insn "*call_pop_1"
+  [(call (mem:QI (match_operand:SI 0 "call_insn_operand" "lsm"))
+ (match_operand:SI 1 "" ""))
+   (set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG)
+    (match_operand:SI 2 "immediate_operand" "i")))]
+  "!TARGET_64BIT && !TARGET_CALL_ESP && !SIBLING_CALL_P (insn)"
 {
   if (constant_call_address_operand (operands[0], Pmode))
     return "call\t%P0";
@@ -14584,7 +14597,7 @@
  (match_operand:SI 1 "" ""))
    (set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG)
     (match_operand:SI 2 "immediate_operand" "i,i")))]
-  "SIBLING_CALL_P (insn) && !TARGET_64BIT"
+  "!TARGET_64BIT && SIBLING_CALL_P (insn)"
   "@
    jmp\t%P0
    jmp\t%A0"
@@ -14622,10 +14635,21 @@
 }
   [(set_attr "type" "call")])
 
-(define_insn "*call_1"
+(define_insn "*call_1_esp"
   [(call (mem:QI (match_operand:SI 0 "call_insn_operand" "rsm"))
  (match_operand 1 "" ""))]
-  "!SIBLING_CALL_P (insn) && !TARGET_64BIT"
+  "!TARGET_64BIT && TARGET_CALL_ESP && !SIBLING_CALL_P (insn)"
+{
+  if (constant_call_address_operand (operands[0], Pmode))
+    return "call\t%P0";
+  return "call\t%A0";
+}
+  [(set_attr "type" "call")])
+
+(define_insn "*call_1"
+  [(call (mem:QI (match_operand:SI 0 "call_insn_operand" "lsm"))
+ (match_operand 1 "" ""))]
+  "!TARGET_64BIT && !TARGET_CALL_ESP && !SIBLING_CALL_P (insn)"
 {
   if (constant_call_address_operand (operands[0], Pmode))
     return "call\t%P0";
@@ -14636,7 +14660,7 @@
 (define_insn "*sibcall_1"
   [(call (mem:QI (match_operand:SI 0 "sibcall_insn_operand" "s,U"))
  (match_operand 1 "" ""))]
-  "SIBLING_CALL_P (insn) && !TARGET_64BIT"
+  "!TARGET_64BIT && SIBLING_CALL_P (insn)"
   "@
    jmp\t%P0
    jmp\t%A0"
@@ -14645,7 +14669,7 @@
 (define_insn "*call_1_rex64"
   [(call (mem:QI (match_operand:DI 0 "call_insn_operand" "rsm"))
  (match_operand 1 "" ""))]
-  "!SIBLING_CALL_P (insn) && TARGET_64BIT
+  "TARGET_64BIT && !SIBLING_CALL_P (insn)
    && ix86_cmodel != CM_LARGE && ix86_cmodel != CM_LARGE_PIC"
 {
   if (constant_call_address_operand (operands[0], Pmode))
@@ -14670,7 +14694,7 @@
    (clobber (reg:TI XMM15_REG))
    (clobber (reg:DI SI_REG))
    (clobber (reg:DI DI_REG))]
-  "!SIBLING_CALL_P (insn) && TARGET_64BIT"
+  "TARGET_64BIT && !SIBLING_CALL_P (insn)"
 {
   if (constant_call_address_operand (operands[0], Pmode))
     return "call\t%P0";
@@ -14681,14 +14705,14 @@
 (define_insn "*call_1_rex64_large"
   [(call (mem:QI (match_operand:DI 0 "call_insn_operand" "rm"))
  (match_operand 1 "" ""))]
-  "!SIBLING_CALL_P (insn) && TARGET_64BIT"
+  "TARGET_64BIT && !SIBLING_CALL_P (insn)"
   "call\t%A0"
   [(set_attr "type" "call")])
 
 (define_insn "*sibcall_1_rex64"
   [(call (mem:QI (match_operand:DI 0 "sibcall_insn_operand" "s,U"))
  (match_operand 1 "" ""))]
-  "SIBLING_CALL_P (insn) && TARGET_64BIT"
+  "TARGET_64BIT && SIBLING_CALL_P (insn)"
   "@
    jmp\t%P0
    jmp\t%A0"
@@ -21084,13 +21108,27 @@
 }
   [(set_attr "type" "callv")])
 
-(define_insn "*call_value_pop_1"
+(define_insn "*call_value_pop_1_esp"
   [(set (match_operand 0 "" "")
  (call (mem:QI (match_operand:SI 1 "call_insn_operand" "rsm"))
       (match_operand:SI 2 "" "")))
    (set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG)
     (match_operand:SI 3 "immediate_operand" "i")))]
-  "!SIBLING_CALL_P (insn) && !TARGET_64BIT"
+  "!TARGET_64BIT && TARGET_CALL_ESP && !SIBLING_CALL_P (insn)"
+{
+  if (constant_call_address_operand (operands[1], Pmode))
+    return "call\t%P1";
+  return "call\t%A1";
+}
+  [(set_attr "type" "callv")])
+
+(define_insn "*call_value_pop_1"
+  [(set (match_operand 0 "" "")
+ (call (mem:QI (match_operand:SI 1 "call_insn_operand" "lsm"))
+      (match_operand:SI 2 "" "")))
+   (set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG)
+    (match_operand:SI 3 "immediate_operand" "i")))]
+  "!TARGET_64BIT && !TARGET_CALL_ESP && !SIBLING_CALL_P (insn)"
 {
   if (constant_call_address_operand (operands[1], Pmode))
     return "call\t%P1";
@@ -21104,7 +21142,7 @@
       (match_operand:SI 2 "" "")))
    (set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG)
     (match_operand:SI 3 "immediate_operand" "i,i")))]
-  "SIBLING_CALL_P (insn) && !TARGET_64BIT"
+  "!TARGET_64BIT && SIBLING_CALL_P (insn)"
   "@
    jmp\t%P1
    jmp\t%A1"
@@ -21153,7 +21191,7 @@
    (clobber (reg:TI XMM15_REG))
    (clobber (reg:DI SI_REG))
    (clobber (reg:DI DI_REG))]
-  "!SIBLING_CALL_P (insn) && TARGET_64BIT"
+  "TARGET_64BIT && !SIBLING_CALL_P (insn)"
 {
   if (SIBLING_CALL_P (insn))
     return "jmp\t%P1";
@@ -21162,11 +21200,23 @@
 }
   [(set_attr "type" "callv")])
 
-(define_insn "*call_value_1"
+(define_insn "*call_value_1_esp"
   [(set (match_operand 0 "" "")
  (call (mem:QI (match_operand:SI 1 "call_insn_operand" "rsm"))
       (match_operand:SI 2 "" "")))]
-  "!SIBLING_CALL_P (insn) && !TARGET_64BIT"
+  "!TARGET_64BIT && TARGET_CALL_ESP && !SIBLING_CALL_P (insn)"
+{
+  if (constant_call_address_operand (operands[1], Pmode))
+    return "call\t%P1";
+  return "call\t%A1";
+}
+  [(set_attr "type" "callv")])
+
+(define_insn "*call_value_1"
+  [(set (match_operand 0 "" "")
+ (call (mem:QI (match_operand:SI 1 "call_insn_operand" "lsm"))
+      (match_operand:SI 2 "" "")))]
+  "!TARGET_64BIT && !TARGET_CALL_ESP && !SIBLING_CALL_P (insn)"
 {
   if (constant_call_address_operand (operands[1], Pmode))
     return "call\t%P1";
@@ -21178,7 +21228,7 @@
   [(set (match_operand 0 "" "")
  (call (mem:QI (match_operand:SI 1 "sibcall_insn_operand" "s,U"))
       (match_operand:SI 2 "" "")))]
-  "SIBLING_CALL_P (insn) && !TARGET_64BIT"
+  "!TARGET_64BIT && SIBLING_CALL_P (insn)"
   "@
    jmp\t%P1
    jmp\t%A1"
@@ -21188,7 +21238,7 @@
   [(set (match_operand 0 "" "")
  (call (mem:QI (match_operand:DI 1 "call_insn_operand" "rsm"))
       (match_operand:DI 2 "" "")))]
-  "!SIBLING_CALL_P (insn) && TARGET_64BIT
+  "TARGET_64BIT && !SIBLING_CALL_P (insn)
    && ix86_cmodel != CM_LARGE && ix86_cmodel != CM_LARGE_PIC"
 {
   if (constant_call_address_operand (operands[1], Pmode))
@@ -21226,7 +21276,7 @@
   [(set (match_operand 0 "" "")
  (call (mem:QI (match_operand:DI 1 "call_insn_operand" "rm"))
       (match_operand:DI 2 "" "")))]
-  "!SIBLING_CALL_P (insn) && TARGET_64BIT"
+  "TARGET_64BIT && !SIBLING_CALL_P (insn)"
   "call\t%A1"
   [(set_attr "type" "callv")])
 
@@ -21234,7 +21284,7 @@
   [(set (match_operand 0 "" "")
  (call (mem:QI (match_operand:DI 1 "sibcall_insn_operand" "s,U"))
       (match_operand:DI 2 "" "")))]
-  "SIBLING_CALL_P (insn) && TARGET_64BIT"
+  "TARGET_64BIT && SIBLING_CALL_P (insn)"
   "@
    jmp\t%P1
    jmp\t%A1"
Index: config/i386/predicates.md
===================================================================
--- config/i386/predicates.md (revision 153837)
+++ config/i386/predicates.md (working copy)
@@ -561,7 +561,9 @@
 ;; Test for a valid operand for a call instruction.
 (define_predicate "call_insn_operand"
   (ior (match_operand 0 "constant_call_address_operand")
-       (ior (match_operand 0 "register_no_elim_operand")
+       (ior (and (match_operand 0 "register_no_elim_operand")
+ (ior (match_test "TARGET_CALL_ESP")
+      (match_operand 0 "index_register_operand")))
     (match_operand 0 "memory_operand"))))
 
 ;; Similarly, but for tail calls, in which we cannot allow memory references.
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c (revision 153837)
+++ config/i386/i386.c (working copy)
@@ -1553,6 +1553,11 @@ static unsigned int initial_ix86_arch_fe
 
   /* X86_ARCH_BSWAP: Byteswap was added for 80486.  */
   ~m_386,
+
+  /* X86_ARCH_CALL_ESP: P6 processors will jump to the address after
+     the decrement (so they will execute return address as code).  See
+     Pentium Pro errata 70, Pentium 2 errata A33, Pentium 3 errata E17.  */
+  ~(m_386 | m_486 | m_PENT | m_PPRO),
 };
 
 static const unsigned int x86_accumulate_outgoing_args

Re: [PATCH, i386]: Fix PR 41900, call *%esp shouldn't be generated because of CPU errata

by Jakub Jelinek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Nov 03, 2009 at 08:48:15AM +0100, Uros Bizjak wrote:
> 2009-11-03  Uros Bizjak  <ubizjak@...>
>
> PR target/41900
> * gcc.target/i386/pr41900.c: New test.

The testsuite fails on i?86-linux.
I believe it is very rare these days to have stack executable for security
reasons.

IMHO you should change the test into dg-do compile with some scan-assembler.

> --- testsuite/gcc.target/i386/pr41900.c (revision 0)
> +++ testsuite/gcc.target/i386/pr41900.c (revision 0)
> @@ -0,0 +1,11 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target ilp32 } */
> +/* { dg-options "-O2 -fomit-frame-pointer -mpreferred-stack-boundary=2" } */
> +
> +int main ()
> +{
> +  unsigned code = 0xc3;
> +
> +  ((void (*)(void)) &code) ();
> +  return 0;
> +}

        Jakub

Re: [PATCH, i386]: Fix PR 41900, call *%esp shouldn't be generated because of CPU errata

by Uros Bizjak-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Nov 3, 2009 at 12:56 PM, Jakub Jelinek <jakub@...> wrote:

>>       PR target/41900
>>       * gcc.target/i386/pr41900.c: New test.
>
> The testsuite fails on i?86-linux.
> I believe it is very rare these days to have stack executable for security
> reasons.
>
> IMHO you should change the test into dg-do compile with some scan-assembler.

No, this test failed due to missing "volatile" (fixed now). AFAICS, it
works OK on Fedora 11 and CentOS 5.3.

Uros.

Re: [PATCH, i386]: Fix PR 41900, call *%esp shouldn't be generated because of CPU errata

by Jakub Jelinek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Nov 03, 2009 at 04:08:11PM +0100, Uros Bizjak wrote:

> On Tue, Nov 3, 2009 at 12:56 PM, Jakub Jelinek <jakub@...> wrote:
>
> >>       PR target/41900
> >>       * gcc.target/i386/pr41900.c: New test.
> >
> > The testsuite fails on i?86-linux.
> > I believe it is very rare these days to have stack executable for security
> > reasons.
> >
> > IMHO you should change the test into dg-do compile with some scan-assembler.
>
> No, this test failed due to missing "volatile" (fixed now). AFAICS, it
> works OK on Fedora 11 and CentOS 5.3.

The extra volatile doesn't help at all, it still fails here (Fedora 11,
x86_64 kernel, SELinux enforcing).

        Jakub

Re: [PATCH, i386]: Fix PR 41900, call *%esp shouldn't be generated because of CPU errata

by Uros Bizjak-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 11/03/2009 04:13 PM, Jakub Jelinek wrote:

>>>> � � � PR target/41900
>>>> � � � * gcc.target/i386/pr41900.c: New test.
>>>>          
>>> The testsuite fails on i?86-linux.
>>> I believe it is very rare these days to have stack executable for security
>>> reasons.
>>>
>>> IMHO you should change the test into dg-do compile with some scan-assembler.
>>>        
>> No, this test failed due to missing "volatile" (fixed now). AFAICS, it
>> works OK on Fedora 11 and CentOS 5.3.
>>      
> The extra volatile doesn't help at all, it still fails here (Fedora 11,
> x86_64 kernel, SELinux enforcing).
>    

Hm, indeed. Is there a solution by somehow enabling execution (like
trampolines)?

Uros.


Re: [PATCH, i386]: Fix PR 41900, call *%esp shouldn't be generated because of CPU errata

by Jakub Jelinek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Nov 03, 2009 at 06:27:21PM +0100, Uros Bizjak wrote:
>>> No, this test failed due to missing "volatile" (fixed now). AFAICS, it
>>> works OK on Fedora 11 and CentOS 5.3.
>>>      
>> The extra volatile doesn't help at all, it still fails here (Fedora 11,
>> x86_64 kernel, SELinux enforcing).
>>    
>
> Hm, indeed. Is there a solution by somehow enabling execution (like  
> trampolines)?

You can assemble with -Wa,--execstack or link with -Wl,-z,execstack.
But it helps only in some cases, some versions of the policy won't allow
unconfined processes to have exec stack, you'd need to chcon -t java_exec_t
or similar.  This is highly Linux specific though.

        Jakub

Re: [PATCH, i386]: Fix PR 41900, call *%esp shouldn't be generated because of CPU errata

by Uros Bizjak-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 11/03/2009 06:31 PM, Jakub Jelinek wrote:

>>>> No, this test failed due to missing "volatile" (fixed now). AFAICS, it
>>>> works OK on Fedora 11 and CentOS 5.3.
>>>>
>>>>          
>>> The extra volatile doesn't help at all, it still fails here (Fedora 11,
>>> x86_64 kernel, SELinux enforcing).
>>>
>>>        
>> Hm, indeed. Is there a solution by somehow enabling execution (like
>> trampolines)?
>>      
> You can assemble with -Wa,--execstack or link with -Wl,-z,execstack.
> But it helps only in some cases, some versions of the policy won't allow
> unconfined processes to have exec stack, you'd need to chcon -t java_exec_t
> or similar.  This is highly Linux specific though.
>    

I think I'll just make the test compile-only and scan for call without
*%esp argument, as you suggested in your previous mails.

Thanks,
Uros.



Re: [PATCH, i386]: Fix PR 41900, call *%esp shouldn't be generated because of CPU errata

by Uros Bizjak-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 11/03/2009 07:47 PM, Uros Bizjak wrote:

>>>>> works OK on Fedora 11 and CentOS 5.3.
>>>>>
>>>> The extra volatile doesn't help at all, it still fails here (Fedora
>>>> 11,
>>>> x86_64 kernel, SELinux enforcing).
>>>>
>>> Hm, indeed. Is there a solution by somehow enabling execution (like
>>> trampolines)?
>> You can assemble with -Wa,--execstack or link with -Wl,-z,execstack.
>> But it helps only in some cases, some versions of the policy won't allow
>> unconfined processes to have exec stack, you'd need to chcon -t
>> java_exec_t
>> or similar.  This is highly Linux specific though.
> No, this test failed due to missing "volatile" (fixed now). AFAICS, it
>
> I think I'll just make the test compile-only and scan for call without
> *%esp argument, as you suggested in your previous mails.

Like this:

2009-11-03  Uros Bizjak <ubizjak@...>

     * gcc.target/i386/pr41900.c: Make test compile only.  Scan assembler
     dump to not include "call *%esp".

Tested on F11, -m32.

Uros.

Index: gcc.target/i386/pr41900.c
===================================================================
--- gcc.target/i386/pr41900.c    (revision 153863)
+++ gcc.target/i386/pr41900.c    (working copy)
@@ -1,4 +1,4 @@
-/* { dg-do run } */
+/* { dg-do compile } */
  /* { dg-require-effective-target ilp32 } */
  /* { dg-options "-O2 -fomit-frame-pointer
-mpreferred-stack-boundary=2" } */

@@ -9,3 +9,5 @@
    ((void (*)(void)) &code) ();
    return 0;
  }
+
+/* { dg-final { scan-assembler-not "call\[ \\t\]+\\*%esp" } } */


Re: [PATCH, i386]: Fix PR 41900, call *%esp shouldn't be generated because of CPU errata

by Richard Henderson-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 11/02/2009 11:48 PM, Uros Bizjak wrote:
> (TARGET_CALL_ESP): New define.

Please don't bother with this and all the pattern duplication.
Just put a comment in front of the call pattern and use "lsm".


r~

Re: [PATCH, i386]: Fix PR 41900, call *%esp shouldn't be generated because of CPU errata

by Uros Bizjak-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Nov 13, 2009 at 1:39 AM, Richard Henderson <rth@...> wrote:
> On 11/02/2009 11:48 PM, Uros Bizjak wrote:
>>
>>        (TARGET_CALL_ESP): New define.
>
> Please don't bother with this and all the pattern duplication.
> Just put a comment in front of the call pattern and use "lsm".

But all new targets (newer than P3) can use %esp here without problems...

IMO, since we already went through the pain of duplicating (in
mainline and backports to 4.3 and 4.4), I think that removing
specialized patterns and limiting existing ones would be a step
backward. If you are concerned with maintenance costs, I can perhaps
compensate extra growth by macroizing 32 and 64bit call patterns.

Uros.

Re: [PATCH, i386]: Fix PR 41900, call *%esp shouldn't be generated because of CPU errata

by Richard Henderson-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 11/12/2009 11:28 PM, Uros Bizjak wrote:
> But all new targets (newer than P3) can use %esp here without problems...

I'm concerned with the extra code required to allow the compiler to
properly compile something that's absolutely silly.


r~

Re: [PATCH, i386]: Fix PR 41900, call *%esp shouldn't be generated because of CPU errata

by Uros Bizjak-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 11/13/2009 05:44 PM, Richard Henderson wrote:

>> But all new targets (newer than P3) can use %esp here without
>> problems...
>
> I'm concerned with the extra code required to allow the compiler to
> properly compile something that's absolutely silly.

OK, patch will be (partially) reverted.

2009-11-13  Uros Bizjak <ubizjak@...>

     PR target/41900
     (*call_pop_1, *call_1, *call_value_pop_1, *call_value_1): Use "lsm"
     as operand 1 constraint.
     * config/i386/predicates.md (call_insn_operand): Depend on
     index_register_operand to avoid %esp register.

2009-11-13  Uros Bizjak <ubizjak@...>

     Revert:
     2009-11-03  Uros Bizjak <ubizjak@...>

     PR target/41900
     * config/i386/i386.h (ix86_arch_indices) <X86_ARCH_CALL_ESP>: New.
     (TARGET_CALL_ESP): New define.
     * config/i386/i386.c (initial_ix86_tune_features): Initialize
     X86_ARCH_CALL_ESP.
     * config/i386/i386.md (*call_pop_1_esp, *call_1_esp,
     *call_value_pop_1_esp, *call_value_1_esp): Rename from *call_pop_1,
     *call_1, *call_value_pop_1 and *call_value_1.  Depend on
     TARGET_CALL_ESP.
     (*call_pop_1, *call_1, *call_value_pop_1, *call_value_1):
     New patterns, use "lsm" as operand 1 constraint.
     * config/i386/predicates.md (call_insn_operand): Depend on
     index_register_operand for !TARGET_CALL_ESP to avoid %esp register.

Thanks,
Uros.


Index: i386/i386.h
===================================================================
--- i386/i386.h (revision 154156)
+++ i386/i386.h (working copy)
@@ -402,7 +402,6 @@ enum ix86_arch_indices {
   X86_ARCH_CMPXCHG8B,
   X86_ARCH_XADD,
   X86_ARCH_BSWAP,
-  X86_ARCH_CALL_ESP,
 
   X86_ARCH_LAST
 };
@@ -414,7 +413,6 @@ extern unsigned char ix86_arch_features[
 #define TARGET_CMPXCHG8B ix86_arch_features[X86_ARCH_CMPXCHG8B]
 #define TARGET_XADD ix86_arch_features[X86_ARCH_XADD]
 #define TARGET_BSWAP ix86_arch_features[X86_ARCH_BSWAP]
-#define TARGET_CALL_ESP ix86_arch_features[X86_ARCH_CALL_ESP]
 
 #define TARGET_FISTTP (TARGET_SSE3 && TARGET_80387)
 
Index: i386/i386.md
===================================================================
--- i386/i386.md (revision 154156)
+++ i386/i386.md (working copy)
@@ -14562,6 +14562,10 @@
 ;; checked for calls.  This is a bug in the generic code, but it isn't that
 ;; easy to fix.  Ignore it for now and be prepared to fix things up.
 
+;; P6 processors will jump to the address after the decrement when %esp
+;; is used as a call operand, so they will execute return address as a code.
+;; See Pentium Pro errata 70, Pentium 2 errata A33 and Pentium 3 errata E17.
+
 ;; Call subroutine returning no value.
 
 (define_expand "call_pop"
@@ -14592,27 +14596,13 @@
 }
   [(set_attr "type" "call")])
 
-(define_insn "*call_pop_1_esp"
-  [(call (mem:QI (match_operand:SI 0 "call_insn_operand" "rsm"))
- (match_operand:SI 1 "" ""))
-   (set (reg:SI SP_REG)
- (plus:SI (reg:SI SP_REG)
- (match_operand:SI 2 "immediate_operand" "i")))]
-  "!TARGET_64BIT && TARGET_CALL_ESP && !SIBLING_CALL_P (insn)"
-{
-  if (constant_call_address_operand (operands[0], Pmode))
-    return "call\t%P0";
-  return "call\t%A0";
-}
-  [(set_attr "type" "call")])
-
 (define_insn "*call_pop_1"
   [(call (mem:QI (match_operand:SI 0 "call_insn_operand" "lsm"))
  (match_operand:SI 1 "" ""))
    (set (reg:SI SP_REG)
  (plus:SI (reg:SI SP_REG)
  (match_operand:SI 2 "immediate_operand" "i")))]
-  "!TARGET_64BIT && !TARGET_CALL_ESP && !SIBLING_CALL_P (insn)"
+  "!TARGET_64BIT && !SIBLING_CALL_P (insn)"
 {
   if (constant_call_address_operand (operands[0], Pmode))
     return "call\t%P0";
@@ -14664,21 +14654,10 @@
 }
   [(set_attr "type" "call")])
 
-(define_insn "*call_1_esp"
-  [(call (mem:QI (match_operand:SI 0 "call_insn_operand" "rsm"))
- (match_operand 1 "" ""))]
-  "!TARGET_64BIT && TARGET_CALL_ESP && !SIBLING_CALL_P (insn)"
-{
-  if (constant_call_address_operand (operands[0], Pmode))
-    return "call\t%P0";
-  return "call\t%A0";
-}
-  [(set_attr "type" "call")])
-
 (define_insn "*call_1"
   [(call (mem:QI (match_operand:SI 0 "call_insn_operand" "lsm"))
  (match_operand 1 "" ""))]
-  "!TARGET_64BIT && !TARGET_CALL_ESP && !SIBLING_CALL_P (insn)"
+  "!TARGET_64BIT && !SIBLING_CALL_P (insn)"
 {
   if (constant_call_address_operand (operands[0], Pmode))
     return "call\t%P0";
@@ -21158,8 +21137,9 @@
   [(set (match_operand 0 "" "")
  (call (mem:QI (match_operand:SI 1 "constant_call_address_operand" ""))
       (match_operand:SI 2 "" "")))
-   (set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG)
-    (match_operand:SI 3 "immediate_operand" "")))]
+   (set (reg:SI SP_REG)
+ (plus:SI (reg:SI SP_REG)
+ (match_operand:SI 3 "immediate_operand" "")))]
   "!TARGET_64BIT"
 {
   if (SIBLING_CALL_P (insn))
@@ -21169,27 +21149,14 @@
 }
   [(set_attr "type" "callv")])
 
-(define_insn "*call_value_pop_1_esp"
-  [(set (match_operand 0 "" "")
- (call (mem:QI (match_operand:SI 1 "call_insn_operand" "rsm"))
-      (match_operand:SI 2 "" "")))
-   (set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG)
-    (match_operand:SI 3 "immediate_operand" "i")))]
-  "!TARGET_64BIT && TARGET_CALL_ESP && !SIBLING_CALL_P (insn)"
-{
-  if (constant_call_address_operand (operands[1], Pmode))
-    return "call\t%P1";
-  return "call\t%A1";
-}
-  [(set_attr "type" "callv")])
-
 (define_insn "*call_value_pop_1"
   [(set (match_operand 0 "" "")
  (call (mem:QI (match_operand:SI 1 "call_insn_operand" "lsm"))
       (match_operand:SI 2 "" "")))
-   (set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG)
-    (match_operand:SI 3 "immediate_operand" "i")))]
-  "!TARGET_64BIT && !TARGET_CALL_ESP && !SIBLING_CALL_P (insn)"
+   (set (reg:SI SP_REG)
+ (plus:SI (reg:SI SP_REG)
+ (match_operand:SI 3 "immediate_operand" "i")))]
+  "!TARGET_64BIT && !SIBLING_CALL_P (insn)"
 {
   if (constant_call_address_operand (operands[1], Pmode))
     return "call\t%P1";
@@ -21201,8 +21168,9 @@
   [(set (match_operand 0 "" "")
  (call (mem:QI (match_operand:SI 1 "sibcall_insn_operand" "s,U"))
       (match_operand:SI 2 "" "")))
-   (set (reg:SI SP_REG) (plus:SI (reg:SI SP_REG)
-    (match_operand:SI 3 "immediate_operand" "i,i")))]
+   (set (reg:SI SP_REG)
+ (plus:SI (reg:SI SP_REG)
+ (match_operand:SI 3 "immediate_operand" "i,i")))]
   "!TARGET_64BIT && SIBLING_CALL_P (insn)"
   "@
    jmp\t%P1
@@ -21261,23 +21229,11 @@
 }
   [(set_attr "type" "callv")])
 
-(define_insn "*call_value_1_esp"
-  [(set (match_operand 0 "" "")
- (call (mem:QI (match_operand:SI 1 "call_insn_operand" "rsm"))
-      (match_operand:SI 2 "" "")))]
-  "!TARGET_64BIT && TARGET_CALL_ESP && !SIBLING_CALL_P (insn)"
-{
-  if (constant_call_address_operand (operands[1], Pmode))
-    return "call\t%P1";
-  return "call\t%A1";
-}
-  [(set_attr "type" "callv")])
-
 (define_insn "*call_value_1"
   [(set (match_operand 0 "" "")
  (call (mem:QI (match_operand:SI 1 "call_insn_operand" "lsm"))
       (match_operand:SI 2 "" "")))]
-  "!TARGET_64BIT && !TARGET_CALL_ESP && !SIBLING_CALL_P (insn)"
+  "!TARGET_64BIT && !SIBLING_CALL_P (insn)"
 {
   if (constant_call_address_operand (operands[1], Pmode))
     return "call\t%P1";
Index: i386/predicates.md
===================================================================
--- i386/predicates.md (revision 154156)
+++ i386/predicates.md (working copy)
@@ -561,9 +561,7 @@
 ;; Test for a valid operand for a call instruction.
 (define_predicate "call_insn_operand"
   (ior (match_operand 0 "constant_call_address_operand")
-       (ior (and (match_operand 0 "register_no_elim_operand")
- (ior (match_test "TARGET_CALL_ESP")
-      (match_operand 0 "index_register_operand")))
+       (ior (match_operand 0 "index_register_operand")
     (match_operand 0 "memory_operand"))))
 
 ;; Similarly, but for tail calls, in which we cannot allow memory references.
Index: i386/i386.c
===================================================================
--- i386/i386.c (revision 154156)
+++ i386/i386.c (working copy)
@@ -1553,11 +1553,6 @@ static unsigned int initial_ix86_arch_fe
 
   /* X86_ARCH_BSWAP: Byteswap was added for 80486.  */
   ~m_386,
-
-  /* X86_ARCH_CALL_ESP: P6 processors will jump to the address after
-     the decrement (so they will execute return address as code).  See
-     Pentium Pro errata 70, Pentium 2 errata A33, Pentium 3 errata E17.  */
-  ~(m_386 | m_486 | m_PENT | m_PPRO),
 };
 
 static const unsigned int x86_accumulate_outgoing_args

Re: [PATCH, i386]: Fix PR 41900, call *%esp shouldn't be generated because of CPU errata

by Eric Botcazou-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> 2009-11-13  Uros Bizjak <ubizjak@...>
>
>      PR target/41900
>      (*call_pop_1, *call_1, *call_value_pop_1, *call_value_1): Use "lsm"
>      as operand 1 constraint.
>      * config/i386/predicates.md (call_insn_operand): Depend on
>      index_register_operand to avoid %esp register.
>
> 2009-11-13  Uros Bizjak <ubizjak@...>
>
>      Revert:
>      2009-11-03  Uros Bizjak <ubizjak@...>
>
>      PR target/41900
>      * config/i386/i386.h (ix86_arch_indices) <X86_ARCH_CALL_ESP>: New.
>      (TARGET_CALL_ESP): New define.
>      * config/i386/i386.c (initial_ix86_tune_features): Initialize
>      X86_ARCH_CALL_ESP.
>      * config/i386/i386.md (*call_pop_1_esp, *call_1_esp,
>      *call_value_pop_1_esp, *call_value_1_esp): Rename from *call_pop_1,
>      *call_1, *call_value_pop_1 and *call_value_1.  Depend on
>      TARGET_CALL_ESP.
>      (*call_pop_1, *call_1, *call_value_pop_1, *call_value_1):
>      New patterns, use "lsm" as operand 1 constraint.
>      * config/i386/predicates.md (call_insn_operand): Depend on
>      index_register_operand for !TARGET_CALL_ESP to avoid %esp register.

This has introduced a regression

FAIL: gcc.c-torture/compile/930117-1.c  -O1  (internal compiler error)
FAIL: gcc.c-torture/compile/930117-1.c  -O1  (test for excess errors)
FAIL: gcc.c-torture/compile/930117-1.c  -O2  (internal compiler error)
FAIL: gcc.c-torture/compile/930117-1.c  -O2  (test for excess errors)
FAIL: gcc.c-torture/compile/930117-1.c  -O3 -fomit-frame-pointer  (internal
compiler error)
FAIL: gcc.c-torture/compile/930117-1.c  -O3 -fomit-frame-pointer  (test for
excess errors)
FAIL: gcc.c-torture/compile/930117-1.c  -O3 -g  (internal compiler error)
FAIL: gcc.c-torture/compile/930117-1.c  -O3 -g  (test for excess errors)
FAIL: gcc.c-torture/compile/930117-1.c  -Os  (internal compiler error)
FAIL: gcc.c-torture/compile/930117-1.c  -Os  (test for excess errors)

on x86 (all branches).

--
Eric Botcazou

Re: [PATCH, i386]: Fix PR 41900, call *%esp shouldn't be generated because of CPU errata

by Uros Bizjak-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 11/14/2009 10:19 AM, Eric Botcazou wrote:

>> 2009-11-13  Uros Bizjak<ubizjak@...>
>>
>>       PR target/41900
>>       (*call_pop_1, *call_1, *call_value_pop_1, *call_value_1): Use "lsm"
>>       as operand 1 constraint.
>>       * config/i386/predicates.md (call_insn_operand): Depend on
>>       index_register_operand to avoid %esp register.
>>
>> 2009-11-13  Uros Bizjak<ubizjak@...>
>>
>>       Revert:
>>       2009-11-03  Uros Bizjak<ubizjak@...>
>>
>>       PR target/41900
>>       * config/i386/i386.h (ix86_arch_indices)<X86_ARCH_CALL_ESP>: New.
>>       (TARGET_CALL_ESP): New define.
>>       * config/i386/i386.c (initial_ix86_tune_features): Initialize
>>       X86_ARCH_CALL_ESP.
>>       * config/i386/i386.md (*call_pop_1_esp, *call_1_esp,
>>       *call_value_pop_1_esp, *call_value_1_esp): Rename from *call_pop_1,
>>       *call_1, *call_value_pop_1 and *call_value_1.  Depend on
>>       TARGET_CALL_ESP.
>>       (*call_pop_1, *call_1, *call_value_pop_1, *call_value_1):
>>       New patterns, use "lsm" as operand 1 constraint.
>>       * config/i386/predicates.md (call_insn_operand): Depend on
>>       index_register_operand for !TARGET_CALL_ESP to avoid %esp register.
>>      
> This has introduced a regression
>
> FAIL: gcc.c-torture/compile/930117-1.c  -O1  (internal compiler error)
> FAIL: gcc.c-torture/compile/930117-1.c  -O1  (test for excess errors)
> FAIL: gcc.c-torture/compile/930117-1.c  -O2  (internal compiler error)
> FAIL: gcc.c-torture/compile/930117-1.c  -O2  (test for excess errors)
> FAIL: gcc.c-torture/compile/930117-1.c  -O3 -fomit-frame-pointer  (interna
> compiler error)
> FAIL: gcc.c-torture/compile/930117-1.c  -O3 -fomit-frame-pointer  (test for
> excess errors)
> FAIL: gcc.c-torture/compile/930117-1.c  -O3 -g  (internal compiler error)
> FAIL: gcc.c-torture/compile/930117-1.c  -O3 -g  (test for excess errors)
> FAIL: gcc.c-torture/compile/930117-1.c  -Os  (internal compiler error)
> FAIL: gcc.c-torture/compile/930117-1.c  -Os  (test for excess errors)
>
> on x86 (all branches).
>    
Uh, for some reason -m32 was ineffective in my test scripts :( .

Apparently, "index_register_operand" predicate is not _that_ similar to
"register_no_elim_operand".

2009-11-14  Uros Bizjak <ubizjak@...>

     * config/i386/predicates.md (call_register_no_elim_operand):
     New predicate.  Reject stack register as valid call operand
     for 32bit targets.
     (call_insn_operand): Use call_register_no_elim_operand.

I'm testing attached patch.

Thanks,
Uros.

Index: predicates.md
===================================================================
--- predicates.md (revision 154160)
+++ predicates.md (working copy)
@@ -533,6 +533,22 @@
  FIRST_PSEUDO_REGISTER, LAST_VIRTUAL_REGISTER));
 })
 
+;; P6 processors will jump to the address after the decrement when %esp
+;; is used as a call operand, so they will execute return address as a code.
+;; See Pentium Pro errata 70, Pentium 2 errata A33 and Pentium 3 errata E17.
+
+(define_predicate "call_register_no_elim_operand"
+  (match_operand 0 "register_operand")
+{
+  if (GET_CODE (op) == SUBREG)
+    op = SUBREG_REG (op);
+
+  if (!TARGET_64BIT && op == stack_pointer_rtx)
+    return 0;
+
+  return register_no_elim_operand (op, mode);
+})
+
 ;; Similarly, but include the stack pointer.  This is used to prevent esp
 ;; from being used as an index reg.
 (define_predicate "index_register_operand"
@@ -561,7 +577,7 @@
 ;; Test for a valid operand for a call instruction.
 (define_predicate "call_insn_operand"
   (ior (match_operand 0 "constant_call_address_operand")
-       (ior (match_operand 0 "index_register_operand")
+       (ior (match_operand 0 "call_register_no_elim_operand")
     (match_operand 0 "memory_operand"))))
 
 ;; Similarly, but for tail calls, in which we cannot allow memory references.

Re: [PATCH, i386]: Fix PR 41900, call *%esp shouldn't be generated because of CPU errata

by Eric Botcazou-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> Apparently, "index_register_operand" predicate is not _that_ similar to
> "register_no_elim_operand".
>
> 2009-11-14  Uros Bizjak <ubizjak@...>
>
>      * config/i386/predicates.md (call_register_no_elim_operand):
>      New predicate.  Reject stack register as valid call operand
>      for 32bit targets.
>      (call_insn_operand): Use call_register_no_elim_operand.

I confirm that this has fixed the regression, thanks!

--
Eric Botcazou