[PATCH: PR target/40835] Remove comparison with 0 after instruction movs for thumb

View: New views
8 Messages — Rating Filter:   Alert me  

[PATCH: PR target/40835] Remove comparison with 0 after instruction movs for thumb

by Carrot Wei :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

In thumb mode the cmp instruction in the following sequence can be removed
since the previous movs instruction has already set the condition code
according to the value of r1.

       movs r0, r1
       cmp  r1, 0
       beq  .L3

In thumb mode we don't have insn patterns to express compare and conditional
branch separately, so this optimization is implemented as a peephole rule.

Test:
This patch was applied to trunk GCC and tested on the arm emulator with newlib.
No new failure.

ChangeLog:
2009-10-29  Wei Guozhi  <carrot@...>

        PR target/40835
        * config/arm/arm.c (emit_branch_after_movs): New function.
        (removable_cmp_0): New function.
        * config/arm/arm-protos.h (emit_branch_after_movs): Declare it.
        (removable_cmp_0): Declare it.
        * config/arm/arm.md: Add peephole rule to do the optimization.
               
2009-10-29  Wei Guozhi  <carrot@...>

        PR target/40835
        * gcc.target/arm/pr40835: New testcase.


thanks
Wei Guozhi


Index: arm-protos.h
===================================================================
--- arm-protos.h        (revision 153642)
+++ arm-protos.h        (working copy)
@@ -146,7 +146,8 @@ extern const char *vfp_output_fstmd (rtx
 extern void arm_set_return_address (rtx, rtx);
 extern int arm_eliminable_register (rtx);
 extern const char *arm_output_shift(rtx *, int);
-
+extern bool removable_cmp_0 (rtx cmp_op);
+extern const char *emit_branch_after_movs (rtx *operands);
 extern bool arm_output_addr_const_extra (FILE *, rtx);

 #if defined TREE_CODE


Index: arm.c
===================================================================
--- arm.c       (revision 153642)
+++ arm.c       (working copy)
@@ -21195,4 +21195,63 @@ arm_have_conditional_execution (void)
   return !TARGET_THUMB1;
 }

+/* In thumb mode the cmp instruction in the following sequence can be
+   removed since the previous movs instruction has already set the
+   condition code according to the value of r1.
+       movs r0, r1
+       cmp  r1, 0
+       beq  .L3      */
+
+const char *
+emit_branch_after_movs (rtx *operands)
+{
+  char buf[100];
+
+  /* operands[2] is the compare rtx.  */
+  switch (GET_CODE (operands[2]))
+    {
+      case EQ:
+        sprintf (buf, "beq");
+        break;
+
+      case NE:
+        sprintf (buf, "bne");
+        break;
+
+      case LT:
+        sprintf (buf, "bmi");
+        break;
+
+      case GE:
+        sprintf (buf, "bpl");
+        break;
+
+      default:
+        gcc_unreachable ();
+    }
+
+  output_asm_insn ("movs\t%0, %1", operands);
+  strcat (buf, "\t%l3");
+  output_asm_insn (buf, operands);
+  return "";
+}
+
+/* If the second operand is 0 the following compare rtx codes depend on N
+   and Z flags only, which have been set by previous movs instruction. So
+   can be removed.  */
+bool
+removable_cmp_0 (rtx cmp_op)
+{
+  switch (GET_CODE (cmp_op))
+    {
+      case EQ:
+      case NE:
+      case LT:
+      case GE:
+        return true;
+      default:
+        return false;
+    }
+}
+
 #include "gt-arm.h"


Index: arm.md
===================================================================
--- arm.md      (revision 153642)
+++ arm.md      (working copy)
@@ -7708,6 +7708,28 @@
                (const_int 8))))]
 )

+;; In thumb mode the cmp instruction in the following sequence can be
+;; removed since the previous movs instruction has already set the
+;; condition code according to the value of r1.
+;;    movs r0, r1
+;;    cmp  r1, 0
+;;    beq  .L3
+
+(define_peephole
+  [(set (match_operand:SI 0 "low_register_operand" "")
+        (match_operand:SI 1 "low_register_operand" ""))
+   (set (pc)
+        (if_then_else (match_operator 2 "arm_comparison_operator"
+                       [(match_dup 1) (const_int 0)])
+                      (label_ref (match_operand 3 "" ""))
+                      (pc)))]
+  "TARGET_THUMB && (get_attr_length (insn) == 4)
+   && removable_cmp_0 (operands[2])"
+  "*
+  return emit_branch_after_movs (operands);
+  "
+)
+
 ;; Comparison and test insns

 (define_insn "*arm_cmpsi_insn"


Index: pr40835.c
===================================================================
--- pr40835.c   (revision 0)
+++ pr40835.c   (revision 0)
@@ -0,0 +1,39 @@
+/* { dg-options "-mthumb -Os -march=armv5te" }  */
+/* { dg-final { scan-assembler-not "cmp" } } */
+
+int bar();
+void goo(int, int);
+
+void eq()
+{
+  int v = bar();
+  if (v == 0)
+    return;
+  goo(1, v);
+}
+
+void ge()
+{
+  int v = bar();
+  if (v >= 0)
+    return;
+  goo(1, v);
+}
+
+void lt()
+{
+  int v = bar();
+  if (v < 0)
+    return;
+  goo(1, v);
+}
+
+unsigned int foo();
+
+void leu()
+{
+  unsigned int v = foo();
+  if (v <= 0)
+    return;
+  goo(1, v);
+}

Re: [PATCH: PR target/40835] Remove comparison with 0 after instruction movs for thumb

by Richard Earnshaw :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 2009-10-29 at 17:11 +0800, Carrot Wei wrote:

> In thumb mode the cmp instruction in the following sequence can be removed
> since the previous movs instruction has already set the condition code
> according to the value of r1.
>
>        movs r0, r1
>        cmp  r1, 0
>        beq  .L3
>
> In thumb mode we don't have insn patterns to express compare and conditional
> branch separately, so this optimization is implemented as a peephole rule.
>

I don't like peepholes; they usually shouldn't be necessary; though I
accept that it might be hard for combine to spot this case because it
isn't identified by the data-flow (the output of the mov does not flow
into the compare).

However, you are missing several other cases that you could handle
because you've tried to preserve the use of movs rather than using subs:

        subs r0, r1, #0
is identical in flag-setting behaviour to cmp and has the side effect of
setting r0 to r1 as well, so we can replace

        movs r0, r1
        cmp r1, #0
        b<any> ...

with
        subs r0, r1, #0
        b<any> ...

Please can you at least update the patch to handle this.

R.

> Test:
> This patch was applied to trunk GCC and tested on the arm emulator with newlib.
> No new failure.
>
> ChangeLog:
> 2009-10-29  Wei Guozhi  <carrot@...>
>
>         PR target/40835
>         * config/arm/arm.c (emit_branch_after_movs): New function.
>         (removable_cmp_0): New function.
>         * config/arm/arm-protos.h (emit_branch_after_movs): Declare it.
>         (removable_cmp_0): Declare it.
>         * config/arm/arm.md: Add peephole rule to do the optimization.
>
> 2009-10-29  Wei Guozhi  <carrot@...>
>
>         PR target/40835
>         * gcc.target/arm/pr40835: New testcase.
>
>
> thanks
> Wei Guozhi
>
>
> Index: arm-protos.h
> ===================================================================
> --- arm-protos.h        (revision 153642)
> +++ arm-protos.h        (working copy)
> @@ -146,7 +146,8 @@ extern const char *vfp_output_fstmd (rtx
>  extern void arm_set_return_address (rtx, rtx);
>  extern int arm_eliminable_register (rtx);
>  extern const char *arm_output_shift(rtx *, int);
> -
> +extern bool removable_cmp_0 (rtx cmp_op);
> +extern const char *emit_branch_after_movs (rtx *operands);
>  extern bool arm_output_addr_const_extra (FILE *, rtx);
>
>  #if defined TREE_CODE
>
>
> Index: arm.c
> ===================================================================
> --- arm.c       (revision 153642)
> +++ arm.c       (working copy)
> @@ -21195,4 +21195,63 @@ arm_have_conditional_execution (void)
>    return !TARGET_THUMB1;
>  }
>
> +/* In thumb mode the cmp instruction in the following sequence can be
> +   removed since the previous movs instruction has already set the
> +   condition code according to the value of r1.
> +       movs r0, r1
> +       cmp  r1, 0
> +       beq  .L3      */
> +
> +const char *
> +emit_branch_after_movs (rtx *operands)
> +{
> +  char buf[100];
> +
> +  /* operands[2] is the compare rtx.  */
> +  switch (GET_CODE (operands[2]))
> +    {
> +      case EQ:
> +        sprintf (buf, "beq");
> +        break;
> +
> +      case NE:
> +        sprintf (buf, "bne");
> +        break;
> +
> +      case LT:
> +        sprintf (buf, "bmi");
> +        break;
> +
> +      case GE:
> +        sprintf (buf, "bpl");
> +        break;
> +
> +      default:
> +        gcc_unreachable ();
> +    }
> +
> +  output_asm_insn ("movs\t%0, %1", operands);
> +  strcat (buf, "\t%l3");
> +  output_asm_insn (buf, operands);
> +  return "";
> +}
> +
> +/* If the second operand is 0 the following compare rtx codes depend on N
> +   and Z flags only, which have been set by previous movs instruction. So
> +   can be removed.  */
> +bool
> +removable_cmp_0 (rtx cmp_op)
> +{
> +  switch (GET_CODE (cmp_op))
> +    {
> +      case EQ:
> +      case NE:
> +      case LT:
> +      case GE:
> +        return true;
> +      default:
> +        return false;
> +    }
> +}
> +
>  #include "gt-arm.h"
>
>
> Index: arm.md
> ===================================================================
> --- arm.md      (revision 153642)
> +++ arm.md      (working copy)
> @@ -7708,6 +7708,28 @@
>                 (const_int 8))))]
>  )
>
> +;; In thumb mode the cmp instruction in the following sequence can be
> +;; removed since the previous movs instruction has already set the
> +;; condition code according to the value of r1.
> +;;    movs r0, r1
> +;;    cmp  r1, 0
> +;;    beq  .L3
> +
> +(define_peephole
> +  [(set (match_operand:SI 0 "low_register_operand" "")
> +        (match_operand:SI 1 "low_register_operand" ""))
> +   (set (pc)
> +        (if_then_else (match_operator 2 "arm_comparison_operator"
> +                       [(match_dup 1) (const_int 0)])
> +                      (label_ref (match_operand 3 "" ""))
> +                      (pc)))]
> +  "TARGET_THUMB && (get_attr_length (insn) == 4)
> +   && removable_cmp_0 (operands[2])"
> +  "*
> +  return emit_branch_after_movs (operands);
> +  "
> +)
> +
>  ;; Comparison and test insns
>
>  (define_insn "*arm_cmpsi_insn"
>
>
> Index: pr40835.c
> ===================================================================
> --- pr40835.c   (revision 0)
> +++ pr40835.c   (revision 0)
> @@ -0,0 +1,39 @@
> +/* { dg-options "-mthumb -Os -march=armv5te" }  */
> +/* { dg-final { scan-assembler-not "cmp" } } */
> +
> +int bar();
> +void goo(int, int);
> +
> +void eq()
> +{
> +  int v = bar();
> +  if (v == 0)
> +    return;
> +  goo(1, v);
> +}
> +
> +void ge()
> +{
> +  int v = bar();
> +  if (v >= 0)
> +    return;
> +  goo(1, v);
> +}
> +
> +void lt()
> +{
> +  int v = bar();
> +  if (v < 0)
> +    return;
> +  goo(1, v);
> +}
> +
> +unsigned int foo();
> +
> +void leu()
> +{
> +  unsigned int v = foo();
> +  if (v <= 0)
> +    return;
> +  goo(1, v);
> +}



Re: [PATCH: PR target/40835] Remove comparison with 0 after instruction movs for thumb

by Carrot Wei :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

You are right. Use sub can remove more cmp instructions. The patch
has been updated to reflect this.

Test:
This patch was applied to trunk GCC and tested on the arm emulator with newlib.
No new failure.

ChangeLog:
2009-10-31  Wei Guozhi  <carrot@...>

        PR target/40835
        * config/arm/arm.c (emit_branch_after_movs): New function.
        (removable_cmp_0): New function.
        * config/arm/arm-protos.h (emit_branch_after_movs): Declare it.
        (removable_cmp_0): Declare it.
        * config/arm/arm.md: Add peephole rule to do the optimization.
               
2009-10-31  Wei Guozhi  <carrot@...>

        PR target/40835
        * gcc.target/arm/pr40835: New testcase.


thanks
Wei Guozhi


Index: arm.c
===================================================================
--- arm.c       (revision 153642)
+++ arm.c       (working copy)
@@ -21195,4 +21195,71 @@ arm_have_conditional_execution (void)
   return !TARGET_THUMB1;
 }

+/* In thumb mode the cmp instruction in the following sequence can be
+   removed since the previous movs instruction has already set the
+   condition code according to the value of r1.
+       movs r0, r1
+       cmp  r1, 0
+       beq  .L3
+   If we replace "movs r0, r1" with "sub r0, r1, #0", we can remove more
+   cmp instructions since more condition codes are set by sub.  */
+
+const char *
+emit_branch_after_movs (rtx *operands)
+{
+  char buf[100];
+
+  /* operands[2] is the compare rtx.  */
+  switch (GET_CODE (operands[2]))
+    {
+      case EQ:
+        sprintf (buf, "beq");
+        break;
+
+      case NE:
+        sprintf (buf, "bne");
+        break;
+
+      case LT:
+        sprintf (buf, "bmi");
+        break;
+
+      case LE:
+        sprintf (buf, "ble");
+        break;
+
+      case GT:
+        sprintf (buf, "bgt");
+        break;
+
+      case GE:
+        sprintf (buf, "bpl");
+        break;
+
+      case GTU:
+        sprintf (buf, "bhi");
+        break;
+
+      case GEU:
+        sprintf (buf, "bhs");
+        break;
+
+      case LTU:
+        sprintf (buf, "blo");
+        break;
+
+      case LEU:
+        sprintf (buf, "bls");
+        break;
+
+      default:
+        gcc_unreachable ();
+    }
+
+  output_asm_insn ("sub\t%0, %1, #0", operands);
+  strcat (buf, "\t%l3");
+  output_asm_insn (buf, operands);
+  return "";
+}
+
 #include "gt-arm.h"
Index: arm-protos.h
===================================================================
--- arm-protos.h        (revision 153642)
+++ arm-protos.h        (working copy)
@@ -146,7 +146,7 @@ extern const char *vfp_output_fstmd (rtx
 extern void arm_set_return_address (rtx, rtx);
 extern int arm_eliminable_register (rtx);
 extern const char *arm_output_shift(rtx *, int);
-
+extern const char *emit_branch_after_movs (rtx *operands);
 extern bool arm_output_addr_const_extra (FILE *, rtx);

 #if defined TREE_CODE
Index: arm.md
===================================================================
--- arm.md      (revision 153642)
+++ arm.md      (working copy)
@@ -7708,6 +7708,27 @@
                (const_int 8))))]
 )

+;; In thumb mode the cmp instruction in the following sequence can be
+;; removed since the previous movs instruction has already set the
+;; condition code according to the value of r1.
+;;    movs r0, r1
+;;    cmp  r1, 0
+;;    beq  .L3
+
+(define_peephole
+  [(set (match_operand:SI 0 "low_register_operand" "")
+        (match_operand:SI 1 "low_register_operand" ""))
+   (set (pc)
+        (if_then_else (match_operator 2 "arm_comparison_operator"
+                       [(match_dup 1) (const_int 0)])
+                      (label_ref (match_operand 3 "" ""))
+                      (pc)))]
+  "TARGET_THUMB && (get_attr_length (insn) == 4)"
+  "*
+  return emit_branch_after_movs (operands);
+  "
+)
+
 ;; Comparison and test insns

 (define_insn "*arm_cmpsi_insn"


Index: pr40835.c
===================================================================
--- pr40835.c   (revision 0)
+++ pr40835.c   (revision 0)
@@ -0,0 +1,55 @@
+/* { dg-options "-mthumb -Os -march=armv5te" }  */
+/* { dg-final { scan-assembler-not "cmp" } } */
+
+int bar();
+void goo(int, int);
+
+void eq()
+{
+  int v = bar();
+  if (v == 0)
+    return;
+  goo(1, v);
+}
+
+void ge()
+{
+  int v = bar();
+  if (v >= 0)
+    return;
+  goo(1, v);
+}
+
+void gt()
+{
+  int v = bar();
+  if (v > 0)
+    return;
+  goo(1, v);
+}
+
+void lt()
+{
+  int v = bar();
+  if (v < 0)
+    return;
+  goo(1, v);
+}
+
+void le()
+{
+  int v = bar();
+  if (v <= 0)
+    return;
+  goo(1, v);
+}
+
+unsigned int foo();
+
+void leu()
+{
+  unsigned int v = foo();
+  if (v <= 0)
+    return;
+  goo(1, v);
+}



On Thu, Oct 29, 2009 at 6:36 PM, Richard Earnshaw <rearnsha@...> wrote:

> On Thu, 2009-10-29 at 17:11 +0800, Carrot Wei wrote:
>> In thumb mode the cmp instruction in the following sequence can be removed
>> since the previous movs instruction has already set the condition code
>> according to the value of r1.
>>
>>        movs r0, r1
>>        cmp  r1, 0
>>        beq  .L3
>>
>> In thumb mode we don't have insn patterns to express compare and conditional
>> branch separately, so this optimization is implemented as a peephole rule.
>>
>
> I don't like peepholes; they usually shouldn't be necessary; though I
> accept that it might be hard for combine to spot this case because it
> isn't identified by the data-flow (the output of the mov does not flow
> into the compare).
>
> However, you are missing several other cases that you could handle
> because you've tried to preserve the use of movs rather than using subs:
>
>        subs r0, r1, #0
> is identical in flag-setting behaviour to cmp and has the side effect of
> setting r0 to r1 as well, so we can replace
>
>        movs    r0, r1
>        cmp     r1, #0
>        b<any>  ...
>
> with
>        subs    r0, r1, #0
>        b<any>  ...
>
> Please can you at least update the patch to handle this.
>
> R.
>
>> Test:
>> This patch was applied to trunk GCC and tested on the arm emulator with newlib.
>> No new failure.
>>
>> ChangeLog:
>> 2009-10-29  Wei Guozhi  <carrot@...>
>>
>>         PR target/40835
>>         * config/arm/arm.c (emit_branch_after_movs): New function.
>>         (removable_cmp_0): New function.
>>         * config/arm/arm-protos.h (emit_branch_after_movs): Declare it.
>>         (removable_cmp_0): Declare it.
>>         * config/arm/arm.md: Add peephole rule to do the optimization.
>>
>> 2009-10-29  Wei Guozhi  <carrot@...>
>>
>>         PR target/40835
>>         * gcc.target/arm/pr40835: New testcase.
>>
>>
>> thanks
>> Wei Guozhi
>>
>>
>> Index: arm-protos.h
>> ===================================================================
>> --- arm-protos.h        (revision 153642)
>> +++ arm-protos.h        (working copy)
>> @@ -146,7 +146,8 @@ extern const char *vfp_output_fstmd (rtx
>>  extern void arm_set_return_address (rtx, rtx);
>>  extern int arm_eliminable_register (rtx);
>>  extern const char *arm_output_shift(rtx *, int);
>> -
>> +extern bool removable_cmp_0 (rtx cmp_op);
>> +extern const char *emit_branch_after_movs (rtx *operands);
>>  extern bool arm_output_addr_const_extra (FILE *, rtx);
>>
>>  #if defined TREE_CODE
>>
>>
>> Index: arm.c
>> ===================================================================
>> --- arm.c       (revision 153642)
>> +++ arm.c       (working copy)
>> @@ -21195,4 +21195,63 @@ arm_have_conditional_execution (void)
>>    return !TARGET_THUMB1;
>>  }
>>
>> +/* In thumb mode the cmp instruction in the following sequence can be
>> +   removed since the previous movs instruction has already set the
>> +   condition code according to the value of r1.
>> +       movs r0, r1
>> +       cmp  r1, 0
>> +       beq  .L3      */
>> +
>> +const char *
>> +emit_branch_after_movs (rtx *operands)
>> +{
>> +  char buf[100];
>> +
>> +  /* operands[2] is the compare rtx.  */
>> +  switch (GET_CODE (operands[2]))
>> +    {
>> +      case EQ:
>> +        sprintf (buf, "beq");
>> +        break;
>> +
>> +      case NE:
>> +        sprintf (buf, "bne");
>> +        break;
>> +
>> +      case LT:
>> +        sprintf (buf, "bmi");
>> +        break;
>> +
>> +      case GE:
>> +        sprintf (buf, "bpl");
>> +        break;
>> +
>> +      default:
>> +        gcc_unreachable ();
>> +    }
>> +
>> +  output_asm_insn ("movs\t%0, %1", operands);
>> +  strcat (buf, "\t%l3");
>> +  output_asm_insn (buf, operands);
>> +  return "";
>> +}
>> +
>> +/* If the second operand is 0 the following compare rtx codes depend on N
>> +   and Z flags only, which have been set by previous movs instruction. So
>> +   can be removed.  */
>> +bool
>> +removable_cmp_0 (rtx cmp_op)
>> +{
>> +  switch (GET_CODE (cmp_op))
>> +    {
>> +      case EQ:
>> +      case NE:
>> +      case LT:
>> +      case GE:
>> +        return true;
>> +      default:
>> +        return false;
>> +    }
>> +}
>> +
>>  #include "gt-arm.h"
>>
>>
>> Index: arm.md
>> ===================================================================
>> --- arm.md      (revision 153642)
>> +++ arm.md      (working copy)
>> @@ -7708,6 +7708,28 @@
>>                 (const_int 8))))]
>>  )
>>
>> +;; In thumb mode the cmp instruction in the following sequence can be
>> +;; removed since the previous movs instruction has already set the
>> +;; condition code according to the value of r1.
>> +;;    movs r0, r1
>> +;;    cmp  r1, 0
>> +;;    beq  .L3
>> +
>> +(define_peephole
>> +  [(set (match_operand:SI 0 "low_register_operand" "")
>> +        (match_operand:SI 1 "low_register_operand" ""))
>> +   (set (pc)
>> +        (if_then_else (match_operator 2 "arm_comparison_operator"
>> +                       [(match_dup 1) (const_int 0)])
>> +                      (label_ref (match_operand 3 "" ""))
>> +                      (pc)))]
>> +  "TARGET_THUMB && (get_attr_length (insn) == 4)
>> +   && removable_cmp_0 (operands[2])"
>> +  "*
>> +  return emit_branch_after_movs (operands);
>> +  "
>> +)
>> +
>>  ;; Comparison and test insns
>>
>>  (define_insn "*arm_cmpsi_insn"
>>
>>
>> Index: pr40835.c
>> ===================================================================
>> --- pr40835.c   (revision 0)
>> +++ pr40835.c   (revision 0)
>> @@ -0,0 +1,39 @@
>> +/* { dg-options "-mthumb -Os -march=armv5te" }  */
>> +/* { dg-final { scan-assembler-not "cmp" } } */
>> +
>> +int bar();
>> +void goo(int, int);
>> +
>> +void eq()
>> +{
>> +  int v = bar();
>> +  if (v == 0)
>> +    return;
>> +  goo(1, v);
>> +}
>> +
>> +void ge()
>> +{
>> +  int v = bar();
>> +  if (v >= 0)
>> +    return;
>> +  goo(1, v);
>> +}
>> +
>> +void lt()
>> +{
>> +  int v = bar();
>> +  if (v < 0)
>> +    return;
>> +  goo(1, v);
>> +}
>> +
>> +unsigned int foo();
>> +
>> +void leu()
>> +{
>> +  unsigned int v = foo();
>> +  if (v <= 0)
>> +    return;
>> +  goo(1, v);
>> +}
>
>
>

Re: [PATCH: PR target/40835] Remove comparison with 0 after instruction movs for thumb

by Richard Earnshaw :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Sat, 2009-10-31 at 20:56 +0800, Carrot Wei wrote:

> You are right. Use sub can remove more cmp instructions. The patch
> has been updated to reflect this.
>
> Test:
> This patch was applied to trunk GCC and tested on the arm emulator with newlib.
> No new failure.
>
> ChangeLog:
> 2009-10-31  Wei Guozhi  <carrot@...>
>
>         PR target/40835
>         * config/arm/arm.c (emit_branch_after_movs): New function.
>         (removable_cmp_0): New function.
>         * config/arm/arm-protos.h (emit_branch_after_movs): Declare it.
>         (removable_cmp_0): Declare it.
>         * config/arm/arm.md: Add peephole rule to do the optimization.
>
> 2009-10-31  Wei Guozhi  <carrot@...>
>
>         PR target/40835
>         * gcc.target/arm/pr40835: New testcase.
>
This is needlessly complicated.  I've only tested the following by
examination, but it's a much more robust solution.  

Note I wouldn't normally condone adding peepholes for the second case
(where there's a natural data-flow dependency between the insns), but it
turns out that in this scenario combine is being prevented from
optimizing this case by the need to reduce register pressure and this is
common enough to warrant an exception.

R.

[thumb.patch]

*** arm.md (revision 153753)
--- arm.md (local)
*************** (define_insn "cbranchsi4_scratch"
*** 6770,6775 ****
--- 6770,6776 ----
  (const_int 6)
  (const_int 8))))]
  )
+
  (define_insn "*movsi_cbranchsi4"
    [(set (pc)
  (if_then_else
*************** (define_insn "*movsi_cbranchsi4"
*** 6833,6838 ****
--- 6834,6878 ----
    (const_int 10)))))]
  )
 
+ (define_peephole2
+   [(set (match_operand:SI 0 "low_register_operand" "")
+ (match_operand:SI 1 "low_register_operand" ""))
+    (set (pc)
+ (if_then_else (match_operator 2 "arm_comparison_operator"
+       [(match_dup 1) (const_int 0)])
+      (label_ref (match_operand 3 "" ""))
+      (pc)))]
+   "TARGET_THUMB1"
+   [(parallel
+     [(set (pc)
+ (if_then_else (match_op_dup 2 [(match_dup 1) (const_int 0)])
+      (label_ref (match_dup 3))
+      (pc)))
+      (set (match_dup 0) (match_dup 1))])]
+   ""
+ )
+
+ ;; Sigh!  This variant shouldn't be needed, but combine often fails to
+ ;; merge cases like this because the op1 is a hard register in
+ ;; CLASS_LIKELY_SPILLED_P.
+ (define_peephole2
+   [(set (match_operand:SI 0 "low_register_operand" "")
+ (match_operand:SI 1 "low_register_operand" ""))
+    (set (pc)
+ (if_then_else (match_operator 2 "arm_comparison_operator"
+       [(match_dup 0) (const_int 0)])
+      (label_ref (match_operand 3 "" ""))
+      (pc)))]
+   "TARGET_THUMB1"
+   [(parallel
+     [(set (pc)
+ (if_then_else (match_op_dup 2 [(match_dup 1) (const_int 0)])
+      (label_ref (match_dup 3))
+      (pc)))
+      (set (match_dup 0) (match_dup 1))])]
+   ""
+ )
+
  (define_insn "*negated_cbranchsi4"
    [(set (pc)
  (if_then_else


Re: [PATCH: PR target/40835] Remove comparison with 0 after instruction movs for thumb

by Carrot Wei :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Great!  I didn't know there is an insn pattern "movsi_cbranchsi4" can
be used. When will you check in this patch?

thanks
Wei Guozhi


On Sat, Oct 31, 2009 at 9:38 PM, Richard Earnshaw <rearnsha@...> wrote:

>
> On Sat, 2009-10-31 at 20:56 +0800, Carrot Wei wrote:
>> You are right. Use sub can remove more cmp instructions. The patch
>> has been updated to reflect this.
>>
>> Test:
>> This patch was applied to trunk GCC and tested on the arm emulator with newlib.
>> No new failure.
>>
>> ChangeLog:
>> 2009-10-31  Wei Guozhi  <carrot@...>
>>
>>         PR target/40835
>>         * config/arm/arm.c (emit_branch_after_movs): New function.
>>         (removable_cmp_0): New function.
>>         * config/arm/arm-protos.h (emit_branch_after_movs): Declare it.
>>         (removable_cmp_0): Declare it.
>>         * config/arm/arm.md: Add peephole rule to do the optimization.
>>
>> 2009-10-31  Wei Guozhi  <carrot@...>
>>
>>         PR target/40835
>>         * gcc.target/arm/pr40835: New testcase.
>>
>
> This is needlessly complicated.  I've only tested the following by
> examination, but it's a much more robust solution.
>
> Note I wouldn't normally condone adding peepholes for the second case
> (where there's a natural data-flow dependency between the insns), but it
> turns out that in this scenario combine is being prevented from
> optimizing this case by the need to reduce register pressure and this is
> common enough to warrant an exception.
>
> R.
>

Re: [PATCH: PR target/40835] Remove comparison with 0 after instruction movs for thumb

by Carrot Wei :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I've tested this patch without any new failure. If you agree I can
check in this patch and the new test case.

thanks
Wei Guozhi

On Sun, Nov 1, 2009 at 5:41 PM, Carrot Wei <carrot@...> wrote:

> Great!  I didn't know there is an insn pattern "movsi_cbranchsi4" can
> be used. When will you check in this patch?
>
> thanks
> Wei Guozhi
>
>
> On Sat, Oct 31, 2009 at 9:38 PM, Richard Earnshaw <rearnsha@...> wrote:
>>
>> On Sat, 2009-10-31 at 20:56 +0800, Carrot Wei wrote:
>>> You are right. Use sub can remove more cmp instructions. The patch
>>> has been updated to reflect this.
>>>
>>> Test:
>>> This patch was applied to trunk GCC and tested on the arm emulator with newlib.
>>> No new failure.
>>>
>>> ChangeLog:
>>> 2009-10-31  Wei Guozhi  <carrot@...>
>>>
>>>         PR target/40835
>>>         * config/arm/arm.c (emit_branch_after_movs): New function.
>>>         (removable_cmp_0): New function.
>>>         * config/arm/arm-protos.h (emit_branch_after_movs): Declare it.
>>>         (removable_cmp_0): Declare it.
>>>         * config/arm/arm.md: Add peephole rule to do the optimization.
>>>
>>> 2009-10-31  Wei Guozhi  <carrot@...>
>>>
>>>         PR target/40835
>>>         * gcc.target/arm/pr40835: New testcase.
>>>
>>
>> This is needlessly complicated.  I've only tested the following by
>> examination, but it's a much more robust solution.
>>
>> Note I wouldn't normally condone adding peepholes for the second case
>> (where there's a natural data-flow dependency between the insns), but it
>> turns out that in this scenario combine is being prevented from
>> optimizing this case by the need to reduce register pressure and this is
>> common enough to warrant an exception.
>>
>> R.
>>
>

Re: [PATCH: PR target/40835] Remove comparison with 0 after instruction movs for thumb

by Richard Earnshaw :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Tue, 2009-11-03 at 22:57 +0800, Carrot Wei wrote:
> I've tested this patch without any new failure. If you agree I can
> check in this patch and the new test case.
>

I've installed my patch along with your test.

R.

2009-11-04  Richard Earnshaw  <rearnsha@...>

        PR target/40835
        * arm.md (peephole2 patterns for move and compare): New.

2009-11-04  Wei Guozhi  <carrot@...>

        PR target/40835
        * gcc.target/arm/pr40835: New testcase.

> thanks
> Wei Guozhi
>
> On Sun, Nov 1, 2009 at 5:41 PM, Carrot Wei <carrot@...> wrote:
> > Great!  I didn't know there is an insn pattern "movsi_cbranchsi4" can
> > be used. When will you check in this patch?
> >
> > thanks
> > Wei Guozhi
> >
> >
> > On Sat, Oct 31, 2009 at 9:38 PM, Richard Earnshaw <rearnsha@...> wrote:
> >>
> >> On Sat, 2009-10-31 at 20:56 +0800, Carrot Wei wrote:
> >>> You are right. Use sub can remove more cmp instructions. The patch
> >>> has been updated to reflect this.
> >>>
> >>> Test:
> >>> This patch was applied to trunk GCC and tested on the arm emulator with newlib.
> >>> No new failure.
> >>>
> >>> ChangeLog:
> >>> 2009-10-31  Wei Guozhi  <carrot@...>
> >>>
> >>>         PR target/40835
> >>>         * config/arm/arm.c (emit_branch_after_movs): New function.
> >>>         (removable_cmp_0): New function.
> >>>         * config/arm/arm-protos.h (emit_branch_after_movs): Declare it.
> >>>         (removable_cmp_0): Declare it.
> >>>         * config/arm/arm.md: Add peephole rule to do the optimization.
> >>>
> >>> 2009-10-31  Wei Guozhi  <carrot@...>
> >>>
> >>>         PR target/40835
> >>>         * gcc.target/arm/pr40835: New testcase.
> >>>
> >>
> >> This is needlessly complicated.  I've only tested the following by
> >> examination, but it's a much more robust solution.
> >>
> >> Note I wouldn't normally condone adding peepholes for the second case
> >> (where there's a natural data-flow dependency between the insns), but it
> >> turns out that in this scenario combine is being prevented from
> >> optimizing this case by the need to reduce register pressure and this is
> >> common enough to warrant an exception.
> >>
> >> R.
> >>
> >


Re: [PATCH: PR target/40835] Remove comparison with 0 after instruction movs for thumb

by Richard Earnshaw :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Tue, 2009-11-03 at 22:57 +0800, Carrot Wei wrote:
> I've tested this patch without any new failure. If you agree I can
> check in this patch and the new test case.

I've made a small change to this test so that it won't cause spurious
fails when testing cortex-a8 (or other thumb-2 targets).

Applied to trunk.

2009-11-25  Richard Earnshaw  <rearnsha@...>

        * gcc.target/arm/pr40835.c: Require a thumb1 target, do not force
        -march=armv5te.



[testsuite.patch]

*** gcc.target/arm/pr40835.c (revision 154657)
--- gcc.target/arm/pr40835.c (local)
***************
*** 1,4 ****
! /* { dg-options "-mthumb -Os -march=armv5te" }  */
  /* { dg-final { scan-assembler-not "cmp" } } */
 
  int bar();
--- 1,5 ----
! /* { dg-options "-mthumb -Os" }  */
! /* { dg-require-effective-target arm_thumb1_ok } */
  /* { dg-final { scan-assembler-not "cmp" } } */
 
  int bar();