|
View:
New views
7 Messages
—
Rating Filter:
Alert me
|
|
|
How to support 40bit GP registerHI all,
I am porting GCC 4.4.0 for a 32bit target. The target has 40bit data registers and 32bit address registers that can be used as general purpose registers. When 40bit registers are used for arithmetic operations or comparison operations GCC generates code assuming that its a 32bit register. Whenever there is a move from address register to data register sign extension is automatically performed by the target. Since the data register is 40bit after some operations sign/zero extension has to be performed for the result to be proper. Take the following test case for example : typedef struct { char b0; char b1; char b2; char b3; char b4; char b5; } __attribute__ ((packed)) b_struct; typedef struct { short a; long b; short c; short d; b_struct e; } __attribute__ ((packed)) a_struct; int main(void) { volatile a_struct *a; volatile a_struct b; a = &b; *a = (a_struct){1,2,3,4}; a->e.b4 = 'c'; if (a->b != 2) abort (); exit (0); } For accessing a->b GCC generates the following code: move.l (sp-16), d3 lsrr.l #<16, d3 move.l (sp-12),d2 asll #<16,d2 or d3,d2 cmpeq.w #<2,d2 jf _L2 Because data registers are 40 bit for 'asll' operation the shift count should be 16+8 or there should be sign extension from 32bit to 40 bits after the 'or' operation. The target has instruction to sign extend from 32bit to 40 bit. Similarly there are other operation that requires sign/zero extension. So is there any way to tell GCC that the data registers are 40bit and there by expect it to generate sign/zero extension accordingly ? Regards, Shafi |
|
|
Re: How to support 40bit GP registerOn 10/21/2009 07:25 AM, Mohamed Shafi wrote:
> For accessing a->b GCC generates the following code: > > move.l (sp-16), d3 > lsrr.l #<16, d3 > move.l (sp-12),d2 > asll #<16,d2 > or d3,d2 > cmpeq.w #<2,d2 > jf _L2 > > Because data registers are 40 bit for 'asll' operation the shift count > should be 16+8 or there should be sign extension from 32bit to 40 bits > after the 'or' operation. The target has instruction to sign extend > from 32bit to 40 bit. > > Similarly there are other operation that requires sign/zero extension. > So is there any way to tell GCC that the data registers are 40bit and > there by expect it to generate sign/zero extension accordingly ? Define a machine mode for your 40-bit type in cpu-modes.def. Depending on how your 40-bit type is stored in memory, you'll use either INT_MODE (RI, 5) // load-store uses exactly 5 bytes FRACTIONAL_INT_MODE (RI, 40, 8) // load-store uses 8 bytes Where I've arbitrarily chosen "RImode" as a mnemonic for Register Integral Mode. Now you define arithmetic operations, as needed, on RImode. You define the "extendsiri" pattern to be that sign-extend from 32-to-40-bit instruction. You define your comparison patterns on RImode, and not on SImode, since your comparison instruction works on the entire 40 bits. You'll wind up with a selection of patterns in your machine description that have a sign-extension pattern built in, depending on the exact behaviour of your ISA. There are plenty of examples on x86_64, mips64, and Alpha (to name a few) that have similar properties with SI and DImodes. Examine the -fdump-rtl-combine-details dump for exemplars of the canonical forms that the combiner creates when it tries to merge sign-extension instructions into preceeding patterns. r~ |
|
|
Re: How to support 40bit GP register2009/10/22 Richard Henderson <rth@...>:
> On 10/21/2009 07:25 AM, Mohamed Shafi wrote: >> >> For accessing a->b GCC generates the following code: >> >> move.l (sp-16), d3 >> lsrr.l #<16, d3 >> move.l (sp-12),d2 >> asll #<16,d2 >> or d3,d2 >> cmpeq.w #<2,d2 >> jf _L2 >> >> Because data registers are 40 bit for 'asll' operation the shift count >> should be 16+8 or there should be sign extension from 32bit to 40 bits >> after the 'or' operation. The target has instruction to sign extend >> from 32bit to 40 bit. >> >> Similarly there are other operation that requires sign/zero extension. >> So is there any way to tell GCC that the data registers are 40bit and >> there by expect it to generate sign/zero extension accordingly ? > > Define a machine mode for your 40-bit type in cpu-modes.def. Depending on > how your 40-bit type is stored in memory, you'll use either > > INT_MODE (RI, 5) // load-store uses exactly 5 bytes > FRACTIONAL_INT_MODE (RI, 40, 8) // load-store uses 8 bytes > Load-store uses 32bits. Sign extension happens automatically. So i have choosen INT_MODE (RI, 5) and copied movsi and renamed it to movri. I have also specified that RImode need only one register. > Where I've arbitrarily chosen "RImode" as a mnemonic for Register Integral > Mode. Now you define arithmetic operations, as needed, on > RImode. You define the "extendsiri" pattern to be that sign-extend from > 32-to-40-bit instruction. You define your comparison patterns on RImode, > and not on SImode, since your comparison instruction works on the entire 40 > bits. I have defined extendsiri and cbranchri4 patterns. When i compile a program like unsigned long xh = 1; int main () { unsigned long yh = 0xffffull; unsigned long z = xh * yh; if (z != yh) abort (); return 0; } I get the following ICE internal compiler error: in immed_double_const, at emit-rtl.c:553 This happens from cse_insn () calls insert() -> gen_lowpart -> gen_lowpart_common -> simplify_gen_subreg -> simplfy_immed_subreg. simplify_immed_subreg is called with the parameters (outermode=RImode, (const_int 65535), innermode=DImode, byte=0) cse_insn is called for the following insn (insn 10 9 11 3 bug7.c:14 (set (reg:RI 67) (const_int 65535 [0xffff])) 4 {movri} (nil)) How can i overcome this? Regards, Shafi > > You'll wind up with a selection of patterns in your machine description that > have a sign-extension pattern built in, depending on the exact behaviour of > your ISA. There are plenty of examples on x86_64, mips64, and Alpha (to > name a few) that have similar properties with SI and DImodes. Examine the > -fdump-rtl-combine-details dump for exemplars of the canonical forms that > the combiner creates when it tries to merge sign-extension instructions into > preceeding patterns. > |
|
|
Re: How to support 40bit GP registerMohamed Shafi wrote:
> Load-store uses 32bits. Sign extension happens automatically. So i > have choosen INT_MODE (RI, 5) and copied movsi and renamed it to > movri. I have also specified that RImode need only one register. > I get the following ICE > > internal compiler error: in immed_double_const, at emit-rtl.c:553 > > This happens from cse_insn () calls insert() -> gen_lowpart -> > gen_lowpart_common -> simplify_gen_subreg -> simplfy_immed_subreg. > simplify_immed_subreg is called with the parameters (outermode=RImode, > (const_int 65535), innermode=DImode, byte=0) > > cse_insn is called for the following insn > > (insn 10 9 11 3 bug7.c:14 (set (reg:RI 67) > (const_int 65535 [0xffff])) 4 {movri} (nil)) > > > How can i overcome this? Just from reading the source for immed_double_const, I see: > /* There are the following cases (note that there are no modes with > HOST_BITS_PER_WIDE_INT < GET_MODE_BITSIZE (mode) < 2 * HOST_BITS_PER_WIDE_INT): Oops. That's no longer true if HBPWI == 32 and your new mode has 40 bits. > gcc_assert (GET_MODE_BITSIZE (mode) == 2 * HOST_BITS_PER_WIDE_INT); I would guess that assert is firing. > /* If this integer fits in one word, return a CONST_INT. */ if ((i1 == 0 && i0 >= 0) || (i1 == ~0 && i0 < 0)) return GEN_INT (i0); Here you'll want to mask out and check only the low 8 (== 40 - 32, i.e. GET_MODE_BITSIZE(mode) - HOST_BITS_PER_WIDE_INT) bits of i1, I think. The rest of the code looks like it should work. cheers, DaveK |
|
|
Re: How to support 40bit GP registerOn 11/04/2009 05:34 AM, Mohamed Shafi wrote:
> Load-store uses 32bits. Sign extension happens automatically. So i > have choosen INT_MODE (RI, 5) and copied movsi and renamed it to > movri. I have also specified that RImode need only one register. This isn't going to work. In order to get correct code, you're going to need to be able to spill and reload the full 40-bit value. If you can't do this easily... then I'm afraid we'll have to find a more complicated solution which involves only exposing RImode values after register allocation. > internal compiler error: in immed_double_const, at emit-rtl.c:553 Hmm. This is a nasty little logic error. The quickest work-around for the problem is to set need_64bit_hwint in config.gcc. r~ |
|
|
Re: How to support 40bit GP register2009/10/22 Richard Henderson <rth@...>:
> On 10/21/2009 07:25 AM, Mohamed Shafi wrote: >> >> For accessing a->b GCC generates the following code: >> >> move.l (sp-16), d3 >> lsrr.l #<16, d3 >> move.l (sp-12),d2 >> asll #<16,d2 >> or d3,d2 >> cmpeq.w #<2,d2 >> jf _L2 >> >> Because data registers are 40 bit for 'asll' operation the shift count >> should be 16+8 or there should be sign extension from 32bit to 40 bits >> after the 'or' operation. The target has instruction to sign extend >> from 32bit to 40 bit. >> >> Similarly there are other operation that requires sign/zero extension. >> So is there any way to tell GCC that the data registers are 40bit and >> there by expect it to generate sign/zero extension accordingly ? > > Define a machine mode for your 40-bit type in cpu-modes.def. Depending on > how your 40-bit type is stored in memory, you'll use either > > INT_MODE (RI, 5) // load-store uses exactly 5 bytes > FRACTIONAL_INT_MODE (RI, 40, 8) // load-store uses 8 bytes > > Where I've arbitrarily chosen "RImode" as a mnemonic for Register Integral > Mode. Now you define arithmetic operations, as needed, on > RImode. You define the "extendsiri" pattern to be that sign-extend from > 32-to-40-bit instruction. You define your comparison patterns on RImode, > and not on SImode, since your comparison instruction works on the entire 40 > bits. > > You'll wind up with a selection of patterns in your machine description that > have a sign-extension pattern built in, depending on the exact behaviour of > your ISA. There are plenty of examples on x86_64, mips64, and Alpha (to > name a few) that have similar properties with SI and DImodes. Examine the > -fdump-rtl-combine-details dump for exemplars of the canonical forms that > the combiner creates when it tries to merge sign-extension instructions into > preceeding patterns. > will wind up with a selection of patterns do you mean to say that i should have patterns for operations that operate on full 40bits in RImode and disable the corresponding SImode patterns? Or is it that i have to write nameless patterns in RImode for arithmetic operations and look at the dumps to see how the combiner will merge the patterns so that it can match the comparison operations? Regards, Shafi |
|
|
Re: How to support 40bit GP registerOn 11/09/2009 06:13 AM, Mohamed Shafi wrote:
> Ok i have comparison patterns written in RImode. When you say that i > will wind up with a selection of patterns do you mean to say that i > should have patterns for operations that operate on full 40bits in > RImode and disable the corresponding SImode patterns? Or is it that i > have to write nameless patterns in RImode for arithmetic operations > and look at the dumps to see how the combiner will merge the patterns > so that it can match the comparison operations? The later. r~ |
| Free embeddable forum powered by Nabble | Forum Help |