New Opteron box, dedicated to PostgreSQL

View: New views
14 Messages — Rating Filter:   Alert me  

New Opteron box, dedicated to PostgreSQL

by Axel Rau :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

while configuring my 1st PostgreSQL box with dual Opterons (2212) on  
FreeBSD, I have some questions:

1. Are there any precautions in fbsd to prevent from excessive page  
switching between the memory nodes?
    E.g. what happens if file cache pages (heavily used by Pg,  
planned 6GB) frequently not found on local node?
2. If there is no good solution for this problem, does it make sense  
to stay with one dualcore (soon quadcore) processor?
3. What are the recommendations for tuning I/O)?
    - setting sysctl vfs.read_max to 16 or 32
    - rebuilding the relevant filesystem with 32K blocks and 4K frags
    Are these reliable?
4. Can I install the system on an Areca raid?

Thanks, Axel
---------------------------------------------------------------------
Axel Rau, ☀Frankfurt , Germany                       +49 69 9514 18 0


_______________________________________________
freebsd-database@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-database
To unsubscribe, send any mail to "freebsd-database-unsubscribe@..."

RE: New Opteron box, dedicated to PostgreSQL

by Jan Mikkelsen-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Axel Rau wrote:
 
> 1. Are there any precautions in fbsd to prevent from excessive page  
> switching between the memory nodes?
>     E.g. what happens if file cache pages (heavily used by Pg,  
> planned 6GB) frequently not found on local node?

Not that I'm aware of.  FreeBSD doesn't have NUMA awareness.

> 2. If there is no good solution for this problem, does it make sense  
> to stay with one dualcore (soon quadcore) processor?

Benchmark and see.

> 4. Can I install the system on an Areca raid?

Yes.  If you are planning to use 6.2-RELEASE, make sure that you get the
arcmsr driver from 6-STABLE.  The arcmsr driver in 6.2-RELEASE can show
I/O errors under high load.

Regards,

Jan Mikkelsen

_______________________________________________
freebsd-database@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-database
To unsubscribe, send any mail to "freebsd-database-unsubscribe@..."

Parent Message unknown Re: New Opteron box, dedicated to PostgreSQL

by Oliver Fromme :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Axel Rau wrote:
 > while configuring my 1st PostgreSQL box with dual Opterons (2212) on  
 > FreeBSD, I have some questions:
 > [...]
 > 3. What are the recommendations for tuning I/O)?
 >     - setting sysctl vfs.read_max to 16 or 32
 >     - rebuilding the relevant filesystem with 32K blocks and 4K frags
 >     Are these reliable?

Personally I would leave the FS parameters at the defaults.
The bsize/fsize defaults of 16k/2k are actually quite well-
suited for PostgreSQL, as far as I can tell.  (If there are
people who have made different experience, I'd like to hear
about it.)

If you expect that only few, large tables will be used,
then reducing the inode density (i.e. increasing the value
of the -i option to newfs) might be a good idea.  Note that
PostgreSQL stores each object (table, index etc.) in its
own file.

Be sure to disable background fsck via /etc/rc.conf.  I've
seen it breaking file systems under certain circumstances.
Instead, you might want want to give PWD's new gjournal
code a try (it's in -current, but I think there's a port
to RELENG_6, too).  On a related note, setting up gmirror
for RAID-1 has worked very well for me, including postgres
machines (the balance algorithm "load" seems to work best
with pgsql databases).

Of course, the usual PostgreSQL tuning advices apply, e.g.
increase the SysV IPC resources (i.e. kernel parameters
for semaphores and shared memory), optimize postgresql.conf
etc.  There's currently a known bottleneck regarding SysV
IPC on FreeBSD, if a lot of processes are waiting on the
same semaphore, which can affect PostgreSQL under very high
load.  I don't know what the status of that is, but I think
nobody is currently working on a fix.

Best regards
   Oliver

--
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

PI:
int f[9814],b,c=9814,g,i;long a=1e4,d,e,h;
main(){for(;b=c,c-=14;i=printf("%04d",e+d/a),e=d%a)
while(g=--b*2)d=h*b+a*(i?f[b]:a/5),h=d/--g,f[b]=d%g;}
_______________________________________________
freebsd-database@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-database
To unsubscribe, send any mail to "freebsd-database-unsubscribe@..."

Re: New Opteron box, dedicated to PostgreSQL

by kometen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> while configuring my 1st PostgreSQL box with dual Opterons (2212) on
> FreeBSD, I have some questions:
>
> 3. What are the recommendations for tuning I/O)?
>     - setting sysctl vfs.read_max to 16 or 32
>     - rebuilding the relevant filesystem with 32K blocks and 4K frags
>     Are these reliable?

The following were suggetions from Vivek Khera:


/boot/loader.conf:

kern.ipc.semmni=256
kern.ipc.semmns=2048


/etc/sysctl.conf:

kern.ipc.somaxconn=2048
kern.maxfiles=65536

# Suggestions by Vivek Khera from postgresql performance
kern.ipc.shm_use_phys=1
kern.ipc.shmmax=1073741824
kern.ipc.shmall=262144
kern.ipc.semmsl=512
kern.ipc.semmap=256


This is on a four-way woodcrest at 3 GHz and 16 GB ram so you may want
to adjust the values.

regards
Claus
_______________________________________________
freebsd-database@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-database
To unsubscribe, send any mail to "freebsd-database-unsubscribe@..."

Re: New Opteron box, dedicated to PostgreSQL

by Astrodog :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jan Mikkelsen wrote:

> Hi,
>
> Axel Rau wrote:
>  
>  
>> 1. Are there any precautions in fbsd to prevent from excessive page  
>> switching between the memory nodes?
>>     E.g. what happens if file cache pages (heavily used by Pg,  
>> planned 6GB) frequently not found on local node?
>>    
>
> Not that I'm aware of.  FreeBSD doesn't have NUMA awareness.
>  
FreeBSD does not support this. I've been working on the scheduler side
of things, but I do not understand the VM well enough to do that side of it.
>  
>> 2. If there is no good solution for this problem, does it make sense  
>> to stay with one dualcore (soon quadcore) processor?
>>    
No. 2 memory controllers beats 1, even with the remote penalty occuring.

The difference either way will be fairly trivial, though.

>
> Benchmark and see.
>
>  
>> 4. Can I install the system on an Areca raid?
>>    
>
> Yes.  If you are planning to use 6.2-RELEASE, make sure that you get the
> arcmsr driver from 6-STABLE.  The arcmsr driver in 6.2-RELEASE can show
> I/O errors under high load.
>
> Regards,
>
> Jan Mikkelsen
>
> _______________________________________________
> freebsd-amd64@... mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-amd64
> To unsubscribe, send any mail to "freebsd-amd64-unsubscribe@..."
>
>  

_______________________________________________
freebsd-database@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-database
To unsubscribe, send any mail to "freebsd-database-unsubscribe@..."

Re: New Opteron box, dedicated to PostgreSQL

by Axel Rau :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Am 16.03.2007 um 00:11 schrieb Jan Mikkelsen:
>
> Not that I'm aware of.  FreeBSD doesn't have NUMA awareness.
And fbsd does not allow to tie say a process group to a cpu?
>
>> 2. If there is no good solution for this problem, does it make sense
>> to stay with one dualcore (soon quadcore) processor?
>
> Benchmark and see.
Would be nice to know before I order the stuff..,
>
>> 4. Can I install the system on an Areca raid?
>
> Yes.  If you are planning to use 6.2-RELEASE, make sure that you  
> get the
> arcmsr driver from 6-STABLE.  The arcmsr driver in 6.2-RELEASE can  
> show
> I/O errors under high load.
That's valuable info. Thanks,

Axel
---------------------------------------------------------------------
Axel Rau, ☀Frankfurt , Germany                       +49 69 9514 18 0


_______________________________________________
freebsd-database@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-database
To unsubscribe, send any mail to "freebsd-database-unsubscribe@..."

Re: New Opteron box, dedicated to PostgreSQL

by Axel Rau :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Am 16.03.2007 um 09:08 schrieb Oliver Fromme:

> Be sure to disable background fsck via /etc/rc.conf.  I've
> seen it breaking file systems under certain circumstances.
Good to know. What about softupdates?
> Instead, you might want want to give PWD's new gjournal
> code a try (it's in -current, but I think there's a port
> to RELENG_6, too).
I will have a look at it.

Axel
---------------------------------------------------------------------
Axel Rau, ☀Frankfurt , Germany                       +49 69 9514 18 0


_______________________________________________
freebsd-database@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-database
To unsubscribe, send any mail to "freebsd-database-unsubscribe@..."

Re: New Opteron box, dedicated to PostgreSQL

by Vick Khera :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Mar 16, 2007, at 7:17 AM, Claus Guttesen wrote:

>> while configuring my 1st PostgreSQL box with dual Opterons (2212) on
>> FreeBSD, I have some questions:
>>
>> 3. What are the recommendations for tuning I/O)?
>>     - setting sysctl vfs.read_max to 16 or 32
>>     - rebuilding the relevant filesystem with 32K blocks and 4K frags
>>     Are these reliable?
>
> The following were suggetions from Vivek Khera:

I've recently bumped the shmall and shmmax on my dual opteron with  
16GB of RAM, and increased correspondingly the shared buffers.

The max shmall you can set on freebsd (at least 6.1) is 2147483647,  
so I set shmall to 524288 to correspond.  This supports 250000 shared  
buffers and 100 max connections.  Might support more, but definitely  
not 260000.

I'm also using vfs.read_max=32 but I haven't really tested if it  
makes a big difference in formal benchmarks.

The other day I was having some I/O overload, so I tried setting  
vfs.hirunningspace to 3K but it didn't solve my immediate problem.  
I've left that setting for now.  Doesn't seem to really make a big  
difference.

I find that the adaptec 2230SLP RAID controllers are not able to keep  
up with my load, but the LSI 320-2X is.  I'm currently investigating  
external arrays attached via fibre for some boost.


Re: New Opteron box, dedicated to PostgreSQL

by Axel Rau :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Am 16.03.2007 um 21:04 schrieb Vivek Khera:

>
> On Mar 16, 2007, at 7:17 AM, Claus Guttesen wrote:
>
>>> while configuring my 1st PostgreSQL box with dual Opterons (2212) on
>>> FreeBSD, I have some questions:
>>>
>>> 3. What are the recommendations for tuning I/O)?
>>>     - setting sysctl vfs.read_max to 16 or 32
>>>     - rebuilding the relevant filesystem with 32K blocks and 4K  
>>> frags
>>>     Are these reliable?
>>
>> The following were suggetions from Vivek Khera:
>
> I've recently bumped the shmall and shmmax on my dual opteron with  
> 16GB of RAM, and increased correspondingly the shared buffers.
>
> The max shmall you can set on freebsd (at least 6.1) is 2147483647,  
> so I set shmall to 524288 to correspond.  This supports 250000  
> shared buffers and 100 max connections.  Might support more, but  
> definitely not 260000.
This box will start with 8GB of RAM, should also support 100  
connections, so I will try half of your values (pretty your settings  
posted by Claus Guttesen <kometen@...>).

>
> I'm also using vfs.read_max=32 but I haven't really tested if it  
> makes a big difference in formal benchmarks.
>
> The other day I was having some I/O overload, so I tried setting  
> vfs.hirunningspace to 3K but it didn't solve my immediate problem.  
> I've left that setting for now.  Doesn't seem to really make a big  
> difference.
>
> I find that the adaptec 2230SLP RAID controllers are not able to  
> keep up with my load, but the LSI 320-2X is.  I'm currently  
> investigating external arrays attached via fibre for some boost.
This box will have an Areca ARC-1261ML (RAID 1 for OS and WAL, RAID 0  
with 7xRAID1 for pg_data).
Any hints beside the usual partition alignment and stripe size of  
128kB ?
Do you use ufs2 with softupdates?

regards, Axel
---------------------------------------------------------------------
Axel Rau, ☀Frankfurt , Germany                       +49 69 9514 18 0


_______________________________________________
freebsd-database@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-database
To unsubscribe, send any mail to "freebsd-database-unsubscribe@..."

Re: New Opteron box, dedicated to PostgreSQL

by Axel Rau :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Am 16.03.2007 um 12:17 schrieb Claus Guttesen:

>
> /boot/loader.conf:
>
> kern.ipc.semmni=256
> kern.ipc.semmns=2048
>
>
> /etc/sysctl.conf:
>

> kern.ipc.somaxconn=2048
This one is new to me.

>
> # Suggestions by Vivek Khera from postgresql performance
> kern.ipc.shm_use_phys=1
> kern.ipc.shmmax=1073741824
> kern.ipc.shmall=262144
> kern.ipc.semmsl=512
> kern.ipc.semmap=256
>
>
> This is on a four-way woodcrest at 3 GHz and 16 GB ram so you may want
> to adjust the values.

Thank you for this valuable info. I will use it as a starting point.

Regards, Axel
---------------------------------------------------------------------
Axel Rau, ☀Frankfurt , Germany                       +49 69 9514 18 0


_______________________________________________
freebsd-database@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-database
To unsubscribe, send any mail to "freebsd-database-unsubscribe@..."

Re: New Opteron box, dedicated to PostgreSQL

by Oliver Fromme :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Axel Rau wrote:
 > Oliver Fromme wrote:
 >
 > > Be sure to disable background fsck via /etc/rc.conf.  I've
 > > seen it breaking file systems under certain circumstances.
 >
 > Good to know. What about softupdates?

It makes a bit of a difference, but not a big one.  Note
that you should enable fsync in the PostgreSQL config
anyway, which defeats the purpose of softupdates somewhat.

If you disable fsync (and you know what you're doing),
using softupdates will probably improve i/o performance
significantly, but I haven't tried that.  Note that you
might lose data if you setup your DB server that way.

Best regards
   Oliver

PS:  Please respect the Reply-To header.  I don't need
to receive a separate copy of the mail; I'm reading the
list.

--
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

"If you think C++ is not overly complicated, just what is a protected
abstract virtual base pure virtual private destructor, and when was the
last time you needed one?"
        -- Tom Cargil, C++ Journal
_______________________________________________
freebsd-database@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-database
To unsubscribe, send any mail to "freebsd-database-unsubscribe@..."

Re: New Opteron box, dedicated to PostgreSQL

by Vick Khera :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Mar 18, 2007, at 7:51 AM, Axel Rau wrote:

> This box will have an Areca ARC-1261ML (RAID 1 for OS and WAL, RAID  
> 0 with 7xRAID1 for pg_data).
> Any hints beside the usual partition alignment and stripe size of  
> 128kB ?
> Do you use ufs2 with softupdates?

You don't value your data?  Why not RAID10?

I use UFS2 with softupdates.  I generally use the default RAID stripe  
sizes. Postgres works in 8k pages, so if you have a lot of locality  
in your db reference, larger stripes might help.  I don't know.

_______________________________________________
freebsd-database@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-database
To unsubscribe, send any mail to "freebsd-database-unsubscribe@..."

Re: New Opteron box, dedicated to PostgreSQL

by Vick Khera :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Mar 19, 2007, at 11:38 AM, Vivek Khera wrote:

> You don't value your data?  Why not RAID10?

eek  never mind... re-read your message and saw that is what you're  
doing...

_______________________________________________
freebsd-database@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-database
To unsubscribe, send any mail to "freebsd-database-unsubscribe@..."

Re: New Opteron box, dedicated to PostgreSQL

by Ivan Voras-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Axel Rau wrote:

>> kern.ipc.somaxconn=2048
> This one is new to me.

Shouldn't make a difference, it's TCP "backlog" size - how many clients
will be allowed to wait on the server's accept() call. It shouldn't make
a difference except if you have *really* high connections/sec, and then
only up to maximum connections you allow. It's practically never a
factor in performance.

_______________________________________________
freebsd-database@... mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-database
To unsubscribe, send any mail to "freebsd-database-unsubscribe@..."