|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 | Next > |
|
|
Re: Core team statement on replication in PostgreSQLOn 5/29/08, Joshua D. Drake <jd@...> wrote:
> On Thu, 2008-05-29 at 08:21 -0700, David Fetter wrote: > > On Thu, May 29, 2008 at 10:12:55AM -0400, Tom Lane wrote: > > This part is a deal-killer. It's a giant up-hill slog to sell warm > > standby to those in charge of making resources available because the > > warm standby machine consumes SA time, bandwidth, power, rack space, > > etc., but provides no tangible benefit, and this feature would have > > exactly the same problem. > > > > IMHO, without the ability to do read-only queries on slaves, it's not > > worth doing this feature at all. > > The only question I have is... what does this give us that PITR doesn't > give us? Tom is talking about synchronous WAL replication. So you can do lossless failover. Currently there is no good solution for this. And it needs to live in core backend. Yes, it could somehow be implemented by filling backend with hooks, but the question is how it will get synced with changes in core backend after couple of releases? The WAL writing and txid/snapshot handling receive heavy changes on each release. No external project that needs deep hooks has been able to keep pace with core changes thus far. Unless heavily commercially backed which means not open-source. Companies can tell the price they pay for such syncing.. Other solution would be indeed to have fixed hooks guaranteed to be stable between releases. (replica-hooks-discuss?) But that would mean limiting the changes we can do with WAL-writing/snapshot handling code and that does not seem like attractive solution. By having such replication code that tightly ties into core code included in main Postgres source, we are still free to do any changes we feel like and not be tied into external API promises. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@...) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
|
Re: Core team statement on replication in PostgreSQLJosh Berkus wrote:
> Bruce, > > > Another idea I discussed with Tom is having the slave _delay_ applying > > WAL files until all slave snapshots are ready. > > > > Well, again, that only works for async mode. I personally think that's > the correct solution for async. But for synch mode, I think we need to > push the xids back to the master; generally if a user is running in > synch mode they're concerned about failover time and zero data loss, so > holding back the WAL files doesn't make sense. You send the WAL to the slave, but the slave doesn't apply them right away --- it isn't related to async. > Also, if you did delay applying WAL files on an async slave, you'd reach > a point (perhaps after a 6-hour query) where it'd actually be cheaper to > rebuild the slave than to apply the pent-up WAL files. True. -- Bruce Momjian <bruce@...> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@...) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
|
Re: Core team statement on replication in PostgreSQLOn May 29, 2008, at 9:12 AM, David Fetter wrote: > On Thu, May 29, 2008 at 11:58:31AM -0400, Bruce Momjian wrote: >> Josh Berkus wrote: >>> Publishing the XIDs back to the master is one possibility. We >>> also looked at using "spillover segments" for vacuumed rows, but >>> that seemed even less viable. >>> >>> I'm also thinking, for *async replication*, that we could simply >>> halt replication on the slave whenever a transaction passes minxid >>> on the master. However, the main focus will be on synchrounous >>> hot standby. >> >> Another idea I discussed with Tom is having the slave _delay_ >> applying WAL files until all slave snapshots are ready. > > Either one of these would be great, but something that involves > machines that stay useless most of the time is just not going to work. I have customers who are thinking about warm standby functionality, and the only thing stopping them deploying it is complexity and maintenance, not the cost of the HA hardware. If trivial-to-deploy replication that didn't offer read-only access of the slaves were available today I'd bet that most of them would be using it. Read-only slaves would certainly be nice, but (for me) it's making it trivial to deploy and maintain that's more interesting. Cheers, Steve -- Sent via pgsql-hackers mailing list (pgsql-hackers@...) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
|
Re: Core team statement on replication in PostgreSQLOn 5/29/08, Aidan Van Dyk <aidan@...> wrote:
> * Dave Page <dpage@...> [080529 12:03]: > > On Thu, May 29, 2008 at 4:48 PM, Douglas McNaught <doug@...> wrote: > > > I think the idea is that WAL records would be shipped (possibly via > > > socket) and applied as they're generated, rather than on a > > > file-by-file basis. At least that's what "real-time" implies to me... > > > > Yes, we're talking real-time streaming (synchronous) log shipping. > > But synchronous streaming doesn't mean the WAL has to be *applied* on > the salve yet. Just that it has to be "safely" on the slave (i.e on > disk, not just in kernel buffers). > > The whole single-threaded WAL replay problem is going to rear it's ugly > head here too, and mean that a slave *won't* be able to keep up with a > busy master if it's actually trying to apply all the changes in > real-time. Well, actually, if it's synchronous, it will keep up, but it > just means that now your master is IO capabilities is limited to the > speed of the slaves single-threaded WAL application. I don't think thats a problem. If the user runs its server at the limit of write-bandwidth, thats its problem. IOW, with synchronous replication, we _want_ the server to lag behind slaves. About the single-threading problem - afaik, the replay is mostly I/O bound so threading would not buy you much. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@...) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
|
Re: Core team statement on replication in PostgreSQLOn Thu, 2008-05-29 at 09:10 -0700, Josh Berkus wrote: > Joshua D. Drake wrote: > > > > The only question I have is... what does this give us that PITR doesn't > > give us? > > Since people seem to be unclear on what we're proposing: > > 8.4 Synchronous Warm Standby: makes PostgreSQL more suitable for HA > systems by eliminating failover data loss and cutting failover time. > What does this give us that Solaris Cluster, RedHat Cluster, DRBD etc.. doesn't give us? I am not trying to be a poison pill, but I am just not seeing the benefit over what solutions that already exist. I could probably argue if I had more time, that this solution doesn't do anything but make us look like we are half baked in implementation. If the real goal is read-only slaves with synchronous capability, then let's implement that. If we can't do that by 8.4 it gets pushed to 8.5. We already have a dozen different utilities to give us what is being currently proposed. Sincerely, Joshua D. Drake -- Sent via pgsql-hackers mailing list (pgsql-hackers@...) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
|
Re: Core team statement on replication in PostgreSQL* Marko Kreen <markokr@...> [080529 12:27]:
> I don't think thats a problem. If the user runs its server at the > limit of write-bandwidth, thats its problem. > > IOW, with synchronous replication, we _want_ the server to lag behind > slaves. > > About the single-threading problem - afaik, the replay is mostly I/O bound > so threading would not buy you much. Right - the problem is that the master has N>1 backends working away, preloading the modified heap pages into shared buffers, where they are modified w/ WAL. This means the kernel/controller has man read-requests in flight at a time as the modifies/writes chug along. The slave has to read/modify/write every buffer, one at a time, as WAL arrives, meaning there is ever only 1 IO request in flight at a time. So the server as a queue of many parallel reads going on, the slave has a set of sequential random reads going on. a. -- Aidan Van Dyk Create like a god, aidan@... command like a king, http://www.highrise.ca/ work like a slave. |
|
|
Re: Core team statement on replication in PostgreSQLAndrew Dunstan <andrew@...> writes:
> Dave Page wrote: >> Yes, we're talking real-time streaming (synchronous) log shipping. > That's not what Tom's email said, AIUI. Sorry, I was a bit sloppy about that. If we go with a WAL-shipping solution it would be pretty easy to support both synchronous and asynchronous cases (synchronous == master doesn't report commit until the WAL is down to disk on the slaves too). There are different use-cases for both so it'd make sense to do both. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@...) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
|
Re: Core team statement on replication in PostgreSQLDavid Fetter wrote:
> This part is a deal-killer. It's a giant up-hill slog to sell warm > standby to those in charge of making resources available because the > warm standby machine consumes SA time, bandwidth, power, rack space, > etc., but provides no tangible benefit, and this feature would have > exactly the same problem. > > IMHO, without the ability to do read-only queries on slaves, it's not > worth doing this feature at all. +1 I would think that a read-only WAL slave is more valuable than a real-time backup. (especially as the topic is about adding slaves not increasing the effectiveness of backups) I also think that starting with a read-only WAL slave will ease the transition between delayed slave updating and real-time slave updating. -- Shane Ambler pgSQL (at) Sheeky (dot) Biz Get Sheeky @ http://Sheeky.Biz -- Sent via pgsql-hackers mailing list (pgsql-hackers@...) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
|
Re: Core team statement on replication in PostgreSQLDavid Fetter <david@...> writes:
> On Thu, May 29, 2008 at 08:46:22AM -0700, Joshua D. Drake wrote: >> The only question I have is... what does this give us that PITR >> doesn't give us? > It looks like a wrapper for PITR to me, so the gain would be ease of > use. A couple of points about that: * Yeah, ease of use is a huge concern here. We're getting beat up because people have to go find a separate package (and figure out which one they want), install it, learn how to use it, etc. It doesn't help that the most mature package is Slony which is, um, not very novice-friendly or low-admin-complexity. I personally got religion on this about two months ago when Red Hat switched their bugzilla from Postgres to MySQL because the admins didn't want to deal with Slony any more. People want simple. * The proposed approach is trying to get to "real" replication incrementally. Getting rid of the loss window involved in file-by-file log shipping is step one, and I suspect that step two is going to be fixing performance issues in WAL replay to ensure that slaves can keep up. After that we'd start thinking about how to let slaves run read-only queries. But even without read-only queries, this will be a useful improvement for HA/backup scenarios. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@...) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
|
Re: Core team statement on replication in PostgreSQLOn Thu, 29 May 2008, David Fetter wrote:
> It's a giant up-hill slog to sell warm standby to those in charge of > making resources available because the warm standby machine consumes SA > time, bandwidth, power, rack space, etc., but provides no tangible > benefit, and this feature would have exactly the same problem. This is an interesting commentary on the priorities of the customers you're selling to, but I don't think you can extrapolate from that too much. The deployments I normally deal with won't run a system unless there's a failover backup available, period, and the fact that such a feature is not integrated into the core yet is a major problem for them. Read-only slaves is a very nice to have, but by no means a prerequisite before core replication will be useful to some people. Hardware/machine resources are only worth a tiny fraction of what the data is in some environments, and in some of those downtime is really, really expensive. -- * Greg Smith gsmith@... http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@...) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
|
Re: Core team statement on replication in PostgreSQLOn Thu, 2008-05-29 at 09:18 -0700, Josh Berkus wrote:
> Bruce, > > > Another idea I discussed with Tom is having the slave _delay_ applying > > WAL files until all slave snapshots are ready. > > > > Well, again, that only works for async mode. It depends on what we mean by synchronous. Do we mean "the WAL record has made it to the disk on the slave system," or "the WAL record has been applied on the slave system"? With this type of replication there will always be a difference for some small window, but most people would expect that window to be very small for synchronous replication. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@...) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
|
Re: Core team statement on replication in PostgreSQLJosh,
> What does this give us that Solaris Cluster, RedHat Cluster, DRBD etc.. > doesn't give us? Actually, these solutions all have some serious drawbacks, not the least of which is difficult administration (I speak from bitter personal experience). Also, most of them require installation at the filesystem level, something which often isn't available in a hosted environment. --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers@...) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
|
Re: Core team statement on replication in PostgreSQLOn Thu, May 29, 2008 at 12:11:21PM -0400, Brian Hurt wrote:
> > Being able to do read-only queries makes this feature more valuable in more > situations, but I disagree that it's a deal-breaker. Your managers are apparently more enlightened than some. ;-) A -- Andrew Sullivan ajs@... +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@...) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
|
Re: Core team statement on replication in PostgreSQLOn Thu, May 29, 2008 at 07:20:37PM +0300, Marko Kreen wrote:
> > So you can do lossless failover. Currently there is no good > solution for this. Indeed. Getting lossless failover would be excellent. I understand David's worry (having had those arguments more times than I care to admit), but if people don't want to spend the money on the extra machine that can't be queried, they can use another solution for the time being. The big missing piece is lossless failover. People are currently doing it with DRBD, various clustering things, &c., and those are complicated to set up and maintain. (As I've told more than one person looking at it, there is a risk that you'll actually make your installation complicated enough that you'll make it _less_ reliable. I have some bitter personal experiences with this effect, and I know some others on this list do as well.) A -- Andrew Sullivan ajs@... +1 503 667 4564 x104 http://www.commandprompt.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@...) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
|
Re: Core team statement on replication in PostgreSQLOn Thu, May 29, 2008 at 02:13:26PM -0400, Andrew Sullivan wrote:
> On Thu, May 29, 2008 at 12:11:21PM -0400, Brian Hurt wrote: > > Being able to do read-only queries makes this feature more > > valuable in more situations, but I disagree that it's a > > deal-breaker. > > Your managers are apparently more enlightened than some. ;-) Than most managers, at least in my experience, and since this feature is (IMHO rightly) based around broad adoption, it's a good thing to bring up. Cheers, David. -- David Fetter <david@...> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@... Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@...) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
|
Re: Core team statement on replication in PostgreSQL> in this case too. So each slave just needs to report its own longest > open tx as "open" to master. Yes, it bloats master but no way around it. Slaves should not report it every time or every transaction. Vacuum on master will ask them before doing a real work. -- Teodor Sigaev E-mail: teodor@... WWW: http://www.sigaev.ru/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@...) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
|
Re: Core team statement on replication in PostgreSQLOn Thu, May 29, 2008 at 12:19 PM, Andrew Dunstan <andrew@...> wrote:
> That's not what Tom's email said, AIUI. "Synchronous" replication surely > means that the master and slave always have the same set of transactions > applied. Streaming <> synchronous. But streaming log shipping will allow us > to get get closer to synchronicity in some situations, i.e. the window for > missing transactions will be much smaller. > > Some of us were discussing this late on Friday night after PGcon. ISTM that > we can have either 1) fairly hot failover slaves that are guaranteed to be > almost up to date, or 2) slaves that can support read-only transactions but > might get somewhat out of date if they run long transactions. The big > problem is in having slaves which are both highly up to date and support > arbitrary read-only transactions. Maybe in the first instance, at least, we > need to make slaves choose which role they will play. I personally would be thrilled to have slaves be query-able in any fashion, even if 'wrong' under certain circumstances. Any asynchronous solution by definition gives the wrong answer on the slave. Read only slave is the #1 most anticipated feature in the circles I run with. It would literally transform how the database world thinks about postgres overnight. This, coupled with easier standby setup (a pg_archive to mirror pg_restore) would be most welcome! merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@...) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
|
Re: Core team statement on replication in PostgreSQLFirst of all, I’m absolutely delighted that the PG community is thinking seriously about replication. Second, having a solid, easy-to-use database availability solution that works more or less out of the box would be an enormous benefit to customers. Availability is the single biggest problem for customers in my experience and as other people have commented the alternatives are not nice. It’s an excellent idea to build off an existing feature—PITR is already pretty useful and the proposed features are solid next steps. The fact that it does not solve all problems is not a drawback but means it’s likely to get done in a reasonable timeframe. Third, you can’t stop with just this feature. (This is the BUT part of the post.) The use cases not covered by this feature area actually pretty large. Here are a few that concern me: 1.) Partial replication. 2.) WAN replication. 3.) Bi-directional replication. (Yes, this is evil but there are problems where it is indispensable.) 4.) Upgrade support. Aside from database upgrade (how would this ever really work between versions?), it would not support zero-downtime app upgrades, which depend on bi-directional replication tricks. 5.) Heterogeneous replication. 6.) Finally, performance scaling using scale-out over large numbers of replicas. I think it’s possible to get tunnel vision on this—it’s not a big requirement in the PG community because people don’t use PG in the first place when they want to do this. They use MySQL, which has very good replication for performance scaling, though it’s rather weak for availability. As a consequence, I don’t see how you can get around doing some sort of row-based replication like all the other databases. Now that people are starting to get religion on this issue I would strongly advocate a parallel effort to put in a change-set extraction API that would allow construction of comprehensive master/slave replication. (Another approach would be to make it possible for third party apps to read the logs and regenerate SQL.) There are existing models for how to do change set extraction; we have done it several times at my company already. There are also research projects like GORDA that have looked fairly comprehensively at this problem. My company would be quite happy to participate in or even sponsor such an API. Between the proposed WAL-based approach and change-set-based replication it’s not hard to see PG becoming the open source database of choice for a very large number of users. Cheers, Robert On 5/29/08 6:37 PM, "Tom Lane" <tgl@...> wrote: David Fetter <david@...> writes: -- Robert Hodges, CTO, Continuent, Inc. Email: robert.hodges@... Mobile: +1-510-501-3728 Skype: hodgesrm |
|
|
Re: Core team statement on replication in PostgreSQLOn Thu, May 29, 2008 at 3:05 PM, Robert Hodges
<robert.hodges@...> wrote: > Third, you can't stop with just this feature. (This is the BUT part of the > post.) The use cases not covered by this feature area actually pretty > large. Here are a few that concern me: > > 1.) Partial replication. > 2.) WAN replication. > 3.) Bi-directional replication. (Yes, this is evil but there are problems > where it is indispensable.) > 4.) Upgrade support. Aside from database upgrade (how would this ever > really work between versions?), it would not support zero-downtime app > upgrades, which depend on bi-directional replication tricks. > 5.) Heterogeneous replication. > 6.) Finally, performance scaling using scale-out over large numbers of > replicas. I think it's possible to get tunnel vision on this—it's not a big > requirement in the PG community because people don't use PG in the first > place when they want to do this. They use MySQL, which has very good > replication for performance scaling, though it's rather weak for > availability. These type of things are what Slony is for. Slony is trigger based. This makes it more complex than log shipping style replication, but provides lots of functionality. wal shipping based replication is maybe the fastest possible solution...you are already paying the overhead so it comes virtually for free from the point of view of the master. mysql replication is imo nearly worthless from backup standpoint. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@...) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
|
Re: Core team statement on replication in PostgreSQLNo doubt. But defining the minimum acceptable feature set by the demands of the dumbest manager is a no-win proposition.On Thu, May 29, 2008 at 12:11:21PM -0400, Brian Hurt wrote: Brian |
| < Prev | 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |