|
View:
New views
5 Messages
—
Rating Filter:
Alert me
|
|
|
Split brain in a dual primary configurationHello,
we have a Citrix XenServer two-nodes cluster on which both nodes has a local partition that is configured as a DRBD resource. The resource is set to become primary on both nodes simultaneously. XenServer uses LVM and it is my understanding that it works in a way that any LV will ever be in use on both hosts at the same this and thus ensuring consistency between our dual-primary hosts. For the DRBD connectivity, both nodes are connected directly through a cross-over cable. For testing purposes, we have unplugged the network interfaces and thus forced both nodes to become WFConnection and in a Primary/Unknown state. VMs on each node kept working as usual. However, after reconnecting the network interfaces, both nodes became StandAlone and logs were showing that a Split-brain had been detected. It was my understanding that DRBD would have been able to sync OOS blocks from each nodes to the other one properly. What is supposed to happen when nodes from a dual-primary configuration reconnects to each other? Our configuration is as follow: global { usage-count no; } common { protocol C; startup { become-primary-on both; } syncer { rate 33M; verify-alg crc32c; al-extents 1801; } net { cram-hmac-alg sha1; max-epoch-size 8192; max-buffers 8192; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; allow-two-primaries; } disk { on-io-error detach; no-disk-flushes; no-disk-barrier; no-md-flushes; } } resource drbd0 { disk /dev/sda3; device /dev/drbd0; flexible-meta-disk internal; on node1 { address 10.10.0.1:7788; } on node2 { address 10.10.0.2:7788; } } Logs from when we reconnected both nodes: block drbd0: Handshake successful: Agreed network protocol version 91 block drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC block drbd0: conn( WFConnection -> WFReportParams ) block drbd0: Starting asender thread (from drbd0_receiver [7644]) block drbd0: data-integrity-alg: <not-used> block drbd0: drbd_sync_handshake: block drbd0: self 95BA39C140141F17:ADE0E340AD8230BB:0CAA835AA97548CC:CF72ED70E8F22F57 bits:160 flags:0 block drbd0: peer F83F651106A22A31:ADE0E340AD8230BB:0CAA835AA97548CC:CF72ED70E8F22F57 bits:51795 flags:0 block drbd0: uuid_compare()=100 by rule 90 block drbd0: Split-Brain detected, dropping connection! block drbd0: helper command: /sbin/drbdadm split-brain minor-0 block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0) block drbd0: conn( WFReportParams -> Disconnecting ) block drbd0: error receiving ReportState, l: 4! block drbd0: asender terminated block drbd0: Terminating asender thread block drbd0: Connection closed block drbd0: conn( Disconnecting -> StandAlone ) block drbd0: receiver terminated block drbd0: Terminating receiver thread Can anyone tell me why I am not getting the behavior I am expecting? Regards, -- Jean-François Chevrette [iWeb] _______________________________________________ drbd-user mailing list drbd-user@... http://lists.linbit.com/mailman/listinfo/drbd-user |
|
|
Re: Split brain in a dual primary configurationOn Fri, Oct 30, 2009 at 7:43 PM, Jean-Francois Chevrette <jfchevrette@...> wrote: Hello, Hello, the message is self explanatory: in drbd.conf you define the policy to "disconnect" when you get a split brain (sb) deriving from a 2-primary scenario. And so does drbd... btw: having you dual primary and LVM, are you using also CLVMD? Otherwise if you do modifications on one VG (such as add an lv) you don't see them immediately, because you don't have cluster locking... Bye, Gianluca _______________________________________________ drbd-user mailing list drbd-user@... http://lists.linbit.com/mailman/listinfo/drbd-user |
|
|
Re: Split brain in a dual primary configurationHello,
On 09-10-30 3:07 PM, Gianluca Cecchi wrote: > Hello, > the message is self explanatory: in drbd.conf you define the policy to > "disconnect" when you get a split brain (sb) deriving from a 2-primary > scenario. > And so does drbd... But what else would be more appropriate for such a situation? In fact, that's what we want to do, have both nodes to disconnect. We don't want either of them to become secondary. Is it acceptable to have both nodes remain primaries while they are disconnected and expect them to sync to each other properly when they are connected again? > btw: having you dual primary and LVM, are you using also CLVMD? > Otherwise if you do modifications on one VG (such as add an lv) you > don't see them immediately, because you don't have cluster locking... We are not using clvm. When a new VG or LV is created, we see it immediately on the second node. Maybe Citrix XenServer has a mechanism so that LVM is reloaded on both nodes when a new VM is created on the cluster? Regards, -- Jean-François Chevrette [iWeb] _______________________________________________ drbd-user mailing list drbd-user@... http://lists.linbit.com/mailman/listinfo/drbd-user |
|
|
Re: Split brain in a dual primary configurationHi,
mind that i'm no expert and can be completely wrong but.. LVM works on top of drbd in active/passive mode only. For active/active you need CLVM (and all of that RH Cluster Suite S**t) It's the same as with filesystems, if you want to have it mounted on many locations at the same time, you need locking, so that no two nodes write at the same spot/block at the same time. LVM by itself doesn't guarantee that. But to inform you, you're not only one who tried that setup. :-) I'm currently looking for appropriate solution. One might be: Disk <-> LVM <-> DRBD[X] <-> domU[X]. Where each DRBD instance is one virtual machine. It would work, only if during live virtual machine migration from host A to host B, writing on host B starts _after_ all writing on host A ceases. Does anyone know if this would work and if XEN can/will write concurrently during live migration on both backing devices (DRBD[X])? Regards, M. Jean-Francois Chevrette wrote: > Hello, > > On 09-10-30 3:07 PM, Gianluca Cecchi wrote: >> Hello, >> the message is self explanatory: in drbd.conf you define the policy to >> "disconnect" when you get a split brain (sb) deriving from a 2-primary >> scenario. >> And so does drbd... > > But what else would be more appropriate for such a situation? In fact, > that's what we want to do, have both nodes to disconnect. We don't want > either of them to become secondary. > > Is it acceptable to have both nodes remain primaries while they are > disconnected and expect them to sync to each other properly when they > are connected again? > >> btw: having you dual primary and LVM, are you using also CLVMD? >> Otherwise if you do modifications on one VG (such as add an lv) you >> don't see them immediately, because you don't have cluster locking... > > We are not using clvm. When a new VG or LV is created, we see it > immediately on the second node. Maybe Citrix XenServer has a mechanism > so that LVM is reloaded on both nodes when a new VM is created on the > cluster? > > > Regards, _______________________________________________ drbd-user mailing list drbd-user@... http://lists.linbit.com/mailman/listinfo/drbd-user |
|
|
Re: Split brain in a dual primary configurationWell,
it's actually being done in such a way. Apparently it even has it's own manual chapter. :-) http://www.drbd.org/users-guide/ch-xen.html And a blog entry: http://blogs.linbit.com/florian/2007/09/03/drbd-806-brings-full-live-migration-for-xen-on-drbd/ I guess that's the way to go. :-) Regards, M. Martin Gombac( wrote: > Hi, > > mind that i'm no expert and can be completely wrong but.. > > LVM works on top of drbd in active/passive mode only. > For active/active you need CLVM (and all of that RH Cluster Suite S**t) > > It's the same as with filesystems, if you want to have it mounted on > many locations at the same time, you need locking, so that no two nodes > write at the same spot/block at the same time. LVM by itself doesn't > guarantee that. > > But to inform you, you're not only one who tried that setup. :-) I'm > currently looking for appropriate solution. > > One might be: > Disk <-> LVM <-> DRBD[X] <-> domU[X]. > Where each DRBD instance is one virtual machine. > It would work, only if during live virtual machine migration from host A > to host B, writing on host B starts _after_ all writing on host A ceases. > > Does anyone know if this would work and if XEN can/will write > concurrently during live migration on both backing devices (DRBD[X])? > > Regards, > M. > > > Jean-Francois Chevrette wrote: >> Hello, >> >> On 09-10-30 3:07 PM, Gianluca Cecchi wrote: >>> Hello, >>> the message is self explanatory: in drbd.conf you define the policy to >>> "disconnect" when you get a split brain (sb) deriving from a 2-primary >>> scenario. >>> And so does drbd... >> >> But what else would be more appropriate for such a situation? In fact, >> that's what we want to do, have both nodes to disconnect. We don't >> want either of them to become secondary. >> >> Is it acceptable to have both nodes remain primaries while they are >> disconnected and expect them to sync to each other properly when they >> are connected again? >> >>> btw: having you dual primary and LVM, are you using also CLVMD? >>> Otherwise if you do modifications on one VG (such as add an lv) you >>> don't see them immediately, because you don't have cluster locking... >> >> We are not using clvm. When a new VG or LV is created, we see it >> immediately on the second node. Maybe Citrix XenServer has a mechanism >> so that LVM is reloaded on both nodes when a new VM is created on the >> cluster? >> >> >> Regards, > > _______________________________________________ > drbd-user mailing list > drbd-user@... > http://lists.linbit.com/mailman/listinfo/drbd-user drbd-user mailing list drbd-user@... http://lists.linbit.com/mailman/listinfo/drbd-user |
| Free embeddable forum powered by Nabble | Forum Help |