Truck replication

View: New views
8 Messages — Rating Filter:   Alert me  

Truck replication

by JayBee :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.
Hi All:

I am stuck with truck based replication.  I have the following setup:

node1 (primary):
/dev/sda7 [4TB]
/dev/drbd1
I have also created filesystem ext3 on drbd1

node2:
<same configuration as above, but there is NO filesystem>

I am basically trying to avoid the initial sync time (if possible).  So, now node1 is primary and if I follow steps written on this page http://www.drbd.org/users-guide/s-using-truck-based-replication.html  I can get both servers to display uptodate/uptodate state.

But, when I mount /dev/drbd1 to /mnt/drbd (on node1 which is primary), and copy files to /mnt/drbd, I do not see Uptodate/Inconsistent message (via cat /proc/drbd on node1)

At this point I would assume that node2 should get some data and try to sync with primary.  Am I missing any steps?

Please advise.

Regards,
-Jay


CDN College or University student? Get Windows 7 for only $39.99 before Jan 3! Buy it now!
_______________________________________________
drbd-user mailing list
drbd-user@...
http://lists.linbit.com/mailman/listinfo/drbd-user

Re: Truck replication

by Lars Ellenberg :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, Oct 29, 2009 at 07:41:16PM +0000, jay b wrote:

>
> Hi All:
>
> I am stuck with truck based replication.  I have the following setup:
>
> node1 (primary):
> /dev/sda7 [4TB]
> /dev/drbd1
> I have also created filesystem ext3 on drbd1
>
> node2:
> <same configuration as above, but there is NO filesystem>
>
> I am basically trying to avoid the initial sync time (if possible).
> So, now node1 is primary and if I follow steps written on this page
> http://www.drbd.org/users-guide/s-using-truck-based-replication.html
> I can get both servers to display uptodate/uptodate state.
>
> But, when I mount /dev/drbd1 to /mnt/drbd (on node1 which is primary),
> and copy files to /mnt/drbd, I do not see Uptodate/Inconsistent
> message (via cat /proc/drbd on node1)

Why would you expect node2 to become "Inconsistent"
during normal operation?

> At this point I would assume that node2 should get some data and try
> to sync with primary.  Am I missing any steps?

When you _connect_ the second node,
it will do some (bitmap based) resync.

Once connected, and "Connected Uptodate",
it will do _online replication_.

If there is problem, I don't see it?
Either you or me or DRBD misunderstood something, I guess.

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed
_______________________________________________
drbd-user mailing list
drbd-user@...
http://lists.linbit.com/mailman/listinfo/drbd-user

Re: Truck replication

by JayBee :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.
Thats correct.  The problem was with my perception of how it suppose to work versus the actual behavior.  Truck based replication worked perfectly for me.  After initial UpToDate/UpToDate message I untared the my directory structure on drbd mounted device and monitored dw (disk write) status from which I was able to see disk writes happening.  Before this I was hoping to see UpToDate/Inconsistent(or Outdated) ds message, but that was just my perception.

Thanks a lot for help.

Regards,
Jay


> Date: Sat, 31 Oct 2009 10:17:22 +0100
> From: lars.ellenberg@...
> To: drbd-user@...
> Subject: Re: [DRBD-user] Truck replication
>
> On Thu, Oct 29, 2009 at 07:41:16PM +0000, jay b wrote:
> >
> > Hi All:
> >
> > I am stuck with truck based replication. I have the following setup:
> >
> > node1 (primary):
> > /dev/sda7 [4TB]
> > /dev/drbd1
> > I have also created filesystem ext3 on drbd1
> >
> > node2:
> > <same configuration as above, but there is NO filesystem>
> >
> > I am basically trying to avoid the initial sync time (if possible).
> > So, now node1 is primary and if I follow steps written on this page
> > http://www.drbd.org/users-guide/s-using-truck-based-replication.html
> > I can get both servers to display uptodate/uptodate state.
> >
> > But, when I mount /dev/drbd1 to /mnt/drbd (on node1 which is primary),
> > and copy files to /mnt/drbd, I do not see Uptodate/Inconsistent
> > message (via cat /proc/drbd on node1)
>
> Why would you expect node2 to become "Inconsistent"
> during normal operation?
>
> > At this point I would assume that node2 should get some data and try
> > to sync with primary. Am I missing any steps?
>
> When you _connect_ the second node,
> it will do some (bitmap based) resync.
>
> Once connected, and "Connected Uptodate",
> it will do _online replication_.
>
> If there is problem, I don't see it?
> Either you or me or DRBD misunderstood something, I guess.
>
> --
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
>
> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
> __
> please don't Cc me, but send to list -- I'm subscribed
> _______________________________________________
> drbd-user mailing list
> drbd-user@...
> http://lists.linbit.com/mailman/listinfo/drbd-user


Save up to 84% on Windows 7 until Jan 3-eligible CDN College or University students only. Hurry-buy it now for $39.99!
_______________________________________________
drbd-user mailing list
drbd-user@...
http://lists.linbit.com/mailman/listinfo/drbd-user

Re: Truck replication

by JayBee :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.

Hi All:

I am hitting a strange behavior during truck replication (so far it has occurred to one machine out of 10).  I have two machines node1 and node2, and I am restoring metadata on node2.  Whenever I do the following (as mentioned on truck replication page) resynchronization takes place. Any idea what could be causing this, any clues to debug such issues? 

drbdsetup 1 new-current-uuid --clear-bitmap
drbdadm detach res
drbdmeta_cmd=$(drbdadm -d dump-md res)
drbdadm detach res
${drbdmeta_cmd/dump-md/restore-md} /var/testmeta


dmesg output:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
block drbd1: conn( Unconnected -> WFConnection )
block drbd1: Handshake successful: Agreed network protocol version 91
block drbd1: conn( WFConnection -> WFReportParams )
block drbd1: Starting asender thread (from drbd1_receiver [24837])
block drbd1: data-integrity-alg: <not-used>
block drbd1: drbd_sync_handshake:
block drbd1: self C48D0F6F05011D66:02D67DD221F19CD8:0000000000000004:0000000000000000 bits:669554939 flags:0
block drbd1: peer C48D0F6F05011D66:02D67DD221F19CD8:0000000000000004:0000000000000000 bits:0 flags:0
block drbd1: uuid_compare()=0 by rule 40
block drbd1: No resync, but 669554939 bits in bitmap!
block drbd1: peer( Unknown -> Secondary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> Inconsistent )
block drbd1: peer( Secondary -> Primary ) pdsk( Inconsistent -> UpToDate )
block drbd1: drbd_sync_handshake:
block drbd1: self C48D0F6F05011D66:02D67DD221F19CD8:0000000000000004:0000000000000000 bits:669554939 flags:0
block drbd1: peer 40BD0C3BDCF22CB9:C48D0F6F05011D66:0000000000000004:0000000000000000 bits:0 flags:0
block drbd1: uuid_compare()=-1 by rule 50
block drbd1: Becoming sync target due to disk states.
block drbd1: conn( Connected -> WFBitMapT )
block drbd1: conn( WFBitMapT -> WFSyncUUID )
block drbd1: helper command: /sbin/drbdadm before-resync-target minor-1
block drbd1: helper command: /sbin/drbdadm before-resync-target minor-1 exit code 0 (0x0)
block drbd1: conn( WFSyncUUID -> SyncTarget )
block drbd1: Began resync as SyncTarget (will sync 2678219756 KB [669554939 bits set]).

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
version: 8.3.4 (api:88/proto:86-91)
GIT-hash: 70a645ae080411c87b4482a135847d69dc90a6a2 build by rmake-chroot@..., 2009-10-27 15:57:44

 1: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r----
    ns:0 nr:9735808 dw:9731552 dr:0 al:0 bm:592 lo:135 pe:29068 ua:133 ap:0 ep:1 wo:b oos:2668488204
        [>....................] sync'ed:  0.4% (2605944/2615448)M
        finish: 62:56:09 speed: 11,768 (11,460) K/sec



From: tech_j@...
To: lars.ellenberg@...; drbd-user@...
Date: Mon, 2 Nov 2009 19:29:01 +0000
Subject: Re: [DRBD-user] Truck replication

Thats correct.  The problem was with my perception of how it suppose to work versus the actual behavior.  Truck based replication worked perfectly for me.  After initial UpToDate/UpToDate message I untared the my directory structure on drbd mounted device and monitored dw (disk write) status from which I was able to see disk writes happening.  Before this I was hoping to see UpToDate/Inconsistent(or Outdated) ds message, but that was just my perception.

Thanks a lot for help.

Regards,
Jay


> Date: Sat, 31 Oct 2009 10:17:22 +0100
> From: lars.ellenberg@...
> To: drbd-user@...
> Subject: Re: [DRBD-user] Truck replication
>
> On Thu, Oct 29, 2009 at 07:41:16PM +0000, jay b wrote:
> >
> > Hi All:
> >
> > I am stuck with truck based replication. I have the following setup:
> >
> > node1 (primary):
> > /dev/sda7 [4TB]
> > /dev/drbd1
> > I have also created filesystem ext3 on drbd1
> >
> > node2:
> > <same configuration as above, but there is NO filesystem>
> >
> > I am basically trying to avoid the initial sync time (if possible).
> > So, now node1 is primary and if I follow steps written on this page
> > http://www.drbd.org/users-guide/s-using-truck-based-replication.html
> > I can get both servers to display uptodate/uptodate state.
> >
> > But, when I mount /dev/drbd1 to /mnt/drbd (on node1 which is primary),
> > and copy files to /mnt/drbd, I do not see Uptodate/Inconsistent
> > message (via cat /proc/drbd on node1)
>
> Why would you expect node2 to become "Inconsistent"
> during normal operation?
>
> > At this point I would assume that node2 should get some data and try
> > to sync with primary. Am I missing any steps?
>
> When you _connect_ the second node,
> it will do some (bitmap based) resync.
>
> Once connected, and "Connected Uptodate",
> it will do _online replication_.
>
> If there is problem, I don't see it?
> Either you or me or DRBD misunderstood something, I guess.
>
> --
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
>
> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
> __
> please don't Cc me, but send to list -- I'm subscribed
> _______________________________________________
> drbd-user mailing list
> drbd-user@...
> http://lists.linbit.com/mailman/listinfo/drbd-user


Save up to 84% on Windows 7 until Jan 3-eligible CDN College or University students only. Hurry-buy it now for $39.99!

Get a great deal on Windows 7 and see how it works the way you want. Check out the offers on Windows 7now.
_______________________________________________
drbd-user mailing list
drbd-user@...
http://lists.linbit.com/mailman/listinfo/drbd-user

Re: Truck replication

by Lars Ellenberg :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Nov 09, 2009 at 10:26:07PM +0000, jay b wrote:
>
>
> Hi All:
>
> I am hitting a strange behavior during truck replication (so far it
> has occurred to one machine out of 10).  I have two machines node1 and
> node2, and I am restoring metadata on node2.  Whenever I do the
> following (as mentioned on truck replication page) resynchronization
> takes place. Any idea what could be causing this,

You ;)

> any clues to debug such issues?  
>
> drbdsetup 1 new-current-uuid --clear-bitmap

fine.

> drbdadm detach res

why?

> drbdmeta_cmd=$(drbdadm -d dump-md res)

why?

> drbdadm detach res

why?

> ${drbdmeta_cmd/dump-md/restore-md} /var/testmeta

why?


May I ask
 * what are you trying to achieve?
 * which "truck replication page" are you referring to?

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed
_______________________________________________
drbd-user mailing list
drbd-user@...
http://lists.linbit.com/mailman/listinfo/drbd-user

Re: Truck replication

by JayBee :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.

Truck replication page: http://www.drbd.org/users-guide/users-guide.html
For dump /restore part: http://www.drbd.org/users-guide/s-resizing.html

What I am trying to achieve:
~~~~~~~~~~~~~~~~~~
I have 2 node setup (primary/secondary), each node have ~4TB disk, and rest of the hardware configuration is exactly the same.   I want this setup so that in a fail over scenario where secondary can be promoted to primary.  I am using Truck replication because I want to avoid initial sync time, which in our case takes more than 12hrs.

> > takes place. Any idea what could be causing this,
> You ;)
Glad if this is the case :)

Basically, steps that I mentioned earlier were for secondary node only.

> > drbdadm detach res
> why?
Because if resource is attached we cannot restore metadata (metadata dump from primary) .

> > ${drbdmeta_cmd/dump-md/restore-md} /var/testmeta
>
> why?
Basically it is delivering "drbdmeta 1 v08 /dev/sdb3 internal restore-md /var/testmeta" command to restore meta on this secondary node (sorry for including extra bash commands). So the idea is that when both nodes connect they are UpToDate/UpToDate.

Let me know if there is something wrong with this approach.


Thanks much,
Jay


> Date: Tue, 10 Nov 2009 16:10:44 +0100
> From: lars.ellenberg@...
> To: drbd-user@...
> Subject: Re: [DRBD-user] Truck replication
>
> On Mon, Nov 09, 2009 at 10:26:07PM +0000, jay b wrote:
> >
> >
> > Hi All:
> >
> > I am hitting a strange behavior during truck replication (so far it
> > has occurred to one machine out of 10). I have two machines node1 and
> > node2, and I am restoring metadata on node2. Whenever I do the
> > following (as mentioned on truck replication page) resynchronization
> > takes place. Any idea what could be causing this,
>
> You ;)
>
> > any clues to debug such issues?
> >
> > drbdsetup 1 new-current-uuid --clear-bitmap
>
> fine.
>
> > drbdadm detach res
>
> why?
>
> > drbdmeta_cmd=$(drbdadm -d dump-md res)
>
> why?
>
> > drbdadm detach res
>
> why?
>
> > ${drbdmeta_cmd/dump-md/restore-md} /var/testmeta
>
> why?
>
>
> May I ask
> * what are you trying to achieve?
> * which "truck replication page" are you referring to?
>
> --
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
>
> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
> __
> please don't Cc me, but send to list -- I'm subscribed
> _______________________________________________
> drbd-user mailing list
> drbd-user@...
> http://lists.linbit.com/mailman/listinfo/drbd-user


Windows Live: Make it easier for your friends to see what you’re up to on Facebook.
_______________________________________________
drbd-user mailing list
drbd-user@...
http://lists.linbit.com/mailman/listinfo/drbd-user

Re: Truck replication

by Lars Ellenberg :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, Nov 10, 2009 at 07:18:16PM +0000, jay b wrote:
>
>
> Truck replication page: http://www.drbd.org/users-guide/users-guide.html 
> For dump /restore part: http://www.drbd.org/users-guide/s-resizing.html 
>
> What I am trying to achieve:

...

> I want to avoid initial sync time,
> which in our case takes more than 12hrs.

Right.

Then how about the approach documented in the man page?
http://www.drbd.org/users-guide/re-drbdsetup.html

Currently the sub section on "new-current-uuid" is
http://www.drbd.org/users-guide/re-drbdsetup.html#id1229962

But those id anchors are likely to change on future updates,
so if someone digs this from an archive,
just "find" the heading/subsection on "new-current-uuid".


--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed
_______________________________________________
drbd-user mailing list
drbd-user@...
http://lists.linbit.com/mailman/listinfo/drbd-user

Re: Truck replication

by JayBee :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.
> Then how about the approach documented in the man page?
> http://www.drbd.org/users-guide/re-drbdsetup.html

That approach is working perfectly, except for one set of boxes that have with exactly same hardware configuration. And I just want to figure out why it is behaving in this manner on these boxes.  Okay, let me explain what I've been doing

config:
~~~~
global {
        usage-count yes;
}
common {
        protocol C;
        startup {
                wfc-timeout 120;
                degr-wfc-timeout 120;
        }
}
resource my_res {
        syncer {
                rate 333M;
        }
        on node1 {
                device /dev/drbd1;
                disk /dev/sdb3;
                address 192.168.2.1:7791;
                meta-disk internal;
        }
        on node2 {
                device /dev/drbd1;
                disk /dev/sdb3;
                address 192.168.2.2:7791;
                meta-disk internal;
        }
}

node1 (primary)
~~~~~~~~~
- drbdadm create-md my_res
- drbdadm up my_res
- drbdadm disconnect my_res (because new-current-uuid would occur only on "standalone")
- drbdsetup 1 new-current-uuid --clear-bitmap
- drbdadm new-current-uuid my_res
- drbdadm detach my_res (because I need to produce and primary dump and I cannot do this if resource is attached)
- drbdadm dump-md my_res > primary_dump
- scp this "primary_dump" to node2

node2 (secondary)
~~~~~~~~~~~
- drbdsetup 1 new-current-uuid --clear-bitmap
- drbdmeta 1 /dev/sdb3 internal restore-md primary_dump

On both nodes:
~~~~~~~~~
- drbdadm adjust all

At this point both nodes are connected with "Secondary/Secondary Inconsistent/Inconsistent" state

Make node1 primary:
~~~~~~~~~~~~
drbdadm -- --overwrite-data-of-peer primary my_res

* Now at this point I would expect both nodes to be in "Primary/Secondary UpToDate/UpToDate" state, but they are not. Instead the state is "Primary/Secondary UpToDate/Inconsistent" state and displaying [=>...........] sync'ed:0.01% message.

I have successfully tested this procedure on 10 other boxes and I have had no issue, except for this one box.  So, I am trying to find out more about this behavior and under what circumstances it may occur.  I'd greatly appreciate any help on this behaviour.

Here is dmesg output:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
block drbd1: conn( Unconnected -> WFConnection )
block drbd1: Handshake successful: Agreed network protocol version 91
block drbd1: conn( WFConnection -> WFReportParams )
block drbd1: Starting asender thread (from drbd1_receiver [24837])
block drbd1: data-integrity-alg: <not-used>
block drbd1: drbd_sync_handshake:
block drbd1: self C48D0F6F05011D66:02D67DD221F19CD8:0000000000000004:0000000000000000 bits:669554939 flags:0
block drbd1: peer C48D0F6F05011D66:02D67DD221F19CD8:0000000000000004:0000000000000000 bits:0 flags:0
block drbd1: uuid_compare()=0 by rule 40
block drbd1: No resync, but 669554939 bits in bitmap!
block drbd1: peer( Unknown -> Secondary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> Inconsistent )
block drbd1: peer( Secondary -> Primary ) pdsk( Inconsistent -> UpToDate )
block drbd1: drbd_sync_handshake:
block drbd1: self C48D0F6F05011D66:02D67DD221F19CD8:0000000000000004:0000000000000000 bits:669554939 flags:0
block drbd1: peer 40BD0C3BDCF22CB9:C48D0F6F05011D66:0000000000000004:0000000000000000 bits:0 flags:0
block drbd1: uuid_compare()=-1 by rule 50
block drbd1: Becoming sync target due to disk states.
block drbd1: conn( Connected -> WFBitMapT )
block drbd1: conn( WFBitMapT -> WFSyncUUID )
block drbd1: helper command: /sbin/drbdadm before-resync-target minor-1
block drbd1: helper command: /sbin/drbdadm before-resync-target minor-1 exit code 0 (0x0)
block drbd1: conn( WFSyncUUID -> SyncTarget )
block drbd1: Began resync as SyncTarget (will sync 2678219756 KB [669554939 bits set]).

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Regards,
-Jay

> Date: Wed, 11 Nov 2009 10:22:10 +0100
> From: lars.ellenberg@...
> To: drbd-user@...
> Subject: Re: [DRBD-user] Truck replication
>
> On Tue, Nov 10, 2009 at 07:18:16PM +0000, jay b wrote:
> >
> >
> > Truck replication page: http://www.drbd.org/users-guide/users-guide.html
> > For dump /restore part: http://www.drbd.org/users-guide/s-resizing.html
> >
> > What I am trying to achieve:
>
> ...
>
> > I want to avoid initial sync time,
> > which in our case takes more than 12hrs.
>
> Right.
>
> Then how about the approach documented in the man page?
> http://www.drbd.org/users-guide/re-drbdsetup.html
>
> Currently the sub section on "new-current-uuid" is
> http://www.drbd.org/users-guide/re-drbdsetup.html#id1229962
>
> But those id anchors are likely to change on future updates,
> so if someone digs this from an archive,
> just "find" the heading/subsection on "new-current-uuid".
>
>
> --
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
>
> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
> __
> please don't Cc me, but send to list -- I'm subscribed
> _______________________________________________
> drbd-user mailing list
> drbd-user@...
> http://lists.linbit.com/mailman/listinfo/drbd-user


Get a great deal on Windows 7 and see how it works the way you want. See the Windows 7 offers now.
_______________________________________________
drbd-user mailing list
drbd-user@...
http://lists.linbit.com/mailman/listinfo/drbd-user