A good way to implement failover is to make the Namenode log transactions to
more than one directory, typically a local directory and a NFS mounted
directory. The Namenode writes transactions to both directories
synchronously.
If the Namenode machine dies, copy the fsimage and fsiedits from the NFS
server and you will have recovered *all* committed transactions.
The SecondaryNamenode pulls the fsimage and fsedits once every configured
period, typically ranging from a few minutes to an hour. If you use the
image from the SecondaryNamenode, you might lose the last few minutes of
transactions.
Thanks
dhruba
On 7/20/07 9:53 AM, "Doug Cutting" <
cutting@...> wrote:
>> So far I learned that the secondary namenode keeps refreshing
>> periodically its backup copies of fsimage and editlog files, and if the
>> primary namenode disappears, it's the responsibility of the cluster
>> admin to notice this, shut down the cluster, switch the configs across
>> the cluster to point to the secondary namenode, start a primary namenode
>> on the secondary namenode's host, and restart the rest of the daemons.
>
> If you use DNS to switch the namenode from the primary to the secondary,
> then no configuration changes or other daemon restarts are required. I
> think that is the best practice.