This is version 2. It does not address Rick's recent mail in which
he talked about using NFS4ERR_EXPIRED for the expired clientid as well.
It looks like there will be version 3.
I'm not sure if we have "seems like consensus" (SLC), as Spencer
was looking for. Perhaps it is "minimally like consensus" (MLC)
and thus less reliable. In the interests of getting to a 'clean'
proposal I've come up with a list of specification changes keyed
INTRO -- Above
TOC -- This list of stuff in here
NOTES -- Discussion of remaining issues and things about which we
may need to get to consensus or at least something that seems
DIFFS -- Discussion of changes from version one.
CURRENT -- current proposal for changes to deal with persistent
delegations plus all the stuff that I had to fix/correct/
clean up to address in connection with those
9.6.3 -- Discussion about correcting current (bis-16) organization
Most of what is in (CURRENT) has been mentioned in my previous sketch of
a proposal and if there is no actual consensus, at least there does not
seem to have been any dissension. If that's not SLC, then at least it
Here is a list of other stuff that is either in the proposal or where
it has been suggested as something that should be in it. the items will
identified by the proposer and an integer.
DAVE0 -- IN-CURRENT MLC-AWAITING-SLC
treat delegation revocation after lease expiration in a fashion like
courtesy locks, rather than entirely separate.
Made this change mainly bcause I couldn't understand what is there and
had to write something that i could understand.
My impression is that Rick would agree but we haven't reached seems-
like-consensus. As of v1, nobody had seen this text yet.
Can rework if there is disagreement.
RICK0 -- IN-CURRENT AWAITING-SLC
saying that when a delegation is revoked after lease expiration
but all locks are not released (no lease cancellation),
NFS4ERR_BAD_STATEID should be returned.
I agree that it is consistent with DAVE0.
Heard no other opinions.
RICK1 -- IN-CURRENT AWAITING-SLC
saying that when a delegation is revoked and there is no lease
expiration, NFS4ERR_ADMIN_REVOKED is returned.
I'm OK with this but think that we can only do it if there are some
other spec changes to go with it, e.g. DAVE1.
Heard no other opinions.
DAVE1 -- IN-CURRENT AWAITING-SLC
changing the definition or name of NFSERR_ADMIN_REVOKED to be
consistent with RICK1.
It seems to me that this is a requirement for RICK1, but I'm not sure
if Rick agrees.
Heard no other opinions on this.
DAVE2 -- NOW-OUT WRONG
saying that NFS4ERR_EXPIRED is not the signal for lease
cancellation (uses invalidation of the clientid instead)
name Presence Consensus or an incredible facsimile
v1 v2 V1 V2
DAVE0 IN IN AWAITING-SLC AWAITING-SLC
DAVE1 OUT IN AWAITING-SLC AWAITING-SLC
RICK0 OUT IN AWAITING-SLC AWAITING-SLC
RICK1 OUT IN AWAITING-SLC AWAITING-SLC
DAVE2 IN OUT AWAITING-SLC WRONG
=== Other changes
revision of errors descriptions.
making it clear you can do CLAIM_DELEGATE_PREV reclaims during grace period.
explicitly discussing delegations in the "courtesy locks" section.
===> In section 184.108.40.206, in the third set of bulleted items, replace
===> the fourth bulleted item therein by:
o If the stateid represents revoked state or state lost as a result
of lease expiration, then return NFS4ERR_EXPIRED,
NFS4ERR_BAD_STATEID, or NFS4ERR_ADMIN_REVOKED, as appropriate.
===> replace the last paragraph of section 9.1.10, by:
Requiring open confirmation on reclaim-type opens is avoidable
because of the nature of the environments in which such opens are
done. For CLAIM_PREVIOUS opens, this is immediately after server
reboot, so there should be no time for open-owners to be created,
found to be unused, and recycled. For CLAIM_DELEGATE_PREV opens, we
are dealing with either a client reboot situation or a network partition
resulting in deletion of lease state (and returning NFS4ERR_EXPIRED).
A server which supports delegation can be sure that no open-owners
for that client have been recycled since client initialization or
deletion of lease state and thus can ensure that confirmation will
not be required.
===> replace the first paragraph of the second bulleted item in 9.5
with the following:
o Any operation made with a valid clientid (DELEGPURGE, RENEW,
OPEN, LOCK) or a valid stateid (CLOSE, DELEGRETURN, LOCK, LOCKU,
OPEN, OPEN_CONFIRM, OPEN_DOWNGRADE, READ, SETATTR, or WRITE).
In the latter case, the stateid must not be one of the
special stateids consisting of all bits 0 or all bits 1.
===> replace the second sentence of the third paragraph of section
During the grace period, clients recover locks and the associated
state by reclaim-type locking requests (i.e., LOCK requests with
reclaim set to true and OPEN operations with the claim types of
CLAIM_PREVIOUS and CLAIM_DELEGATE_PREV).
===> replace the second sentence of section 9.6.3 by the following:
If this occurs, the server may cancel the lease and free all
locks held for the client.
===> Replace the first two paragraphs of section 220.127.116.11 by the
As a courtesy to the client or as an optimization, the server may
continue to hold locks, including delegations, on behalf of a client
for which recent communication has extended beyond the lease period,
delaying the cancellation of the lease. If the server receives a lock
or I/O request that conflicts with one of these courtesy locks or if it
runs out of resources, the server MAY cause lease cancellation to occur at
that time and henceforth return NFS4ERR_EXPIRED when any of the stateids
associated with the freed locks is used. If lease cancellation has not
occurred and the server receives a lock or I/O request that conflicts with
one of the courtesy locks, the requirements are as follows:
o in the case of a courtesy lock which is not a delegation, it MUST free
the courtesy lock and grant the new request.
o In the case of lock or IO request which conflicts with a delegation
which is being held as courtesy lock, the server MAY delay resolution
of request but MUST NOT reject the request and MUST free the delegation
and grant the new request eventually.
o In the case of a requests for a delegation which conflicts with a
delegation which is being held as courtesy lock, the server MAY
grant the new request or not as it chooses, but if it grants the
conflicting request, the delegation haled as courtesy lock MUST be
If the server does not reboot or cancel the lease before the network
partition is healed, when the original client tries to access a courtesy
lock which was freed, the server SHOULD send back a NFS4ERR_BAD_STATEID
to the client. If the client tries to access a courtesy lock which
was not freed, then the server SHOULD mark all of the courtesy locks as
implicitly being renewed.
===> After what is now the fourth non-bulleted paragraph of section 18.104.22.168
===> but probably shouldn't be (see the bottom of this mail),
===> add the following paragraph.
The above are directed to CLAIM_PREVIOUS reclaims and not to
CLAIM_DELEGATE_PREV reclaims, which generally do not involve
a server reboot. However, when a server persistently stores
delegation information to support CLAIM_DELEGATE_PREV across
a period in which both client and server are down at the same
time, similar strictures apply.
===> Update the following items in section 22.214.171.124 as follows:
All locks have been freed as a result of a lease cancellation
which occurred during the partition. The client should use a
SETCLIENTID to recover.
The current lock has been revoked before, during, or after
the partition. The client SHOULD handle this error
as it normally would.
The current lock has been revoked/released during the partition and the
server did not reboot. Other locks MAY still be renewed. The
client need not do a SETCLIENTID and instead SHOULD probe
via a RENEW call.
===> Replace section 10.2.1 by the following
There are three situations that delegation recovery must deal with:
o Client reboot or restart
o Server reboot or restart
o Network partition (full or callback-only)
In the event the client reboots or restarts, the confirmation of
a SETCLIENTID done with an nfs_client_id4 with a new verifier4 value
will result in the release of byte-range locks and share reservations.
Delegations, however, may be treated a bit differently.
There will be situations in which delegations will need to be
reestablished after a client reboots or restarts. The reason for
this is the client may have file data stored locally and this data
was associated with the previously held delegations. The client will
need to reestablish the appropriate file state on the server.
To allow for this type of client recovery, the server MAY allow
delegations to be retained after other sort of locks are released.
This implies that requests from other clients that conflict with these
delegations will need to wait. Because the normal recall process may
require significant time for the client to flush changed state to the
server, other clients need to be prepared for delays that occur because
of a conflicting delegation. In order to give clients a chance to get
through the reboot process during which leases will not be renewed,
the server MAY extend the period for delegation recovery beyond the
typical lease expiration period. For open delegations, such delegations
that are not released are reclaimed using OPEN with a claim type of
CLAIM_DELEGATE_PREV. (See Section 10.5 and Section 15.18 for discussion
of open delegation and the details of OPEN respectively).
A server MAY support a claim type of CLAIM_DELEGATE_PREV, but if it
does, it MUST NOT remove delegations upon SETCLIENTID_CONFIRM, and
instead MUST, make them available until the client indicates that
delegation recovery is complete, by doing a DELEGPURGE. The only exception
to this requirement is when the client lease expires.
A server that supports a claim type of CLAIM_DELEGATE_PREV MUST support
the DELEGPURGE operation, and similarly a server that supports
DELEGPURGE MUST support CLAIM_DELEGATE_PREV. A server which does
not support CLAIM_DELEGATE_PREV MUST return NFS4ERR_NOTSUPP if the
client attempts to use that feature or performs a DELEGPURGE operation.
Support for a claim type of CLAIM_DELEGATE_PREV, is often referred to
as providing for "client-persistent delegations" in that they allow
use of client persistent storage on the client to store data written
by the client, even across a client restart. It should be noted that,
with the optional exception noted below, this feature requires persistent
storage to be used on the client and does not add to persistent storage
requirements on the server.
One good way to think about client-persistent delegations is that for
the most part, they function like "courtesy locks", with a special
semantic adjustments to allow them to be retained across a client restart,
which cause all other sorts of locks to be freed. Such locks are generally
not retained across a server restart. The one exception is the
case of simultaneous failure of the client and server and is discussed
When the server indicates support of CLAIM_DELEGATE_PREV (implicitly)
by returning NFS_OK to DELEGPURGE, a client with a write delegation,
can use write-back caching for data to be written to the server, deferring
the write-back, until such time as the delegation is recalled, possibly
after intervening client restarts. Similarly, when the server indicates
support of CLAIM_DELEGATE_PREV, a client with a read delegation and
an open-for-write subordinate to that delegation, may be sure of the
integrity of its persistently cached copy of the file after a client
restart without specific verification of the change attribute.
When the server reboots or restarts, delegations are reclaimed (using
the OPEN operation with CLAIM_PREVIOUS) in a similar fashion to byte-
range locks and share reservations. However, there is a slight
semantic difference. In the normal case, if the server decides that a
delegation should not be granted, it performs the requested action
(e.g., OPEN) without granting any delegation. For reclaim, the
server grants the delegation but a special designation is applied so
that the client treats the delegation as having been granted but
recalled by the server. Because of this, the client has the duty to
write all modified state to the server and then return the
delegation. This process of handling delegation reclaim reconciles
three principles of the NFSv4 protocol:
o Upon reclaim, a client reporting resources assigned to it by an
earlier server instance must be granted those resources.
o The server has unquestionable authority to determine whether
delegations are to be granted and, once granted, whether they are
to be continued.
o The use of callbacks is not to be depended upon until the client
has proven its ability to receive them.
When a client has more than a single open associated with a
delegation, state for those additional opens can be established using
OPEN operations of type CLAIM_DELEGATE_CUR. When these are used to
establish opens associated with reclaimed delegations, the server
MUST allow them when made within the grace period.
Situations in which there us a series of client and server restarts where
there is no restart of both at the same time, are dealt with via a
combination of CLAIM_DELEGATE_PREV and CLAIM_PREVIOUS reclaim cycles.
Persistent storage is needed only on the client. For each server
failure, a CLAIM_PREVIOUS reclaim cycle is done, while for each client
restart, a CLAIM_DELEGATE_PREV reclaim cycle is done.
To deal with the possibility of simultaneous failure of client and
server (e.g. situations like power outages, etc.), the server MAY
persistently store delegation information so that it can respond to
a CLAIM_DELEGATE_PREV reclaim request which it receives from a
restarting client. This is the one case in which persistent
delegation state can be retained across a server restart. A server
is not required to store this information, but if it does do so, it
should do so for write delegations and for read delegations, during
the pendency of which (across multiple client and/or server
instances), some open-for-write was done as part of delegation. When
the space to persistently record such information is limited, the
server should recall delegations in this class in preference to
keeping them active without persistent storage recording.
When a network partition occurs, delegations are subject to freeing
by the server when the lease renewal period expires. This is similar
to the behavior for locks and share reservations, and, as for locks
and share reservations it may be modified by support for "courtesy
locks" in which locks are not freed in the absence of a conflicting
lock request. Whereas, for locks and share reservations, freeing of
locks will occur immediately upon the appearance of a conflicting
request, for delegations, the server may institute period during which
conflicting requests are held off. Eventually the occurrence of a
conflicting request from another client will cause revocation of the
A loss of the callback path (e.g., by later network configuration
change) will have a similar effect in that it can also result in
revocation of a delegation A recall request will fail and revocation
of the delegation will result.
A client normally finds out about revocation of a delegation when it
uses a stateid associated with a delegation and receives one of the
errors NFS4ERR_EXPIRED, NFS4ERR_BAD_STATEID, or NFS4ERR_ADMIN_REVOKED
(NFS4ERR_EXPIRED indicates that all lock state associated with the
client has been lost). It also may find out about delegation revocation
after a client reboot when it attempts to reclaim a delegation and
receives NFS4ERR_EXPIRED. Note that in the case of a revoked
OPEN_DELEGATE_WRITE delegation, there are issues because data may
have been modified by the client whose delegation is revoked and
separately by other clients. See Section 10.5.1 for a discussion of
such issues. Note also that when delegations are revoked,
information about the revoked delegation will be written by the
server to stable storage (as described in Section 9.6). This is done
to deal with the case in which a server reboots after revoking a
delegation but before the client holding the revoked delegation is
notified about the revocation.
Note that when there is a loss of a delegation, due to a network
partition in which all locks associated with the lease are lost,
the client will also receive the error NFS4ERR_EXPIRED. This case
can be distinguished from other situations in which delegations
are revoked by seeing that the associated clientid becomes invalid
so that NFS4ERR_STALE_CLIENTID is returned when it is used.
When NFS4ERR_EXPIRED Is returned, the server MAY retain information
about the delegations held by the client, deleting those that are
invalidated by a conflicting request. Retaining such information
will allow the client to recover all non-invalidated delegations
using the claim type CLAIM_DELEGATE_PREV, once the
SETCLIENTID_CONFIRM is done to recover. Attempted recovery of
a delegation that the client has no record of, typically because they
were invalidated by conflicting requests, will get the error
NFS4ERR_BAD_RECLAIM. Once a reclaim is attempted for all delegations
that the client held, it SHOULD do a DELEGPURGE to allow any remaining
server delegation information to be freed.
A stateid designates locking state of any type that has been revoked
due to administrative interaction, possibly while the lease is valid,
or because a delegation was revoked because of failure to return it,
while the lease was valid.
A stateid designates locking state of any type that has been
revoked or released due to cancellation of the client's lease, either
immediately upon lease expiration, or following a later request for
a conflicting lock.
===> Note the paragraph with the hanging "CLAIM_DELEGATE_PREV:" in
===> section 15.18.5.
===> In that paragraph, replace "; used after the client reboots:" with
===> the following:
This claim type is for use after a SETCLIENTID_CONFIRM and before
the corresponding DELEGPURGE in two situations: after a client reboot
and after a lease expiration that resulted in loss of all lock
===> after that paragraph, add the following material with indenting that
matches the non-hanging portion of the noted paragraph.
The following errors apply to use of the CLAIM_DELEGATE_PREV claim
o NFS4ERR_NOTSUPP is returned if the server does not support this
o NFS4ERR_INVAL is returned if the reclaim is done at an inappropriate
time, e.g. after DELEGPURGE has been done.
o NFS4ERR_BAD_RECLAIM is returned if the other error conditions do not
apply and the server has no record of the delegation whose reclaim
is being attempted.
===> Replace the last two paragraphs of 15.7.4 by the following:
This operation in provided to support clients that record delegation
information on stable storage on the client. In this case,
DELEGPURGE should be issued immediately after doing delegation
recovery (using CLAIM_DELEGATE_PREV) on all delegations known to
the client. Doing so will notify the server that no additional
delegations for the client will be recovered allowing it to free
resources, and avoid delaying other clients who make requests that
conflict with the unrecovered delegations. All client SHOULD use
DELEGPURGE as part of recovery once it is known that no further
CLAIM_DELEGATE_PREV recovery will be done. This includes clients
that do not record delegation information on stable storage, who would
then do a DELEGPURGE immediately after SETCLIENTID_CONFIRM.
The set of delegations known to the server and the
client may be different. The reasons for this include:
o a client may fail after making a request which resulted in delegation
but before it received the results and committed them to the client's
o a client may fail after deleting its indication that a delegation exists
but before the delegation return is fully processed by the server.
o in the case in which the server and the client restart, the server
may have limited persistent recording of delegation to a subset of
those in existence.
o a client may have only persistently recorded information about a subset of
The server MAY support DELEGPURGE, but its support or non-support should
match that of CLAIM_DELEGATE_PREV:
o A server may support both DELEGPURGE and CLAIM_DELEGATE_PREV
o A server may support neither DELEGPURGE nor CLAIM_DELEGATE_PREV
This fact allows a client starting up to determine if the server is prepared
to support persistent storage of delegation information and thus whether it
may use write-back caching to local persistent storage, relying on
CLAIM_DELEGATE_PREV recovery to allow such changed data to be flushed
safely to the server in the event of client restart.
===> The section organization of 9.6.3 is messed up. See below.
===> The big issue is that all of the edge condition stuff is part of
===> 'courtesy locks' when it shouldn't be
===> Here's what exists now.
9.6.3. Network Partitions and Recovery
126.96.36.199. Courtesy Locks
188.8.131.52.1. First Server Edge Condition
184.108.40.206.2. Second Server Edge Condition
220.127.116.11.3. Handling Server Edge Conditions
18.104.22.168.4. Client Edge Condition
22.214.171.124.5. Client's Handling of NFS4ERR_NO_GRACE
126.96.36.199. Client's Reaction to a Freed Lock
===> Here's a suggested replacement;
9.6.3. Network Partitions and Recovery
188.8.131.52. Courtesy Locks
184.108.40.206. Client's Reaction to a Freed Lock
220.127.116.11. Edge Conditions (NEW)
18.104.22.168.1. First Server Edge Condition
22.214.171.124.2. Second Server Edge Condition
126.96.36.199.3. Handling Server Edge Conditions
188.8.131.52.4. Client Edge Condition
184.108.40.206. Client's Handling of NFS4ERR_NO_GRACE