Re: [HACKERS] Subtransaction commits and Hot Standby

View: New views
3 Messages — Rating Filter:   Alert me  

Parent Message unknown Re: [HACKERS] Subtransaction commits and Hot Standby

by Simon Riggs :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Thu, 2008-09-18 at 15:59 +0100, Simon Riggs wrote:

> On Tue, 2008-09-16 at 10:11 -0400, Alvaro Herrera wrote:
>
> > I wonder if the improved clog API required to mark multiple
> > transactions as committed at once would be also useful to
> > TransactionIdCommitTree which is used in regular transaction commit.
>
> I've hacked together this concept patch (WIP).
>
> Not fully tested yet, but it gives a flavour of the API rearrangements
> required for atomic clog updates. It passes make check, but that's not
> saying enough for a serious review yet. I expect to pick this up again
> next week.
I've tested this some more and am much happier with it now.
Also added README details; there are no user interface or behaviour
changes.

The patch removes the need for RecordSubTransactionCommit() which

* reduces response times of subtransaction queries because we are able
to apply these changes in batches at commit time. This requires a
batch-style API that now works atomically, so there is much change in
transam.c

* reduces the path length for visibility tests for all users viewing
concurrent subtransaction activity since we are much less likely to
waste time following a long trail to an uncommitted higher-level
transaction

* removes the need for additional WAL logging to implement
subtransaction commits for Hot Standby

So half the patch is refactoring, half rearranging of clog access
functions to support batched-access.

An early review would greatly help my work on Hot Standby. Thanks.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

[atomic_subxids.v3a.patch]

Index: src/backend/access/transam/README
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/README,v
retrieving revision 1.11
diff -c -r1.11 README
*** src/backend/access/transam/README 21 Mar 2008 13:23:28 -0000 1.11
--- src/backend/access/transam/README 23 Sep 2008 21:23:02 -0000
***************
*** 342,351 ****
  an XID.  A transaction can be in progress, committed, aborted, or
  "sub-committed".  This last state means that it's a subtransaction that's no
  longer running, but its parent has not updated its state yet (either it is
! still running, or the backend crashed without updating its status).  A
! sub-committed transaction's status will be updated again to the final value as
! soon as the parent commits or aborts, or when the parent is detected to be
! aborted.
 
  Savepoints are implemented using subtransactions.  A subtransaction is a
  transaction inside a transaction; its commit or abort status is not only
--- 342,360 ----
  an XID.  A transaction can be in progress, committed, aborted, or
  "sub-committed".  This last state means that it's a subtransaction that's no
  longer running, but its parent has not updated its state yet (either it is
! still running, or the backend crashed without updating its status).  Prior
! to 8.4 we updated the status to sub-committed in clog as soon as
! sub-commit had happened.  It was later realised that this is not actually
! required for any purpose and the action can be deferred until transaction
! commit. The main role of marking transactions as sub-committed is to
! provide an atomic commit protocol when transaction status is spread across
! multiple clog pages. As a result whenever transaction status spreads
! across multiple pages we must use a two-phase commit protocol. The first
! phase is to mark the subtransactions as sub-committed, then we mark the
! top level transaction and all its subtransactions committed (in that order).
! So in 8.4 sub-committed state still exists, but as a transitory state as
! part of final commit. Subtransaction abort is always marked in clog as
! soon as it occurs, to allow locks to be released.
 
  Savepoints are implemented using subtransactions.  A subtransaction is a
  transaction inside a transaction; its commit or abort status is not only
Index: src/backend/access/transam/clog.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/clog.c,v
retrieving revision 1.47
diff -c -r1.47 clog.c
*** src/backend/access/transam/clog.c 1 Aug 2008 13:16:08 -0000 1.47
--- src/backend/access/transam/clog.c 23 Sep 2008 20:41:17 -0000
***************
*** 80,89 ****
  static bool CLOGPagePrecedes(int page1, int page2);
  static void WriteZeroPageXlogRec(int pageno);
  static void WriteTruncateXlogRec(int pageno);
!
!
! /*
!  * Record the final state of a transaction in the commit log.
   *
   * lsn must be the WAL location of the commit record when recording an async
   * commit. For a synchronous commit it can be InvalidXLogRecPtr, since the
--- 80,105 ----
  static bool CLOGPagePrecedes(int page1, int page2);
  static void WriteZeroPageXlogRec(int pageno);
  static void WriteTruncateXlogRec(int pageno);
! static void TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
! TransactionId *subxids, XidStatus status, XLogRecPtr lsn, int pageno, bool subcommitted);
! static void TransactionIdSetStatusBit(TransactionId xid, XidStatus status,
! XLogRecPtr lsn, int slotno);
!
! /*
!  * TransactionIdSetTreeStatus
!  *
!  * Record the final state of transaction entries in the commit log for
!  * a transaction and its subtransaction tree. Take care to ensure this is
!  * both atomic and efficient. Prior to 8.4, this capability was provided
!  * by the non-atomic TransactionIdSetStatus, which is replaced by this
!  * new atomic version.
!  *
!  * xid is a single xid to set status for. This will typically be
!  * the top level transactionid for a top level commit or abort. It can
!  * also be a subtransaction when we record transaction aborts.
!  *
!  * subxids is an array of xids of length nsubxids, representing subtransactions
!  * in the tree of xid. In various cases nsubxids may be zero.
   *
   * lsn must be the WAL location of the commit record when recording an async
   * commit. For a synchronous commit it can be InvalidXLogRecPtr, since the
***************
*** 91,107 ****
   * should be InvalidXLogRecPtr for abort cases, too.
   *
   * NB: this is a low-level routine and is NOT the preferred entry point
!  * for most uses; TransactionLogUpdate() in transam.c is the intended caller.
   */
  void
! TransactionIdSetStatus(TransactionId xid, XidStatus status, XLogRecPtr lsn)
  {
- int pageno = TransactionIdToPage(xid);
- int byteno = TransactionIdToByte(xid);
- int bshift = TransactionIdToBIndex(xid) * CLOG_BITS_PER_XACT;
  int slotno;
! char   *byteptr;
! char byteval;
 
  Assert(status == TRANSACTION_STATUS_COMMITTED ||
    status == TRANSACTION_STATUS_ABORTED ||
--- 107,221 ----
   * should be InvalidXLogRecPtr for abort cases, too.
   *
   * NB: this is a low-level routine and is NOT the preferred entry point
!  * for most uses; functions in in transam.c are the intended callers.
!  *
!  * Note that no lock requests are made at this level, only lower functions.
!  *
!  * XXX Think about issuing FADVISE_WILLNEED on pages that we will need,
!  * but aren't yet in cache, as well as hinting pages not to fall out of
!  * cache yet.
   */
  void
! TransactionIdSetTreeStatus(TransactionId xid, int nsubxids,
! TransactionId *subxids, XidStatus status, XLogRecPtr lsn)
! {
! int pageno = TransactionIdToPage(xid); /* get page of parent */
! int i;
! bool subcommitted = false;
!
! /*
! * See how many subxids, if any, are on the same page as the parent, if any.
! */
! for (i = 0; i < nsubxids; i++)
! {
! if (TransactionIdToPage(subxids[i]) != pageno)
! break;
! }
!
! /*
! * Do all items fit on a single page?
! */
! if (i == nsubxids)
! {
! /*
! * Set the parent and any subtransactions on same page as it
! */
! TransactionIdSetPageStatus(xid, nsubxids, subxids, status, lsn,
! pageno, subcommitted);
! }
! else
! {
! int start_subxid = 0;
! int num_on_page = 1;
!
! if (status == TRANSACTION_STATUS_COMMITTED)
! {
! /*
! * If this is a commit then we care about doing this atomically.
! * By here, we know we're updating more than one page of clog,
! * so we must mark entries that are not on the first page so
! * that they show as subcommitted before we then return to
! * update the status to fully committed.
! * We don't mark the first page, because we will be doing that
! * part when we access the first page in our next step.
! */
! TransactionIdSetPageStatus(InvalidTransactionId,
! nsubxids - i, subxids + i,
! TRANSACTION_STATUS_SUB_COMMITTED, lsn, pageno,
! false /* this setting never relevant */);
! }
!
! /*
! * Now set the parent and any subtransactions on same page as it
! */
! TransactionIdSetPageStatus(xid, i, subxids, status, lsn,
! pageno, subcommitted);
!
! /*
! * By now, all subtransactions have been subcommitted, if relevant
! */
! subcommitted = true;
!
! /*
! * Now work through the rest of the subxids one clog page at a time,
! * starting from next subxid.
! */
! start_subxid = i;
! pageno = TransactionIdToPage(subxids[i]);
! for (; i < nsubxids; i++)
! {
! if (TransactionIdToPage(subxids[i]) != pageno)
! {
! TransactionIdSetPageStatus(InvalidTransactionId,
! num_on_page, subxids + start_subxid,
! status, lsn, pageno, true);
! start_subxid = i;
! num_on_page = 1;
! pageno = TransactionIdToPage(subxids[start_subxid]);
! }
! else
! num_on_page++;
! }
!
! /* Write last page */
! TransactionIdSetPageStatus(InvalidTransactionId,
! num_on_page, subxids + start_subxid,  
! status, lsn, pageno, true);
! }
! }
!
! /*
!  * Record the final state of transaction entries in the commit log for
!  * all entries on *one* page only. Atomic only on this page.
!  *
!  * Otherwise API is same as TransactionIdSetTreeStatus()
!  */
! static void
! TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
! TransactionId *subxids, XidStatus status, XLogRecPtr lsn, int pageno, bool subcommitted)
  {
  int slotno;
! int i;
 
  Assert(status == TRANSACTION_STATUS_COMMITTED ||
    status == TRANSACTION_STATUS_ABORTED ||
***************
*** 116,124 ****
  * mustn't let it reach disk until we've done the appropriate WAL flush.
  * But when lsn is invalid, it's OK to scribble on a page while it is
  * write-busy, since we don't care if the update reaches disk sooner than
! * we think.  Hence, pass write_ok = XLogRecPtrIsInvalid(lsn).
  */
  slotno = SimpleLruReadPage(ClogCtl, pageno, XLogRecPtrIsInvalid(lsn), xid);
  byteptr = ClogCtl->shared->page_buffer[slotno] + byteno;
 
  /* Current state should be 0, subcommitted or target state */
--- 230,284 ----
  * mustn't let it reach disk until we've done the appropriate WAL flush.
  * But when lsn is invalid, it's OK to scribble on a page while it is
  * write-busy, since we don't care if the update reaches disk sooner than
! * we think.  
  */
  slotno = SimpleLruReadPage(ClogCtl, pageno, XLogRecPtrIsInvalid(lsn), xid);
+
+ /*
+ * If we synch commit more than one xid on this page while write busy
+ * we might find that some of the bits go to disk and others don't.
+ * That would break atomicity, so if we haven't already subcommitted
+ * the xids for this commit, we do that first and then come back
+ * to start marking commits. If using async commit then we already
+ * waited for the write I/O to complete by this point, so nothing to do.
+ */
+ if (status == TRANSACTION_STATUS_COMMITTED && !subcommitted && XLogRecPtrIsInvalid(lsn))
+ {
+ for (i = 0; i < nsubxids; i++)
+ {
+ Assert(ClogCtl->shared->page_number[slotno] == TransactionIdToPage(subxids[i]));
+ TransactionIdSetStatusBit(subxids[i],
+ TRANSACTION_STATUS_SUB_COMMITTED, lsn, slotno);
+ }
+ }
+
+ /* Set the main transaction id, if any */
+ if (TransactionIdIsValid(xid))
+ TransactionIdSetStatusBit(xid, status, lsn, slotno);
+
+ /* Set the subtransactions on this page only */
+ for (i = 0; i < nsubxids; i++)
+ {
+ Assert(ClogCtl->shared->page_number[slotno] == TransactionIdToPage(subxids[i]));
+ TransactionIdSetStatusBit(subxids[i], status, lsn, slotno);
+ }
+
+ ClogCtl->shared->page_dirty[slotno] = true;
+
+ LWLockRelease(CLogControlLock);
+ }
+
+ /*
+  * Must be called with CLogControlLock held
+  */
+ static void
+ TransactionIdSetStatusBit(TransactionId xid, XidStatus status, XLogRecPtr lsn, int slotno)
+ {
+ int byteno = TransactionIdToByte(xid);
+ int bshift = TransactionIdToBIndex(xid) * CLOG_BITS_PER_XACT;
+ char   *byteptr;
+ char byteval;
+
  byteptr = ClogCtl->shared->page_buffer[slotno] + byteno;
 
  /* Current state should be 0, subcommitted or target state */
***************
*** 132,139 ****
  byteval |= (status << bshift);
  *byteptr = byteval;
 
- ClogCtl->shared->page_dirty[slotno] = true;
-
  /*
  * Update the group LSN if the transaction completion LSN is higher.
  *
--- 292,297 ----
***************
*** 149,156 ****
  if (XLByteLT(ClogCtl->shared->group_lsn[lsnindex], lsn))
  ClogCtl->shared->group_lsn[lsnindex] = lsn;
  }
-
- LWLockRelease(CLogControlLock);
  }
 
  /*
--- 307,312 ----
Index: src/backend/access/transam/transam.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/transam.c,v
retrieving revision 1.76
diff -c -r1.76 transam.c
*** src/backend/access/transam/transam.c 26 Mar 2008 18:48:59 -0000 1.76
--- src/backend/access/transam/transam.c 23 Sep 2008 20:41:17 -0000
***************
*** 40,54 ****
 
  /* Local functions */
  static XidStatus TransactionLogFetch(TransactionId transactionId);
- static void TransactionLogUpdate(TransactionId transactionId,
- XidStatus status, XLogRecPtr lsn);
 
 
  /* ----------------------------------------------------------------
   * Postgres log access method interface
   *
   * TransactionLogFetch
!  * TransactionLogUpdate
   * ----------------------------------------------------------------
   */
 
--- 40,58 ----
 
  /* Local functions */
  static XidStatus TransactionLogFetch(TransactionId transactionId);
 
 
  /* ----------------------------------------------------------------
   * Postgres log access method interface
   *
   * TransactionLogFetch
!  *
!  * Prior to 8.4, we also had TransactionLogUpdate and
!  * TransactionLogMultiUpdate. These have now been merged
!  * into a single command TransactionIdSetTreeStatus(),
!  * though that is now part of clog.c because of the need
!  * for closer integration with clog code to achieve
!  * atomic clog updates for subtransactions.
   * ----------------------------------------------------------------
   */
 
***************
*** 100,140 ****
  return xidstatus;
  }
 
- /*
-  * TransactionLogUpdate
-  *
-  * Store the new status of a transaction.  The commit record LSN must be
-  * passed when recording an async commit; else it should be InvalidXLogRecPtr.
-  */
- static inline void
- TransactionLogUpdate(TransactionId transactionId,
- XidStatus status, XLogRecPtr lsn)
- {
- /*
- * update the commit log
- */
- TransactionIdSetStatus(transactionId, status, lsn);
- }
-
- /*
-  * TransactionLogMultiUpdate
-  *
-  * Update multiple transaction identifiers to a given status.
-  * Don't depend on this being atomic; it's not.
-  */
- static inline void
- TransactionLogMultiUpdate(int nxids, TransactionId *xids,
-  XidStatus status, XLogRecPtr lsn)
- {
- int i;
-
- Assert(nxids != 0);
-
- for (i = 0; i < nxids; i++)
- TransactionIdSetStatus(xids[i], status, lsn);
- }
-
-
  /* ----------------------------------------------------------------
   * Interface functions
   *
--- 104,109 ----
***************
*** 143,154 ****
   * ========
   *   these functions test the transaction status of
   *   a specified transaction id.
!  *
!  * TransactionIdCommit
!  * TransactionIdAbort
   * ========
   *   these functions set the transaction status
!  *   of the specified xid.
   *
   * See also TransactionIdIsInProgress, which once was in this module
   * but now lives in procarray.c.
--- 112,125 ----
   * ========
   *   these functions test the transaction status of
   *   a specified transaction id.
!  *
!  * TransactionIdCommitTree
!  * TransactionIdAsyncCommitTree
!  * TransactionIdAbortTree
   * ========
   *   these functions set the transaction status
!  *   of the specified transaction tree. As of 8.4, these
!  *   are now atomic so we set the whole tree in a single call.
   *
   * See also TransactionIdIsInProgress, which once was in this module
   * but now lives in procarray.c.
***************
*** 287,374 ****
  return false;
  }
 
-
- /*
-  * TransactionIdCommit
-  * Commits the transaction associated with the identifier.
-  *
-  * Note:
-  * Assumes transaction identifier is valid.
-  */
- void
- TransactionIdCommit(TransactionId transactionId)
- {
- TransactionLogUpdate(transactionId, TRANSACTION_STATUS_COMMITTED,
- InvalidXLogRecPtr);
- }
-
- /*
-  * TransactionIdAsyncCommit
-  * Same as above, but for async commits.  The commit record LSN is needed.
-  */
- void
- TransactionIdAsyncCommit(TransactionId transactionId, XLogRecPtr lsn)
- {
- TransactionLogUpdate(transactionId, TRANSACTION_STATUS_COMMITTED, lsn);
- }
-
- /*
-  * TransactionIdAbort
-  * Aborts the transaction associated with the identifier.
-  *
-  * Note:
-  * Assumes transaction identifier is valid.
-  * No async version of this is needed.
-  */
- void
- TransactionIdAbort(TransactionId transactionId)
- {
- TransactionLogUpdate(transactionId, TRANSACTION_STATUS_ABORTED,
- InvalidXLogRecPtr);
- }
-
- /*
-  * TransactionIdSubCommit
-  * Marks the subtransaction associated with the identifier as
-  * sub-committed.
-  *
-  * Note:
-  * No async version of this is needed.
-  */
- void
- TransactionIdSubCommit(TransactionId transactionId)
- {
- TransactionLogUpdate(transactionId, TRANSACTION_STATUS_SUB_COMMITTED,
- InvalidXLogRecPtr);
- }
-
  /*
   * TransactionIdCommitTree
!  * Marks all the given transaction ids as committed.
!  *
!  * The caller has to be sure that this is used only to mark subcommitted
!  * subtransactions as committed, and only *after* marking the toplevel
!  * parent as committed.  Otherwise there is a race condition against
!  * TransactionIdDidCommit.
   */
  void
! TransactionIdCommitTree(int nxids, TransactionId *xids)
  {
! if (nxids > 0)
! TransactionLogMultiUpdate(nxids, xids, TRANSACTION_STATUS_COMMITTED,
!  InvalidXLogRecPtr);
  }
 
  /*
   * TransactionIdAsyncCommitTree
!  * Same as above, but for async commits.  The commit record LSN is needed.
   */
  void
! TransactionIdAsyncCommitTree(int nxids, TransactionId *xids, XLogRecPtr lsn)
  {
! if (nxids > 0)
! TransactionLogMultiUpdate(nxids, xids, TRANSACTION_STATUS_COMMITTED,
!  lsn);
  }
 
  /*
--- 258,284 ----
  return false;
  }
 
  /*
   * TransactionIdCommitTree
!  * Marks all the given transaction ids as committed, atomically.
   */
  void
! TransactionIdCommitTree(TransactionId xid, int nxids, TransactionId *xids)
  {
! return TransactionIdSetTreeStatus(xid, nxids, xids,
! TRANSACTION_STATUS_COMMITTED, InvalidXLogRecPtr);
  }
 
  /*
   * TransactionIdAsyncCommitTree
!  * Same as above, but for async commits, atomically.  The commit record
!  * LSN is needed.
   */
  void
! TransactionIdAsyncCommitTree(TransactionId xid, int nxids, TransactionId *xids, XLogRecPtr lsn)
  {
! return TransactionIdSetTreeStatus(xid, nxids, xids,
! TRANSACTION_STATUS_COMMITTED, lsn);
  }
 
  /*
***************
*** 379,392 ****
   * will consider all the xacts as not-yet-committed anyway.
   */
  void
! TransactionIdAbortTree(int nxids, TransactionId *xids)
  {
! if (nxids > 0)
! TransactionLogMultiUpdate(nxids, xids, TRANSACTION_STATUS_ABORTED,
!  InvalidXLogRecPtr);
  }
 
-
  /*
   * TransactionIdPrecedes --- is id1 logically < id2?
   */
--- 289,300 ----
   * will consider all the xacts as not-yet-committed anyway.
   */
  void
! TransactionIdAbortTree(TransactionId xid, int nxids, TransactionId *xids)
  {
! TransactionIdSetTreeStatus(xid, nxids, xids,
! TRANSACTION_STATUS_ABORTED, InvalidXLogRecPtr);
  }
 
  /*
   * TransactionIdPrecedes --- is id1 logically < id2?
   */
Index: src/backend/access/transam/twophase.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/twophase.c,v
retrieving revision 1.45
diff -c -r1.45 twophase.c
*** src/backend/access/transam/twophase.c 11 Aug 2008 11:05:10 -0000 1.45
--- src/backend/access/transam/twophase.c 23 Sep 2008 20:41:17 -0000
***************
*** 1745,1753 ****
  XLogFlush(recptr);
 
  /* Mark the transaction committed in pg_clog */
! TransactionIdCommit(xid);
! /* to avoid race conditions, the parent must commit first */
! TransactionIdCommitTree(nchildren, children);
 
  /* Checkpoint can proceed now */
  MyProc->inCommit = false;
--- 1745,1751 ----
  XLogFlush(recptr);
 
  /* Mark the transaction committed in pg_clog */
! TransactionIdCommitTree(xid, nchildren, children);
 
  /* Checkpoint can proceed now */
  MyProc->inCommit = false;
***************
*** 1822,1829 ****
  * Mark the transaction aborted in clog.  This is not absolutely necessary
  * but we may as well do it while we are here.
  */
! TransactionIdAbort(xid);
! TransactionIdAbortTree(nchildren, children);
 
  END_CRIT_SECTION();
  }
--- 1820,1826 ----
  * Mark the transaction aborted in clog.  This is not absolutely necessary
  * but we may as well do it while we are here.
  */
! TransactionIdAbortTree(xid, nchildren, children);
 
  END_CRIT_SECTION();
  }
Index: src/backend/access/transam/xact.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/xact.c,v
retrieving revision 1.265
diff -c -r1.265 xact.c
*** src/backend/access/transam/xact.c 11 Aug 2008 11:05:10 -0000 1.265
--- src/backend/access/transam/xact.c 23 Sep 2008 20:41:17 -0000
***************
*** 254,260 ****
  static TransactionId RecordTransactionAbort(bool isSubXact);
  static void StartTransaction(void);
 
- static void RecordSubTransactionCommit(void);
  static void StartSubTransaction(void);
  static void CommitSubTransaction(void);
  static void AbortSubTransaction(void);
--- 254,259 ----
***************
*** 952,962 ****
  * Now we may update the CLOG, if we wrote a COMMIT record above
  */
  if (markXidCommitted)
! {
! TransactionIdCommit(xid);
! /* to avoid race conditions, the parent must commit first */
! TransactionIdCommitTree(nchildren, children);
! }
  }
  else
  {
--- 951,957 ----
  * Now we may update the CLOG, if we wrote a COMMIT record above
  */
  if (markXidCommitted)
! TransactionIdCommitTree(xid, nchildren, children);
  }
  else
  {
***************
*** 974,984 ****
  * flushed before the CLOG may be updated.
  */
  if (markXidCommitted)
! {
! TransactionIdAsyncCommit(xid, XactLastRecEnd);
! /* to avoid race conditions, the parent must commit first */
! TransactionIdAsyncCommitTree(nchildren, children, XactLastRecEnd);
! }
  }
 
  /*
--- 969,975 ----
  * flushed before the CLOG may be updated.
  */
  if (markXidCommitted)
! TransactionIdAsyncCommitTree(xid, nchildren, children, XactLastRecEnd);
  }
 
  /*
***************
*** 1156,1191 ****
  s->maxChildXids = 0;
  }
 
- /*
-  * RecordSubTransactionCommit
-  */
- static void
- RecordSubTransactionCommit(void)
- {
- TransactionId xid = GetCurrentTransactionIdIfAny();
-
- /*
- * We do not log the subcommit in XLOG; it doesn't matter until the
- * top-level transaction commits.
- *
- * We must mark the subtransaction subcommitted in the CLOG if it had a
- * valid XID assigned. If it did not, nobody else will ever know about
- * the existence of this subxact.  We don't have to deal with deletions
- * scheduled for on-commit here, since they'll be reassigned to our parent
- * (who might still abort).
- */
- if (TransactionIdIsValid(xid))
- {
- /* XXX does this really need to be a critical section? */
- START_CRIT_SECTION();
-
- /* Record subtransaction subcommit */
- TransactionIdSubCommit(xid);
-
- END_CRIT_SECTION();
- }
- }
-
  /* ----------------------------------------------------------------
   * AbortTransaction stuff
   * ----------------------------------------------------------------
--- 1147,1152 ----
***************
*** 1288,1301 ****
  * waiting for already-aborted subtransactions.  It is OK to do it without
  * having flushed the ABORT record to disk, because in event of a crash
  * we'd be assumed to have aborted anyway.
- *
- * The ordering here isn't critical but it seems best to mark the parent
- * first.  This assures an atomic transition of all the subtransactions to
- * aborted state from the point of view of concurrent
- * TransactionIdDidAbort calls.
  */
! TransactionIdAbort(xid);
! TransactionIdAbortTree(nchildren, children);
 
  END_CRIT_SECTION();
 
--- 1249,1256 ----
  * waiting for already-aborted subtransactions.  It is OK to do it without
  * having flushed the ABORT record to disk, because in event of a crash
  * we'd be assumed to have aborted anyway.
  */
! TransactionIdAbortTree(xid, nchildren, children);
 
  END_CRIT_SECTION();
 
***************
*** 3791,3798 ****
  /* Must CCI to ensure commands of subtransaction are seen as done */
  CommandCounterIncrement();
 
! /* Mark subtransaction as subcommitted */
! RecordSubTransactionCommit();
 
  /* Post-commit cleanup */
  if (TransactionIdIsValid(s->transactionId))
--- 3746,3757 ----
  /* Must CCI to ensure commands of subtransaction are seen as done */
  CommandCounterIncrement();
 
! /*
! * Prior to 8.4 we marked subcommit in clog at this point.
! * We now only perform that step, if required, as part of the
! * atomic update of the whole transaction tree at top level
! * commit or abort.
! */
 
  /* Post-commit cleanup */
  if (TransactionIdIsValid(s->transactionId))
***************
*** 4259,4269 ****
  TransactionId max_xid;
  int i;
 
- TransactionIdCommit(xid);
-
  /* Mark committed subtransactions as committed */
  sub_xids = (TransactionId *) &(xlrec->xnodes[xlrec->nrels]);
! TransactionIdCommitTree(xlrec->nsubxacts, sub_xids);
 
  /* Make sure nextXid is beyond any XID mentioned in the record */
  max_xid = xid;
--- 4218,4226 ----
  TransactionId max_xid;
  int i;
 
  /* Mark committed subtransactions as committed */
  sub_xids = (TransactionId *) &(xlrec->xnodes[xlrec->nrels]);
! TransactionIdCommitTree(xid, xlrec->nsubxacts, sub_xids);
 
  /* Make sure nextXid is beyond any XID mentioned in the record */
  max_xid = xid;
***************
*** 4299,4309 ****
  TransactionId max_xid;
  int i;
 
- TransactionIdAbort(xid);
-
  /* Mark subtransactions as aborted */
  sub_xids = (TransactionId *) &(xlrec->xnodes[xlrec->nrels]);
! TransactionIdAbortTree(xlrec->nsubxacts, sub_xids);
 
  /* Make sure nextXid is beyond any XID mentioned in the record */
  max_xid = xid;
--- 4256,4264 ----
  TransactionId max_xid;
  int i;
 
  /* Mark subtransactions as aborted */
  sub_xids = (TransactionId *) &(xlrec->xnodes[xlrec->nrels]);
! TransactionIdAbortTree(xid, xlrec->nsubxacts, sub_xids);
 
  /* Make sure nextXid is beyond any XID mentioned in the record */
  max_xid = xid;
Index: src/include/access/clog.h
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/include/access/clog.h,v
retrieving revision 1.21
diff -c -r1.21 clog.h
*** src/include/access/clog.h 1 Jan 2008 19:45:56 -0000 1.21
--- src/include/access/clog.h 23 Sep 2008 20:41:17 -0000
***************
*** 32,38 ****
  #define NUM_CLOG_BUFFERS 8
 
 
! extern void TransactionIdSetStatus(TransactionId xid, XidStatus status, XLogRecPtr lsn);
  extern XidStatus TransactionIdGetStatus(TransactionId xid, XLogRecPtr *lsn);
 
  extern Size CLOGShmemSize(void);
--- 32,39 ----
  #define NUM_CLOG_BUFFERS 8
 
 
! extern void TransactionIdSetTreeStatus(TransactionId xid, int nsubxids,
! TransactionId *subxids, XidStatus status, XLogRecPtr lsn);
  extern XidStatus TransactionIdGetStatus(TransactionId xid, XLogRecPtr *lsn);
 
  extern Size CLOGShmemSize(void);
Index: src/include/access/transam.h
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/include/access/transam.h,v
retrieving revision 1.65
diff -c -r1.65 transam.h
*** src/include/access/transam.h 11 Mar 2008 20:20:35 -0000 1.65
--- src/include/access/transam.h 23 Sep 2008 20:41:17 -0000
***************
*** 139,151 ****
  extern bool TransactionIdDidCommit(TransactionId transactionId);
  extern bool TransactionIdDidAbort(TransactionId transactionId);
  extern bool TransactionIdIsKnownCompleted(TransactionId transactionId);
- extern void TransactionIdCommit(TransactionId transactionId);
- extern void TransactionIdAsyncCommit(TransactionId transactionId, XLogRecPtr lsn);
  extern void TransactionIdAbort(TransactionId transactionId);
! extern void TransactionIdSubCommit(TransactionId transactionId);
! extern void TransactionIdCommitTree(int nxids, TransactionId *xids);
! extern void TransactionIdAsyncCommitTree(int nxids, TransactionId *xids, XLogRecPtr lsn);
! extern void TransactionIdAbortTree(int nxids, TransactionId *xids);
  extern bool TransactionIdPrecedes(TransactionId id1, TransactionId id2);
  extern bool TransactionIdPrecedesOrEquals(TransactionId id1, TransactionId id2);
  extern bool TransactionIdFollows(TransactionId id1, TransactionId id2);
--- 139,148 ----
  extern bool TransactionIdDidCommit(TransactionId transactionId);
  extern bool TransactionIdDidAbort(TransactionId transactionId);
  extern bool TransactionIdIsKnownCompleted(TransactionId transactionId);
  extern void TransactionIdAbort(TransactionId transactionId);
! extern void TransactionIdCommitTree(TransactionId xid, int nxids, TransactionId *xids);
! extern void TransactionIdAsyncCommitTree(TransactionId xid, int nxids, TransactionId *xids, XLogRecPtr lsn);
! extern void TransactionIdAbortTree(TransactionId xid, int nxids, TransactionId *xids);
  extern bool TransactionIdPrecedes(TransactionId id1, TransactionId id2);
  extern bool TransactionIdPrecedesOrEquals(TransactionId id1, TransactionId id2);
  extern bool TransactionIdFollows(TransactionId id1, TransactionId id2);



--
Sent via pgsql-patches mailing list (pgsql-patches@...)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

Re: [HACKERS] Subtransaction commits and Hot Standby

by Simon Riggs :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Tue, 2008-09-23 at 22:47 +0100, Simon Riggs wrote:

> I've tested this some more and am much happier with it now.

The concept is fine, but I've found a coding bug in further testing.
Please wait now for new version before review.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


--
Sent via pgsql-patches mailing list (pgsql-patches@...)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

Re: [HACKERS] Subtransaction commits and Hot Standby

by Simon Riggs :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Wed, 2008-09-24 at 13:48 +0100, Simon Riggs wrote:
> On Tue, 2008-09-23 at 22:47 +0100, Simon Riggs wrote:
>
> > I've tested this some more and am much happier with it now.
>
> The concept is fine, but I've found a coding bug in further testing.
> Please wait now for new version before review.

OK, spent long time testing various batching scenarios for this using a
custom test harness to simulate various spreads of xids in transaction
trees. All looks fine now.

The main work is done in new clog.c functions:
TransactionIdSetTreeStatus() which sets whole tree atomically by calling
TransactionIdSetPageStatus(), which in turn calls
TransactionIdSetStatusBit() for each xid status change.

TransactionIdSetPageStatus() performs locking and handles write_ok
problem, as did code it replaces. TransactionIdSetPageStatus() is called
theoretical minimum number of times for any transaction tree.

Patch slightly fumbles diff-ing new and replacement code, so there are
two chunks that appear to show I'm removing locking. I'm not!!

Everything else is just API changes.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

[atomic_subxids.v4.patch]

Index: src/backend/access/transam/README
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/README,v
retrieving revision 1.11
diff -c -r1.11 README
*** src/backend/access/transam/README 21 Mar 2008 13:23:28 -0000 1.11
--- src/backend/access/transam/README 24 Sep 2008 17:33:23 -0000
***************
*** 342,351 ****
  an XID.  A transaction can be in progress, committed, aborted, or
  "sub-committed".  This last state means that it's a subtransaction that's no
  longer running, but its parent has not updated its state yet (either it is
! still running, or the backend crashed without updating its status).  A
! sub-committed transaction's status will be updated again to the final value as
! soon as the parent commits or aborts, or when the parent is detected to be
! aborted.
 
  Savepoints are implemented using subtransactions.  A subtransaction is a
  transaction inside a transaction; its commit or abort status is not only
--- 342,360 ----
  an XID.  A transaction can be in progress, committed, aborted, or
  "sub-committed".  This last state means that it's a subtransaction that's no
  longer running, but its parent has not updated its state yet (either it is
! still running, or the backend crashed without updating its status).  Prior
! to 8.4 we updated the status to sub-committed in clog as soon as
! sub-commit had happened.  It was later realised that this is not actually
! required for any purpose and the action can be deferred until transaction
! commit. The main role of marking transactions as sub-committed is to
! provide an atomic commit protocol when transaction status is spread across
! multiple clog pages. As a result whenever transaction status spreads
! across multiple pages we must use a two-phase commit protocol. The first
! phase is to mark the subtransactions as sub-committed, then we mark the
! top level transaction and all its subtransactions committed (in that order).
! So in 8.4 sub-committed state still exists, but as a transitory state as
! part of final commit. Subtransaction abort is always marked in clog as
! soon as it occurs, to allow locks to be released.
 
  Savepoints are implemented using subtransactions.  A subtransaction is a
  transaction inside a transaction; its commit or abort status is not only
Index: src/backend/access/transam/clog.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/clog.c,v
retrieving revision 1.47
diff -c -r1.47 clog.c
*** src/backend/access/transam/clog.c 1 Aug 2008 13:16:08 -0000 1.47
--- src/backend/access/transam/clog.c 24 Sep 2008 21:24:02 -0000
***************
*** 80,89 ****
  static bool CLOGPagePrecedes(int page1, int page2);
  static void WriteZeroPageXlogRec(int pageno);
  static void WriteTruncateXlogRec(int pageno);
!
!
! /*
!  * Record the final state of a transaction in the commit log.
   *
   * lsn must be the WAL location of the commit record when recording an async
   * commit. For a synchronous commit it can be InvalidXLogRecPtr, since the
--- 80,105 ----
  static bool CLOGPagePrecedes(int page1, int page2);
  static void WriteZeroPageXlogRec(int pageno);
  static void WriteTruncateXlogRec(int pageno);
! static void TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
! TransactionId *subxids, XidStatus status, XLogRecPtr lsn, int pageno, bool subcommit);
! static void TransactionIdSetStatusBit(TransactionId xid, XidStatus status,
! XLogRecPtr lsn, int slotno);
!
! /*
!  * TransactionIdSetTreeStatus
!  *
!  * Record the final state of transaction entries in the commit log for
!  * a transaction and its subtransaction tree. Take care to ensure this is
!  * both atomic and efficient. Prior to 8.4, this capability was provided
!  * by the non-atomic TransactionIdSetStatus, which is replaced by this
!  * new atomic version.
!  *
!  * xid is a single xid to set status for. This will typically be
!  * the top level transactionid for a top level commit or abort. It can
!  * also be a subtransaction when we record transaction aborts.
!  *
!  * subxids is an array of xids of length nsubxids, representing subtransactions
!  * in the tree of xid. In various cases nsubxids may be zero.
   *
   * lsn must be the WAL location of the commit record when recording an async
   * commit. For a synchronous commit it can be InvalidXLogRecPtr, since the
***************
*** 91,107 ****
   * should be InvalidXLogRecPtr for abort cases, too.
   *
   * NB: this is a low-level routine and is NOT the preferred entry point
!  * for most uses; TransactionLogUpdate() in transam.c is the intended caller.
   */
  void
! TransactionIdSetStatus(TransactionId xid, XidStatus status, XLogRecPtr lsn)
  {
- int pageno = TransactionIdToPage(xid);
- int byteno = TransactionIdToByte(xid);
- int bshift = TransactionIdToBIndex(xid) * CLOG_BITS_PER_XACT;
  int slotno;
! char   *byteptr;
! char byteval;
 
  Assert(status == TRANSACTION_STATUS_COMMITTED ||
    status == TRANSACTION_STATUS_ABORTED ||
--- 107,238 ----
   * should be InvalidXLogRecPtr for abort cases, too.
   *
   * NB: this is a low-level routine and is NOT the preferred entry point
!  * for most uses; functions in in transam.c are the intended callers.
!  *
!  * Note that no lock requests are made at this level, only lower functions.
!  *
!  * XXX Think about issuing FADVISE_WILLNEED on pages that we will need,
!  * but aren't yet in cache, as well as hinting pages not to fall out of
!  * cache yet.
   */
  void
! TransactionIdSetTreeStatus(TransactionId xid, int nsubxids,
! TransactionId *subxids, XidStatus status, XLogRecPtr lsn)
! {
! int pageno = TransactionIdToPage(xid); /* get page of parent */
! int i;
!
! Assert(status == TRANSACTION_STATUS_COMMITTED ||
!   status == TRANSACTION_STATUS_ABORTED);
!
! /*
! * See how many subxids, if any, are on the same page as the parent, if any.
! */
! for (i = 0; i < nsubxids; i++)
! {
! if (TransactionIdToPage(subxids[i]) != pageno)
! break;
! }
!
! /*
! * Do all items fit on a single page?
! */
! if (i == nsubxids)
! {
! /*
! * Set the parent and any subtransactions on same page as it
! */
! TransactionIdSetPageStatus(xid, nsubxids, subxids, status, lsn,
! pageno, true);
! }
! else
! {
! int num_on_first_page = i;
! int num_on_page = 0;
! int offset;
!
! if (status == TRANSACTION_STATUS_COMMITTED)
! {
! /*
! * If this is a commit then we care about doing this atomically.
! * By here, we know we're updating more than one page of clog,
! * so we must mark entries that are *not* on the first page so
! * that they show as subcommitted before we then return to
! * update the status to fully committed.
! * We don't mark the first page because we will be doing that
! * when we mark the main commit, so we wish to avoid touching
! * that page twice.
! */
! num_on_page = 0;
! i = offset = num_on_first_page;
! pageno = TransactionIdToPage(subxids[num_on_first_page]);
! while (i < nsubxids)
! {
! while (TransactionIdToPage(subxids[i]) == pageno && i < nsubxids)
! {
! num_on_page++;
! i++;
! }
!
! TransactionIdSetPageStatus(InvalidTransactionId,
! num_on_page, subxids + offset,
! TRANSACTION_STATUS_SUB_COMMITTED, lsn, pageno, true);
! offset = i;
! num_on_page = 0;
! pageno = TransactionIdToPage(subxids[offset]);
! }
! }
!
! /*
! * Now set the parent and subtransactions on same page as it, if any
! */
! pageno = TransactionIdToPage(xid);
! TransactionIdSetPageStatus(xid, num_on_first_page, subxids, status, lsn,
! pageno, true);
!
! /*
! * By now, all subtransactions have been subcommitted, so all calls
! * to TransactionIdSetPageStatus() will use subcommit=false after
! * this point for this transaction tree.
! */
!
! /*
! * Now work through the rest of the subxids one clog page at a time,
! * starting from the second page onwards, like we did above.
! */
! num_on_page = 0;
! i = offset = num_on_first_page;
! pageno = TransactionIdToPage(subxids[num_on_first_page]);
! while (i < nsubxids)
! {
! while (TransactionIdToPage(subxids[i]) == pageno && i < nsubxids)
! {
! num_on_page++;
! i++;
! }
!
! TransactionIdSetPageStatus(InvalidTransactionId,
! num_on_page, subxids + offset,
! status, lsn, pageno, false);
! offset = i;
! num_on_page = 0;
! pageno = TransactionIdToPage(subxids[offset]);
! }
! }
! }
!
! /*
!  * Record the final state of transaction entries in the commit log for
!  * all entries on *one* page only. Atomic only on this page.
!  *
!  * Otherwise API is same as TransactionIdSetTreeStatus()
!  */
! static void
! TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
! TransactionId *subxids, XidStatus status, XLogRecPtr lsn, int pageno, bool subcommit)
  {
  int slotno;
! int i;
 
  Assert(status == TRANSACTION_STATUS_COMMITTED ||
    status == TRANSACTION_STATUS_ABORTED ||
***************
*** 116,124 ****
  * mustn't let it reach disk until we've done the appropriate WAL flush.
  * But when lsn is invalid, it's OK to scribble on a page while it is
  * write-busy, since we don't care if the update reaches disk sooner than
! * we think.  Hence, pass write_ok = XLogRecPtrIsInvalid(lsn).
  */
  slotno = SimpleLruReadPage(ClogCtl, pageno, XLogRecPtrIsInvalid(lsn), xid);
  byteptr = ClogCtl->shared->page_buffer[slotno] + byteno;
 
  /* Current state should be 0, subcommitted or target state */
--- 247,303 ----
  * mustn't let it reach disk until we've done the appropriate WAL flush.
  * But when lsn is invalid, it's OK to scribble on a page while it is
  * write-busy, since we don't care if the update reaches disk sooner than
! * we think.  
  */
  slotno = SimpleLruReadPage(ClogCtl, pageno, XLogRecPtrIsInvalid(lsn), xid);
+
+ /*
+ * If we synch commit more than one xid on this page while write busy
+ * we might find that some of the bits go to disk and others don't.
+ * That would break atomicity, so if we haven't already subcommitted
+ * the xids for this commit, we do that first and then come back
+ * to start marking commits. If using async commit then we already
+ * waited for the write I/O to complete by this point, so nothing to do.
+ */
+ if (subcommit && status == TRANSACTION_STATUS_COMMITTED &&
+ XLogRecPtrIsInvalid(lsn))
+ {
+ for (i = 0; i < nsubxids; i++)
+ {
+ Assert(ClogCtl->shared->page_number[slotno] == TransactionIdToPage(subxids[i]));
+ TransactionIdSetStatusBit(subxids[i],
+ TRANSACTION_STATUS_SUB_COMMITTED, lsn, slotno);
+ }
+ }
+
+
+ /* Set the main transaction id, if any */
+ if (TransactionIdIsValid(xid))
+ TransactionIdSetStatusBit(xid, status, lsn, slotno);
+
+ /* Set the subtransactions on this page only */
+ for (i = 0; i < nsubxids; i++)
+ {
+ Assert(ClogCtl->shared->page_number[slotno] == TransactionIdToPage(subxids[i]));
+ TransactionIdSetStatusBit(subxids[i], status, lsn, slotno);
+ }
+
+ ClogCtl->shared->page_dirty[slotno] = true;
+
+ LWLockRelease(CLogControlLock);
+ }
+
+ /*
+  * Must be called with CLogControlLock held
+  */
+ static void
+ TransactionIdSetStatusBit(TransactionId xid, XidStatus status, XLogRecPtr lsn, int slotno)
+ {
+ int byteno = TransactionIdToByte(xid);
+ int bshift = TransactionIdToBIndex(xid) * CLOG_BITS_PER_XACT;
+ char   *byteptr;
+ char byteval;
+
  byteptr = ClogCtl->shared->page_buffer[slotno] + byteno;
 
  /* Current state should be 0, subcommitted or target state */
***************
*** 132,139 ****
  byteval |= (status << bshift);
  *byteptr = byteval;
 
- ClogCtl->shared->page_dirty[slotno] = true;
-
  /*
  * Update the group LSN if the transaction completion LSN is higher.
  *
--- 311,316 ----
***************
*** 149,156 ****
  if (XLByteLT(ClogCtl->shared->group_lsn[lsnindex], lsn))
  ClogCtl->shared->group_lsn[lsnindex] = lsn;
  }
-
- LWLockRelease(CLogControlLock);
  }
 
  /*
--- 326,331 ----
Index: src/backend/access/transam/transam.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/transam.c,v
retrieving revision 1.76
diff -c -r1.76 transam.c
*** src/backend/access/transam/transam.c 26 Mar 2008 18:48:59 -0000 1.76
--- src/backend/access/transam/transam.c 24 Sep 2008 17:33:23 -0000
***************
*** 40,54 ****
 
  /* Local functions */
  static XidStatus TransactionLogFetch(TransactionId transactionId);
- static void TransactionLogUpdate(TransactionId transactionId,
- XidStatus status, XLogRecPtr lsn);
 
 
  /* ----------------------------------------------------------------
   * Postgres log access method interface
   *
   * TransactionLogFetch
!  * TransactionLogUpdate
   * ----------------------------------------------------------------
   */
 
--- 40,58 ----
 
  /* Local functions */
  static XidStatus TransactionLogFetch(TransactionId transactionId);
 
 
  /* ----------------------------------------------------------------
   * Postgres log access method interface
   *
   * TransactionLogFetch
!  *
!  * Prior to 8.4, we also had TransactionLogUpdate and
!  * TransactionLogMultiUpdate. These have now been merged
!  * into a single command TransactionIdSetTreeStatus(),
!  * though that is now part of clog.c because of the need
!  * for closer integration with clog code to achieve
!  * atomic clog updates for subtransactions.
   * ----------------------------------------------------------------
   */
 
***************
*** 100,140 ****
  return xidstatus;
  }
 
- /*
-  * TransactionLogUpdate
-  *
-  * Store the new status of a transaction.  The commit record LSN must be
-  * passed when recording an async commit; else it should be InvalidXLogRecPtr.
-  */
- static inline void
- TransactionLogUpdate(TransactionId transactionId,
- XidStatus status, XLogRecPtr lsn)
- {
- /*
- * update the commit log
- */
- TransactionIdSetStatus(transactionId, status, lsn);
- }
-
- /*
-  * TransactionLogMultiUpdate
-  *
-  * Update multiple transaction identifiers to a given status.
-  * Don't depend on this being atomic; it's not.
-  */
- static inline void
- TransactionLogMultiUpdate(int nxids, TransactionId *xids,
-  XidStatus status, XLogRecPtr lsn)
- {
- int i;
-
- Assert(nxids != 0);
-
- for (i = 0; i < nxids; i++)
- TransactionIdSetStatus(xids[i], status, lsn);
- }
-
-
  /* ----------------------------------------------------------------
   * Interface functions
   *
--- 104,109 ----
***************
*** 143,154 ****
   * ========
   *   these functions test the transaction status of
   *   a specified transaction id.
!  *
!  * TransactionIdCommit
!  * TransactionIdAbort
   * ========
   *   these functions set the transaction status
!  *   of the specified xid.
   *
   * See also TransactionIdIsInProgress, which once was in this module
   * but now lives in procarray.c.
--- 112,125 ----
   * ========
   *   these functions test the transaction status of
   *   a specified transaction id.
!  *
!  * TransactionIdCommitTree
!  * TransactionIdAsyncCommitTree
!  * TransactionIdAbortTree
   * ========
   *   these functions set the transaction status
!  *   of the specified transaction tree. As of 8.4, these
!  *   are now atomic so we set the whole tree in a single call.
   *
   * See also TransactionIdIsInProgress, which once was in this module
   * but now lives in procarray.c.
***************
*** 287,374 ****
  return false;
  }
 
-
- /*
-  * TransactionIdCommit
-  * Commits the transaction associated with the identifier.
-  *
-  * Note:
-  * Assumes transaction identifier is valid.
-  */
- void
- TransactionIdCommit(TransactionId transactionId)
- {
- TransactionLogUpdate(transactionId, TRANSACTION_STATUS_COMMITTED,
- InvalidXLogRecPtr);
- }
-
- /*
-  * TransactionIdAsyncCommit
-  * Same as above, but for async commits.  The commit record LSN is needed.
-  */
- void
- TransactionIdAsyncCommit(TransactionId transactionId, XLogRecPtr lsn)
- {
- TransactionLogUpdate(transactionId, TRANSACTION_STATUS_COMMITTED, lsn);
- }
-
- /*
-  * TransactionIdAbort
-  * Aborts the transaction associated with the identifier.
-  *
-  * Note:
-  * Assumes transaction identifier is valid.
-  * No async version of this is needed.
-  */
- void
- TransactionIdAbort(TransactionId transactionId)
- {
- TransactionLogUpdate(transactionId, TRANSACTION_STATUS_ABORTED,
- InvalidXLogRecPtr);
- }
-
- /*
-  * TransactionIdSubCommit
-  * Marks the subtransaction associated with the identifier as
-  * sub-committed.
-  *
-  * Note:
-  * No async version of this is needed.
-  */
- void
- TransactionIdSubCommit(TransactionId transactionId)
- {
- TransactionLogUpdate(transactionId, TRANSACTION_STATUS_SUB_COMMITTED,
- InvalidXLogRecPtr);
- }
-
  /*
   * TransactionIdCommitTree
!  * Marks all the given transaction ids as committed.
!  *
!  * The caller has to be sure that this is used only to mark subcommitted
!  * subtransactions as committed, and only *after* marking the toplevel
!  * parent as committed.  Otherwise there is a race condition against
!  * TransactionIdDidCommit.
   */
  void
! TransactionIdCommitTree(int nxids, TransactionId *xids)
  {
! if (nxids > 0)
! TransactionLogMultiUpdate(nxids, xids, TRANSACTION_STATUS_COMMITTED,
!  InvalidXLogRecPtr);
  }
 
  /*
   * TransactionIdAsyncCommitTree
!  * Same as above, but for async commits.  The commit record LSN is needed.
   */
  void
! TransactionIdAsyncCommitTree(int nxids, TransactionId *xids, XLogRecPtr lsn)
  {
! if (nxids > 0)
! TransactionLogMultiUpdate(nxids, xids, TRANSACTION_STATUS_COMMITTED,
!  lsn);
  }
 
  /*
--- 258,284 ----
  return false;
  }
 
  /*
   * TransactionIdCommitTree
!  * Marks all the given transaction ids as committed, atomically.
   */
  void
! TransactionIdCommitTree(TransactionId xid, int nxids, TransactionId *xids)
  {
! return TransactionIdSetTreeStatus(xid, nxids, xids,
! TRANSACTION_STATUS_COMMITTED, InvalidXLogRecPtr);
  }
 
  /*
   * TransactionIdAsyncCommitTree
!  * Same as above, but for async commits, atomically.  The commit record
!  * LSN is needed.
   */
  void
! TransactionIdAsyncCommitTree(TransactionId xid, int nxids, TransactionId *xids, XLogRecPtr lsn)
  {
! return TransactionIdSetTreeStatus(xid, nxids, xids,
! TRANSACTION_STATUS_COMMITTED, lsn);
  }
 
  /*
***************
*** 379,392 ****
   * will consider all the xacts as not-yet-committed anyway.
   */
  void
! TransactionIdAbortTree(int nxids, TransactionId *xids)
  {
! if (nxids > 0)
! TransactionLogMultiUpdate(nxids, xids, TRANSACTION_STATUS_ABORTED,
!  InvalidXLogRecPtr);
  }
 
-
  /*
   * TransactionIdPrecedes --- is id1 logically < id2?
   */
--- 289,300 ----
   * will consider all the xacts as not-yet-committed anyway.
   */
  void
! TransactionIdAbortTree(TransactionId xid, int nxids, TransactionId *xids)
  {
! TransactionIdSetTreeStatus(xid, nxids, xids,
! TRANSACTION_STATUS_ABORTED, InvalidXLogRecPtr);
  }
 
  /*
   * TransactionIdPrecedes --- is id1 logically < id2?
   */
Index: src/backend/access/transam/twophase.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/twophase.c,v
retrieving revision 1.45
diff -c -r1.45 twophase.c
*** src/backend/access/transam/twophase.c 11 Aug 2008 11:05:10 -0000 1.45
--- src/backend/access/transam/twophase.c 24 Sep 2008 17:33:23 -0000
***************
*** 1745,1753 ****
  XLogFlush(recptr);
 
  /* Mark the transaction committed in pg_clog */
! TransactionIdCommit(xid);
! /* to avoid race conditions, the parent must commit first */
! TransactionIdCommitTree(nchildren, children);
 
  /* Checkpoint can proceed now */
  MyProc->inCommit = false;
--- 1745,1751 ----
  XLogFlush(recptr);
 
  /* Mark the transaction committed in pg_clog */
! TransactionIdCommitTree(xid, nchildren, children);
 
  /* Checkpoint can proceed now */
  MyProc->inCommit = false;
***************
*** 1822,1829 ****
  * Mark the transaction aborted in clog.  This is not absolutely necessary
  * but we may as well do it while we are here.
  */
! TransactionIdAbort(xid);
! TransactionIdAbortTree(nchildren, children);
 
  END_CRIT_SECTION();
  }
--- 1820,1826 ----
  * Mark the transaction aborted in clog.  This is not absolutely necessary
  * but we may as well do it while we are here.
  */
! TransactionIdAbortTree(xid, nchildren, children);
 
  END_CRIT_SECTION();
  }
Index: src/backend/access/transam/xact.c
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/xact.c,v
retrieving revision 1.265
diff -c -r1.265 xact.c
*** src/backend/access/transam/xact.c 11 Aug 2008 11:05:10 -0000 1.265
--- src/backend/access/transam/xact.c 24 Sep 2008 17:33:23 -0000
***************
*** 254,260 ****
  static TransactionId RecordTransactionAbort(bool isSubXact);
  static void StartTransaction(void);
 
- static void RecordSubTransactionCommit(void);
  static void StartSubTransaction(void);
  static void CommitSubTransaction(void);
  static void AbortSubTransaction(void);
--- 254,259 ----
***************
*** 952,962 ****
  * Now we may update the CLOG, if we wrote a COMMIT record above
  */
  if (markXidCommitted)
! {
! TransactionIdCommit(xid);
! /* to avoid race conditions, the parent must commit first */
! TransactionIdCommitTree(nchildren, children);
! }
  }
  else
  {
--- 951,957 ----
  * Now we may update the CLOG, if we wrote a COMMIT record above
  */
  if (markXidCommitted)
! TransactionIdCommitTree(xid, nchildren, children);
  }
  else
  {
***************
*** 974,984 ****
  * flushed before the CLOG may be updated.
  */
  if (markXidCommitted)
! {
! TransactionIdAsyncCommit(xid, XactLastRecEnd);
! /* to avoid race conditions, the parent must commit first */
! TransactionIdAsyncCommitTree(nchildren, children, XactLastRecEnd);
! }
  }
 
  /*
--- 969,975 ----
  * flushed before the CLOG may be updated.
  */
  if (markXidCommitted)
! TransactionIdAsyncCommitTree(xid, nchildren, children, XactLastRecEnd);
  }
 
  /*
***************
*** 1156,1191 ****
  s->maxChildXids = 0;
  }
 
- /*
-  * RecordSubTransactionCommit
-  */
- static void
- RecordSubTransactionCommit(void)
- {
- TransactionId xid = GetCurrentTransactionIdIfAny();
-
- /*
- * We do not log the subcommit in XLOG; it doesn't matter until the
- * top-level transaction commits.
- *
- * We must mark the subtransaction subcommitted in the CLOG if it had a
- * valid XID assigned. If it did not, nobody else will ever know about
- * the existence of this subxact.  We don't have to deal with deletions
- * scheduled for on-commit here, since they'll be reassigned to our parent
- * (who might still abort).
- */
- if (TransactionIdIsValid(xid))
- {
- /* XXX does this really need to be a critical section? */
- START_CRIT_SECTION();
-
- /* Record subtransaction subcommit */
- TransactionIdSubCommit(xid);
-
- END_CRIT_SECTION();
- }
- }
-
  /* ----------------------------------------------------------------
   * AbortTransaction stuff
   * ----------------------------------------------------------------
--- 1147,1152 ----
***************
*** 1288,1301 ****
  * waiting for already-aborted subtransactions.  It is OK to do it without
  * having flushed the ABORT record to disk, because in event of a crash
  * we'd be assumed to have aborted anyway.
- *
- * The ordering here isn't critical but it seems best to mark the parent
- * first.  This assures an atomic transition of all the subtransactions to
- * aborted state from the point of view of concurrent
- * TransactionIdDidAbort calls.
  */
! TransactionIdAbort(xid);
! TransactionIdAbortTree(nchildren, children);
 
  END_CRIT_SECTION();
 
--- 1249,1256 ----
  * waiting for already-aborted subtransactions.  It is OK to do it without
  * having flushed the ABORT record to disk, because in event of a crash
  * we'd be assumed to have aborted anyway.
  */
! TransactionIdAbortTree(xid, nchildren, children);
 
  END_CRIT_SECTION();
 
***************
*** 3791,3798 ****
  /* Must CCI to ensure commands of subtransaction are seen as done */
  CommandCounterIncrement();
 
! /* Mark subtransaction as subcommitted */
! RecordSubTransactionCommit();
 
  /* Post-commit cleanup */
  if (TransactionIdIsValid(s->transactionId))
--- 3746,3757 ----
  /* Must CCI to ensure commands of subtransaction are seen as done */
  CommandCounterIncrement();
 
! /*
! * Prior to 8.4 we marked subcommit in clog at this point.
! * We now only perform that step, if required, as part of the
! * atomic update of the whole transaction tree at top level
! * commit or abort.
! */
 
  /* Post-commit cleanup */
  if (TransactionIdIsValid(s->transactionId))
***************
*** 4259,4269 ****
  TransactionId max_xid;
  int i;
 
- TransactionIdCommit(xid);
-
  /* Mark committed subtransactions as committed */
  sub_xids = (TransactionId *) &(xlrec->xnodes[xlrec->nrels]);
! TransactionIdCommitTree(xlrec->nsubxacts, sub_xids);
 
  /* Make sure nextXid is beyond any XID mentioned in the record */
  max_xid = xid;
--- 4218,4226 ----
  TransactionId max_xid;
  int i;
 
  /* Mark committed subtransactions as committed */
  sub_xids = (TransactionId *) &(xlrec->xnodes[xlrec->nrels]);
! TransactionIdCommitTree(xid, xlrec->nsubxacts, sub_xids);
 
  /* Make sure nextXid is beyond any XID mentioned in the record */
  max_xid = xid;
***************
*** 4299,4309 ****
  TransactionId max_xid;
  int i;
 
- TransactionIdAbort(xid);
-
  /* Mark subtransactions as aborted */
  sub_xids = (TransactionId *) &(xlrec->xnodes[xlrec->nrels]);
! TransactionIdAbortTree(xlrec->nsubxacts, sub_xids);
 
  /* Make sure nextXid is beyond any XID mentioned in the record */
  max_xid = xid;
--- 4256,4264 ----
  TransactionId max_xid;
  int i;
 
  /* Mark subtransactions as aborted */
  sub_xids = (TransactionId *) &(xlrec->xnodes[xlrec->nrels]);
! TransactionIdAbortTree(xid, xlrec->nsubxacts, sub_xids);
 
  /* Make sure nextXid is beyond any XID mentioned in the record */
  max_xid = xid;
Index: src/include/access/clog.h
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/include/access/clog.h,v
retrieving revision 1.21
diff -c -r1.21 clog.h
*** src/include/access/clog.h 1 Jan 2008 19:45:56 -0000 1.21
--- src/include/access/clog.h 24 Sep 2008 17:33:23 -0000
***************
*** 32,38 ****
  #define NUM_CLOG_BUFFERS 8
 
 
! extern void TransactionIdSetStatus(TransactionId xid, XidStatus status, XLogRecPtr lsn);
  extern XidStatus TransactionIdGetStatus(TransactionId xid, XLogRecPtr *lsn);
 
  extern Size CLOGShmemSize(void);
--- 32,39 ----
  #define NUM_CLOG_BUFFERS 8
 
 
! extern void TransactionIdSetTreeStatus(TransactionId xid, int nsubxids,
! TransactionId *subxids, XidStatus status, XLogRecPtr lsn);
  extern XidStatus TransactionIdGetStatus(TransactionId xid, XLogRecPtr *lsn);
 
  extern Size CLOGShmemSize(void);
Index: src/include/access/transam.h
===================================================================
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/include/access/transam.h,v
retrieving revision 1.65
diff -c -r1.65 transam.h
*** src/include/access/transam.h 11 Mar 2008 20:20:35 -0000 1.65
--- src/include/access/transam.h 24 Sep 2008 17:33:23 -0000
***************
*** 139,151 ****
  extern bool TransactionIdDidCommit(TransactionId transactionId);
  extern bool TransactionIdDidAbort(TransactionId transactionId);
  extern bool TransactionIdIsKnownCompleted(TransactionId transactionId);
- extern void TransactionIdCommit(TransactionId transactionId);
- extern void TransactionIdAsyncCommit(TransactionId transactionId, XLogRecPtr lsn);
  extern void TransactionIdAbort(TransactionId transactionId);
! extern void TransactionIdSubCommit(TransactionId transactionId);
! extern void TransactionIdCommitTree(int nxids, TransactionId *xids);
! extern void TransactionIdAsyncCommitTree(int nxids, TransactionId *xids, XLogRecPtr lsn);
! extern void TransactionIdAbortTree(int nxids, TransactionId *xids);
  extern bool TransactionIdPrecedes(TransactionId id1, TransactionId id2);
  extern bool TransactionIdPrecedesOrEquals(TransactionId id1, TransactionId id2);
  extern bool TransactionIdFollows(TransactionId id1, TransactionId id2);
--- 139,148 ----
  extern bool TransactionIdDidCommit(TransactionId transactionId);
  extern bool TransactionIdDidAbort(TransactionId transactionId);
  extern bool TransactionIdIsKnownCompleted(TransactionId transactionId);
  extern void TransactionIdAbort(TransactionId transactionId);
! extern void TransactionIdCommitTree(TransactionId xid, int nxids, TransactionId *xids);
! extern void TransactionIdAsyncCommitTree(TransactionId xid, int nxids, TransactionId *xids, XLogRecPtr lsn);
! extern void TransactionIdAbortTree(TransactionId xid, int nxids, TransactionId *xids);
  extern bool TransactionIdPrecedes(TransactionId id1, TransactionId id2);
  extern bool TransactionIdPrecedesOrEquals(TransactionId id1, TransactionId id2);
  extern bool TransactionIdFollows(TransactionId id1, TransactionId id2);



--
Sent via pgsql-patches mailing list (pgsql-patches@...)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches