源代码解释block格式

本文原创为freas_1990，转载请标明出处：http://www.jb51.cc/article/p-dcjqjvyw-yu.html

一直都想看一下Oracle的page(block)格式，未能如愿。

今天在翻postgresql的源代码时，偶然看到一些有趣的东西。

大家一起研究研究。

/*
 * a postgres disk page is an abstraction layered on top of a postgres
 * disk block (which is simply a unit of i/o,see block.h).
 *
 * specifically,while a disk block can be unformatted,a postgres
 * disk page is always a slotted page of the form:
 *
 * +----------------+---------------------------------+
 * | PageHeaderData | linp0 linp1 linp2 ...           |
 * +-----------+----+---------------------------------+
 * | ... linpN |                                      |
 * +-----------+--------------------------------------+
 * |           ^ pd_lower                             |
 * |                                                  |
 * |             v pd_upper                           |
 * +-------------+------------------------------------+
 * |             | tupleN ...                         |
 * +-------------+------------------+-----------------+
 * |       ... tuple2 tuple1 tuple0 | "special space" |
 * +--------------------------------+-----------------+
 *                                  ^ pd_special
 *
 * a page is full when nothing can be added between pd_lower and
 * pd_upper.
 *
 * all blocks written out by an access method must be disk pages.
 *
 * EXCEPTIONS:
 *
 * obvIoUsly,a page is not formatted before it is initialized with by
 * a call to PageInit.
 *
 * the contents of the special pg_variable/pg_time/pg_log tables are
 * raw disk blocks with special formats.  these are the only "access
 * methods" that need not write disk pages.
 *
 * NOTES:
 *
 * linp0..N form an ItemId array.  ItemPointers point into this array
 * rather than pointing directly to a tuple.
 *
 * tuple0..N are added "backwards" on the page.  because a tuple's
 * ItemPointer points to its ItemId entry rather than its actual
 * byte-offset position,tuples can be physically shuffled on a page
 * whenever the need arises.
 *
 * AM-generic per-page information is kept in the pd_opaque field of
 * the PageHeaderData.  (this is currently only the page size.)
 * AM-specific per-page data is kept in the area marked "special
 * space"; each AM has an "opaque" structure defined somewhere that is
 * stored as the page trailer.  an access method should always
 * initialize its pages with PageInit and then set its own opaque
 * fields.
 */

我们再看一下一个page(block)是如何初始化的：

/*
 * PageInit --
 *	Initializes the contents of a page.
 */
void
PageInit(Page page,Size pageSize,Size specialSize)
{
    PageHeader p = (PageHeader) page;

    Assert(pageSize == BLCKSZ);
    Assert(pageSize >
	   specialSize + sizeof(PageHeaderData) - sizeof(ItemIdData));
    
    specialSize = DOUBLEALIGN(specialSize);

    p->pd_lower = sizeof(PageHeaderData) - sizeof(ItemIdData);
    p->pd_upper = pageSize - specialSize;
    p->pd_special = pageSize - specialSize;
    PageSetPageSize(page,pageSize);
}

这里的PageHeader定义如下：

/*
 * disk page organization
 */
typedef struct PageHeaderData {
    LocationIndex pd_lower; /* offset to start of free space */
    LocationIndex pd_upper; /* offset to end of free space */
    LocationIndex pd_special; /* offset to start of special space */
    OpaqueData        pd_opaque; /* AM-generic information */
    ItemIdData  pd_linp[1]; /* line pointers */
} PageHeaderData;

typedef PageHeaderData *PageHeader;

源代码解释block格式

猜你在找的Postgre SQL相关文章