开始

Postgresql 名人 momjian 的文章指出了其pseudo code：

@H_403_7@

for@H_403_7@ (j = 0@H_403_7@; j < length(inner); j++)
　　hash_key @H_403_7@= hash(inner[j]);
　　append(hash_store[hash_key],inner[j]);
@H_403_7@for@H_403_7@ (i = 0@H_403_7@; i < length(outer); i++)
　　hash_key @H_403_7@= hash(outer[i]);
　　@H_403_7@0@H_403_7@; j < length(hash_store[hash_key]); j++)
　　　　@H_403_7@if@H_403_7@ (outer[i] == hash_store[hash_key][j])
　　　　　　output(outer[i],inner[j]);@H_403_7@

@H_403_7@

为了看的更加清楚一点，加上自己的注释：

//@H_403_7@利用 inner 表，来构造 hash 表(放在内存里) @H_403_7@ 0@H_403_7@; j < length(inner); j++) { hash_key @H_403_7@= hash(inner[j]); append(hash_store[hash_key],inner[j]); } @H_403_7@对 outer 表的每一个元素，进行遍历 @H_403_7@ 0@H_403_7@; i < length(outer); i++) { @H_403_7@拿到 outer 表中的某个元素，进行 hash运算，得到其 hash_key 值 @H_403_7@ hash_key = hash(outer[i]); @H_403_7@用上面刚得到的 hash_key值，来对 hash 表进行探测（假定hash表中有此key 值） @H_403_7@采用 length (hash_store[hash_Key]) 是因为，hash算法构造完hash 表后，有可能出现一个key值处有多个元素的情况。 @H_403_7@例如： hash_key 100 ，对应 a,c,e；而 hash_key 200 ，对应 d; hash_key 300，对应 f; @H_403_7@也就是说，如下的遍历，其实是对拥有相同的（此处是上面刚运算的，特定的）hash_key 值的各个元素的遍历 @H_403_7@ 0@H_403_7@; j < length(hash_store[hash_key]); j++) { @H_403_7@如果找到了匹配值，则输出一行结果 @H_403_7@ if@H_403_7@ (outer[i] == hash_store[hash_key][j]) output(outer[i],inner[j]); } } @H_403_7@

[作者：技术者高健@博客园 mail:luckyjackgao@gmail.com]@H_403_7@

实践一下：

@H_403_7@

postgres=# \d employee
          Table @H_403_7@"@H_403_7@public.employee@H_403_7@"@H_403_7@
 Column @H_403_7@|         Type          | Modifiers 
@H_403_7@--------+-----------------------+-----------
 id     @H_403_7@| integer               | 
 name   @H_403_7@| character varying(20@H_403_7@) | 
 deptno @H_403_7@| integer               | 
 age    @H_403_7@| integer               | 
Indexes:
    @H_403_7@idx_id_dept@H_403_7@"@H_403_7@ btree (id,deptno)

postgres@H_403_7@=# \d deptment
           Table @H_403_7@public.deptment@H_403_7@"@H_403_7@
  Column  @H_403_7@|         Type          | Modifiers 
@H_403_7@----------+-----------------------+-----------
 deptno   @H_403_7@| integer               | 
 deptname @H_403_7@| character varying(20@H_403_7@) | 

postgres@H_403_7@=# 

postgres@H_403_7@=# select@H_403_7@ count(*) from@H_403_7@ employee;
 count 
@H_403_7@-------
1000@H_403_7@
(@H_403_7@1@H_403_7@ row)

postgres@H_403_7@=# from@H_403_7@ deptment;
 count 
@H_403_7@-------
102@H_403_7@
(@H_403_7@1@H_403_7@ row)

postgres@H_403_7@=#

执行计划：

@H_403_7@

postgres=# explain select@H_403_7@ a.name,b.deptname from@H_403_7@ employee a,deptment b where@H_403_7@ a.deptno=b.deptno;
                               QUERY PLAN                                
@H_403_7@-------------------------------------------------------------------------
 Hash Join  (cost@H_403_7@=3.29@H_403_7@..34.05@H_403_7@ rows=1000@H_403_7@ width=14@H_403_7@)
   Hash Cond: (a.deptno @H_403_7@= b.deptno)
   @H_403_7@->  Seq Scan on employee a  (cost=0.00@H_403_7@..17.00@H_403_7@ rows=10@H_403_7@)
   @H_403_7@->  Hash  (cost=2.02@H_403_7@..2.02@H_403_7@ rows=102@H_403_7@ width=12@H_403_7@)
         @H_403_7@->  Seq Scan on deptment b  (cost=12@H_403_7@)
(@H_403_7@5@H_403_7@ rows)

postgres@H_403_7@=#