数据集是多边形,表示全球的轮廓.表中有106,000行,多边形存储在几何字段中.
我的问题是,许多多边形覆盖了全球的很大一部分.这似乎使得很难获得一个空间索引,这将消除主要过滤器中的许多行.例如,查看以下查询:
SELECT "ID","CODE","geom".STAsBinary() as "geom" FROM "dbo"."ContA" WHERE "geom".Filter( geometry::STGeomFromText('POLYGON ((-142.03193662573682 59.53396984952896,-142.03193662573682 59.88928136451884,-141.32743833481925 59.88928136451884,-141.32743833481925 59.53396984952896,-142.03193662573682 59.53396984952896))',4326) ) = 1
这是查询与表中只有两个多边形相交的区域.无论我选择什么空间索引设置的组合,Filter()总是返回大约60,000行.
用STIntersects()替换Filter()当然只返回我想要的两个多边形,但当然需要更长的时间(Filter()为6秒,STIntersects()为12秒).
任何人都可以提供关于是否有空间索引设置有可能提高60,000行的提示,或者我的数据集是否与sql Server的空间索引不是很好的匹配?
更多信息:
如所建议的那样,我将多边形拆分,在全球范围内使用4×4格网.我看不到用QGIS做的方法,所以我写了我自己的查询来做.首先我定义了16个边框,第一个看起来像这样:
declare @Box1 geometry = geometry::STGeomFromText('POLYGON (( -180 90,-90 90,-90 45,-180 45,-180 90))',4326)
然后我使用每个边框来选择和截断与该框相交的多边形:
insert ContASplit select CODE,geom.STIntersection(@Box1),CODE_DESC from ContA where geom.STIntersects(@Box1) = 1
我显然是在4×4网格中的所有16个边界框中.最终的结果是我有一个新的表,大约有107,000行(这证实我实际上没有很多巨型多边形).
我添加了每个对象1024个单元格的空间索引,每级别的单元格为低,低,低.
然而,非常奇怪的是,这个新的表与拆分多边形仍然与旧的一样.上面列出的.Filter仍然返回〜60,000行.我真的不明白这一点,显然我不明白空间指数如何实际运作.
矛盾的是,虽然.Filter()仍返回约60,但性能却有所提高. .Filter()现在需要大约2秒而不是6,而.STIntersects()现在需要6秒而不是12.
根据请求,这里是索引的sql示例:
CREATE SPATIAL INDEX [contasplit_sidx] ON [dbo].[ContASplit] ( [geom] )USING GEOMETRY_GRID WITH ( BOUNDING_Box =(-90,-180,90,180),GRIDS =(LEVEL_1 = LOW,LEVEL_2 = LOW,LEVEL_3 = LOW,LEVEL_4 = LOW),CELLS_PER_OBJECT = 1024,PAD_INDEX = OFF,SORT_IN_TEMPDB = OFF,DROP_EXISTING = OFF,ALLOW_ROW_LOCKS = ON,ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
虽然记住,我已经为每个对象的网格和单元格尝试了一系列不同的设置,每次都有相同的结果.
以下是运行sp_help_spatial_geometry_index的结果,这是在我的分割数据集上,其中没有单个多边形占据全球的1/16以上.
Base_Table_Rows 215138
Bounding_Box_xmin -90
Bounding_Box_ymin -180
Bounding_Box_xmax 90
Bounding_Box_ymax 180
Grid_Size_Level_1 64
Grid_Size_Level_2 64
Grid_Size_Level_3 64
Grid_Size_Level_4 64
Cells_Per_Object 16
Total_Primary_Index_Rows 378650
Total_Primary_Index_Pages 1129
Average_Number_Of_Index_Rows_Per_Base_Row 1
Total_Number_Of_ObjectCells_In_Level0_For_QuerySample 1
Total_Number_Of_ObjectCells_In_Level0_In_Index 60956
Total_Number_Of_ObjectCells_In_Level1_In_Index 361
Total_Number_Of_ObjectCells_In_Level2_In_Index 2935
Total_Number_Of_ObjectCells_In_Level3_In_Index 32420
Total_Number_Of_ObjectCells_In_Level4_In_Index 281978
Total_Number_Of_Interior_ObjectCells_In_Level2_In_Index 1
Total_Number_Of_Interior_ObjectCells_In_Level3_In_Index 49
Total_Number_Of_Interior_ObjectCells_In_Level4_In_Index 4236
Total_Number_Of_Intersecting_ObjectCells_In_Level1_In_Index 29
Total_Number_Of_Intersecting_ObjectCells_In_Level2_In_Index 1294
Total_Number_Of_Intersecting_ObjectCells_In_Level3_In_Index 29680
Total_Number_Of_Intersecting_ObjectCells_In_Level4_In_Index 251517
Total_Number_Of_Border_ObjectCells_In_Level0_For_QuerySample 1
Total_Number_Of_Border_ObjectCells_In_Level0_In_Index 60956
Total_Number_Of_Border_ObjectCells_In_Level1_In_Index 332
Total_Number_Of_Border_ObjectCells_In_Level2_In_Index 1640
Total_Number_Of_Border_ObjectCells_In_Level3_In_Index 2691
Total_Number_Of_Border_ObjectCells_In_Level4_In_Index 26225
Interior_To_Total_Cells_Normalized_To_Leaf_Grid_Percentage 0.004852925
Intersecting_To_Total_Cells_Normalized_To_Leaf_Grid_Percentage 0.288147586
Border_To_Total_Cells_Normalized_To_Leaf_Grid_Percentage 99.70699949
Average_Cells_Per_Object_Normalized_To_Leaf_Grid 405.7282349
Average_Objects_PerLeaf_GridCell 0.002464704
Number_Of_SRIDs_Found 1
Width_Of_Cell_In_Level1 2.8125
Width_Of_Cell_In_Level2 0.043945313
Width_Of_Cell_In_Level3 0.000686646
Width_Of_Cell_In_Level4 1.07E-05
Height_Of_Cell_In_Level1 5.625
Height_Of_Cell_In_Level2 0.087890625
Height_Of_Cell_In_Level3 0.001373291
Height_Of_Cell_In_Level4 2.15E-05
Area_Of_Cell_In_Level1 1012.5
Area_Of_Cell_In_Level2 15.8203125
Area_Of_Cell_In_Level3 0.247192383
Area_Of_Cell_In_Level4 0.003862381
CellArea_To_BoundingBoxArea_Percentage_In_Level1 1.5625
CellArea_To_BoundingBoxArea_Percentage_In_Level2 0.024414063
CellArea_To_BoundingBoxArea_Percentage_In_Level3 0.00038147
CellArea_To_BoundingBoxArea_Percentage_In_Level4 5.96E-06
Number_Of_Rows_Selected_By_Primary_Filter 60956
Number_Of_Rows_Selected_By_Internal_Filter 0
Number_Of_Times_Secondary_Filter_Is_Called 60956
Number_Of_Rows_Output 2
Percentage_Of_Rows_NotSelected_By_Primary_Filter 71.66655821
Percentage_Of_Primary_Filter_Rows_Selected_By_Internal_Filter 0
Internal_Filter_Efficiency 0
Primary_Filter_Efficiency 0.003281055
“Base_Table_Rows 215138”对我来说没有什么意义,表中有107,而不是215,000行
当渲染时,数据集看起来像这样:
alt text http://norman.cx/photos/links/wms.png
进一步的研究:
我对这个数据的主要滤波器性能不佳感到困惑.所以我做了一个测试,看看我的数据如何分割.使用我原来的非分割功能,我在表中添加了一个“单元格”列.然后我运行16个查询来计算4×4网格中跨越多少个单元格.所以我为每个单元格运行了一个这样的查询:
declare @Box1 geometry = geometry::STGeomFromText('POLYGON (( -180 90,4326) update ContA set cells = cells + 1 where geom.STIntersects(@Box1) = 1
如果我再看表格中的“单元格”列,我的整个数据集中只有672个特征与4×4网格中的多个单元相交.那么在地球上呢,从字面上来说,主滤波器可以返回60,000个功能,用于查询一个小的200英里宽的矩形吗?
解决方法
CREATE SPATIAL INDEX [contasplit_sidx] ON [dbo].[ContASplit] ( [geom] )USING GEOMETRY_GRID WITH ( BOUNDING_Box =(-90,...
因此,BOUNDING_Box映射到:
xmin = -90 ymin = -180 xmax = 90 ymax = 180
> Longtitude(-180 to
180 – 指定东西方
子午线)应映射到X
>纬度(-90到90 – 指定如何
赤道以北或南方)
应该映射到Y
所以要为世界创建BOUNDING_Box,你应该使用:
CREATE SPATIAL INDEX [contasplit_sidx] ON [dbo].[ContASplit] ( [geom] )USING GEOMETRY_GRID WITH ( BOUNDING_Box =(-180,-90,180,90),...
这应该创建一个适合您的数据的索引,意味着所有的功能都被索引覆盖.