我使用的是oracle10gR2 10.2.0.4和solaris10 64bit
我需要从xmltype列表(word.testMeta)中的xml中选择数据值
并插入另一个表(word.testwordyy)
desc word.testMeta; Name Null? Type -------------------------------------- FILENAME CHAR(2000) XMLDATA XMLTYPE desc word.testwordyy; Name Null? Type --------------------------------------- ID VARCHAR2(255) KEYWORD VARCHAR2(4000)
我使用xmltable并执行:
insert /*+append */ into word.testwordyy(KEYWORD) select /*+ gather_plan_statistics */ dbms_lob.substr(xmltype.getclobval(b.KEWOR),254) from word.testMeta,xmltable ( '$B/mets/KEWOR' passing word.testMeta.XMLDATA as B columns KEWOR xmltype path '/KEWOR/text()' ) b
这是解释计划select * from table(dbms_xplan.display_cursor(null,null,’iostats last’));
PLAN_TABLE_OUTPUT ----------------------------------------------------------------------------------------------------------------------------------- sql_ID 37ua3npnxx8su,child number 0 ------------------------------------- insert /*+append */ into word.testwordyy(KEYWORD) select /*+ gather_plan_statistics */ dbms_lob.substr(xmltype.getclobval(b.KEWOR),254) from word.testMeta,xmltable ( '$B/mets/KEWOR' passing > word.testMeta.XMLDATA as B columns KEWOR xmltype path '/KEWOR/text()' ) b Plan hash value: 875848213 ----------------------------------------------------------------------------------------------------------------------------------- PLAN_TABLE_OUTPUT ----------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | ----------------------------------------------------------------------------------------------------------------------------------- | 1 | LOAD AS SELECT | | 1 | | 1 |00:10:32.72 | 16832 | 7 | 90 | | 2 | NESTED LOOPS | | 1 | 29M| 34688 |00:00:25.95 | 12639 | 5 | 0 | | 3 | TABLE ACCESS FULL | TESTMeta | 1 | 3638 | 3999 |00:00:00.08 | 909 | 0 | 0 | | 4 | COLLECTION ITERATOR PICKLER FETCH| XMLSEQUENCEFROMXMLTYPE | 3999 | | 34688 |00:00:24.50 | 11730 | 5 | 0 | Note ----- - dynamic sampling used for this statement 21 rows selected.
表word.testMeta中的行数越多,每行花费的时间就越多
我的XML简单而小巧,但需要处理大量的XML(5000000)
当行数超过8000时,处理速度非常慢,需要几个小时.
有没有优化或更快的方式?
解决方法
您已将KEYWOR列定义为XMLTYPE.这是为什么? XMLTABLE的重点是将XML结构转换为关系列.如果您将列定义为简单字符串,则可以避免大量不必要的转换.
“the contents of the tag are more than 4000 characters>>> are there any methods to substring the contents of the tag in the xmltable”
有XPath子串函数.
insert /*+append */ into word.testwordyy(KEYWORD) select /*+ gather_plan_statistics */ b.KEWOR from word.testMeta,xmltable ( '$B/mets/KEWOR' passing word.testMeta.XMLDATA as B columns KEWOR varchar2(4000) path 'substring(KEWOR,254,4000)' ) b
在这里,我已经启动了您在原始帖子中使用的偏移量为254的子字符串.我还明确地将其长度设置为4000.
在声明列时,我认为您不需要显式引用text()节点.