[1] 黄豫清, 戚广志, 张福炎, HUANG Yuq-ing, QI Guang-zhi, ZHANG Fu-yan. 从WEB文档中构造半结构化信息的抽取器 [J]. 软件学报 2000. [2] 张绍华, 徐林昊, 杨文柱, 薛文玲, 李天柱. 基于样本实例的Web信息抽取 [J]. 河北大学学报(自然科学版) 2001.doi:10.3969/j.issn.1000-1565.2001.04.023 [3] LAENDER A, RIBEIRO- NE TO B, SILVA A. A brief survey of web data extraction Tools [J]. Sigmod Record 2002, 31(02). [4] Wang T, TANG S, Yang D. COMMIX: Towards effective Web information extraction integration and query Answering [A]. 2002. [5] GEORG G, KOCH C. Monadic datalog and the expressive power of languages for Web information extraction [A]. 2002. [6] DAYNE FREITAG. Machine Learning for Information Extraction in Informal Domains [J]. Machine learning 2000, 2/3(2/3). [7] SODERLAND S. Learning information extraction rules for semi - structured and Free Text [J]. Machine Learning 1999, 34(34). [8] Kushmerick N. Wrapper induction:efficiency and expressiveness [J]. Artificial Intelligence 2000, 118(1/2). [9] ION MUSLEA, STEVEN MINTON, CRAIG A. KNOBLOCK. Hierarchical Wrapper Induction for Semistructured Information Sources [J]. Autonomous agents and multi-agent systems 2001, 1/2(1/2). [10] MUSLEA I. Extraction patterns for information extraction tasks: A survey [A]. Orlando Florida 1999. [11] Craig A, KNOBLOCK, KRISTINA L. Accurately and reliably extracting data from the web: A machine learning approach [J]. Data Engineering Bulletin 2000, 23(04). [12] MUSLEA I, MINTON S, CRAIG A. Active learning for hierarchical wrapper induction [A]. Orlando Florida USA 1999. [13] MUSLEA I, MINTON S, CRAIG A. A hierarchical approach to wrapper induction [A]. Washington USA 1999. [14] Hsu CN., Dung MT.. Generating finite-state transducers for semi-structured data extraction from the Web [J]. Information systems 1998, 8(8). [15] Embley D, CAMPBELL D, Jiang S. Conceptual - model - based data extraction from ultiple record web pages [J]. Data and Knowledge Engineering 1999, 31(03). [16] EMBLEY D, JIANG S, NG Y. Record- boundary discovery in web documents [A]. Philadelphia Pennsylvania USA 1999. [17] EMBLEY D, XU L. Record location and reconfiguration in unstructured multiple- record web documents [A]. Dallas Texas USA 2000. [18] EMBLEY D, CAMPBELL D, LIDDLE S. Ontology - based extraction and structuring of information from data - rich unstructured documents [A]. Bethesda Maryland USA 1998. [19] CHRISTfNA YIP CHUNG, MICHAEL GERTZ, NEEL SUNDARESAN. Reverse engineering for Web data :From visual to semantic structures [A]. San Jose California 2002. [20] CHRISTINA YIP CHUNG, NEEL SUNDARESAN. Quixote: Building XML repositories from topic specific web documents [A]. 2001. [21] ROBERT BAUMGARTNER, SERGIO FLESCA, GEORG GOTTLOB. Supervised wrapper generation with lixto [A]. Roma Italy 2001. [22] ROBERT BAUMGARTNER, SERGIO FLESCA, GEORG GOTTLOB. Visual web information extraction with lixto [A]. Roma Italy 2001. [23] Liu L, PU C, Han W. XWRAP: An XML- enabled wrapper construction system for Web information sources [A]. San Diego 2000. [24] Liu L, HAN W, BUTTLER D. An XML- Based wrapper generator for web Information extraction [A]. Philadelphia Pennsylvania USA 1999. [25] AROCENA G, MENDELZON A. WebOQL: Restructuring documents databases and webs [A]. Orlando Florida USA 1998. [26] GUSTAVO AROCENA. WebOQL: Exploiting document structure in Web queries [D]. Toronto:Master's thesis University of Toronto 1997. [27] Arnaud Sahuguet, Fabien Azavant. Building intelligent Web applications using lightweight wrappers [J]. Data & knowledge engineering 2001, 3(3). [28] HEEKYOUNG SEO, JAEYOUNG YANG. Knowledge- based wrapper generation by using XML [A]. Washington USA 2001. [29] JUSSI M. Effective Web data extraction with standard XML technologies [A]. Hong Kong China 2001. [30] ARNAUD S, FABIEN A. Building light - weight wrappers for legacy Web data - sources using W4F [A]. Edinburgh Scotland UK 1999. [31] ARNAUD S, FABIEN A. Web Ecology: Recycling HTML pages as XML documents using W4F [A]. Philadelphia Pennsylvania USA 1999. [32] JOACHIM HAMMER, HECTOR GARCIA- MOLINA. Template- based wrapper in the TSIMMIS system [A]. Tucson Arizona USA 1997. [33] VALTER CRESCENZI, GIANSALVATORE MECCA. RoadRunner: towards automatic data extraction from large Web sites [A]. Rome Italy 2001. [34] Chang C, LUI C. IEPAD: Information extraction based on pattern discovery [A]. Hong Kong China 2001. [35] HanW, BUTTLER D, PUC. Wrapping Web data into XML [J]. Sigmod Record 2001, 30(03). [36] JONATHAN HODGSON. Do HTML tags flag semantic content? [J]. IEEE Internet Computing 2001, 5(01). [37] DOM [EB/OL]. http://www.w3corg/TR/REC-DOM-Level1 [38] XML Path Language Version 2 0 [EB/OL]. http://www.w3org/TR/xpath20 [39] XQuery [EB/OL]. http://www.w3orgTR/xquery [40] 徐林吴, 杨文柱, 陈少飞. 基于XPath的Web信息抽取 [A]. 郑州 2002. [41] 杨文柱, 徐林昊, 郝亚南. 个性化的Web查询助手的设计与实现 [A]. 郑州 2002. [42] CALI FF M, MOONEY R. Relational Learning of pattern- match rules for information extraction [A]. Orlando Florida 1999. |