TITLE: Declarative Information Extraction, Web Crawling and Recursive Wrapping with Lixto SPEAKER: Georg Gottlob, Vienna University of Technology, (TU Wien) ABSTRACT: Lixto is a system and method for the visual and interactive generation of wrappers for Web pages under the supervision of a human developer, for automatically extracting information from Web pages using such wrappers, and for translating the extracted content into XML. In this talk, we describe some advanced features of Lixto, such as disjunctive pattern definitions, specialization rules, and Lixto's capability of collecting and aggregating information from several linked Web pages. We illustrate these features with significant examples from the commercial domain. Joint work with Robert Baumgartner and Sergio Flesca. |
8:35 Invited talk by Georg Gottlob
9:30 Coffee break
10:00 Session 1
12:00 Lunch (included in the registration fee)
13:30 Session 2
16:00 Session 3
Workshop organizers
Program committee
The proceedings will be electronically available at http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS. At the workshop, a hard copy of the proceedings will be handed out to the participants.
For more information on the workshop, including registration, please check the Web page of KRDB-2001 at http://www.dis.uniroma1.it/~lenzerini/krdb01. Enquiries about the KRDB-2001 workshop can be made by mailing to lenzerini@dis.uniroma1.it.