Keynote Speeches

 

Topic:  Exploring the Deep Web through Search Engines
Speaker:
Weiyi Meng, State University of New York at Binghamton, USA

The Web can be divided into the Surface Web and the Deep Web. The former consists of Web pages that can be crawled by following URLs while the latter contains un-crawlable data hidden behind search interfaces. The data in the Deep Web are of much larger size and better quality than the data in the Surface Web. For an application to utilize the data in the Deep Web, it is necessary for the application to interact with Deep Web search engines. There are numerous interesting research problems related to working with search engines and many of them are in fact data mining problems. These problems include search engine categorization/clustering, automatic search result extraction and wrapper generation, publication time extraction, result template mining and data alignment, and entity identification. In this talk, I will review some of these problems and introduce some of our effort in solving these problems.

Topic: An Introduction to Transfer Learning
Speaker: Qiang Yang,
professor at Hong Kong University of Science and Technology, Department of Computer Science and Engineering

Many existing data mining and machine learning techniques are based on the assumption that training and test data fit the same distribution. This assumption does not hold, however, as in many cases of Web mining and wireless computing when labeled data becomes outdated or test data are from a different domain with training data. In these cases, most machine learning methods would fail in correctly classifying new and future data. It would be very costly and infeasible to collect and label enough new training data. Instead, we would like to recoup as much useful knowledge as possible from the old data. This problem is known as transfer learning. In this talk, I will give an overview of the transfer learning problem, present a number of important directions in this research, and discuss our own novel solutions to this problem.