.
.

Wednesday, August 20, 2014

Web Crawling and Data Mining with Apache Nutch

Web Crawling and Data Mining with Apache Nutch


Perform web crawling and apply data mining in your application.

DETAIL

  • Author: Dr. Zakir Laliwala and Abdulbasit Shaikh
  • Language: English
  • Published: 2013
  • Page: 136
  • Size: 2 MB
  • Format: pdf








CONTENTS

Preface

Chapter 1: Getting Started with Apache Nutch
Introduction to Apache Nutch
Installing and configuring Apache Nutch
Crawling your website using the crawl script
Crawling the Web, the CrawlDb, and URL filters
Parsing and parse filters
The Apache Nutch plugin
Understanding the Nutch Plugin architecture

Chapter 2: Deployment, Sharding, and AJAX Solr with Apache Nutch
Deployment of Apache Solr
Sharding using Apache Solr
Working with AJAX Solr

Chapter 3: Integration of Apache Nutch with Apache Hadoop and Eclipse
Integrating Apache Nutch with Apache Hadoop
Configuring Apache Nutch with Eclipse

Chapter 4: Apache Nutch with Gora, Accumulo, and MySQL
Introduction to Apache Accumulo
Introduction to Apache Gora
Use of Apache Gora
Integration of Apache Nutch with Apache Accumulo
Integration of Apache Nutch with MySQL

Index

No comments:

Post a Comment