本文介绍了如何用Hbase 1.2.6编译Nutch 2.3.1的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须用Nutch 2.3.1设置hadoop堆栈。 Hadoop for hadoop 2.7.4的支持版本是1.2.6,我已成功配置并测试。但是当我编译Nutch时,我得到了下面的代码并抓取了一个示例页面,我得到了这个错误信息。

  / usr / local / nutch / runtime / local / bin / nutch注入urls / -crawlId kics 
InjectorJob:从2017-09-21 14:20:10开始
InjectorJob:注入urlDir:url
线程main中的异常java.lang.NoSuchFieldError:HBASE_CLIENT_PREFETCH_LIMIT
at org.apache.hadoop.hbase.client.HConnectionKey。< clinit>(HConnectionKey.java:43)
at org.apache.hadoop.hbase.client。 HConnectionManager.getConnection(HConnectionManager.java:267)
位于org.apache.hadoop.hbase.client.HBaseAdmin。< init>(HBaseAdmin.java:194)
位于org.apache.gora.hbase .store.HBaseStore.initialize(HBaseStore.java:115)
at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
at org.apache.gora.store.DataStoreFactory .createDataStore(DataStoreFactory.java:161)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFa ctory.java:135)
at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:78)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java: 218)
at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
在org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
在org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
运行错误:

根据我的搜索,例如和,Hbase 1.x可以编译为Nutch 2.3.1。但如何编译我不知道。有人可以引导(步骤等)

解决方案Apache Gora 0.7支持HBase 1.2.3(+): a href =https://issues.apache.org/jira/browse/GORA-443 =nofollow noreferrer> https://issues.apache.org/jira/browse/GORA-443

您可以查看我写了如何修改Nutch 2.3.1以使用Apache Gora 0.7。在该答案中,关于补丁,使用0.7在那里它显示0.7-SNAPSHOT。

顺便说一句,Apache Gora 0.8昨天发布了:)只要将0.7改为0.8就可以了。




I have to setup hadoop stack with Nutch 2.3.1. Supported version of Hbase for hadoop 2.7.4 is 1.2.6 that I have configured and tested successfully. But when I compile Nutch I got following and crawl a sample page I got this error.

/usr/local/nutch/runtime/local/bin/nutch inject urls/ -crawlId kics
InjectorJob: starting at 2017-09-21 14:20:10
InjectorJob: Injecting urlDir: urls
Exception in thread "main" java.lang.NoSuchFieldError: HBASE_CLIENT_PREFETCH_LIMIT
    at org.apache.hadoop.hbase.client.HConnectionKey.<clinit>(HConnectionKey.java:43)
    at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:267)
    at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:194)
    at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:115)
    at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
    at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
    at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
    at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:78)
    at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:218)
    at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
    at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
Error running:

According to my search such as this and this, Hbase 1.x can be compiled for Nutch 2.3.1. But How to compile I have no idea. Can someone please guide (steps etc.)

解决方案

Apache Gora 0.7 is the one supporting HBase 1.2.3(+): https://issues.apache.org/jira/browse/GORA-443

You can take a look at https://stackoverflow.com/a/39837926/582789 where I wrote how to modify Nutch 2.3.1 to work with Apache Gora 0.7. About the patch https://paste.apache.org/jjqz in that answer, use "0.7" where it shows "0.7-SNAPSHOT".

By the way, Apache Gora 0.8 was released yesterday :) Just changing 0.7 for 0.8 should work.

http://gora.apache.org/#20-september-2017-apache-gora-08-release

这篇关于如何用Hbase 1.2.6编译Nutch 2.3.1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-28 22:04