问题描述
我必须用Nutch 2.3.1设置hadoop堆栈。 Hadoop for hadoop 2.7.4的支持版本是1.2.6,我已成功配置并测试。但是当我编译Nutch时,我得到了下面的代码并抓取了一个示例页面,我得到了这个错误信息。
/ usr / local / nutch / runtime / local / bin / nutch注入urls / -crawlId kics
InjectorJob:从2017-09-21 14:20:10开始
InjectorJob:注入urlDir:url
线程main中的异常java.lang.NoSuchFieldError:HBASE_CLIENT_PREFETCH_LIMIT
at org.apache.hadoop.hbase.client.HConnectionKey。< clinit>(HConnectionKey.java:43)
at org.apache.hadoop.hbase.client。 HConnectionManager.getConnection(HConnectionManager.java:267)
位于org.apache.hadoop.hbase.client.HBaseAdmin。< init>(HBaseAdmin.java:194)
位于org.apache.gora.hbase .store.HBaseStore.initialize(HBaseStore.java:115)
at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
at org.apache.gora.store.DataStoreFactory .createDataStore(DataStoreFactory.java:161)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFa ctory.java:135)
at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:78)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java: 218)
at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
在org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
在org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
运行错误:
根据我的搜索,例如和,Hbase 1.x可以编译为Nutch 2.3.1。但如何编译我不知道。有人可以引导(步骤等)
解决方案Apache Gora 0.7支持HBase 1.2.3(+): a href =https://issues.apache.org/jira/browse/GORA-443 =nofollow noreferrer> https://issues.apache.org/jira/browse/GORA-443您可以查看我写了如何修改Nutch 2.3.1以使用Apache Gora 0.7。在该答案中,关于补丁,使用0.7在那里它显示0.7-SNAPSHOT。
顺便说一句,Apache Gora 0.8昨天发布了:)只要将0.7改为0.8就可以了。
I have to setup hadoop stack with Nutch 2.3.1. Supported version of Hbase for hadoop 2.7.4 is 1.2.6 that I have configured and tested successfully. But when I compile Nutch I got following and crawl a sample page I got this error.
/usr/local/nutch/runtime/local/bin/nutch inject urls/ -crawlId kics
InjectorJob: starting at 2017-09-21 14:20:10
InjectorJob: Injecting urlDir: urls
Exception in thread "main" java.lang.NoSuchFieldError: HBASE_CLIENT_PREFETCH_LIMIT
at org.apache.hadoop.hbase.client.HConnectionKey.<clinit>(HConnectionKey.java:43)
at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:267)
at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:194)
at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:115)
at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:78)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:218)
at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
Error running:
According to my search such as this and this, Hbase 1.x can be compiled for Nutch 2.3.1. But How to compile I have no idea. Can someone please guide (steps etc.)
Apache Gora 0.7 is the one supporting HBase 1.2.3(+): https://issues.apache.org/jira/browse/GORA-443
You can take a look at https://stackoverflow.com/a/39837926/582789 where I wrote how to modify Nutch 2.3.1 to work with Apache Gora 0.7. About the patch https://paste.apache.org/jjqz in that answer, use "0.7" where it shows "0.7-SNAPSHOT".
By the way, Apache Gora 0.8 was released yesterday :) Just changing 0.7 for 0.8 should work.
http://gora.apache.org/#20-september-2017-apache-gora-08-release
这篇关于如何用Hbase 1.2.6编译Nutch 2.3.1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!