本文介绍了通过Java Processbuilder进行mapreduce作业提交不会结束的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个mareduce作业,如jar文件,说"mapred.jar".实际上,Jobtracker正在远程Linux机器上运行.我从本地计算机运行jar文件,jar文件中的作业被提交到远程jobtracker,并且如下所示正常工作:

I have a mareduce job as jar file , say 'mapred.jar'.Actually Jobtracker is running in a remote linux machine. I run the jar file from local machine, job in the jar file is submitted to the remote jobtracker and it works fine as below:

java -jar F:/hadoop/mapred.jar
     13/12/19 12:40:27 WARN mapred.JobClient: Use GenericOptionsParser for parsing th
     e arguments. Applications should implement Tool for the same.
     13/12/19 12:40:27 INFO input.FileInputFormat: Total input paths to process : 49
     13/12/19 12:40:27 WARN util.NativeCodeLoader: Unable to load native-hadoop libra
     ry for your platform... using builtin-java classes where applicable
     13/12/19 12:40:27 WARN snappy.LoadSnappy: Snappy native library not loaded
     13/12/19 12:40:28 INFO mapred.JobClient: Running job: job_201312160716_0063
     13/12/19 12:40:29 INFO mapred.JobClient:  map 0% reduce 0%
     13/12/19 12:40:50 INFO mapred.JobClient:  map 48% reduce 0%
     13/12/19 12:40:53 INFO mapred.JobClient:  map 35% reduce 0%
     13/12/19 12:40:56 INFO mapred.JobClient:  map 29% reduce 0%
     13/12/19 12:41:02 INFO mapred.JobClient:  map 99% reduce 0%
     13/12/19 12:41:08 INFO mapred.JobClient:  map 100% reduce 0%
     13/12/19 12:41:23 INFO mapred.JobClient:  map 100% reduce 100%
     13/12/19 12:41:28 INFO mapred.JobClient: Job complete: job_201312160716_0063
      ...

但是当我通过java的ProcessBuilder执行相同的操作时,如下所示:

But when I executed the same through java's ProcessBuilder as below:

ProcessBuilder pb = new ProcessBuilder("java", "-jar", "F:/hadoop/mapred.jar");
    pb.directory(new File("D:/test"));
    final Process process = pb.start();
    InputStream is = process.getInputStream();
    InputStreamReader isr = new InputStreamReader(is);
    BufferedReader br = new BufferedReader(isr);
    String line;
    while ((line = br.readLine()) != null) {
      System.out.println(line);
    }

    System.out.println("Waited for: "+ process.waitFor());
    System.out.println("Program terminated! ");

它也可以工作,我可以通过http://192.168.1.112:50030/jobtracker.jsp查看工作状态.

It also worked, and I can view the job status through, http://192.168.1.112:50030/jobtracker.jsp.

问题

我的问题是,即使mapreduce作业已完成, java程序也不会无限期运行 !我也没有收到通过命令行得到的任何输出消息.我怎么知道工作已经完成?

My problem is, the java program wont end up, run indefinitely even if the mapreduce job completed !. Also I do not get any output message that i got through command line.How can I know the job get finished ?

推荐答案

在开始阅读之前,您应该将stderr重定向到stdout:

You should probably redirect stderr to stdout before starting to read:

pb.redirectErrorStream(true)

原因在Process类的文档中进行了描述:

The reason is described in documentation of the Process class:

如果您使用的Java 7显着改进了ProcessBuilder和Process,那么您也可以这样做

If you are using Java 7, where ProcessBuilder and Process are significantly improved, you could also just do

pb.inheritIO()

这会将进程的stderr和stdout重定向到您的Java进程中.

which will redirect the process's stderr and stdout to the ones of your Java process.

更新:顺便说一句,最好使用Hadoop API提交Hadoop作业(类Job和Configuration),例如从简单的Java程序中调用mapreduce作业

Update: By the way, you are better off submitting the Hadoop job using the Hadoop API (classes Job and Configuration), see e.g. Calling a mapreduce job from a simple java program

这篇关于通过Java Processbuilder进行mapreduce作业提交不会结束的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-24 03:41