I am not totally sure if I understand the problem that you face, but we do
the following in version 0.16.4 (where the hod shell is deprecated).
a) Use shell scripts to echo commands into a runme.hod script
b) An example of a runme.hod script is:
hadoop jar /grid/0/hadoop/current/hadoop-streaming.jar -input xxx -mapper
"./mr_merge_mapper_bin" -output xxx -reducer "./mr_merge_reducer_bin
--num_feats 24" -file ../m45scripts/mr_merge_reducer_bin -file
../m45scripts/mr_merge_mapper_bin
In this runme.hod you can include many such calls, therefore running jobs in
sequence.
c) chmod +x runme.hod
c) hod script -d $working_dir -n $machines -s $abs_path_to_runme
--hod.script-wait-time=$wait_time
I usually set wait_time to 20 (seconds), this is fine to deal with the
initializing problem.
Hope this helped...
Ashish
On Tue, Jun 10, 2008 at 6:10 PM, Miles Osborne <
miles@...> wrote:
> You have another problem in that Hadoop is still initialising --this will
> cause subsequent jobs to fail.
>
> I've not yet migrated to 17.0 (I still use 16.3), but all my jobs are done
> from nohuped scripts. If you really want to check on the running status
> and
> busy wait, you can look at the jobtracker log and poll it for when
> everything is finished.
>
> My turn to ask a question in the next post ..
>
> Miles
> 2008/6/10 Richard Zhang <
richardtechzh@...>:
>
> > Hello folks:
> > I am running several hadoop applications on hdfs. To save the efforts in
> > issuing the set of commands every time, I am trying to use bash script to
> > run the several applications sequentially. To let the job finishes before
> > it
> > is proceeding to the next job, I am using wait in the script like below.
> >
> > sh bin/start-all.sh
> > wait
> > echo cluster start
> > (bin/hadoop jar hadoop-0.17.0-examples.jar randomwriter -D
> > test.randomwrite.bytes_per_map=107374182 rand)
> > wait
> > bin/hadoop jar hadoop-0.17.0-examples.jar randomtextwriter -D
> > test.randomtextwrite.total_bytes=107374182 rand-text
> > bin/stop-all.sh
> > echo finished hdfs randomwriter experiment
> >
> >
> > However, it always give the error like below. Does anyone have better
> idea
> > on how to run the multiple sequential jobs with bash script?
> >
> > HadoopScript.sh: line 39: wait: pid 10 is not a child of this shell
> >
> > org.apache.hadoop.ipc.RemoteException:
> > org.apache.hadoop.mapred.JobTracker$IllegalStateException: Job tracker
> > still
> > initializing
> > at
> > org.apache.hadoop.mapred.JobTracker.ensureRunning(JobTracker.java:1722)
> > at
> > org.apache.hadoop.mapred.JobTracker.getNewJobId(JobTracker.java:1730)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > at java.lang.reflect.Method.invoke(Method.java:597)
> > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446)
> > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
> >
> > at org.apache.hadoop.ipc.Client.call(Client.java:557)
> > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212)
> > at $Proxy1.getNewJobId(Unknown Source)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > at java.lang.reflect.Method.invoke(Method.java:597)
> > at
> >
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> > at
> >
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> > at $Proxy1.getNewJobId(Unknown Source)
> > at
> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:696)
> > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
> > at
> > org.apache.hadoop.examples.RandomWriter.run(RandomWriter.java:276)
> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > at
> > org.apache.hadoop.examples.RandomWriter.main(RandomWriter.java:287)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > at java.lang.reflect.Method.invoke(Method.java:597)
> > at
> >
> >
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> > at
> > org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> > at
> > org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:53)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > at java.lang.reflect.Method.invoke(Method.java:597)
> > at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
> > at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194)
> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> > at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
> >
>
>
>
> --
> The University of Edinburgh is a charitable body, registered in Scotland,
> with registration number SC005336.
>