|
View:
New views
13 Messages
—
Rating Filter:
Alert me
|
|
|
Mapper Out of MemoryHi, I run hadoop on a BSD4 clusters and each map task is a gzip file (about 10MB). Some tasks finished. But many of them failed due to heap out of memory. I got the following syslogs: 2007-12-06 12:16:50,277 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId= 2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 256 2007-12-06 12:16:53,638 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker: Error running child java.lang.OutOfMemoryError: Java heap space Does anyone know what is the reason and how should we avoid it? Thanks, Rui ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs |
|
|
RE: Mapper Out of MemoryCan control heap size using 'mapred.child.java.opts' option.
Check ur program logic though. Personal experience is that running out of heap space in map task usually suggests some runaway logic somewhere. -----Original Message----- From: Rui Shi [mailto:shearershot@...] Sent: Thursday, December 06, 2007 12:31 PM To: hadoop-user@... Subject: Mapper Out of Memory Hi, I run hadoop on a BSD4 clusters and each map task is a gzip file (about 10MB). Some tasks finished. But many of them failed due to heap out of memory. I got the following syslogs: 2007-12-06 12:16:50,277 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId= 2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 256 2007-12-06 12:16:53,638 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker: Error running child java.lang.OutOfMemoryError: Java heap space Does anyone know what is the reason and how should we avoid it? Thanks, Rui ________________________________________________________________________ ____________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs |
|
|
Re: Mapper Out of MemoryHello,
There is a setting in hadoop-0.15.0/bin/rcc default: JAVA_HEAP_MAX=-Xmx1000m For 2GB memory you can set this about: JAVA_HEAP_MAX=-Xmx1700m 2048m is the highest allowed setting on a mac, linux, non-solaris unix or windows box. Peter W. On Dec 6, 2007, at 12:30 PM, Rui Shi wrote: > > Hi, > > I run hadoop on a BSD4 clusters and each map task is a gzip file > (about 10MB). Some tasks finished. But many of them failed due to > heap out of memory. I got the following syslogs: > > 2007-12-06 12:16:50,277 INFO > org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics > with processName=MAP, sessionId= > 2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask: > numReduceTasks: 256 > 2007-12-06 12:16:53,638 WARN > org.apache.hadoop.util.NativeCodeLoader: Unable to load native- > hadoop library for your platform... using builtin-java classes > where applicable > 2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker: > Error running child > java.lang.OutOfMemoryError: Java heap space > Does anyone know what is the reason and how should we avoid it? > > Thanks, > > Rui |
|
|
Re: Mapper Out of MemoryHi, It is hard to believe that you need to enlarge heap size given the input size is only 10MB. In particular, you don't load all input at the same time. As for the program logic, no much fancy stuff, mostly cut and sorting. So GC should be able to handle... Thanks, Rui ----- Original Message ---- From: Joydeep Sen Sarma <jssarma@...> To: hadoop-user@... Sent: Thursday, December 6, 2007 1:14:51 PM Subject: RE: Mapper Out of Memory Can control heap size using 'mapred.child.java.opts' option. Check ur program logic though. Personal experience is that running out of heap space in map task usually suggests some runaway logic somewhere. -----Original Message----- From: Rui Shi [mailto:shearershot@...] Sent: Thursday, December 06, 2007 12:31 PM To: hadoop-user@... Subject: Mapper Out of Memory Hi, I run hadoop on a BSD4 clusters and each map task is a gzip file (about 10MB). Some tasks finished. But many of them failed due to heap out of memory. I got the following syslogs: 2007-12-06 12:16:50,277 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId= 2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 256 2007-12-06 12:16:53,638 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker: Error running child java.lang.OutOfMemoryError: Java heap space Does anyone know what is the reason and how should we avoid it? Thanks, Rui ________________________________________________________________________ ____________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs ____________________________________________________________________________________ Looking for last minute shopping deals? Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping |
|
|
Re: Mapper Out of MemoryRui Shi wrote:
> It is hard to believe that you need to enlarge heap size given the input size is only 10MB. In particular, you don't load all input at the same time. As for the program logic, no much fancy stuff, mostly cut and sorting. So GC should be able to handle... Out-of-memory exceptions can also be caused by having too many files open at once. What does 'ulimit -n' show? You presented an excerpt from a jobtracker log, right? What do the tasktracker logs show? Can you monitor a node while it is running to see whether the jvm's heap is growing, or whether the number of open files (lsof -p) is large? Also, can you please provide more details about your application? I.e., what is your inputformat, map function, etc. Doug |
|
|
Re: Mapper Out of MemoryThere is a bug in the GZipInputStream on java 1.5 that can cause an out-of-memory error on a malformed gzip input. It is possible that you are trying to treat this input as a splittable file which is causing your maps to be fed from chunks of the gzip file. Those chunks would be ill-formed, of course, and it is possible that this is causing an out-of-memory condition. I am just speculating, however. To confirm or discard this possibility, you should examine the stack traces for the maps that are falling over. On 12/6/07 2:05 PM, "Rui Shi" <shearershot@...> wrote: > > Hi, > > It is hard to believe that you need to enlarge heap size given the input size > is only 10MB. In particular, you don't load all input at the same time. As for > the program logic, no much fancy stuff, mostly cut and sorting. So GC should > be able to handle... > > Thanks, > > Rui > > > ----- Original Message ---- > From: Joydeep Sen Sarma <jssarma@...> > To: hadoop-user@... > Sent: Thursday, December 6, 2007 1:14:51 PM > Subject: RE: Mapper Out of Memory > > > Can control heap size using 'mapred.child.java.opts' option. > > Check ur program logic though. Personal experience is that running out > of heap space in map task usually suggests some runaway logic > somewhere. > > -----Original Message----- > From: Rui Shi [mailto:shearershot@...] > Sent: Thursday, December 06, 2007 12:31 PM > To: hadoop-user@... > Subject: Mapper Out of Memory > > > Hi, > > I run hadoop on a BSD4 clusters and each map task is a gzip file (about > 10MB). Some tasks finished. But many of them failed due to heap out of > memory. I got the following syslogs: > > 2007-12-06 12:16:50,277 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: > Initializing JVM Metrics with processName=MAP, sessionId= > 2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask: > numReduceTasks: 256 > 2007-12-06 12:16:53,638 WARN org.apache.hadoop.util.NativeCodeLoader: > Unable to load native-hadoop library for your platform... using > builtin-java classes where applicable > 2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker: > Error > running child > java.lang.OutOfMemoryError: Java heap space > Does anyone know what is the reason and how should we avoid it? > > Thanks, > > Rui > > > > > > > ________________________________________________________________________ > ____________ > Never miss a thing. Make Yahoo your home page. > http://www.yahoo.com/r/hs > > > > > > > > ______________________________________________________________________________ > ______ > Looking for last minute shopping deals? > Find them fast with Yahoo! Search. > http://tools.search.yahoo.com/newsearch/category.php?category=shopping |
|
|
Re: Mapper Out of MemoryThe JDK also provides "jmap -histo PID" which will give you some crude information about where the memory is going.
-Michael On 12/6/07 2:16 PM, "Ted Dunning" <tdunning@...> wrote: There is a bug in the GZipInputStream on java 1.5 that can cause an out-of-memory error on a malformed gzip input. It is possible that you are trying to treat this input as a splittable file which is causing your maps to be fed from chunks of the gzip file. Those chunks would be ill-formed, of course, and it is possible that this is causing an out-of-memory condition. I am just speculating, however. To confirm or discard this possibility, you should examine the stack traces for the maps that are falling over. On 12/6/07 2:05 PM, "Rui Shi" <shearershot@...> wrote: > > Hi, > > It is hard to believe that you need to enlarge heap size given the input size > is only 10MB. In particular, you don't load all input at the same time. As for > the program logic, no much fancy stuff, mostly cut and sorting. So GC should > be able to handle... > > Thanks, > > Rui > > > ----- Original Message ---- > From: Joydeep Sen Sarma <jssarma@...> > To: hadoop-user@... > Sent: Thursday, December 6, 2007 1:14:51 PM > Subject: RE: Mapper Out of Memory > > > Can control heap size using 'mapred.child.java.opts' option. > > Check ur program logic though. Personal experience is that running out > of heap space in map task usually suggests some runaway logic > somewhere. > > -----Original Message----- > From: Rui Shi [mailto:shearershot@...] > Sent: Thursday, December 06, 2007 12:31 PM > To: hadoop-user@... > Subject: Mapper Out of Memory > > > Hi, > > I run hadoop on a BSD4 clusters and each map task is a gzip file (about > 10MB). Some tasks finished. But many of them failed due to heap out of > memory. I got the following syslogs: > > 2007-12-06 12:16:50,277 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: > Initializing JVM Metrics with processName=MAP, sessionId= > 2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask: > numReduceTasks: 256 > 2007-12-06 12:16:53,638 WARN org.apache.hadoop.util.NativeCodeLoader: > Unable to load native-hadoop library for your platform... using > builtin-java classes where applicable > 2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker: > Error > running child > java.lang.OutOfMemoryError: Java heap space > Does anyone know what is the reason and how should we avoid it? > > Thanks, > > Rui > > > > > > > ________________________________________________________________________ > ____________ > Never miss a thing. Make Yahoo your home page. > http://www.yahoo.com/r/hs > > > > > > > > ______________________________________________________________________________ > ______ > Looking for last minute shopping deals? > Find them fast with Yahoo! Search. > http://tools.search.yahoo.com/newsearch/category.php?category=shopping |
|
|
Re: Mapper Out of MemoryHi,
Out-of-memory exceptions can also be caused by having too many files open at once. What does 'ulimit -n' show? 29491 You presented an excerpt from a jobtracker log, right? What do the tasktracker logs show? I saw the some warning in the tasktracker log: 2007-12-06 12:23:41,604 WARN org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50050, call progress(task_200712031900_0014_m_000058_0, 9.126612E-12, hdfs:///usr/ruish/400.gz:0+9528361, MAP, org.apache.hadoop.mapred.Counters@11c135c) from: output error java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:125) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294) at org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(SocketChannelOutputStream.java:108) at org.apache.hadoop.ipc.SocketChannelOutputStream.write(SocketChannelOutputStream.java:89) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at java.io.DataOutputStream.flush(DataOutputStream.java:106) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:585) And in the datanode logs: 2007-12-06 14:42:20,831 ERROR org.apache.hadoop.dfs.DataNode: DataXceiver: java.io.IOException: Block blk_-8176614602638949879 is valid, and cannot be written to. at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:515) at org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:822) at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:727) at java.lang.Thread.run(Thread.java:595) Also, can you please provide more details about your application? I.e., what is your inputformat, map function, etc. Very simple stuff, projecting certain fields as key and sorting. The input is gzipped files in which each line has some fields separated by a delimiter. Doug ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs |
|
|
Re: Mapper Out of MemoryOn Dec 6, 2007, at 12:30 PM, Rui Shi wrote: > Does anyone know what is the reason and how should we avoid it? Java 6 gives a little better information in the form of a stack trace. My patch HADOOP-2367 will also help after it is finished and committed. It will allow you to get cpu and heap summaries from representative tasks. -- Owen |
|
|
|
|
|
RE: Mapper Out of MemoryWas the value of mapred.child.java.opts set to something like 512MB ? What's
the io.sort.mb set to? > -----Original Message----- > From: Rui Shi [mailto:shearershot@...] > Sent: Sunday, December 09, 2007 6:02 AM > To: hadoop-user@... > Subject: Re: Mapper Out of Memory > > Hi, > > I did some experiments on a single Linux machine. I generated > some data using the 'random writer' and use the 'sort' in the > hadoop-examples to sort them. I still got some out of memory > exceptions as follows: > > java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOf(Unknown Source) > at java.io.ByteArrayOutputStream.write(Unknown Source) > at java.io.DataOutputStream.write(Unknown Source) > at > org.apache.hadoop.io.BytesWritable.write(BytesWritable.java:137) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTa > at > org.apache.hadoop.mapred.lib.IdentityMapper.map(IdentityMapper > .java:39) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1777) > Any ideas? > > Thanks, > > Rui > > ----- Original Message ---- > From: Rui Shi <shearershot@...> > To: hadoop-user@... > Sent: Thursday, December 6, 2007 5:56:42 PM > Subject: Re: Mapper Out of Memory > > > Hi, > > Out-of-memory exceptions can also be caused by having too > many files open at once. What does 'ulimit -n' show? > > 29491 > > You presented an excerpt from a jobtracker log, right? What > do the tasktracker logs show? > > I saw the some warning in the tasktracker log: > > 2007-12-06 12:23:41,604 WARN org.apache.hadoop.ipc.Server: > IPC Server handler 0 on 50050, call > progress(task_200712031900_0014_m_000058_0, > 9.126612E-12, hdfs:///usr/ruish/400.gz:0+9528361, MAP, > org.apache.hadoop.mapred.Counters@11c135c) from: output > error java.nio.channels.ClosedChannelException > at > > sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl > .java:125) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294) > at > > org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(So > cketChannelOutputStream.java:108) > at > > org.apache.hadoop.ipc.SocketChannelOutputStream.write(SocketCh > annelOutputStream.java:89) > at > > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) > at > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) > at java.io.DataOutputStream.flush(DataOutputStream.java:106) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:585) > And in the datanode logs: > > 2007-12-06 14:42:20,831 ERROR org.apache.hadoop.dfs.DataNode: > DataXceiver: java.io.IOException: Block > blk_-8176614602638949879 is valid, and cannot be written to. > at > org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:515) > at > > org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode > at > org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:727) > at java.lang.Thread.run(Thread.java:595) > > Also, can you please provide more details about your application? > I.e., > what is your inputformat, map function, etc. > > Very simple stuff, projecting certain fields as key and > sorting. The input is gzipped files in which each line has > some fields separated by a delimiter. > > Doug > > > > > > > > > ______________________________________________________________ > ______________________ > Never miss a thing. Make Yahoo your home page. > http://www.yahoo.com/r/hs > > > > > > > ______________________________________________________________ > ______________________ > Never miss a thing. Make Yahoo your home page. > http://www.yahoo.com/r/hs > |
|
|
|
|
|
RE: Mapper Out of MemoryRui, pls set the mapred.child.java.opts to 512m. That should take care of
the OOM problem. > -----Original Message----- > From: Rui Shi [mailto:shearershot@...] > Sent: Tuesday, December 11, 2007 3:15 AM > To: hadoop-user@... > Subject: Re: Mapper Out of Memory > > Hi, > > I didn't change those numbers. Basically, using the defaults. > > Thanks, > > Rui > > ----- Original Message ---- > From: Devaraj Das <ddas@...> > To: hadoop-user@... > Sent: Monday, December 10, 2007 4:48:59 AM > Subject: RE: Mapper Out of Memory > > > Was the value of mapred.child.java.opts set to something like 512MB ? > What's > the io.sort.mb set to? > > > -----Original Message----- > > From: Rui Shi [mailto:shearershot@...] > > Sent: Sunday, December 09, 2007 6:02 AM > > To: hadoop-user@... > > Subject: Re: Mapper Out of Memory > > > > Hi, > > > > I did some experiments on a single Linux machine. I generated some > > data using the 'random writer' and use the 'sort' in the > > hadoop-examples to sort them. I still got some out of memory > > exceptions as follows: > > > > java.lang.OutOfMemoryError: Java heap space > > at java.util.Arrays.copyOf(Unknown Source) > > at java.io.ByteArrayOutputStream.write(Unknown Source) > > at java.io.DataOutputStream.write(Unknown Source) > > at > > org.apache.hadoop.io.BytesWritable.write(BytesWritable.java:137) > > at > > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTa > sk.java:340) > > at > > org.apache.hadoop.mapred.lib.IdentityMapper.map(IdentityMapper > > .java:39) > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189) > > at > > > > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1777) > > Any ideas? > > > > Thanks, > > > > Rui > > > > ----- Original Message ---- > > From: Rui Shi <shearershot@...> > > To: hadoop-user@... > > Sent: Thursday, December 6, 2007 5:56:42 PM > > Subject: Re: Mapper Out of Memory > > > > > > Hi, > > > > Out-of-memory exceptions can also be caused by having too > many files > > open at once. What does 'ulimit -n' show? > > > > 29491 > > > > You presented an excerpt from a jobtracker log, right? What do the > > tasktracker logs show? > > > > I saw the some warning in the tasktracker log: > > > > 2007-12-06 12:23:41,604 WARN org.apache.hadoop.ipc.Server: > > IPC Server handler 0 on 50050, call > > progress(task_200712031900_0014_m_000058_0, > > 9.126612E-12, hdfs:///usr/ruish/400.gz:0+9528361, MAP, > > org.apache.hadoop.mapred.Counters@11c135c) from: output error > > java.nio.channels.ClosedChannelException > > at > > > > sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl > > .java:125) > > at > sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294) > > at > > > > org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(So > > cketChannelOutputStream.java:108) > > at > > > > org.apache.hadoop.ipc.SocketChannelOutputStream.write(SocketCh > > annelOutputStream.java:89) > > at > > > > > > java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) > > at > > java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) > > at java.io.DataOutputStream.flush(DataOutputStream.java:106) > > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:585) > > And in the datanode logs: > > > > 2007-12-06 14:42:20,831 ERROR org.apache.hadoop.dfs.DataNode: > > DataXceiver: java.io.IOException: Block > > blk_-8176614602638949879 is valid, and cannot be written to. > > at > > org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:515) > > at > > > > org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode > .java:822) > > at > > org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:727) > > at java.lang.Thread.run(Thread.java:595) > > > > Also, can you please provide more details about your application? > > I.e., > > what is your inputformat, map function, etc. > > > > Very simple stuff, projecting certain fields as key and > sorting. The > > input is gzipped files in which each line has some fields > separated by > > a delimiter. > > > > Doug > > > > > > > > > > > > > > > > > > ______________________________________________________________ > > ______________________ > > Never miss a thing. Make Yahoo your home page. > > http://www.yahoo.com/r/hs > > > > > > > > > > > > > > ______________________________________________________________ > > ______________________ > > Never miss a thing. Make Yahoo your home page. > > http://www.yahoo.com/r/hs > > > > > > > > > > > ______________________________________________________________ > ______________________ > Be a better friend, newshound, and > know-it-all with Yahoo! Mobile. Try it now. > http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ > |
| Free embeddable forum powered by Nabble | Forum Help |