Mapper Out of Memory

View: New views
13 Messages — Rating Filter:   Alert me  

Mapper Out of Memory

by Rui Shi :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi,

I run hadoop on a BSD4 clusters and each map task is a gzip file (about 10MB). Some tasks finished. But many of them failed due to heap out of memory. I got the following syslogs:

2007-12-06 12:16:50,277 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 256
2007-12-06 12:16:53,638 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker: Error running child
java.lang.OutOfMemoryError: Java heap space
Does anyone know what is the reason and how should we avoid it?

Thanks,

Rui





      ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page.
http://www.yahoo.com/r/hs

RE: Mapper Out of Memory

by Joydeep Sen Sarma :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Can control heap size using 'mapred.child.java.opts' option.

Check ur program logic though. Personal experience is that running out
of heap space in map task usually suggests some runaway logic somewhere.

-----Original Message-----
From: Rui Shi [mailto:shearershot@...]
Sent: Thursday, December 06, 2007 12:31 PM
To: hadoop-user@...
Subject: Mapper Out of Memory


Hi,

I run hadoop on a BSD4 clusters and each map task is a gzip file (about
10MB). Some tasks finished. But many of them failed due to heap out of
memory. I got the following syslogs:

2007-12-06 12:16:50,277 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=MAP, sessionId=
2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask:
numReduceTasks: 256
2007-12-06 12:16:53,638 WARN org.apache.hadoop.util.NativeCodeLoader:
Unable to load native-hadoop library for your platform... using
builtin-java classes where applicable
2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker: Error
running child
java.lang.OutOfMemoryError: Java heap space
Does anyone know what is the reason and how should we avoid it?

Thanks,

Rui





 
________________________________________________________________________
____________
Never miss a thing.  Make Yahoo your home page.
http://www.yahoo.com/r/hs

Re: Mapper Out of Memory

by Peter W.-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello,

There is a setting in hadoop-0.15.0/bin/rcc

default:

JAVA_HEAP_MAX=-Xmx1000m

For 2GB memory you can set this about:

JAVA_HEAP_MAX=-Xmx1700m

2048m is the highest allowed setting on a mac, linux,
non-solaris unix or windows box.

Peter W.

On Dec 6, 2007, at 12:30 PM, Rui Shi wrote:

>
> Hi,
>
> I run hadoop on a BSD4 clusters and each map task is a gzip file  
> (about 10MB). Some tasks finished. But many of them failed due to  
> heap out of memory. I got the following syslogs:
>
> 2007-12-06 12:16:50,277 INFO  
> org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics  
> with processName=MAP, sessionId=
> 2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask:  
> numReduceTasks: 256
> 2007-12-06 12:16:53,638 WARN  
> org.apache.hadoop.util.NativeCodeLoader: Unable to load native-
> hadoop library for your platform... using builtin-java classes  
> where applicable
> 2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker:  
> Error running child
> java.lang.OutOfMemoryError: Java heap space
> Does anyone know what is the reason and how should we avoid it?
>
> Thanks,
>
> Rui

Re: Mapper Out of Memory

by Rui Shi :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi,

It is hard to believe that you need to enlarge heap size given the input size is only 10MB. In particular, you don't load all input at the same time. As for the program logic, no much fancy stuff, mostly cut and sorting. So GC should be able to handle...

Thanks,

Rui


----- Original Message ----
From: Joydeep Sen Sarma <jssarma@...>
To: hadoop-user@...
Sent: Thursday, December 6, 2007 1:14:51 PM
Subject: RE: Mapper Out of Memory


Can control heap size using 'mapred.child.java.opts' option.

Check ur program logic though. Personal experience is that running out
of heap space in map task usually suggests some runaway logic
 somewhere.

-----Original Message-----
From: Rui Shi [mailto:shearershot@...]
Sent: Thursday, December 06, 2007 12:31 PM
To: hadoop-user@...
Subject: Mapper Out of Memory


Hi,

I run hadoop on a BSD4 clusters and each map task is a gzip file (about
10MB). Some tasks finished. But many of them failed due to heap out of
memory. I got the following syslogs:

2007-12-06 12:16:50,277 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=MAP, sessionId=
2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask:
numReduceTasks: 256
2007-12-06 12:16:53,638 WARN org.apache.hadoop.util.NativeCodeLoader:
Unable to load native-hadoop library for your platform... using
builtin-java classes where applicable
2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker:
 Error
running child
java.lang.OutOfMemoryError: Java heap space
Does anyone know what is the reason and how should we avoid it?

Thanks,

Rui





 
________________________________________________________________________
____________
Never miss a thing.  Make Yahoo your home page.
http://www.yahoo.com/r/hs






      ____________________________________________________________________________________
Looking for last minute shopping deals?  
Find them fast with Yahoo! Search.  http://tools.search.yahoo.com/newsearch/category.php?category=shopping

Re: Mapper Out of Memory

by Doug Cutting-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Rui Shi wrote:
> It is hard to believe that you need to enlarge heap size given the input size is only 10MB. In particular, you don't load all input at the same time. As for the program logic, no much fancy stuff, mostly cut and sorting. So GC should be able to handle...

Out-of-memory exceptions can also be caused by having too many files
open at once.  What does 'ulimit -n' show?

You presented an excerpt from a jobtracker log, right?  What do the
tasktracker logs show?

Can you monitor a node while it is running to see whether the jvm's heap
is growing, or whether the number of open files (lsof -p) is large?

Also, can you please provide more details about your application?  I.e.,
what is your inputformat, map function, etc.

Doug

Re: Mapper Out of Memory

by Ted Dunning-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


There is a bug in the GZipInputStream on java 1.5 that can cause an
out-of-memory error on a malformed gzip input.

It is possible that you are trying to treat this input as a splittable file
which is causing your maps to be fed from chunks of the gzip file.  Those
chunks would be ill-formed, of course, and it is possible that this is
causing an out-of-memory condition.

I am just speculating, however.  To confirm or discard this possibility, you
should examine the stack traces for the maps that are falling over.

On 12/6/07 2:05 PM, "Rui Shi" <shearershot@...> wrote:

>
> Hi,
>
> It is hard to believe that you need to enlarge heap size given the input size
> is only 10MB. In particular, you don't load all input at the same time. As for
> the program logic, no much fancy stuff, mostly cut and sorting. So GC should
> be able to handle...
>
> Thanks,
>
> Rui
>
>
> ----- Original Message ----
> From: Joydeep Sen Sarma <jssarma@...>
> To: hadoop-user@...
> Sent: Thursday, December 6, 2007 1:14:51 PM
> Subject: RE: Mapper Out of Memory
>
>
> Can control heap size using 'mapred.child.java.opts' option.
>
> Check ur program logic though. Personal experience is that running out
> of heap space in map task usually suggests some runaway logic
>  somewhere.
>
> -----Original Message-----
> From: Rui Shi [mailto:shearershot@...]
> Sent: Thursday, December 06, 2007 12:31 PM
> To: hadoop-user@...
> Subject: Mapper Out of Memory
>
>
> Hi,
>
> I run hadoop on a BSD4 clusters and each map task is a gzip file (about
> 10MB). Some tasks finished. But many of them failed due to heap out of
> memory. I got the following syslogs:
>
> 2007-12-06 12:16:50,277 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=MAP, sessionId=
> 2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask:
> numReduceTasks: 256
> 2007-12-06 12:16:53,638 WARN org.apache.hadoop.util.NativeCodeLoader:
> Unable to load native-hadoop library for your platform... using
> builtin-java classes where applicable
> 2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker:
>  Error
> running child
> java.lang.OutOfMemoryError: Java heap space
> Does anyone know what is the reason and how should we avoid it?
>
> Thanks,
>
> Rui
>
>
>
>
>
>  
> ________________________________________________________________________
> ____________
> Never miss a thing.  Make Yahoo your home page.
> http://www.yahoo.com/r/hs
>
>
>
>
>
>
>      
> ______________________________________________________________________________
> ______
> Looking for last minute shopping deals?
> Find them fast with Yahoo! Search.
> http://tools.search.yahoo.com/newsearch/category.php?category=shopping


Re: Mapper Out of Memory

by Michael Bieniosek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

The JDK also provides "jmap -histo PID" which will give you some crude information about where the memory is going.

-Michael

On 12/6/07 2:16 PM, "Ted Dunning" <tdunning@...> wrote:



There is a bug in the GZipInputStream on java 1.5 that can cause an
out-of-memory error on a malformed gzip input.

It is possible that you are trying to treat this input as a splittable file
which is causing your maps to be fed from chunks of the gzip file.  Those
chunks would be ill-formed, of course, and it is possible that this is
causing an out-of-memory condition.

I am just speculating, however.  To confirm or discard this possibility, you
should examine the stack traces for the maps that are falling over.

On 12/6/07 2:05 PM, "Rui Shi" <shearershot@...> wrote:

>
> Hi,
>
> It is hard to believe that you need to enlarge heap size given the input size
> is only 10MB. In particular, you don't load all input at the same time. As for
> the program logic, no much fancy stuff, mostly cut and sorting. So GC should
> be able to handle...
>
> Thanks,
>
> Rui
>
>
> ----- Original Message ----
> From: Joydeep Sen Sarma <jssarma@...>
> To: hadoop-user@...
> Sent: Thursday, December 6, 2007 1:14:51 PM
> Subject: RE: Mapper Out of Memory
>
>
> Can control heap size using 'mapred.child.java.opts' option.
>
> Check ur program logic though. Personal experience is that running out
> of heap space in map task usually suggests some runaway logic
>  somewhere.
>
> -----Original Message-----
> From: Rui Shi [mailto:shearershot@...]
> Sent: Thursday, December 06, 2007 12:31 PM
> To: hadoop-user@...
> Subject: Mapper Out of Memory
>
>
> Hi,
>
> I run hadoop on a BSD4 clusters and each map task is a gzip file (about
> 10MB). Some tasks finished. But many of them failed due to heap out of
> memory. I got the following syslogs:
>
> 2007-12-06 12:16:50,277 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=MAP, sessionId=
> 2007-12-06 12:16:53,128 INFO org.apache.hadoop.mapred.MapTask:
> numReduceTasks: 256
> 2007-12-06 12:16:53,638 WARN org.apache.hadoop.util.NativeCodeLoader:
> Unable to load native-hadoop library for your platform... using
> builtin-java classes where applicable
> 2007-12-06 12:18:19,079 WARN org.apache.hadoop.mapred.TaskTracker:
>  Error
> running child
> java.lang.OutOfMemoryError: Java heap space
> Does anyone know what is the reason and how should we avoid it?
>
> Thanks,
>
> Rui
>
>
>
>
>
>
> ________________________________________________________________________
> ____________
> Never miss a thing.  Make Yahoo your home page.
> http://www.yahoo.com/r/hs
>
>
>
>
>
>
>
> ______________________________________________________________________________
> ______
> Looking for last minute shopping deals?
> Find them fast with Yahoo! Search.
> http://tools.search.yahoo.com/newsearch/category.php?category=shopping




Re: Mapper Out of Memory

by Rui Shi :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Out-of-memory exceptions can also be caused by having too many files
open at once.  What does 'ulimit -n' show?

29491

You presented an excerpt from a jobtracker log, right?  What do the
tasktracker logs show?

I saw the some warning in the tasktracker log:

2007-12-06 12:23:41,604 WARN org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50050, call progress(task_200712031900_0014_m_000058_0, 9.126612E-12, hdfs:///usr/ruish/400.gz:0+9528361, MAP, org.apache.hadoop.mapred.Counters@11c135c) from: output error
java.nio.channels.ClosedChannelException
        at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:125)
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294)
        at org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(SocketChannelOutputStream.java:108)
        at org.apache.hadoop.ipc.SocketChannelOutputStream.write(SocketChannelOutputStream.java:89)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
        at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
        at java.io.DataOutputStream.flush(DataOutputStream.java:106)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:585)
And in the datanode logs:

2007-12-06 14:42:20,831 ERROR org.apache.hadoop.dfs.DataNode: DataXceiver: java.io.IOException: Block blk_-8176614602638949879 is valid, and cannot be written to.
        at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:515)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:822)
        at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:727)
        at java.lang.Thread.run(Thread.java:595)

Also, can you please provide more details about your application?
  I.e.,
what is your inputformat, map function, etc.

Very simple stuff, projecting certain fields as key and sorting. The input is gzipped files in which each line has some fields separated by  a delimiter.

Doug






      ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page.
http://www.yahoo.com/r/hs

Re: Mapper Out of Memory

by Owen O'Malley-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Dec 6, 2007, at 12:30 PM, Rui Shi wrote:

> Does anyone know what is the reason and how should we avoid it?

Java 6 gives a little better information in the form of a stack  
trace. My patch HADOOP-2367 will also help after it is finished and  
committed.  It will allow you to get cpu and heap summaries from  
representative tasks.

-- Owen

Parent Message unknown Re: Mapper Out of Memory

by Rui Shi :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I did some experiments on a single Linux machine. I generated some data using the 'random writer' and use the 'sort' in the hadoop-examples to sort them. I still got some out of memory exceptions as follows:

java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Unknown Source)
        at java.io.ByteArrayOutputStream.write(Unknown Source)
        at java.io.DataOutputStream.write(Unknown Source)
        at org.apache.hadoop.io.BytesWritable.write(BytesWritable.java:137)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:340)
        at org.apache.hadoop.mapred.lib.IdentityMapper.map(IdentityMapper.java:39)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1777)
Any ideas?

Thanks,

Rui

----- Original Message ----
From: Rui Shi <shearershot@...>
To: hadoop-user@...
Sent: Thursday, December 6, 2007 5:56:42 PM
Subject: Re: Mapper Out of Memory


Hi,

Out-of-memory exceptions can also be caused by having too many files
open at once.  What does 'ulimit -n' show?

29491

You presented an excerpt from a jobtracker log, right?  What do the
tasktracker logs show?

I saw the some warning in the tasktracker log:

2007-12-06 12:23:41,604 WARN org.apache.hadoop.ipc.Server: IPC Server
 handler 0 on 50050, call progress(task_200712031900_0014_m_000058_0,
 9.126612E-12, hdfs:///usr/ruish/400.gz:0+9528361, MAP,
 org.apache.hadoop.mapred.Counters@11c135c) from: output error
java.nio.channels.ClosedChannelException
    at
 sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:125)
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294)
    at
 org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(SocketChannelOutputStream.java:108)
    at
 org.apache.hadoop.ipc.SocketChannelOutputStream.write(SocketChannelOutputStream.java:89)
    at
 java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
    at
 java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
    at java.io.DataOutputStream.flush(DataOutputStream.java:106)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:585)
And in the datanode logs:

2007-12-06 14:42:20,831 ERROR org.apache.hadoop.dfs.DataNode:
 DataXceiver: java.io.IOException: Block blk_-8176614602638949879 is valid, and
 cannot be written to.
    at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:515)
    at
 org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:822)
    at
 org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:727)
    at java.lang.Thread.run(Thread.java:595)

Also, can you please provide more details about your application?
  I.e.,
what is your inputformat, map function, etc.

Very simple stuff, projecting certain fields as key and sorting. The
 input is gzipped files in which each line has some fields separated by  a
 delimiter.

Doug






   
  ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page.
http://www.yahoo.com/r/hs





      ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page.
http://www.yahoo.com/r/hs

RE: Mapper Out of Memory

by Devaraj Das :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Was the value of mapred.child.java.opts set to something like 512MB ? What's
the io.sort.mb set to?

> -----Original Message-----
> From: Rui Shi [mailto:shearershot@...]
> Sent: Sunday, December 09, 2007 6:02 AM
> To: hadoop-user@...
> Subject: Re: Mapper Out of Memory
>
> Hi,
>
> I did some experiments on a single Linux machine. I generated
> some data using the 'random writer' and use the 'sort' in the
> hadoop-examples to sort them. I still got some out of memory
> exceptions as follows:
>
> java.lang.OutOfMemoryError: Java heap space
> at java.util.Arrays.copyOf(Unknown Source)
> at java.io.ByteArrayOutputStream.write(Unknown Source)
> at java.io.DataOutputStream.write(Unknown Source)
> at
> org.apache.hadoop.io.BytesWritable.write(BytesWritable.java:137)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTa
sk.java:340)

> at
> org.apache.hadoop.mapred.lib.IdentityMapper.map(IdentityMapper
> .java:39)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
> at
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1777)
> Any ideas?
>
> Thanks,
>
> Rui
>
> ----- Original Message ----
> From: Rui Shi <shearershot@...>
> To: hadoop-user@...
> Sent: Thursday, December 6, 2007 5:56:42 PM
> Subject: Re: Mapper Out of Memory
>
>
> Hi,
>
> Out-of-memory exceptions can also be caused by having too
> many files open at once.  What does 'ulimit -n' show?
>
> 29491
>
> You presented an excerpt from a jobtracker log, right?  What
> do the tasktracker logs show?
>
> I saw the some warning in the tasktracker log:
>
> 2007-12-06 12:23:41,604 WARN org.apache.hadoop.ipc.Server:
> IPC Server  handler 0 on 50050, call
> progress(task_200712031900_0014_m_000058_0,
>  9.126612E-12, hdfs:///usr/ruish/400.gz:0+9528361, MAP,
>  org.apache.hadoop.mapred.Counters@11c135c) from: output
> error java.nio.channels.ClosedChannelException
>     at
>  
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl
> .java:125)
>     at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294)
>     at
>  
> org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(So
> cketChannelOutputStream.java:108)
>     at
>  
> org.apache.hadoop.ipc.SocketChannelOutputStream.write(SocketCh
> annelOutputStream.java:89)
>     at
>  
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>     at
>  java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>     at java.io.DataOutputStream.flush(DataOutputStream.java:106)
>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:585)
> And in the datanode logs:
>
> 2007-12-06 14:42:20,831 ERROR org.apache.hadoop.dfs.DataNode:
>  DataXceiver: java.io.IOException: Block
> blk_-8176614602638949879 is valid, and  cannot be written to.
>     at
> org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:515)
>     at
>  
> org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode
.java:822)

>     at
>  org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:727)
>     at java.lang.Thread.run(Thread.java:595)
>
> Also, can you please provide more details about your application?
>   I.e.,
> what is your inputformat, map function, etc.
>
> Very simple stuff, projecting certain fields as key and
> sorting. The  input is gzipped files in which each line has
> some fields separated by  a  delimiter.
>
> Doug
>
>
>
>
>
>
>    
>  
> ______________________________________________________________
> ______________________
> Never miss a thing.  Make Yahoo your home page.
> http://www.yahoo.com/r/hs
>
>
>
>
>
>      
> ______________________________________________________________
> ______________________
> Never miss a thing.  Make Yahoo your home page.
> http://www.yahoo.com/r/hs
>


Parent Message unknown Re: Mapper Out of Memory

by Rui Shi :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I didn't change those numbers. Basically, using the defaults.

Thanks,

Rui

----- Original Message ----
From: Devaraj Das <ddas@...>
To: hadoop-user@...
Sent: Monday, December 10, 2007 4:48:59 AM
Subject: RE: Mapper Out of Memory


Was the value of mapred.child.java.opts set to something like 512MB ?
 What's
the io.sort.mb set to?

> -----Original Message-----
> From: Rui Shi [mailto:shearershot@...]
> Sent: Sunday, December 09, 2007 6:02 AM
> To: hadoop-user@...
> Subject: Re: Mapper Out of Memory
>
> Hi,
>
> I did some experiments on a single Linux machine. I generated
> some data using the 'random writer' and use the 'sort' in the
> hadoop-examples to sort them. I still got some out of memory
> exceptions as follows:
>
> java.lang.OutOfMemoryError: Java heap space
>     at java.util.Arrays.copyOf(Unknown Source)
>     at java.io.ByteArrayOutputStream.write(Unknown Source)
>     at java.io.DataOutputStream.write(Unknown Source)
>     at
> org.apache.hadoop.io.BytesWritable.write(BytesWritable.java:137)
>     at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTa
sk.java:340)
>     at
> org.apache.hadoop.mapred.lib.IdentityMapper.map(IdentityMapper
> .java:39)
>     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
>     at
>
 org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1777)

> Any ideas?
>
> Thanks,
>
> Rui
>
> ----- Original Message ----
> From: Rui Shi <shearershot@...>
> To: hadoop-user@...
> Sent: Thursday, December 6, 2007 5:56:42 PM
> Subject: Re: Mapper Out of Memory
>
>
> Hi,
>
> Out-of-memory exceptions can also be caused by having too
> many files open at once.  What does 'ulimit -n' show?
>
> 29491
>
> You presented an excerpt from a jobtracker log, right?  What
> do the tasktracker logs show?
>
> I saw the some warning in the tasktracker log:
>
> 2007-12-06 12:23:41,604 WARN org.apache.hadoop.ipc.Server:
> IPC Server  handler 0 on 50050, call
> progress(task_200712031900_0014_m_000058_0,
>  9.126612E-12, hdfs:///usr/ruish/400.gz:0+9528361, MAP,
>  org.apache.hadoop.mapred.Counters@11c135c) from: output
> error java.nio.channels.ClosedChannelException
>     at
>  
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl
> .java:125)
>     at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294)
>     at
>  
> org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(So
> cketChannelOutputStream.java:108)
>     at
>  
> org.apache.hadoop.ipc.SocketChannelOutputStream.write(SocketCh
> annelOutputStream.java:89)
>     at
>  
>
 java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)

>     at
>  java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>     at java.io.DataOutputStream.flush(DataOutputStream.java:106)
>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:585)
> And in the datanode logs:
>
> 2007-12-06 14:42:20,831 ERROR org.apache.hadoop.dfs.DataNode:
>  DataXceiver: java.io.IOException: Block
> blk_-8176614602638949879 is valid, and  cannot be written to.
>     at
> org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:515)
>     at
>  
> org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode
.java:822)

>     at
>  org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:727)
>     at java.lang.Thread.run(Thread.java:595)
>
> Also, can you please provide more details about your application?
>   I.e.,
> what is your inputformat, map function, etc.
>
> Very simple stuff, projecting certain fields as key and
> sorting. The  input is gzipped files in which each line has
> some fields separated by  a  delimiter.
>
> Doug
>
>
>
>
>
>
>    
>  
> ______________________________________________________________
> ______________________
> Never miss a thing.  Make Yahoo your home page.
> http://www.yahoo.com/r/hs
>
>
>
>
>
>      
> ______________________________________________________________
> ______________________
> Never miss a thing.  Make Yahoo your home page.
> http://www.yahoo.com/r/hs
>







      ____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ 

RE: Mapper Out of Memory

by Devaraj Das :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Rui, pls set the mapred.child.java.opts to 512m. That should take care of
the OOM problem.

> -----Original Message-----
> From: Rui Shi [mailto:shearershot@...]
> Sent: Tuesday, December 11, 2007 3:15 AM
> To: hadoop-user@...
> Subject: Re: Mapper Out of Memory
>
> Hi,
>
> I didn't change those numbers. Basically, using the defaults.
>
> Thanks,
>
> Rui
>
> ----- Original Message ----
> From: Devaraj Das <ddas@...>
> To: hadoop-user@...
> Sent: Monday, December 10, 2007 4:48:59 AM
> Subject: RE: Mapper Out of Memory
>
>
> Was the value of mapred.child.java.opts set to something like 512MB ?
>  What's
> the io.sort.mb set to?
>
> > -----Original Message-----
> > From: Rui Shi [mailto:shearershot@...]
> > Sent: Sunday, December 09, 2007 6:02 AM
> > To: hadoop-user@...
> > Subject: Re: Mapper Out of Memory
> >
> > Hi,
> >
> > I did some experiments on a single Linux machine. I generated some
> > data using the 'random writer' and use the 'sort' in the
> > hadoop-examples to sort them. I still got some out of memory
> > exceptions as follows:
> >
> > java.lang.OutOfMemoryError: Java heap space
> >     at java.util.Arrays.copyOf(Unknown Source)
> >     at java.io.ByteArrayOutputStream.write(Unknown Source)
> >     at java.io.DataOutputStream.write(Unknown Source)
> >     at
> > org.apache.hadoop.io.BytesWritable.write(BytesWritable.java:137)
> >     at
> > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTa
> sk.java:340)
> >     at
> > org.apache.hadoop.mapred.lib.IdentityMapper.map(IdentityMapper
> > .java:39)
> >     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
> >     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
> >     at
> >
>  
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1777)
> > Any ideas?
> >
> > Thanks,
> >
> > Rui
> >
> > ----- Original Message ----
> > From: Rui Shi <shearershot@...>
> > To: hadoop-user@...
> > Sent: Thursday, December 6, 2007 5:56:42 PM
> > Subject: Re: Mapper Out of Memory
> >
> >
> > Hi,
> >
> > Out-of-memory exceptions can also be caused by having too
> many files
> > open at once.  What does 'ulimit -n' show?
> >
> > 29491
> >
> > You presented an excerpt from a jobtracker log, right?  What do the
> > tasktracker logs show?
> >
> > I saw the some warning in the tasktracker log:
> >
> > 2007-12-06 12:23:41,604 WARN org.apache.hadoop.ipc.Server:
> > IPC Server  handler 0 on 50050, call
> > progress(task_200712031900_0014_m_000058_0,
> >  9.126612E-12, hdfs:///usr/ruish/400.gz:0+9528361, MAP,
> >  org.apache.hadoop.mapred.Counters@11c135c) from: output error
> > java.nio.channels.ClosedChannelException
> >     at
> >  
> > sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl
> > .java:125)
> >     at
> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294)
> >     at
> >  
> > org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer(So
> > cketChannelOutputStream.java:108)
> >     at
> >  
> > org.apache.hadoop.ipc.SocketChannelOutputStream.write(SocketCh
> > annelOutputStream.java:89)
> >     at
> >  
> >
>  
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> >     at
> >  java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> >     at java.io.DataOutputStream.flush(DataOutputStream.java:106)
> >     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:585)
> > And in the datanode logs:
> >
> > 2007-12-06 14:42:20,831 ERROR org.apache.hadoop.dfs.DataNode:
> >  DataXceiver: java.io.IOException: Block
> > blk_-8176614602638949879 is valid, and  cannot be written to.
> >     at
> > org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:515)
> >     at
> >  
> > org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode
> .java:822)
> >     at
> >  org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:727)
> >     at java.lang.Thread.run(Thread.java:595)
> >
> > Also, can you please provide more details about your application?
> >   I.e.,
> > what is your inputformat, map function, etc.
> >
> > Very simple stuff, projecting certain fields as key and
> sorting. The  
> > input is gzipped files in which each line has some fields
> separated by  
> > a  delimiter.
> >
> > Doug
> >
> >
> >
> >
> >
> >
> >    
> >  
> > ______________________________________________________________
> > ______________________
> > Never miss a thing.  Make Yahoo your home page.
> > http://www.yahoo.com/r/hs
> >
> >
> >
> >
> >
> >      
> > ______________________________________________________________
> > ______________________
> > Never miss a thing.  Make Yahoo your home page.
> > http://www.yahoo.com/r/hs
> >
>
>
>
>
>
>
>
>      
> ______________________________________________________________
> ______________________
> Be a better friend, newshound, and
> know-it-all with Yahoo! Mobile.  Try it now.  
> http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ 
>