Hadoop wants to do whoami?

View: New views
7 Messages — Rating Filter:   Alert me  

Hadoop wants to do whoami?

by Paul Tomblin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I'm trying to move my crawler from a shared hosting environment (where
it kept getting killed off for using too much memory) to a VPS.  But
on the new host, I'm getting the following exception:

[ WARN] 01:15:17 (FileSystem.java:<init>:1440)
uri=file:///

javax.security.auth.login.LoginException: Login failed: Cannot run
program "whoami": java.io.IOException: error=12, Cannot allocate
memory
        at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:250)
        at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:275)
        at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:257)
        at org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformation.java:67)
        at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
        at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:319)
        at org.apache.hadoop.mapred.FileInputFormat.addInputPath(FileInputFormat.java:313)
        at org.apache.nutch.crawl.Injector.inject(Injector.java:152)
        at com.lucidityworks.nutch.crawler.Crawler.crawlIt(Crawler.java:407)
        at com.lucidityworks.nutch.crawler.Crawler.crawlSite(Crawler.java:381)
        at com.lucidityworks.nutch.crawler.Crawler.crawlCategory(Crawler.java:255)
        at com.lucidityworks.nutch.crawler.Crawler.crawl(Crawler.java:166)
        at com.lucidityworks.nutch.crawler.Crawler.main(Crawler.java:724)

What is going on here?

--
http://www.linkedin.com/in/paultomblin
http://careers.stackoverflow.com/ptomblin

Re: Hadoop wants to do whoami?

by Ken Krugler :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Paul,

Hadoop uses the whoami command to find out what user it's running as.

When running a command line tool from Java, the process running the  
JVM gets forked.

This in turn can trigger out of memory errors if you're running  
without much/any swap or your OS doesn't support memory overcommit.

I ran into something similar when running some Java code in a VMWare  
environment where the instances hadn't been set up with any swap space.

See http://issues.apache.org/jira/browse/HADOOP-5059 for more details.

-- Ken


On Nov 6, 2009, at 5:21pm, Paul Tomblin wrote:

> I'm trying to move my crawler from a shared hosting environment (where
> it kept getting killed off for using too much memory) to a VPS.  But
> on the new host, I'm getting the following exception:
>
> [ WARN] 01:15:17 (FileSystem.java:<init>:1440)
> uri=file:///
>
> javax.security.auth.login.LoginException: Login failed: Cannot run
> program "whoami": java.io.IOException: error=12, Cannot allocate
> memory
>        at  
> org
> .apache
> .hadoop
> .security
> .UnixUserGroupInformation.login(UnixUserGroupInformation.java:250)
>        at  
> org
> .apache
> .hadoop
> .security
> .UnixUserGroupInformation.login(UnixUserGroupInformation.java:275)
>        at  
> org
> .apache
> .hadoop
> .security
> .UnixUserGroupInformation.login(UnixUserGroupInformation.java:257)
>        at  
> org
> .apache
> .hadoop
> .security.UserGroupInformation.login(UserGroupInformation.java:67)
>        at org.apache.hadoop.fs.FileSystem$Cache
> $Key.<init>(FileSystem.java:1438)
>        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:
> 1376)
>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
>        at  
> org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:319)
>        at  
> org
> .apache
> .hadoop.mapred.FileInputFormat.addInputPath(FileInputFormat.java:313)
>        at org.apache.nutch.crawl.Injector.inject(Injector.java:152)
>        at  
> com.lucidityworks.nutch.crawler.Crawler.crawlIt(Crawler.java:407)
>        at  
> com.lucidityworks.nutch.crawler.Crawler.crawlSite(Crawler.java:381)
>        at  
> com.lucidityworks.nutch.crawler.Crawler.crawlCategory(Crawler.java:
> 255)
>        at com.lucidityworks.nutch.crawler.Crawler.crawl(Crawler.java:
> 166)
>        at com.lucidityworks.nutch.crawler.Crawler.main(Crawler.java:
> 724)
>
> What is going on here?
>
> --
> http://www.linkedin.com/in/paultomblin
> http://careers.stackoverflow.com/ptomblin

--------------------------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g





Re: Hadoop wants to do whoami?

by Neera :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Most likely your allocated memory settings for java process is high.

Usually you get this error if
2 * Memory configured for the process > (total physical memory
available + swap space)


You can find more explanation about this error at

http://issues.apache.org/jira/browse/HADOOP-5059

HTH.

Neera



On Fri, Nov 6, 2009 at 5:21 PM, Paul Tomblin <ptomblin@...> wrote:

> I'm trying to move my crawler from a shared hosting environment (where
> it kept getting killed off for using too much memory) to a VPS.  But
> on the new host, I'm getting the following exception:
>
> [ WARN] 01:15:17 (FileSystem.java:<init>:1440)
> uri=file:///
>
> javax.security.auth.login.LoginException: Login failed: Cannot run
> program "whoami": java.io.IOException: error=12, Cannot allocate
> memory
>        at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:250)
>        at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:275)
>        at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:257)
>        at org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformation.java:67)
>        at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
>        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
>        at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:319)
>        at org.apache.hadoop.mapred.FileInputFormat.addInputPath(FileInputFormat.java:313)
>        at org.apache.nutch.crawl.Injector.inject(Injector.java:152)
>        at com.lucidityworks.nutch.crawler.Crawler.crawlIt(Crawler.java:407)
>        at com.lucidityworks.nutch.crawler.Crawler.crawlSite(Crawler.java:381)
>        at com.lucidityworks.nutch.crawler.Crawler.crawlCategory(Crawler.java:255)
>        at com.lucidityworks.nutch.crawler.Crawler.crawl(Crawler.java:166)
>        at com.lucidityworks.nutch.crawler.Crawler.main(Crawler.java:724)
>
> What is going on here?
>
> --
> http://www.linkedin.com/in/paultomblin
> http://careers.stackoverflow.com/ptomblin
>

Re: Hadoop wants to do whoami?

by Paul Tomblin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Wait a second, doesn't Linux have a fork that does copy on write?  I
mean, Sun OS got that around 1990 or so (at least that's when I was
told not to bother using vfork because fork now didn't use as much
memory), surely Linux has caught up to 15 years ago.

On Fri, Nov 6, 2009 at 8:29 PM, Ken Krugler <kkrugler_lists@...> wrote:

> Hi Paul,
>
> Hadoop uses the whoami command to find out what user it's running as.
>
> When running a command line tool from Java, the process running the JVM gets
> forked.
>
> This in turn can trigger out of memory errors if you're running without
> much/any swap or your OS doesn't support memory overcommit.
>
> I ran into something similar when running some Java code in a VMWare
> environment where the instances hadn't been set up with any swap space.
>
> See http://issues.apache.org/jira/browse/HADOOP-5059 for more details.
>
> -- Ken
>
>
> On Nov 6, 2009, at 5:21pm, Paul Tomblin wrote:
>
>> I'm trying to move my crawler from a shared hosting environment (where
>> it kept getting killed off for using too much memory) to a VPS.  But
>> on the new host, I'm getting the following exception:
>>
>> [ WARN] 01:15:17 (FileSystem.java:<init>:1440)
>> uri=file:///
>>
>> javax.security.auth.login.LoginException: Login failed: Cannot run
>> program "whoami": java.io.IOException: error=12, Cannot allocate
>> memory
>>       at
>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:250)
>>       at
>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:275)
>>       at
>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:257)
>>       at
>> org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformation.java:67)
>>       at
>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
>>       at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
>>       at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
>>       at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
>>       at
>> org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:319)
>>       at
>> org.apache.hadoop.mapred.FileInputFormat.addInputPath(FileInputFormat.java:313)
>>       at org.apache.nutch.crawl.Injector.inject(Injector.java:152)
>>       at com.lucidityworks.nutch.crawler.Crawler.crawlIt(Crawler.java:407)
>>       at
>> com.lucidityworks.nutch.crawler.Crawler.crawlSite(Crawler.java:381)
>>       at
>> com.lucidityworks.nutch.crawler.Crawler.crawlCategory(Crawler.java:255)
>>       at com.lucidityworks.nutch.crawler.Crawler.crawl(Crawler.java:166)
>>       at com.lucidityworks.nutch.crawler.Crawler.main(Crawler.java:724)
>>
>> What is going on here?
>>
>> --
>> http://www.linkedin.com/in/paultomblin
>> http://careers.stackoverflow.com/ptomblin
>
> --------------------------------------------
> Ken Krugler
> +1 530-210-6378
> http://bixolabs.com
> e l a s t i c   w e b   m i n i n g
>
>
>
>
>



--
http://www.linkedin.com/in/paultomblin
http://careers.stackoverflow.com/ptomblin

Re: Hadoop wants to do whoami?

by Ken Krugler :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Paul,

On Nov 6, 2009, at 5:39pm, Paul Tomblin wrote:

> Wait a second, doesn't Linux have a fork that does copy on write?  I
> mean, Sun OS got that around 1990 or so (at least that's when I was
> told not to bother using vfork because fork now didn't use as much
> memory), surely Linux has caught up to 15 years ago.

Normally it works fine, but it will fail if you don't have swap space  
allocated because that's factored into the free space calc when the  
fork happens.

What's the swap space setup for your VPS setup?

-- Ken

>
> On Fri, Nov 6, 2009 at 8:29 PM, Ken Krugler <kkrugler_lists@...
> > wrote:
>> Hi Paul,
>>
>> Hadoop uses the whoami command to find out what user it's running as.
>>
>> When running a command line tool from Java, the process running the  
>> JVM gets
>> forked.
>>
>> This in turn can trigger out of memory errors if you're running  
>> without
>> much/any swap or your OS doesn't support memory overcommit.
>>
>> I ran into something similar when running some Java code in a VMWare
>> environment where the instances hadn't been set up with any swap  
>> space.
>>
>> See http://issues.apache.org/jira/browse/HADOOP-5059 for more  
>> details.
>>
>> -- Ken
>>
>>
>> On Nov 6, 2009, at 5:21pm, Paul Tomblin wrote:
>>
>>> I'm trying to move my crawler from a shared hosting environment  
>>> (where
>>> it kept getting killed off for using too much memory) to a VPS.  But
>>> on the new host, I'm getting the following exception:
>>>
>>> [ WARN] 01:15:17 (FileSystem.java:<init>:1440)
>>> uri=file:///
>>>
>>> javax.security.auth.login.LoginException: Login failed: Cannot run
>>> program "whoami": java.io.IOException: error=12, Cannot allocate
>>> memory
>>>       at
>>> org
>>> .apache
>>> .hadoop
>>> .security
>>> .UnixUserGroupInformation.login(UnixUserGroupInformation.java:250)
>>>       at
>>> org
>>> .apache
>>> .hadoop
>>> .security
>>> .UnixUserGroupInformation.login(UnixUserGroupInformation.java:275)
>>>       at
>>> org
>>> .apache
>>> .hadoop
>>> .security
>>> .UnixUserGroupInformation.login(UnixUserGroupInformation.java:257)
>>>       at
>>> org
>>> .apache
>>> .hadoop
>>> .security.UserGroupInformation.login(UserGroupInformation.java:67)
>>>       at
>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:
>>> 1438)
>>>       at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:
>>> 1376)
>>>       at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
>>>       at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
>>>       at
>>> org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:
>>> 319)
>>>       at
>>> org
>>> .apache
>>> .hadoop.mapred.FileInputFormat.addInputPath(FileInputFormat.java:
>>> 313)
>>>       at org.apache.nutch.crawl.Injector.inject(Injector.java:152)
>>>       at  
>>> com.lucidityworks.nutch.crawler.Crawler.crawlIt(Crawler.java:407)
>>>       at
>>> com.lucidityworks.nutch.crawler.Crawler.crawlSite(Crawler.java:381)
>>>       at
>>> com.lucidityworks.nutch.crawler.Crawler.crawlCategory(Crawler.java:
>>> 255)
>>>       at  
>>> com.lucidityworks.nutch.crawler.Crawler.crawl(Crawler.java:166)
>>>       at com.lucidityworks.nutch.crawler.Crawler.main(Crawler.java:
>>> 724)
>>>
>>> What is going on here?
>>>
>>> --
>>> http://www.linkedin.com/in/paultomblin
>>> http://careers.stackoverflow.com/ptomblin
>>
>> --------------------------------------------
>> Ken Krugler
>> +1 530-210-6378
>> http://bixolabs.com
>> e l a s t i c   w e b   m i n i n g
>>
>>
>>
>>
>>
>
>
>
> --
> http://www.linkedin.com/in/paultomblin
> http://careers.stackoverflow.com/ptomblin

--------------------------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g





Re: Hadoop wants to do whoami?

by Fadzi Ushewokunze-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

if you are running under windows try to run the crawler under cygwin.


On Fri, 2009-11-06 at 20:21 -0500, Paul Tomblin wrote:

> I'm trying to move my crawler from a shared hosting environment (where
> it kept getting killed off for using too much memory) to a VPS.  But
> on the new host, I'm getting the following exception:
>
> [ WARN] 01:15:17 (FileSystem.java:<init>:1440)
> uri=file:///
>
> javax.security.auth.login.LoginException: Login failed: Cannot run
> program "whoami": java.io.IOException: error=12, Cannot allocate
> memory
>         at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:250)
>         at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:275)
>         at org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupInformation.java:257)
>         at org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformation.java:67)
>         at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
>         at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
>         at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:319)
>         at org.apache.hadoop.mapred.FileInputFormat.addInputPath(FileInputFormat.java:313)
>         at org.apache.nutch.crawl.Injector.inject(Injector.java:152)
>         at com.lucidityworks.nutch.crawler.Crawler.crawlIt(Crawler.java:407)
>         at com.lucidityworks.nutch.crawler.Crawler.crawlSite(Crawler.java:381)
>         at com.lucidityworks.nutch.crawler.Crawler.crawlCategory(Crawler.java:255)
>         at com.lucidityworks.nutch.crawler.Crawler.crawl(Crawler.java:166)
>         at com.lucidityworks.nutch.crawler.Crawler.main(Crawler.java:724)
>
> What is going on here?
>


Re: Hadoop wants to do whoami?

by Paul Tomblin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Nov 6, 2009 at 11:44 PM, Ken Krugler
<kkrugler_lists@...> wrote:

> Normally it works fine, but it will fail if you don't have swap space
> allocated because that's factored into the free space calc when the fork
> happens.
>
> What's the swap space setup for your VPS setup?

There's no swap space.

--
http://www.linkedin.com/in/paultomblin
http://careers.stackoverflow.com/ptomblin