[jira] Created: (HADOOP-2680) Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery

View: New views
13 Messages — Rating Filter:   Alert me  

[jira] Created: (HADOOP-2680) Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery
--------------------------------------------------------------------------------------------------

                 Key: HADOOP-2680
                 URL: https://issues.apache.org/jira/browse/HADOOP-2680
             Project: Hadoop
          Issue Type: Bug
          Components: dfs
    Affects Versions: 0.16.0
            Reporter: Mukund Madhugiri
            Priority: Blocker
             Fix For: 0.16.0


The unit test: TestDFSStorageStateRecovery has failed a couple of times with a timeout on the nightly build:

Here are the logs from the failures:
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/373/console
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/369/console


Here is a build where it passed:
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/365/console

Here is the part where it failed:

[junit] 2008-01-22 12:16:39,499 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(84)) - ============================================================
    [junit] 2008-01-22 12:16:39,502 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(85)) - ***TEST 34*** NAME_NODE recovery: numDirs=1 testCase=17 current=false previous=true previous.tmp=false removed.tmp=true
    [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 4 on 36207: exiting
    [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 3 on 36207: exiting
    [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 7 on 36207: exiting
    [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(353)) - Stopping IPC Server listener on 36207
    [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 2 on 36207: exiting
    [junit] 2008-01-22 12:16:40,153 INFO  jvm.JvmMetrics (RpcMetrics.java:setTags(49)) - Initializing RPC Metrics with serverName=NameNode, port=0
    [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=RpcStatistics
    [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
    [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
    [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
    [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
    [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
    [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
    [junit] at org.apache.hadoop.ipc.metrics.RpcMgt.<init>(RpcMgt.java:37)
    [junit] at org.apache.hadoop.ipc.metrics.RpcMetrics.<init>(RpcMetrics.java:60)
    [junit] at org.apache.hadoop.ipc.Server.<init>(Server.java:969)
    [junit] at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:393)
    [junit] at org.apache.hadoop.ipc.RPC.getServer(RPC.java:355)
    [junit] 2008-01-22 12:16:40,156 INFO  dfs.NameNode (NameNode.java:initialize(125)) - Namenode up at: localhost/127.0.0.1:36768
    [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:119)
    [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
    [junit] 2008-01-22 12:16:40,156 INFO  jvm.JvmMetrics (JvmMetrics.java:init(51)) - Cannot initialize JVM Metrics with processName=NameNode, sessionId=null - already initialized
    [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
    [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
    [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
    [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
    [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
    [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
    [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    [junit] at java.lang.reflect.Method.invoke(Method.java:585)
    [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
    [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
    [junit] 2008-01-22 12:16:40,158 INFO  dfs.NameNodeMetrics (NameNodeMetrics.java:<init>(74)) - Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
    [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
    [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
    [junit] at junit.framework.TestResult.run(TestResult.java:109)
    [junit] at junit.framework.TestCase.run(TestCase.java:118)
    [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
    [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
    [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
    [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
    [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
    [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=NameNodeStatistics
    [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
    [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
    [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
    [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
    [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
    [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
    [junit] at org.apache.hadoop.dfs.namenode.metrics.NameNodeMgt.<init>(NameNodeMgt.java:36)
    [junit] at org.apache.hadoop.dfs.NameNodeMetrics.<init>(NameNodeMetrics.java:67)
    [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:127)
    [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
    [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
    [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
    [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
    [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
    [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
    [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
    [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    [junit] at java.lang.reflect.Method.invoke(Method.java:585)
    [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
    [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
    [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
    [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
    [junit] at junit.framework.TestResult.run(TestResult.java:109)
    [junit] at junit.framework.TestCase.run(TestCase.java:118)
    [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
    [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
    [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
    [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
    [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
    [junit] 2008-01-22 12:16:40,193 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(321)) - fsOwner=hudson,hudson
    [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(325)) - supergroup=supergroup
    [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(326)) - isPermissionEnabled=true
    [junit] 2008-01-22 12:16:40,198 INFO  dfs.Storage (Storage.java:doRecover(369)) - Recovering storage directory /export/home/hudson/hudson/jobs/Hadoop-Nightly/workspace/trunk/build/test/data/name1 from previous rollback.
    [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
    [junit] Test org.apache.hadoop.dfs.TestDFSStorageStateRecovery FAILED (timeout)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-2680) Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


     [ https://issues.apache.org/jira/browse/HADOOP-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mukund Madhugiri reassigned HADOOP-2680:
----------------------------------------

    Assignee: Robert Chansler

> Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2680
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2680
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Mukund Madhugiri
>            Assignee: Robert Chansler
>            Priority: Blocker
>             Fix For: 0.16.0
>
>
> The unit test: TestDFSStorageStateRecovery has failed a couple of times with a timeout on the nightly build:
> Here are the logs from the failures:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/373/console
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/369/console
> Here is a build where it passed:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/365/console
> Here is the part where it failed:
> [junit] 2008-01-22 12:16:39,499 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(84)) - ============================================================
>     [junit] 2008-01-22 12:16:39,502 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(85)) - ***TEST 34*** NAME_NODE recovery: numDirs=1 testCase=17 current=false previous=true previous.tmp=false removed.tmp=true
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 4 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 3 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 7 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(353)) - Stopping IPC Server listener on 36207
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 2 on 36207: exiting
>     [junit] 2008-01-22 12:16:40,153 INFO  jvm.JvmMetrics (RpcMetrics.java:setTags(49)) - Initializing RPC Metrics with serverName=NameNode, port=0
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=RpcStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMgt.<init>(RpcMgt.java:37)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMetrics.<init>(RpcMetrics.java:60)
>     [junit] at org.apache.hadoop.ipc.Server.<init>(Server.java:969)
>     [junit] at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:393)
>     [junit] at org.apache.hadoop.ipc.RPC.getServer(RPC.java:355)
>     [junit] 2008-01-22 12:16:40,156 INFO  dfs.NameNode (NameNode.java:initialize(125)) - Namenode up at: localhost/127.0.0.1:36768
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:119)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] 2008-01-22 12:16:40,156 INFO  jvm.JvmMetrics (JvmMetrics.java:init(51)) - Cannot initialize JVM Metrics with processName=NameNode, sessionId=null - already initialized
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] 2008-01-22 12:16:40,158 INFO  dfs.NameNodeMetrics (NameNodeMetrics.java:<init>(74)) - Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=NameNodeStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.dfs.namenode.metrics.NameNodeMgt.<init>(NameNodeMgt.java:36)
>     [junit] at org.apache.hadoop.dfs.NameNodeMetrics.<init>(NameNodeMetrics.java:67)
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:127)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] 2008-01-22 12:16:40,193 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(321)) - fsOwner=hudson,hudson
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(325)) - supergroup=supergroup
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(326)) - isPermissionEnabled=true
>     [junit] 2008-01-22 12:16:40,198 INFO  dfs.Storage (Storage.java:doRecover(369)) - Recovering storage directory /export/home/hudson/hudson/jobs/Hadoop-Nightly/workspace/trunk/build/test/data/name1 from previous rollback.
>     [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
>     [junit] Test org.apache.hadoop.dfs.TestDFSStorageStateRecovery FAILED (timeout)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-2680) Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


     [ https://issues.apache.org/jira/browse/HADOOP-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur reassigned HADOOP-2680:
----------------------------------------

    Assignee: Konstantin Shvachko  (was: Robert Chansler)

> Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2680
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2680
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Mukund Madhugiri
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.16.0
>
>
> The unit test: TestDFSStorageStateRecovery has failed a couple of times with a timeout on the nightly build:
> Here are the logs from the failures:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/373/console
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/369/console
> Here is a build where it passed:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/365/console
> Here is the part where it failed:
> [junit] 2008-01-22 12:16:39,499 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(84)) - ============================================================
>     [junit] 2008-01-22 12:16:39,502 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(85)) - ***TEST 34*** NAME_NODE recovery: numDirs=1 testCase=17 current=false previous=true previous.tmp=false removed.tmp=true
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 4 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 3 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 7 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(353)) - Stopping IPC Server listener on 36207
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 2 on 36207: exiting
>     [junit] 2008-01-22 12:16:40,153 INFO  jvm.JvmMetrics (RpcMetrics.java:setTags(49)) - Initializing RPC Metrics with serverName=NameNode, port=0
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=RpcStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMgt.<init>(RpcMgt.java:37)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMetrics.<init>(RpcMetrics.java:60)
>     [junit] at org.apache.hadoop.ipc.Server.<init>(Server.java:969)
>     [junit] at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:393)
>     [junit] at org.apache.hadoop.ipc.RPC.getServer(RPC.java:355)
>     [junit] 2008-01-22 12:16:40,156 INFO  dfs.NameNode (NameNode.java:initialize(125)) - Namenode up at: localhost/127.0.0.1:36768
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:119)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] 2008-01-22 12:16:40,156 INFO  jvm.JvmMetrics (JvmMetrics.java:init(51)) - Cannot initialize JVM Metrics with processName=NameNode, sessionId=null - already initialized
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] 2008-01-22 12:16:40,158 INFO  dfs.NameNodeMetrics (NameNodeMetrics.java:<init>(74)) - Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=NameNodeStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.dfs.namenode.metrics.NameNodeMgt.<init>(NameNodeMgt.java:36)
>     [junit] at org.apache.hadoop.dfs.NameNodeMetrics.<init>(NameNodeMetrics.java:67)
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:127)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] 2008-01-22 12:16:40,193 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(321)) - fsOwner=hudson,hudson
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(325)) - supergroup=supergroup
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(326)) - isPermissionEnabled=true
>     [junit] 2008-01-22 12:16:40,198 INFO  dfs.Storage (Storage.java:doRecover(369)) - Recovering storage directory /export/home/hudson/hudson/jobs/Hadoop-Nightly/workspace/trunk/build/test/data/name1 from previous rollback.
>     [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
>     [junit] Test org.apache.hadoop.dfs.TestDFSStorageStateRecovery FAILED (timeout)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2680) Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ https://issues.apache.org/jira/browse/HADOOP-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561542#action_12561542 ]

Mukund Madhugiri commented on HADOOP-2680:
------------------------------------------

This test was first seen failing on Jan 04, 2008 (see HADOOP-2524). Since this JIRA has the latest information, I am going to mark HADOOP-2524 as a DUP of this one.

> Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2680
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2680
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Mukund Madhugiri
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.16.0
>
>
> The unit test: TestDFSStorageStateRecovery has failed a couple of times with a timeout on the nightly build:
> Here are the logs from the failures:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/373/console
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/369/console
> Here is a build where it passed:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/365/console
> Here is the part where it failed:
> [junit] 2008-01-22 12:16:39,499 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(84)) - ============================================================
>     [junit] 2008-01-22 12:16:39,502 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(85)) - ***TEST 34*** NAME_NODE recovery: numDirs=1 testCase=17 current=false previous=true previous.tmp=false removed.tmp=true
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 4 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 3 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 7 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(353)) - Stopping IPC Server listener on 36207
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 2 on 36207: exiting
>     [junit] 2008-01-22 12:16:40,153 INFO  jvm.JvmMetrics (RpcMetrics.java:setTags(49)) - Initializing RPC Metrics with serverName=NameNode, port=0
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=RpcStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMgt.<init>(RpcMgt.java:37)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMetrics.<init>(RpcMetrics.java:60)
>     [junit] at org.apache.hadoop.ipc.Server.<init>(Server.java:969)
>     [junit] at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:393)
>     [junit] at org.apache.hadoop.ipc.RPC.getServer(RPC.java:355)
>     [junit] 2008-01-22 12:16:40,156 INFO  dfs.NameNode (NameNode.java:initialize(125)) - Namenode up at: localhost/127.0.0.1:36768
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:119)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] 2008-01-22 12:16:40,156 INFO  jvm.JvmMetrics (JvmMetrics.java:init(51)) - Cannot initialize JVM Metrics with processName=NameNode, sessionId=null - already initialized
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] 2008-01-22 12:16:40,158 INFO  dfs.NameNodeMetrics (NameNodeMetrics.java:<init>(74)) - Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=NameNodeStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.dfs.namenode.metrics.NameNodeMgt.<init>(NameNodeMgt.java:36)
>     [junit] at org.apache.hadoop.dfs.NameNodeMetrics.<init>(NameNodeMetrics.java:67)
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:127)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] 2008-01-22 12:16:40,193 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(321)) - fsOwner=hudson,hudson
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(325)) - supergroup=supergroup
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(326)) - isPermissionEnabled=true
>     [junit] 2008-01-22 12:16:40,198 INFO  dfs.Storage (Storage.java:doRecover(369)) - Recovering storage directory /export/home/hudson/hudson/jobs/Hadoop-Nightly/workspace/trunk/build/test/data/name1 from previous rollback.
>     [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
>     [junit] Test org.apache.hadoop.dfs.TestDFSStorageStateRecovery FAILED (timeout)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2680) Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


     [ https://issues.apache.org/jira/browse/HADOOP-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-2680:
---------------------------------

    Attachment: build-369-TestEditLog.txt

build-369-TestEditLog.txt : contains the output of this failed test from the console.

Konstantin and I looked at this and it looks like disk sync was very slow during this test. We don't know if there is some other reason, but a very slow or busy partition could cause this.

Not sure if this is related to TestDFSStorageStateRecovery failures.

> Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2680
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2680
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Mukund Madhugiri
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.16.0
>
>         Attachments: build-369-TestEditLog.txt
>
>
> The unit test: TestDFSStorageStateRecovery has failed a couple of times with a timeout on the nightly build:
> Here are the logs from the failures:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/373/console
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/369/console
> Here is a build where it passed:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/365/console
> Here is the part where it failed:
> [junit] 2008-01-22 12:16:39,499 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(84)) - ============================================================
>     [junit] 2008-01-22 12:16:39,502 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(85)) - ***TEST 34*** NAME_NODE recovery: numDirs=1 testCase=17 current=false previous=true previous.tmp=false removed.tmp=true
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 4 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 3 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 7 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(353)) - Stopping IPC Server listener on 36207
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 2 on 36207: exiting
>     [junit] 2008-01-22 12:16:40,153 INFO  jvm.JvmMetrics (RpcMetrics.java:setTags(49)) - Initializing RPC Metrics with serverName=NameNode, port=0
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=RpcStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMgt.<init>(RpcMgt.java:37)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMetrics.<init>(RpcMetrics.java:60)
>     [junit] at org.apache.hadoop.ipc.Server.<init>(Server.java:969)
>     [junit] at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:393)
>     [junit] at org.apache.hadoop.ipc.RPC.getServer(RPC.java:355)
>     [junit] 2008-01-22 12:16:40,156 INFO  dfs.NameNode (NameNode.java:initialize(125)) - Namenode up at: localhost/127.0.0.1:36768
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:119)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] 2008-01-22 12:16:40,156 INFO  jvm.JvmMetrics (JvmMetrics.java:init(51)) - Cannot initialize JVM Metrics with processName=NameNode, sessionId=null - already initialized
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] 2008-01-22 12:16:40,158 INFO  dfs.NameNodeMetrics (NameNodeMetrics.java:<init>(74)) - Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=NameNodeStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.dfs.namenode.metrics.NameNodeMgt.<init>(NameNodeMgt.java:36)
>     [junit] at org.apache.hadoop.dfs.NameNodeMetrics.<init>(NameNodeMetrics.java:67)
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:127)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] 2008-01-22 12:16:40,193 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(321)) - fsOwner=hudson,hudson
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(325)) - supergroup=supergroup
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(326)) - isPermissionEnabled=true
>     [junit] 2008-01-22 12:16:40,198 INFO  dfs.Storage (Storage.java:doRecover(369)) - Recovering storage directory /export/home/hudson/hudson/jobs/Hadoop-Nightly/workspace/trunk/build/test/data/name1 from previous rollback.
>     [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
>     [junit] Test org.apache.hadoop.dfs.TestDFSStorageStateRecovery FAILED (timeout)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2680) Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ https://issues.apache.org/jira/browse/HADOOP-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561552#action_12561552 ]

Raghu Angadi commented on HADOOP-2680:
--------------------------------------

More regd TestEditLog : even when it passes, it came dangerously close to the limit, 8 minutes :

http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/373/testReport/org.apache.hadoop.dfs/TestEditLog/

For Hudson, we might have to reduce the number of transaction in this test.


> Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2680
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2680
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Mukund Madhugiri
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.16.0
>
>         Attachments: build-369-TestEditLog.txt
>
>
> The unit test: TestDFSStorageStateRecovery has failed a couple of times with a timeout on the nightly build:
> Here are the logs from the failures:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/373/console
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/369/console
> Here is a build where it passed:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/365/console
> Here is the part where it failed:
> [junit] 2008-01-22 12:16:39,499 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(84)) - ============================================================
>     [junit] 2008-01-22 12:16:39,502 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(85)) - ***TEST 34*** NAME_NODE recovery: numDirs=1 testCase=17 current=false previous=true previous.tmp=false removed.tmp=true
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 4 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 3 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 7 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(353)) - Stopping IPC Server listener on 36207
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 2 on 36207: exiting
>     [junit] 2008-01-22 12:16:40,153 INFO  jvm.JvmMetrics (RpcMetrics.java:setTags(49)) - Initializing RPC Metrics with serverName=NameNode, port=0
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=RpcStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMgt.<init>(RpcMgt.java:37)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMetrics.<init>(RpcMetrics.java:60)
>     [junit] at org.apache.hadoop.ipc.Server.<init>(Server.java:969)
>     [junit] at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:393)
>     [junit] at org.apache.hadoop.ipc.RPC.getServer(RPC.java:355)
>     [junit] 2008-01-22 12:16:40,156 INFO  dfs.NameNode (NameNode.java:initialize(125)) - Namenode up at: localhost/127.0.0.1:36768
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:119)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] 2008-01-22 12:16:40,156 INFO  jvm.JvmMetrics (JvmMetrics.java:init(51)) - Cannot initialize JVM Metrics with processName=NameNode, sessionId=null - already initialized
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] 2008-01-22 12:16:40,158 INFO  dfs.NameNodeMetrics (NameNodeMetrics.java:<init>(74)) - Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=NameNodeStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.dfs.namenode.metrics.NameNodeMgt.<init>(NameNodeMgt.java:36)
>     [junit] at org.apache.hadoop.dfs.NameNodeMetrics.<init>(NameNodeMetrics.java:67)
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:127)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] 2008-01-22 12:16:40,193 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(321)) - fsOwner=hudson,hudson
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(325)) - supergroup=supergroup
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(326)) - isPermissionEnabled=true
>     [junit] 2008-01-22 12:16:40,198 INFO  dfs.Storage (Storage.java:doRecover(369)) - Recovering storage directory /export/home/hudson/hudson/jobs/Hadoop-Nightly/workspace/trunk/build/test/data/name1 from previous rollback.
>     [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
>     [junit] Test org.apache.hadoop.dfs.TestDFSStorageStateRecovery FAILED (timeout)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2680) Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ https://issues.apache.org/jira/browse/HADOOP-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561553#action_12561553 ]

Raghu Angadi commented on HADOOP-2680:
--------------------------------------

> For Hudson, we might have to reduce the number of transaction in this test.

Actually this was not common till build #366 or so.. the test used to take 1-2 min in earlier builds. may be the Hudson got busier?

> Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2680
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2680
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Mukund Madhugiri
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.16.0
>
>         Attachments: build-369-TestEditLog.txt
>
>
> The unit test: TestDFSStorageStateRecovery has failed a couple of times with a timeout on the nightly build:
> Here are the logs from the failures:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/373/console
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/369/console
> Here is a build where it passed:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/365/console
> Here is the part where it failed:
> [junit] 2008-01-22 12:16:39,499 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(84)) - ============================================================
>     [junit] 2008-01-22 12:16:39,502 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(85)) - ***TEST 34*** NAME_NODE recovery: numDirs=1 testCase=17 current=false previous=true previous.tmp=false removed.tmp=true
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 4 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 3 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 7 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(353)) - Stopping IPC Server listener on 36207
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 2 on 36207: exiting
>     [junit] 2008-01-22 12:16:40,153 INFO  jvm.JvmMetrics (RpcMetrics.java:setTags(49)) - Initializing RPC Metrics with serverName=NameNode, port=0
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=RpcStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMgt.<init>(RpcMgt.java:37)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMetrics.<init>(RpcMetrics.java:60)
>     [junit] at org.apache.hadoop.ipc.Server.<init>(Server.java:969)
>     [junit] at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:393)
>     [junit] at org.apache.hadoop.ipc.RPC.getServer(RPC.java:355)
>     [junit] 2008-01-22 12:16:40,156 INFO  dfs.NameNode (NameNode.java:initialize(125)) - Namenode up at: localhost/127.0.0.1:36768
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:119)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] 2008-01-22 12:16:40,156 INFO  jvm.JvmMetrics (JvmMetrics.java:init(51)) - Cannot initialize JVM Metrics with processName=NameNode, sessionId=null - already initialized
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] 2008-01-22 12:16:40,158 INFO  dfs.NameNodeMetrics (NameNodeMetrics.java:<init>(74)) - Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=NameNodeStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.dfs.namenode.metrics.NameNodeMgt.<init>(NameNodeMgt.java:36)
>     [junit] at org.apache.hadoop.dfs.NameNodeMetrics.<init>(NameNodeMetrics.java:67)
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:127)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] 2008-01-22 12:16:40,193 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(321)) - fsOwner=hudson,hudson
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(325)) - supergroup=supergroup
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(326)) - isPermissionEnabled=true
>     [junit] 2008-01-22 12:16:40,198 INFO  dfs.Storage (Storage.java:doRecover(369)) - Recovering storage directory /export/home/hudson/hudson/jobs/Hadoop-Nightly/workspace/trunk/build/test/data/name1 from previous rollback.
>     [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
>     [junit] Test org.apache.hadoop.dfs.TestDFSStorageStateRecovery FAILED (timeout)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2680) Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ https://issues.apache.org/jira/browse/HADOOP-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561554#action_12561554 ]

Konstantin Shvachko commented on HADOOP-2680:
---------------------------------------------

I think these problems are related to the slowness of the mounted disk drive.
In case of TestEditLog name-node is supposed to make 100,000 transactions and synchronize them to disk.
And this takes more than 10 minutes with less than a half of transaction completed.
Here is the relevant part of the log:
{code}
    [junit] 2008-01-18 13:38:04,406 INFO  fs.FSNamesystem (FSEditLog.java:printStatistics(772)) - Number of transactions: 2727 Total time for transactions(ms): 180 Number of syncs: 54 SyncTimes(ms): 33105 28125
    [junit] 2008-01-18 13:39:05,135 INFO  fs.FSNamesystem (FSEditLog.java:printStatistics(772)) - Number of transactions: 5728 Total time for transactions(ms): 231 Number of syncs: 113 SyncTimes(ms): 63838 58107
    [junit] 2008-01-18 13:40:05,714 INFO  fs.FSNamesystem (FSEditLog.java:printStatistics(772)) - Number of transactions: 8874 Total time for transactions(ms): 294 Number of syncs: 177 SyncTimes(ms): 95192 87320
    [junit] 2008-01-18 13:41:05,910 INFO  fs.FSNamesystem (FSEditLog.java:printStatistics(772)) - Number of transactions: 12400 Total time for transactions(ms): 378 Number of syncs: 247 SyncTimes(ms): 125102 117586
    [junit] 2008-01-18 13:42:06,136 INFO  fs.FSNamesystem (FSEditLog.java:printStatistics(772)) - Number of transactions: 15279 Total time for transactions(ms): 432 Number of syncs: 303 SyncTimes(ms): 156284 146617
    [junit] 2008-01-18 13:43:06,820 INFO  fs.FSNamesystem (FSEditLog.java:printStatistics(772)) - Number of transactions: 18602 Total time for transactions(ms): 467 Number of syncs: 369 SyncTimes(ms): 187323 176248
    [junit] 2008-01-18 13:44:07,818 INFO  fs.FSNamesystem (FSEditLog.java:printStatistics(772)) - Number of transactions: 21877 Total time for transactions(ms): 514 Number of syncs: 434 SyncTimes(ms): 218905 205651
    [junit] 2008-01-18 13:45:08,400 INFO  fs.FSNamesystem (FSEditLog.java:printStatistics(772)) - Number of transactions: 25358 Total time for transactions(ms): 562 Number of syncs: 503 SyncTimes(ms): 250827 234296
    [junit] 2008-01-18 13:46:08,565 INFO  fs.FSNamesystem (FSEditLog.java:printStatistics(772)) - Number of transactions: 28183 Total time for transactions(ms): 611 Number of syncs: 559 SyncTimes(ms): 281769 263511
    [junit] 2008-01-18 13:47:09,728 INFO  fs.FSNamesystem (FSEditLog.java:printStatistics(772)) - Number of transactions: 31349 Total time for transactions(ms): 639 Number of syncs: 622 SyncTimes(ms): 311388 295040
    [junit] 2008-01-18 13:48:10,725 INFO  fs.FSNamesystem (FSEditLog.java:printStatistics(772)) - Number of transactions: 34534 Total time for transactions(ms): 668 Number of syncs: 686 SyncTimes(ms): 342038 325370
    [junit] 2008-01-18 13:49:11,673 INFO  fs.FSNamesystem (FSEditLog.java:printStatistics(772)) - Number of transactions: 37197 Total time for transactions(ms): 695 Number of syncs: 738 SyncTimes(ms): 372024 356322
    [junit] 2008-01-18 13:50:12,050 INFO  fs.FSNamesystem (FSEditLog.java:printStatistics(772)) - Number of transactions: 40113 Total time for transactions(ms): 723 Number of syncs: 796 SyncTimes(ms): 403170 385541
    [junit] 2008-01-18 13:51:12,981 INFO  fs.FSNamesystem (FSEditLog.java:printStatistics(772)) - Number of transactions: 42734 Total time for transactions(ms): 745 Number of syncs: 848 SyncTimes(ms): 432674 416957
    [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
    [junit] Test org.apache.hadoop.dfs.TestEditLog FAILED (timeout)
{code}

In case of TestDFSStorageStateRecovery the name-node tries to recover from the aborted rollback, which means
it needs to rename some directories. And this also takes more than a minute and causes the test to timeout.
{code}
    [junit] 2008-01-22 12:16:40,198 INFO  dfs.Storage (Storage.java:doRecover(369)) - Recovering storage directory /export/home/hudson/hudson/jobs/Hadoop-Nightly/workspace/trunk/build/test/data/name1 from previous rollback.
    [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
    [junit] Test org.apache.hadoop.dfs.TestDFSStorageStateRecovery FAILED (timeout)
{code}

Could anyone please check the partition Hundson uses for storage directories.

> Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2680
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2680
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Mukund Madhugiri
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.16.0
>
>         Attachments: build-369-TestEditLog.txt
>
>
> The unit test: TestDFSStorageStateRecovery has failed a couple of times with a timeout on the nightly build:
> Here are the logs from the failures:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/373/console
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/369/console
> Here is a build where it passed:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/365/console
> Here is the part where it failed:
> [junit] 2008-01-22 12:16:39,499 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(84)) - ============================================================
>     [junit] 2008-01-22 12:16:39,502 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(85)) - ***TEST 34*** NAME_NODE recovery: numDirs=1 testCase=17 current=false previous=true previous.tmp=false removed.tmp=true
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 4 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 3 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 7 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(353)) - Stopping IPC Server listener on 36207
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 2 on 36207: exiting
>     [junit] 2008-01-22 12:16:40,153 INFO  jvm.JvmMetrics (RpcMetrics.java:setTags(49)) - Initializing RPC Metrics with serverName=NameNode, port=0
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=RpcStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMgt.<init>(RpcMgt.java:37)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMetrics.<init>(RpcMetrics.java:60)
>     [junit] at org.apache.hadoop.ipc.Server.<init>(Server.java:969)
>     [junit] at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:393)
>     [junit] at org.apache.hadoop.ipc.RPC.getServer(RPC.java:355)
>     [junit] 2008-01-22 12:16:40,156 INFO  dfs.NameNode (NameNode.java:initialize(125)) - Namenode up at: localhost/127.0.0.1:36768
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:119)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] 2008-01-22 12:16:40,156 INFO  jvm.JvmMetrics (JvmMetrics.java:init(51)) - Cannot initialize JVM Metrics with processName=NameNode, sessionId=null - already initialized
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] 2008-01-22 12:16:40,158 INFO  dfs.NameNodeMetrics (NameNodeMetrics.java:<init>(74)) - Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=NameNodeStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.dfs.namenode.metrics.NameNodeMgt.<init>(NameNodeMgt.java:36)
>     [junit] at org.apache.hadoop.dfs.NameNodeMetrics.<init>(NameNodeMetrics.java:67)
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:127)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] 2008-01-22 12:16:40,193 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(321)) - fsOwner=hudson,hudson
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(325)) - supergroup=supergroup
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(326)) - isPermissionEnabled=true
>     [junit] 2008-01-22 12:16:40,198 INFO  dfs.Storage (Storage.java:doRecover(369)) - Recovering storage directory /export/home/hudson/hudson/jobs/Hadoop-Nightly/workspace/trunk/build/test/data/name1 from previous rollback.
>     [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
>     [junit] Test org.apache.hadoop.dfs.TestDFSStorageStateRecovery FAILED (timeout)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2680) Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ https://issues.apache.org/jira/browse/HADOOP-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561558#action_12561558 ]

Tsz Wo (Nicholas), SZE commented on HADOOP-2680:
------------------------------------------------

Could we simply decrease the value of TestEditLog.numberTransactions to 500, or even 200?  I did that in my local copy for saving time.

> Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2680
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2680
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Mukund Madhugiri
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.16.0
>
>         Attachments: build-369-TestEditLog.txt
>
>
> The unit test: TestDFSStorageStateRecovery has failed a couple of times with a timeout on the nightly build:
> Here are the logs from the failures:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/373/console
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/369/console
> Here is a build where it passed:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/365/console
> Here is the part where it failed:
> [junit] 2008-01-22 12:16:39,499 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(84)) - ============================================================
>     [junit] 2008-01-22 12:16:39,502 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(85)) - ***TEST 34*** NAME_NODE recovery: numDirs=1 testCase=17 current=false previous=true previous.tmp=false removed.tmp=true
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 4 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 3 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 7 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(353)) - Stopping IPC Server listener on 36207
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 2 on 36207: exiting
>     [junit] 2008-01-22 12:16:40,153 INFO  jvm.JvmMetrics (RpcMetrics.java:setTags(49)) - Initializing RPC Metrics with serverName=NameNode, port=0
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=RpcStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMgt.<init>(RpcMgt.java:37)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMetrics.<init>(RpcMetrics.java:60)
>     [junit] at org.apache.hadoop.ipc.Server.<init>(Server.java:969)
>     [junit] at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:393)
>     [junit] at org.apache.hadoop.ipc.RPC.getServer(RPC.java:355)
>     [junit] 2008-01-22 12:16:40,156 INFO  dfs.NameNode (NameNode.java:initialize(125)) - Namenode up at: localhost/127.0.0.1:36768
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:119)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] 2008-01-22 12:16:40,156 INFO  jvm.JvmMetrics (JvmMetrics.java:init(51)) - Cannot initialize JVM Metrics with processName=NameNode, sessionId=null - already initialized
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] 2008-01-22 12:16:40,158 INFO  dfs.NameNodeMetrics (NameNodeMetrics.java:<init>(74)) - Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=NameNodeStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.dfs.namenode.metrics.NameNodeMgt.<init>(NameNodeMgt.java:36)
>     [junit] at org.apache.hadoop.dfs.NameNodeMetrics.<init>(NameNodeMetrics.java:67)
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:127)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] 2008-01-22 12:16:40,193 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(321)) - fsOwner=hudson,hudson
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(325)) - supergroup=supergroup
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(326)) - isPermissionEnabled=true
>     [junit] 2008-01-22 12:16:40,198 INFO  dfs.Storage (Storage.java:doRecover(369)) - Recovering storage directory /export/home/hudson/hudson/jobs/Hadoop-Nightly/workspace/trunk/build/test/data/name1 from previous rollback.
>     [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
>     [junit] Test org.apache.hadoop.dfs.TestDFSStorageStateRecovery FAILED (timeout)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2680) Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ https://issues.apache.org/jira/browse/HADOOP-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561563#action_12561563 ]

Raghu Angadi commented on HADOOP-2680:
--------------------------------------

> Could we simply decrease the value of TestEditLog.numberTransactions to 500, or even 200? I did that in my local copy for saving time.

Before we do this probably its better to know what changed on Hudson. This slowness would affect pretty much everything that Hudson does...

> Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2680
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2680
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Mukund Madhugiri
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.16.0
>
>         Attachments: build-369-TestEditLog.txt
>
>
> The unit test: TestDFSStorageStateRecovery has failed a couple of times with a timeout on the nightly build:
> Here are the logs from the failures:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/373/console
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/369/console
> Here is a build where it passed:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/365/console
> Here is the part where it failed:
> [junit] 2008-01-22 12:16:39,499 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(84)) - ============================================================
>     [junit] 2008-01-22 12:16:39,502 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(85)) - ***TEST 34*** NAME_NODE recovery: numDirs=1 testCase=17 current=false previous=true previous.tmp=false removed.tmp=true
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 4 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 3 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 7 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(353)) - Stopping IPC Server listener on 36207
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 2 on 36207: exiting
>     [junit] 2008-01-22 12:16:40,153 INFO  jvm.JvmMetrics (RpcMetrics.java:setTags(49)) - Initializing RPC Metrics with serverName=NameNode, port=0
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=RpcStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMgt.<init>(RpcMgt.java:37)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMetrics.<init>(RpcMetrics.java:60)
>     [junit] at org.apache.hadoop.ipc.Server.<init>(Server.java:969)
>     [junit] at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:393)
>     [junit] at org.apache.hadoop.ipc.RPC.getServer(RPC.java:355)
>     [junit] 2008-01-22 12:16:40,156 INFO  dfs.NameNode (NameNode.java:initialize(125)) - Namenode up at: localhost/127.0.0.1:36768
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:119)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] 2008-01-22 12:16:40,156 INFO  jvm.JvmMetrics (JvmMetrics.java:init(51)) - Cannot initialize JVM Metrics with processName=NameNode, sessionId=null - already initialized
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] 2008-01-22 12:16:40,158 INFO  dfs.NameNodeMetrics (NameNodeMetrics.java:<init>(74)) - Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=NameNodeStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.dfs.namenode.metrics.NameNodeMgt.<init>(NameNodeMgt.java:36)
>     [junit] at org.apache.hadoop.dfs.NameNodeMetrics.<init>(NameNodeMetrics.java:67)
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:127)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] 2008-01-22 12:16:40,193 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(321)) - fsOwner=hudson,hudson
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(325)) - supergroup=supergroup
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(326)) - isPermissionEnabled=true
>     [junit] 2008-01-22 12:16:40,198 INFO  dfs.Storage (Storage.java:doRecover(369)) - Recovering storage directory /export/home/hudson/hudson/jobs/Hadoop-Nightly/workspace/trunk/build/test/data/name1 from previous rollback.
>     [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
>     [junit] Test org.apache.hadoop.dfs.TestDFSStorageStateRecovery FAILED (timeout)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2680) Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ https://issues.apache.org/jira/browse/HADOOP-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561577#action_12561577 ]

Nigel Daley commented on HADOOP-2680:
-------------------------------------

Perhaps this is causing the slowness:

https://issues.apache.org/jira/browse/INFRA-1460?
focusedCommentId=12557402#action_12557402






> Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2680
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2680
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Mukund Madhugiri
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.16.0
>
>         Attachments: build-369-TestEditLog.txt
>
>
> The unit test: TestDFSStorageStateRecovery has failed a couple of times with a timeout on the nightly build:
> Here are the logs from the failures:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/373/console
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/369/console
> Here is a build where it passed:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/365/console
> Here is the part where it failed:
> [junit] 2008-01-22 12:16:39,499 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(84)) - ============================================================
>     [junit] 2008-01-22 12:16:39,502 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(85)) - ***TEST 34*** NAME_NODE recovery: numDirs=1 testCase=17 current=false previous=true previous.tmp=false removed.tmp=true
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 4 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 3 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 7 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(353)) - Stopping IPC Server listener on 36207
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 2 on 36207: exiting
>     [junit] 2008-01-22 12:16:40,153 INFO  jvm.JvmMetrics (RpcMetrics.java:setTags(49)) - Initializing RPC Metrics with serverName=NameNode, port=0
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=RpcStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMgt.<init>(RpcMgt.java:37)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMetrics.<init>(RpcMetrics.java:60)
>     [junit] at org.apache.hadoop.ipc.Server.<init>(Server.java:969)
>     [junit] at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:393)
>     [junit] at org.apache.hadoop.ipc.RPC.getServer(RPC.java:355)
>     [junit] 2008-01-22 12:16:40,156 INFO  dfs.NameNode (NameNode.java:initialize(125)) - Namenode up at: localhost/127.0.0.1:36768
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:119)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] 2008-01-22 12:16:40,156 INFO  jvm.JvmMetrics (JvmMetrics.java:init(51)) - Cannot initialize JVM Metrics with processName=NameNode, sessionId=null - already initialized
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] 2008-01-22 12:16:40,158 INFO  dfs.NameNodeMetrics (NameNodeMetrics.java:<init>(74)) - Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=NameNodeStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.dfs.namenode.metrics.NameNodeMgt.<init>(NameNodeMgt.java:36)
>     [junit] at org.apache.hadoop.dfs.NameNodeMetrics.<init>(NameNodeMetrics.java:67)
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:127)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] 2008-01-22 12:16:40,193 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(321)) - fsOwner=hudson,hudson
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(325)) - supergroup=supergroup
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(326)) - isPermissionEnabled=true
>     [junit] 2008-01-22 12:16:40,198 INFO  dfs.Storage (Storage.java:doRecover(369)) - Recovering storage directory /export/home/hudson/hudson/jobs/Hadoop-Nightly/workspace/trunk/build/test/data/name1 from previous rollback.
>     [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
>     [junit] Test org.apache.hadoop.dfs.TestDFSStorageStateRecovery FAILED (timeout)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2680) Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ https://issues.apache.org/jira/browse/HADOOP-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561600#action_12561600 ]

dhruba borthakur commented on HADOOP-2680:
------------------------------------------

I see that the hudson tests run on a machine that shows that lots of swap space is being used:

Memory: 16G real, 5912M free, 13G swap in use, 30G swap free

Can somebody please check if these machines are really swapping user processes (this could cause test runs to slow down drastically)?

> Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2680
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2680
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Mukund Madhugiri
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.16.0
>
>         Attachments: build-369-TestEditLog.txt
>
>
> The unit test: TestDFSStorageStateRecovery has failed a couple of times with a timeout on the nightly build:
> Here are the logs from the failures:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/373/console
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/369/console
> Here is a build where it passed:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/365/console
> Here is the part where it failed:
> [junit] 2008-01-22 12:16:39,499 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(84)) - ============================================================
>     [junit] 2008-01-22 12:16:39,502 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(85)) - ***TEST 34*** NAME_NODE recovery: numDirs=1 testCase=17 current=false previous=true previous.tmp=false removed.tmp=true
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 4 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 3 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 7 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(353)) - Stopping IPC Server listener on 36207
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 2 on 36207: exiting
>     [junit] 2008-01-22 12:16:40,153 INFO  jvm.JvmMetrics (RpcMetrics.java:setTags(49)) - Initializing RPC Metrics with serverName=NameNode, port=0
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=RpcStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMgt.<init>(RpcMgt.java:37)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMetrics.<init>(RpcMetrics.java:60)
>     [junit] at org.apache.hadoop.ipc.Server.<init>(Server.java:969)
>     [junit] at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:393)
>     [junit] at org.apache.hadoop.ipc.RPC.getServer(RPC.java:355)
>     [junit] 2008-01-22 12:16:40,156 INFO  dfs.NameNode (NameNode.java:initialize(125)) - Namenode up at: localhost/127.0.0.1:36768
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:119)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] 2008-01-22 12:16:40,156 INFO  jvm.JvmMetrics (JvmMetrics.java:init(51)) - Cannot initialize JVM Metrics with processName=NameNode, sessionId=null - already initialized
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] 2008-01-22 12:16:40,158 INFO  dfs.NameNodeMetrics (NameNodeMetrics.java:<init>(74)) - Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=NameNodeStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.dfs.namenode.metrics.NameNodeMgt.<init>(NameNodeMgt.java:36)
>     [junit] at org.apache.hadoop.dfs.NameNodeMetrics.<init>(NameNodeMetrics.java:67)
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:127)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] 2008-01-22 12:16:40,193 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(321)) - fsOwner=hudson,hudson
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(325)) - supergroup=supergroup
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(326)) - isPermissionEnabled=true
>     [junit] 2008-01-22 12:16:40,198 INFO  dfs.Storage (Storage.java:doRecover(369)) - Recovering storage directory /export/home/hudson/hudson/jobs/Hadoop-Nightly/workspace/trunk/build/test/data/name1 from previous rollback.
>     [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
>     [junit] Test org.apache.hadoop.dfs.TestDFSStorageStateRecovery FAILED (timeout)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2680) Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ https://issues.apache.org/jira/browse/HADOOP-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561602#action_12561602 ]

dhruba borthakur commented on HADOOP-2680:
------------------------------------------

Also, a "iostat -xc 5" on the test machine yields:

sd19     92.8   99.4  237.9  430.5  0.0 14.7   76.6   0 100

which means that device sd19 has a service wait time of 76 ms and 100% busy (this is very very bad).

> Unit test fails with a timeout on nightly build: org.apache.hadoop.dfs.TestDFSStorageStateRecovery
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2680
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2680
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.16.0
>            Reporter: Mukund Madhugiri
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>             Fix For: 0.16.0
>
>         Attachments: build-369-TestEditLog.txt
>
>
> The unit test: TestDFSStorageStateRecovery has failed a couple of times with a timeout on the nightly build:
> Here are the logs from the failures:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/373/console
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/369/console
> Here is a build where it passed:
> http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/365/console
> Here is the part where it failed:
> [junit] 2008-01-22 12:16:39,499 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(84)) - ============================================================
>     [junit] 2008-01-22 12:16:39,502 INFO  dfs.TestDFSStorageStateRecovery (TestDFSStorageStateRecovery.java:log(85)) - ***TEST 34*** NAME_NODE recovery: numDirs=1 testCase=17 current=false previous=true previous.tmp=false removed.tmp=true
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 4 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 3 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 7 on 36207: exiting
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(353)) - Stopping IPC Server listener on 36207
>     [junit] 2008-01-22 12:16:39,499 INFO  ipc.Server (Server.java:run(939)) - IPC Server handler 2 on 36207: exiting
>     [junit] 2008-01-22 12:16:40,153 INFO  jvm.JvmMetrics (RpcMetrics.java:setTags(49)) - Initializing RPC Metrics with serverName=NameNode, port=0
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=RpcStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMgt.<init>(RpcMgt.java:37)
>     [junit] at org.apache.hadoop.ipc.metrics.RpcMetrics.<init>(RpcMetrics.java:60)
>     [junit] at org.apache.hadoop.ipc.Server.<init>(Server.java:969)
>     [junit] at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:393)
>     [junit] at org.apache.hadoop.ipc.RPC.getServer(RPC.java:355)
>     [junit] 2008-01-22 12:16:40,156 INFO  dfs.NameNode (NameNode.java:initialize(125)) - Namenode up at: localhost/127.0.0.1:36768
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:119)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] 2008-01-22 12:16:40,156 INFO  jvm.JvmMetrics (JvmMetrics.java:init(51)) - Cannot initialize JVM Metrics with processName=NameNode, sessionId=null - already initialized
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] 2008-01-22 12:16:40,158 INFO  dfs.NameNodeMetrics (NameNodeMetrics.java:<init>(74)) - Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] javax.management.InstanceAlreadyExistsException: hadoop.dfs:service=NameNode,name=NameNodeStatistics
>     [junit] at com.sun.jmx.mbeanserver.RepositorySupport.addMBean(RepositorySupport.java:452)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1410)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:936)
>     [junit] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:337)
>     [junit] at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:497)
>     [junit] at org.apache.hadoop.metrics.util.MBeanUtil.registerMBean(MBeanUtil.java:50)
>     [junit] at org.apache.hadoop.dfs.namenode.metrics.NameNodeMgt.<init>(NameNodeMgt.java:36)
>     [junit] at org.apache.hadoop.dfs.NameNodeMetrics.<init>(NameNodeMetrics.java:67)
>     [junit] at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:127)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:174)
>     [junit] at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:160)
>     [junit] at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:849)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:195)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:134)
>     [junit] at org.apache.hadoop.dfs.MiniDFSCluster.<init>(MiniDFSCluster.java:86)
>     [junit] at org.apache.hadoop.dfs.TestDFSStorageStateRecovery.testStorageStates(TestDFSStorageStateRecovery.java:193)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     [junit] at java.lang.reflect.Method.invoke(Method.java:585)
>     [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
>     [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
>     [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
>     [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] at junit.framework.TestResult.run(TestResult.java:109)
>     [junit] at junit.framework.TestCase.run(TestCase.java:118)
>     [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
>     [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
>     [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
>     [junit] 2008-01-22 12:16:40,193 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(321)) - fsOwner=hudson,hudson
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(325)) - supergroup=supergroup
>     [junit] 2008-01-22 12:16:40,195 INFO  fs.FSNamesystem (FSNamesystem.java:setConfigurationParameters(326)) - isPermissionEnabled=true
>     [junit] 2008-01-22 12:16:40,198 INFO  dfs.Storage (Storage.java:doRecover(369)) - Recovering storage directory /export/home/hudson/hudson/jobs/Hadoop-Nightly/workspace/trunk/build/test/data/name1 from previous rollback.
>     [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
>     [junit] Test org.apache.hadoop.dfs.TestDFSStorageStateRecovery FAILED (timeout)

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.