|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 | Next > |
|
|
[jira] Created: (HADOOP-2232) Add option to disable nagles algorithm in the IPC ServerAdd option to disable nagles algorithm in the IPC Server
-------------------------------------------------------- Key: HADOOP-2232 URL: https://issues.apache.org/jira/browse/HADOOP-2232 Project: Hadoop Issue Type: Improvement Components: ipc Affects Versions: 0.16.0 Reporter: Clint Morgan While investigating hbase performance, I found a bottleneck caused by Nagles algorithm. For some reads I would get a bi-modal distribution of read times, with about half the times being around 20ms, and half around 200ms. I tracked this down to the well-known interaction between Nagle's algorithm and TCP delayed acknowledgments. I found that calling setTcpNoDelay(true) on the server's socket connection dropped all of my read times back to a constant 20 ms. I propose a patch to have this TCP_NODELAY option be configurable. The attacked patch allows one to set the TCP_NODELAY option on both the client and the server side. Currently this is defaulted to false (i.e., with Nagle's enabled). To see the effect, I have included a Test which provokes the issue by sending a MapWriteable over an IPC call. On my machine this test shows a speedup of 117 times when using TCP_NODELAY. These tests were done on OSX 10.4. Your milage may very with other TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clint Morgan updated HADOOP-2232: --------------------------------- Status: Patch Available (was: Open) > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clint Morgan updated HADOOP-2232: --------------------------------- Attachment: HADOOP-2232-1.patch > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Attachments: HADOOP-2232-1.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543737 ] Chris Douglas commented on HADOOP-2232: --------------------------------------- Patch looks good; I've seen disabling Nagle work well for other RPC systems and like the idea of permitting it here. The unit test fails if TcpNoDelay doesn't improve performance by at least 2x. I'm uncertain what sorts of regressions we'd prevent by including it, particularly since it's using a custom protocol. Did you have something in mind, or did you include it as a useful illustration of the concept? > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Attachments: HADOOP-2232-1.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543742 ] Clint Morgan commented on HADOOP-2232: -------------------------------------- Yeah, that test was just to illustrate the issue, and not for inclusion. > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Attachments: HADOOP-2232-1.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543801 ] Hadoop QA commented on HADOOP-2232: ----------------------------------- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12369828/HADOOP-2232-1.patch against trunk revision r596495. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests -1. The patch failed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1122/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1122/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1122/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1122/console This message is automatically generated. > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Attachments: HADOOP-2232-1.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543802 ] stack commented on HADOOP-2232: ------------------------------- Clint, can you redo your patch so it doesn't include a test that fails. Otherwise, +1 on the patch. It looks good. Am running some hbase tests to see difference before and after patch. Will report back when done. > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Attachments: HADOOP-2232-1.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clint Morgan updated HADOOP-2232: --------------------------------- Attachment: HADOOP-2232-2.patch Same patch, but with the test removed. However, the test failure was in an unrelated test. I don't think TCP_NODELAY will affect the current PerformanceEvaluation. It only affected my results with getRow() when the results were of particular sizes. However, if people notice consistent/uniform high latency response times, then it could be a sign of bad interaction between nagles and delayed ACKs that TCP_NODELAY would help. > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Attachments: HADOOP-2232-1.patch, HADOOP-2232-2.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12543964 ] Raghu Angadi commented on HADOOP-2232: -------------------------------------- Are there any disadvantages of enabling this option by default? > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Attachments: HADOOP-2232-1.patch, HADOOP-2232-2.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544648 ] stack commented on HADOOP-2232: ------------------------------- Dang. So this IPC switch-flipping ain't the silver bullet thats going to fix all hbase performance issues? I did rough timings using sequentialRead in PE. Setting ipc.server.tcpnodelay to true made the test run much slower (cells are 1k in size). Otherwise, +1 on the patch after review and test on my little cluster. Seems like an option that will be important for certain loadings.. > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Attachments: HADOOP-2232-1.patch, HADOOP-2232-2.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544891 ] Clint Morgan commented on HADOOP-2232: -------------------------------------- As I understand it, a possible disadvantage is increased network traffic and bandwidth. Disabling nagles means we send packets for small amounts of data, and so we spend more bandwidth on packet headers. > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Attachments: HADOOP-2232-1.patch, HADOOP-2232-2.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Assigned: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley reassigned HADOOP-2232: ------------------------------------- Assignee: Clint Morgan > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Assignee: Clint Morgan > Attachments: HADOOP-2232-1.patch, HADOOP-2232-2.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12549578 ] Owen O'Malley commented on HADOOP-2232: --------------------------------------- Makund, Can you please run a 500 node sort and look for task failures and execution time degredations? Thanks! > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Assignee: Clint Morgan > Attachments: HADOOP-2232-1.patch, HADOOP-2232-2.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561453#action_12561453 ] Mukund Madhugiri commented on HADOOP-2232: ------------------------------------------ Finally getting around to run a 500 node sort benchmark. I tried to apply the patch to trunk and it fails. Is it possible to upload a new patch that works with trunk? Thanks > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Assignee: Clint Morgan > Attachments: HADOOP-2232-1.patch, HADOOP-2232-2.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HADOOP-2232: ---------------------------------- Attachment: 2232-3.patch Merged patch with latest trunk > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Assignee: Clint Morgan > Attachments: 2232-3.patch, HADOOP-2232-1.patch, HADOOP-2232-2.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HADOOP-2232: ---------------------------------- Status: Open (was: Patch Available) > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Assignee: Clint Morgan > Attachments: 2232-3.patch, HADOOP-2232-1.patch, HADOOP-2232-2.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HADOOP-2232: ---------------------------------- Status: Patch Available (was: Open) Submitting re-merged patch for Hudson > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Assignee: Clint Morgan > Attachments: 2232-3.patch, HADOOP-2232-1.patch, HADOOP-2232-2.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561551#action_12561551 ] Mukund Madhugiri commented on HADOOP-2232: ------------------------------------------ Here is data from the Sort benchmark run on 500 nodes. Sort validation could not be compared as it is broken on trunk (HADOOP-2646). NOTE: The trunk run is 4 days old and will have new data on latest trunk tomorrow. |*500 nodes*|*trunk*|*trunk + patch*|*Difference (%)*| |randomWriter (mins)|24|28| ( 17.9% ) |sort (mins)|91|113| ( 23% ) I see exceptions of this kind in the JT logs: 2008-01-22 22:21:35,162 INFO org.apache.hadoop.mapred.TaskInProgress: Error from task_200801222207_0001_m_000989_0: java.io.IOException: All datanodes are bad. Aborting... at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:1831) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1100(DFSClient.java:1479) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1571) 2008-01-22 23:09:01,629 INFO org.apache.hadoop.mapred.TaskInProgress: Error from task_200801222207_0002_m_036186_0: java.net.UnknownHostException: unknown host: <hostname> at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:142) at org.apache.hadoop.ipc.Client.getConnection(Client.java:568) at org.apache.hadoop.ipc.Client.call(Client.java:501) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:198) at org.apache.hadoop.dfs.$Proxy1.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:291) at org.apache.hadoop.dfs.DFSClient.createNamenode(DFSClient.java:127) at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:143) at org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFileSystem.java:64) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:164) > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Assignee: Clint Morgan > Attachments: 2232-3.patch, HADOOP-2232-1.patch, HADOOP-2232-2.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561595#action_12561595 ] dhruba borthakur commented on HADOOP-2232: ------------------------------------------ The exception messages are related to HADOOP-1707. This JIRA changed the error recovery model. Earlier, the client used to cache the entire disk block. When it is full it uploads the entire block to a pipeline of datanodes. If the upload to the first datanode succeeded, the operation was deemed as successful. In the new model, the client will upload the block to all datanodes in the pipeline. In the case of error, the client establishes a new pipeline (by removing the bad datanode(s) from the pipeline) and resending outstanding data for this block. This change means that a client is now *more* likely to detect failures of datanodes. My question then is: do you see any of these exceptions when you run your test on trunk without the patch for this JIRA? > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Assignee: Clint Morgan > Attachments: 2232-3.patch, HADOOP-2232-1.patch, HADOOP-2232-2.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (HADOOP-2232) Add option to disable nagles algorithm in the IPC Server[ https://issues.apache.org/jira/browse/HADOOP-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561645#action_12561645 ] Hadoop QA commented on HADOOP-2232: ----------------------------------- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12373779/2232-3.patch against trunk revision r614413. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1682/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1682/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1682/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1682/console This message is automatically generated. > Add option to disable nagles algorithm in the IPC Server > -------------------------------------------------------- > > Key: HADOOP-2232 > URL: https://issues.apache.org/jira/browse/HADOOP-2232 > Project: Hadoop > Issue Type: Improvement > Components: ipc > Affects Versions: 0.16.0 > Reporter: Clint Morgan > Assignee: Clint Morgan > Attachments: 2232-3.patch, HADOOP-2232-1.patch, HADOOP-2232-2.patch > > > While investigating hbase performance, I found a bottleneck caused by > Nagles algorithm. For some reads I would get a bi-modal distribution > of read times, with about half the times being around 20ms, and half > around 200ms. I tracked this down to the well-known interaction between > Nagle's algorithm and TCP delayed acknowledgments. > I found that calling setTcpNoDelay(true) on the server's socket > connection dropped all of my read times back to a constant 20 ms. > I propose a patch to have this TCP_NODELAY option be configurable. The > attacked patch allows one to set the TCP_NODELAY option on both the > client and the server side. Currently this is defaulted to false > (i.e., with Nagle's enabled). > To see the effect, I have included a Test which provokes the issue by > sending a MapWriteable over an IPC call. On my machine this test shows > a speedup of 117 times when using TCP_NODELAY. > These tests were done on OSX 10.4. Your milage may very with other > TCP/IP implementation stacks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
| < Prev | 1 - 2 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |