« Return to Thread: [jira] Created: (HADOOP-2284) BasicTypeSorterBase.compare calls progress on each compare

[jira] Commented: (HADOOP-2284) BasicTypeSorterBase.compare calls progress on each compare

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View in Thread


    [ https://issues.apache.org/jira/browse/HADOOP-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554382 ]

Amar Kamat commented on HADOOP-2284:
------------------------------------

Looks like there are two possibilities
-  _counter-threshold based technique_ : Send the progress after certain number of compares.
-  _time-interval based technique_ : Where the amount of time to wait before sending the progress is determined by *mapred.task.timeout*
Currently time-based seems like a better technique. Comments?

> BasicTypeSorterBase.compare calls progress on each compare
> ----------------------------------------------------------
>
>                 Key: HADOOP-2284
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2284
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Devaraj Das
>             Fix For: 0.16.0
>
>
> The inner loop of the sort is calling progress on each compare. I think it would make more sense to call progress in the sort rather than the compare or at most every 10000 compares. In the performance numbers, the call to progress as part of the sort are consuming 12% of the total cpu time when running word count under the local runner.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

 « Return to Thread: [jira] Created: (HADOOP-2284) BasicTypeSorterBase.compare calls progress on each compare