|
View:
New views
18 Messages
—
Rating Filter:
Alert me
|
|
|
Task OptimizationI am interested in ways to short circuit task execution for the purpose of
optimization. I would love to see some of this in 0.7 and would be glad to contribute. Here are some ideas: 1) Add an "onlyIf" method to Task that is given a closure. The closure would be executed before the first action of the task and would cancel execution of the task (with appropriate lifecycle message) if it returned false. This closure would have as a delegate an optimization container with some helper methods that would provide more convenient access to change detection (among other things). Then you could do: mytask.onlyIf { timestampChanged 'src/main/mysrc' // or contentsChanged 'src/main/mysrc' } 2) Running a clean should probably remove the change detection state information for a project (or at least the clean task should be able to be configured to do this conveniently). 3) I would like some general way for tasks to indicate that they did anything. Perhaps task.getDidWork(). BTW, I figured out how to do this for gradle's use of ant.javac and can now tell if it really compiled anything. 4) I would like to be able to specify that a chain of dependent tasks only execute a task if Task.didWork is true for all of its dependents. Note that this is not always desired, so you need to be able to turn this on and off. I'm not sure of the best way to configure this. If we use the onlyIf method suggested above, it might take another closure to check this that would be returned from a "needed" method. This would look like: myTask.onlyIf(needed()) This probably should be the default for tests, but perhaps not for all Tasks. Javac is already checking to see if the source files are out of date with the classes, so I don't think that the javac task needs to use the new changedetection. This would, however let you stop other tasks in the chain (like test) if nothing needed to be compiled. (unrelated: I would also like to see an option on compile to use Ant's depend task. I think the current dependencyTracking option doesn't work with the modern compiler. ) Other types of tasks could make good use of Tom's change detection. 5) We probably want a command line option to be able to disable all of these optimizations. Sometimes you really want to force a build with no optimizations (without running clean). In the race for speed, Gradle will probably never catch Ant in a clean build (at least while you are delegating most of the expensive stuff to ant). However, most of the time developers are doing incremental changes on existing systems and not running clean. In this case, if Gradle can support features to conveniently bypass unneeded steps, it can be much faster. Also, Gradle has a huge advantage of a more maintainable and modular build specification. -- Steve Appling Automated Logic Research Team --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Task OptimizationI have a proof of concept implementation of some of this at
git://github.com/sappling/gradle.git in the "opt" branch. This includes: 1) A new onlyIf method on Task 2) A new didWork method on Task 3) Implementations of didWork for Compile and GroovyCompile. I don't think that we can handle Ant's Copy task in this same way. We may have to use a replacement, but this has other consequences. 4) Changes to src/samples/java/quickstart/build.gradle to demo didWork and onlyIf. 5) A start at OptimizationHelper.isNeeded method. This will require some additional dependency management features, so I stopped development until I got some feedback on this whole approach. Steve Appling wrote: > I am interested in ways to short circuit task execution for the purpose > of optimization. I would love to see some of this in 0.7 and would be > glad to contribute. > > Here are some ideas: > 1) Add an "onlyIf" method to Task that is given a closure. The closure > would be executed before the first action of the task and would cancel > execution of the task (with appropriate lifecycle message) if it > returned false. This closure would have as a delegate an optimization > container with some helper methods that would provide more convenient > access to change detection (among other things). Then you could do: > mytask.onlyIf { > timestampChanged 'src/main/mysrc' > // or contentsChanged 'src/main/mysrc' > } > > 2) Running a clean should probably remove the change detection state > information for a project (or at least the clean task should be able to > be configured to do this conveniently). > > 3) I would like some general way for tasks to indicate that they did > anything. Perhaps task.getDidWork(). BTW, I figured out how to do this > for gradle's use of ant.javac and can now tell if it really compiled > anything. > > 4) I would like to be able to specify that a chain of dependent tasks > only execute a task if Task.didWork is true for all of its dependents. > Note that this is not always desired, so you need to be able to turn > this on and off. I'm not sure of the best way to configure this. If we > use the onlyIf method suggested above, it might take another closure to > check this that would be returned from a "needed" method. This would > look like: > myTask.onlyIf(needed()) > > This probably should be the default for tests, but perhaps not for all > Tasks. > > Javac is already checking to see if the source files are out of date > with the classes, so I don't think that the javac task needs to use the > new changedetection. This would, however let you stop other tasks in > the chain (like test) if nothing needed to be compiled. (unrelated: I > would also like to see an option on compile to use Ant's depend task. I > think the current dependencyTracking option doesn't work with the modern > compiler. ) > > Other types of tasks could make good use of Tom's change detection. > > 5) We probably want a command line option to be able to disable all of > these optimizations. Sometimes you really want to force a build with no > optimizations (without running clean). > > > In the race for speed, Gradle will probably never catch Ant in a clean > build (at least while you are delegating most of the expensive stuff to > ant). However, most of the time developers are doing incremental > changes on existing systems and not running clean. In this case, if > Gradle can support features to conveniently bypass unneeded steps, it > can be much faster. Also, Gradle has a huge advantage of a more > maintainable and modular build specification. > -- Steve Appling Automated Logic Research Team --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Task OptimizationOn Jun 19, 2009, at 9:44 PM, Steve Appling wrote: > I am interested in ways to short circuit task execution for the > purpose of optimization. I would love to see some of this in 0.7 > and would be glad to contribute. > > Here are some ideas: > 1) Add an "onlyIf" method to Task that is given a closure. The > closure would be executed before the first action of the task and > would cancel execution of the task (with appropriate lifecycle > message) if it returned false. This closure would have as a > delegate an optimization container with some helper methods that > would provide more convenient access to change detection (among > other things). Then you could do: > mytask.onlyIf { > timestampChanged 'src/main/mysrc' > // or contentsChanged 'src/main/mysrc' > } I like the syntax. I'm also thinking about the following use cases: I want to _add_ a custom onlyIf condition. I want to remove a condition. What you can do to add: oldOnlyIf = myTask.onlyIf myTask.onlyIf { value == 5 && oldOnlyIf.call() } It is not very nice but it works. So I think that should be good enough for 0.7. Later we might add a spec like API. Removing and replacing is obviously easy. > > 2) Running a clean should probably remove the change detection state > information for a project (or at least the clean task should be able > to be configured to do this conveniently). That would be important. > > 3) I would like some general way for tasks to indicate that they did > anything. Perhaps task.getDidWork(). BTW, I figured out how to do > this for gradle's use of ant.javac and can now tell if it really > compiled anything. This makes sense. > > 4) I would like to be able to specify that a chain of dependent > tasks only execute a task if Task.didWork is true for all of its > dependents. Note that this is not always desired, so you need to be > able to turn this on and off. I'm not sure of the best way to > configure this. If we use the onlyIf method suggested above, it > might take another closure to check this that would be returned from > a "needed" method. This would look like: > myTask.onlyIf(needed()) > > This probably should be the default for tests, but perhaps not for > all Tasks. > > Javac is already checking to see if the source files are out of date > with the classes, so I don't think that the javac task needs to use > the new changedetection. Right. But we can set the didWork flag. > This would, however let you stop other tasks in the chain (like > test) if nothing needed to be compiled. Right. > (unrelated: I would also like to see an option on compile to use > Ant's depend task. I think the current dependencyTracking option > doesn't work with the modern compiler. ) Interesting. This deserves a discussion on its own. I think this is an important topic. > > Other types of tasks could make good use of Tom's change detection. > > 5) We probably want a command line option to be able to disable all > of these optimizations. Sometimes you really want to force a build > with no optimizations (without running clean). Right. A contrived example: You want the tests to be run even if nothing needs to be compiled as your tests depend on some dynamic properties retrieved from the network. Adam has come up with the idea of introducing the notion of a build type. We should discuss this now in more detail. - Hans -- Hans Dockter Gradle Project Manager http://www.gradle.org --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Task OptimizationCool. I'm keen to get this into 0.7.
On Jun 22, 2009, at 9:03 PM, Steve Appling wrote: > I have a proof of concept implementation of some of this at git:// > github.com/sappling/gradle.git in the "opt" branch. > > This includes: > 1) A new onlyIf method on Task > 2) A new didWork method on Task > 3) Implementations of didWork for Compile and GroovyCompile. Amazing. For Ant 1.7.1 you probably could also use the new updatedProperty. But no such thing exists for Groovyc. Excellent. And I think we should expose all the information you gather. The public API of the Compile task could return a list with compiled files. > I don't think that we can handle Ant's Copy task in this same way. > We may have to use a replacement, but this has other consequences. I guess the problem is that as long as the Copy task is not able to tell if it did work, we can't decide whether to skip the tests (unless we check the binary dir). Isn't it? > 5) A start at OptimizationHelper.isNeeded method. This will require > some additional dependency management features, so I stopped > development until I got some feedback on this whole approach. I have to think about the whole 'needed' thing. I will give feedback later. - Hans -- Hans Dockter Gradle Project Manager http://www.gradle.org --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Task Optimization<snip>
> 4) I would like to be able to specify that a chain of dependent > tasks only execute a task if Task.didWork is true for all of its > dependents. I don't fully understand this. Could you explain this a bit more? <snip> - Hans -- Hans Dockter Gradle Project Manager http://www.gradle.org --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Task OptimizationHans Dockter wrote: > <snip> > >> 4) I would like to be able to specify that a chain of dependent tasks >> only execute a task if Task.didWork is true for all of its dependents. > > I don't fully understand this. Could you explain this a bit more? > > <snip> > > - Hans > Sure - I did not express that very well at all. I also wrote it before I attempted an implementation, so I think I have a better idea of what might be needed now. In the syntax that I implemented, you could say: test.onlyIf { isNeeded() } I wanted this to be able to look at the TaskDependencies for the test task and only execute if Task.didWork was true for one of them. I was not able to figure out how to use TaskDependencies to accomplish this. task.getTaskDependencies(task) only returns the tasks that are explicitly added using dependsOn and doesn't seem to take into account the tasks needed to build the artifacts in the configurations that are contained in the TaskDependencies object. In this case (a Test task), I would like the isNeeded method to return true if either compile or compileTests didWork() is true or if any of the tasks needed to build artifacts in the testRuntime configuration didWork() was true. Currently it does not check the tasks that might be derived from the configuration. I was hoping that a general purpose isNeeded helper could do this for all tasks in the same way, but it is possible that certain subclasses of Task just need their own specific implementations. -- Steve Appling Automated Logic Research Team --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Task OptimizationHans Dockter wrote: > Cool. I'm keen to get this into 0.7. > > On Jun 22, 2009, at 9:03 PM, Steve Appling wrote: > >> I have a proof of concept implementation of some of this at >> git://github.com/sappling/gradle.git in the "opt" branch. >> >> This includes: >> 1) A new onlyIf method on Task >> 2) A new didWork method on Task >> 3) Implementations of didWork for Compile and GroovyCompile. > > Amazing. For Ant 1.7.1 you probably could also use the new > updatedProperty. But no such thing exists for Groovyc. Excellent. And I > think we should expose all the information you gather. The public API of > the Compile task could return a list with compiled files. > >> I don't think that we can handle Ant's Copy task in this same way. We >> may have to use a replacement, but this has other consequences. > > I guess the problem is that as long as the Copy task is not able to tell > if it did work, we can't decide whether to skip the tests (unless we > check the binary dir). Isn't it? > the execute method (comment says this is to clean up so a single instance can be reused). We need to know if the copy did anything (for tasks like processResources) so that processResources.didWork can be part of the onlyIf closure for test. I have a replacement implementation of Copy that doesn't use Ant, but I would want to give it a closer look before making it public. It is not exactly the same syntax as the current Copy task, but has some nice extra features including file renaming based on regular expressions and filtering content during a copy. It can also track if any files were actually copied. > >> 5) A start at OptimizationHelper.isNeeded method. This will require >> some additional dependency management features, so I stopped >> development until I got some feedback on this whole approach. > > I have to think about the whole 'needed' thing. I will give feedback later. > > - Hans > > -- > Hans Dockter > Gradle Project Manager > http://www.gradle.org > -- Steve Appling Automated Logic Research Team --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Task OptimizationOn Jun 24, 2009, at 10:42 PM, Steve Appling wrote: > > > Hans Dockter wrote: >> <snip> >>> 4) I would like to be able to specify that a chain of dependent >>> tasks only execute a task if Task.didWork is true for all of its >>> dependents. >> I don't fully understand this. Could you explain this a bit more? >> <snip> >> - Hans > > Sure - I did not express that very well at all. I also wrote it > before I attempted an implementation, so I think I have a better > idea of what might be needed now. > > In the syntax that I implemented, you could say: > test.onlyIf { isNeeded() } > > I wanted this to be able to look at the TaskDependencies for the > test task and only execute if Task.didWork was true for one of them. > I was not able to figure out how to use TaskDependencies to > accomplish this. task.getTaskDependencies(task) only returns the > tasks that are explicitly added using dependsOn and doesn't seem to > take into account the tasks needed to build the artifacts in the > configurations that are contained in the TaskDependencies object. At the moment our task execution graph does not provide this information nor does it have a data model for this. What should be straight forward to do is to add a method to the execution graph that computes this on the fly for a certain task. > > In this case (a Test task), I would like the isNeeded method to > return true if either compile or compileTests didWork() is true or > if any of the tasks needed to build artifacts in the testRuntime > configuration didWork() was true. Currently it does not check the > tasks that might be derived from the configuration. It is similar to the idea of smart exclusion except that this needs to be done at execution time. > > I was hoping that a general purpose isNeeded helper could do this > for all tasks in the same way, but it is possible that certain > subclasses of Task just need their own specific implementations. Right. - Hans -- Hans Dockter Gradle Project Manager http://www.gradle.org --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Task OptimizationOn Jun 24, 2009, at 11:09 PM, Steve Appling wrote: > > > Hans Dockter wrote: >> Cool. I'm keen to get this into 0.7. >> On Jun 22, 2009, at 9:03 PM, Steve Appling wrote: >>> I have a proof of concept implementation of some of this at git:// >>> github.com/sappling/gradle.git in the "opt" branch. >>> >>> This includes: >>> 1) A new onlyIf method on Task >>> 2) A new didWork method on Task >>> 3) Implementations of didWork for Compile and GroovyCompile. >> Amazing. For Ant 1.7.1 you probably could also use the new >> updatedProperty. But no such thing exists for Groovyc. Excellent. >> And I think we should expose all the information you gather. The >> public API of the Compile task could return a list with compiled >> files. >>> I don't think that we can handle Ant's Copy task in this same >>> way. We may have to use a replacement, but this has other >>> consequences. >> I guess the problem is that as long as the Copy task is not able to >> tell if it did work, we can't decide whether to skip the tests >> (unless we check the binary dir). Isn't it? > The Ant copy task keeps a list of the files to copy, but clears it > at the end of the execute method (comment says this is to clean up > so a single instance can be reused). > > We need to know if the copy did anything (for tasks like > processResources) so that processResources.didWork can be part of > the onlyIf closure for test. > > I have a replacement implementation of Copy that doesn't use Ant, > but I would want to give it a closer look before making it public. > It is not exactly the same syntax as the current Copy task, but has > some nice extra features including file renaming based on regular > expressions and filtering content during a copy. It can also track > if any files were actually copied. I'm very happy to switch the Copy implementation even if it introduces some breaking changes. I'm very interested to have a look at your implementation. - Hans -- Hans Dockter Gradle Project Manager http://www.gradle.org --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Task OptimizationSteve Appling wrote: > I am interested in ways to short circuit task execution for the > purpose of optimization. I would love to see some of this in 0.7 and > would be glad to contribute. > > Here are some ideas: > 1) Add an "onlyIf" method to Task that is given a closure. The closure > would be executed before the first action of the task and would cancel > execution of the task (with appropriate lifecycle message) if it > returned false. This closure would have as a delegate an optimization > container with some helper methods that would provide more convenient > access to change detection (among other things). Then you could do: > mytask.onlyIf { > timestampChanged 'src/main/mysrc' > // or contentsChanged 'src/main/mysrc' > } > I think this is a good idea. > 2) Running a clean should probably remove the change detection state > information for a project (or at least the clean task should be able > to be configured to do this conveniently). > I think the change detection mechanism should figure out that the output artifacts don't exist any more instead. One thing that clean should arguably get rid of is the internal repository in $rootDir/.gradle. I wonder if it should also clean the buildSrc project? > 3) I would like some general way for tasks to indicate that they did > anything. Perhaps task.getDidWork(). BTW, I figured out how to do > this for gradle's use of ant.javac and can now tell if it really > compiled anything. > When you say 'it really compiled anything' do you mean you can tell whether the task decided to invoke javac or not? I think it would be better if Gradle could figure out whether a task did anything, rather than require the task writer to do anything. I think we could assume that if a task executes any task action, it has done work. If a task wants to do any short-circuiting, it would need to use an onlyIf() predicate. In addition, if we provided any easy way for a task to declare its output artifacts, then Gradle can additionally automatically apply change detection to these output artifacts in order to decide whether the task did any work. So, instead of adding a Task.didWork property, perhaps we should merge this concept with the existing Task.executed property into a single read-only Task.state property with an enum with values something like: created, executed, or skipped. > 4) I would like to be able to specify that a chain of dependent tasks > only execute a task if Task.didWork is true for all of its > dependents. Note that this is not always desired, so you need to be > able to turn this on and off. I'm not sure of the best way to > configure this. If we use the onlyIf method suggested above, it might > take another closure to check this that would be returned from a > "needed" method. This would look like: > myTask.onlyIf(needed()) > > This probably should be the default for tests, but perhaps not for all > Tasks. > I'm not sure about this approach. The tests should run if either the test classes or the classes under test have changed since last time we successfully ran the tests. Arguably a change to the test runtime classpath should also cause the tests to run. In other words, the tests should be run only if the input artifacts have not changed since last time we ran the tests. Checking whether all the dependencies of the test task have executed or not is only an approximation of this, and not a general solution. For example, if I assemble my classes under test using, say, 2 independent Compile tasks, then the test task should run if either task has done something. Or, I may assemble my classes using some other build tool, so that there's no task which we can use to check whether or not the classes have changed. To me, the key to task optimisation is to base it on the input and output artifacts of a task. If we make it easy to declare both the input and output artifacts of a task, we make the model much richer, and from this we get a lot of goodness. For example, if we know what the input artifacts for a task are, Gradle can apply change detection to those input artifacts on the task's behalf. If we also know which tasks produce those artifacts, then Gradle can optimise the change detection. Gradle could, for example, when it knows which task produces a given artifact, simply use the fact that the producer task executed an action or not to decide whether the input artifacts have changed, and only fall back to hashing or timestamps or a Java 7 file watcher or whatever when it doesn't know how the artifact is produced. Similarly, it could use the fact that a Jar was downloaded by the dependency management system to decide whether the input artifacts have changed. Adding input and output artifacts to the model also lets us use this information to build the DAG, and to be smart about skipping tasks. For example, if the test task were to declare that it uses the tests classes directory and the test runtime configuration as input artifacts, then Gradle would be able to automatically add the tasks that produce these (if any) to the task dependencies of the test task. Knowing which tasks produce and consume a given artifact also allows us to extract concurrency constraints from the model. If 2 tasks both contribute to the production of the same artifact (classes dir, say), they should not run concurrently. Or if 2 tasks both consume the same artifact, they should not run concurrently. And obviously a producer and consumer task for a given artifact should not run concurrently. Extending this further, if we know the input and output artifacts of a task, or subgraph of tasks, we can distribute the work to remote machines. > Javac is already checking to see if the source files are out of date > with the classes, so I don't think that the javac task needs to use > the new changedetection. This would, however let you stop other tasks > in the chain (like test) if nothing needed to be compiled. > (unrelated: I would also like to see an option on compile to use Ant's > depend task. I think the current dependencyTracking option doesn't > work with the modern compiler. ) > > Other types of tasks could make good use of Tom's change detection. > > 5) We probably want a command line option to be able to disable all of > these optimizations. Sometimes you really want to force a build with > no optimizations (without running clean). > > > In the race for speed, Gradle will probably never catch Ant in a clean > build (at least while you are delegating most of the expensive stuff > to ant). I wonder. The richer our model, the more scope we have to optimise without the build script author or task author to doing anything special. We can automatically extract parallelism. We can inline and batch tasks. We can distribute bits of the build. We can reuse work that other machines have already done. Adam --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Task OptimizationAdam Murdoch wrote: > > > Steve Appling wrote: >> I am interested in ways to short circuit task execution for the >> purpose of optimization. I would love to see some of this in 0.7 and >> would be glad to contribute. >> >> Here are some ideas: >> 1) Add an "onlyIf" method to Task that is given a closure. The closure >> would be executed before the first action of the task and would cancel >> execution of the task (with appropriate lifecycle message) if it >> returned false. This closure would have as a delegate an optimization >> container with some helper methods that would provide more convenient >> access to change detection (among other things). Then you could do: >> mytask.onlyIf { >> timestampChanged 'src/main/mysrc' >> // or contentsChanged 'src/main/mysrc' >> } >> > > I think this is a good idea. > >> 2) Running a clean should probably remove the change detection state >> information for a project (or at least the clean task should be able >> to be configured to do this conveniently). >> > > I think the change detection mechanism should figure out that the output > artifacts don't exist any more instead. > > One thing that clean should arguably get rid of is the internal > repository in $rootDir/.gradle. I wonder if it should also clean the > buildSrc project? > >> 3) I would like some general way for tasks to indicate that they did >> anything. Perhaps task.getDidWork(). BTW, I figured out how to do >> this for gradle's use of ant.javac and can now tell if it really >> compiled anything. >> > > When you say 'it really compiled anything' do you mean you can tell > whether the task decided to invoke javac or not? Ant's javac scans the source and class files itself to see if any source files are newer than the corresponding class files. If so, it then calls Java's javac with this list of outdated files. After executing the gradle task, I can determine which files were actually passed to Java's javac by ant. For several types of tasks (compile, groovycompile, copy, directory, zip, jar, tar), the task is already doing its own optimization by comparing source timestamps to some target during execution. It is possible to execute the task without it having any side effects. Since most of them have the information about what they actually did, it seems better (and faster) to use this information instead of scanning source / output a second time externally to see what changed. > > I think it would be better if Gradle could figure out whether a task did > anything, rather than require the task writer to do anything. I would like this, but I'm not sure how to accomplish it in the general case. Tasks may have input/output other than just a set of files (like network operations, web services calls, deploy over webdav). Even tasks like copy may do the work in a way that makes it hard to see what happened after the fact. I know that we have several tasks which have output that is put into the same directory with the output from other tasks. It would not be sufficient to just scan the output directories after each execution since they would also include the results from other tasks. If you never allow parallel execution, then you could scan the output directories both before and after a tasks execution, but this seems expensive. If the task already knows what it did, why not make use of that information. For custom tasks (instances of DefaultTask), it seems simpler for a build writer to set some state to indicate if they did anything than to specify the set of files to check. If this check is best done by comparing files, then we should provide easy ways to call into the change detection code to set this state. > I think we could assume that if a task executes any task action, it has done work. I don't think this is true. As I discussed above, there are many tasks (like compile) that execute their task action, but decide during execution to not cause any side effects. > If a task wants to do any short-circuiting, it would need to use an > onlyIf() predicate. In addition, if we provided any easy way for a task > to declare its output artifacts, then Gradle can additionally > automatically apply change detection to these output artifacts in order > to decide whether the task did any work. > > So, instead of adding a Task.didWork property, perhaps we should merge > this concept with the existing Task.executed property into a single > read-only Task.state property with an enum with values something like: > created, executed, or skipped. > I think you should be able to distinguish executed and did something from executed and didn't do anything. >> 4) I would like to be able to specify that a chain of dependent tasks >> only execute a task if Task.didWork is true for all of its >> dependents. Note that this is not always desired, so you need to be >> able to turn this on and off. I'm not sure of the best way to >> configure this. If we use the onlyIf method suggested above, it might >> take another closure to check this that would be returned from a >> "needed" method. This would look like: >> myTask.onlyIf(needed()) >> >> This probably should be the default for tests, but perhaps not for all >> Tasks. >> > > I'm not sure about this approach. either. I don't think there is anything appropriate to do "for a chain of dependent tasks". I do still like the general idea of onlyIf { isNeeded() }. I think that isNeeded may be a good place contain any mechanism for Gradle to automatically determine if artifacts it depends changed or tasks it depends on did work. > > The tests should run if either the test classes or the classes under > test have changed since last time we successfully ran the tests. > Arguably a change to the test runtime classpath should also cause the > tests to run. In other words, the tests should be run only if the input > artifacts have not changed since last time we ran the tests. Checking > whether all the dependencies of the test task have executed or not is > only an approximation of this, and not a general solution. For example, > if I assemble my classes under test using, say, 2 independent Compile > tasks, then the test task should run if either task has done something. > Or, I may assemble my classes using some other build tool, so that > there's no task which we can use to check whether or not the classes > have changed. > > To me, the key to task optimisation is to base it on the input and > output artifacts of a task. If we make it easy to declare both the input > and output artifacts of a task, we make the model much richer, and from > this we get a lot of goodness. > > For example, if we know what the input artifacts for a task are, Gradle > can apply change detection to those input artifacts on the task's > behalf. If we also know which tasks produce those artifacts, then Gradle > can optimise the change detection. Gradle could, for example, when it > knows which task produces a given artifact, simply use the fact that the > producer task executed an action or not to decide whether the input > artifacts have changed, and only fall back to hashing or timestamps or a > Java 7 file watcher or whatever when it doesn't know how the artifact is > produced. Similarly, it could use the fact that a Jar was downloaded by > the dependency management system to decide whether the input artifacts > have changed. > > Adding input and output artifacts to the model also lets us use this > information to build the DAG, and to be smart about skipping tasks. For > example, if the test task were to declare that it uses the tests classes > directory and the test runtime configuration as input artifacts, then > Gradle would be able to automatically add the tasks that produce these > (if any) to the task dependencies of the test task. > > Knowing which tasks produce and consume a given artifact also allows us > to extract concurrency constraints from the model. If 2 tasks both > contribute to the production of the same artifact (classes dir, say), > they should not run concurrently. Or if 2 tasks both consume the same > artifact, they should not run concurrently. And obviously a producer and > consumer task for a given artifact should not run concurrently. > > Extending this further, if we know the input and output artifacts of a > task, or subgraph of tasks, we can distribute the work to remote machines. > and some helpers to allow manual use of optimization and then investigate techniques to allow Gradle to be smarter about this and do more automatically. If Gradle just adds optimization rules to tasks in the built in plugins and doesn't provide automated optimization for custom tasks you will still get a lot of benefit. I generally like the idea of a richer model that has information about what each task consumes and produces, but I'm not clear exactly how this would be specified. I don't want to require the build writer to duplicate information about what the task inputs / outputs are. I would love to see some examples of how this would work for general tasks. >> Javac is already checking to see if the source files are out of date >> with the classes, so I don't think that the javac task needs to use >> the new changedetection. This would, however let you stop other tasks >> in the chain (like test) if nothing needed to be compiled. >> (unrelated: I would also like to see an option on compile to use Ant's >> depend task. I think the current dependencyTracking option doesn't >> work with the modern compiler. ) >> >> Other types of tasks could make good use of Tom's change detection. >> >> 5) We probably want a command line option to be able to disable all of >> these optimizations. Sometimes you really want to force a build with >> no optimizations (without running clean). >> >> >> In the race for speed, Gradle will probably never catch Ant in a clean >> build (at least while you are delegating most of the expensive stuff >> to ant). > > I wonder. The richer our model, the more scope we have to optimise > without the build script author or task author to doing anything > special. We can automatically extract parallelism. We can inline and > batch tasks. We can distribute bits of the build. We can reuse work that > other machines have already done. > > > Adam > -- Steve Appling Automated Logic Research Team --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Task OptimizationHans Dockter wrote: > > I'm very happy to switch the Copy implementation even if it introduces > some breaking changes. I'm very interested to have a look at your > implementation. > > - Hans > I'll put my changes in a public repo in a few days and start another thread here with information about the syntax changes and new features. -- Steve Appling Automated Logic Research Team --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Task Optimization> > One thing that clean should arguably get rid of is the internal > repository in $rootDir/.gradle. That's were we have our cached stuff. I'm wondering if clean should really affect the cache. Many people do always a clean when they do a build. That means buildSrc would always be builded and the build script would always be compiled. That's usually not what they want I think. > I wonder if it should also clean the buildSrc project? The buildSrc jar is cached. If the cached jar is not available (e.g. deleted by -C rebuild) or is out of date, the buildSrc project is rebuild with a clean. - Hans -- Hans Dockter Gradle Project Manager http://www.gradle.org --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Task Optimization> >> 4) I would like to be able to specify that a chain of dependent >> tasks only execute a task if Task.didWork is true for all of its >> dependents. Note that this is not always desired, so you need to >> be able to turn this on and off. I'm not sure of the best way to >> configure this. If we use the onlyIf method suggested above, it >> might take another closure to check this that would be returned >> from a "needed" method. This would look like: >> myTask.onlyIf(needed()) >> >> This probably should be the default for tests, but perhaps not for >> all Tasks. >> > > I'm not sure about this approach. > > The tests should run if either the test classes or the classes under > test have changed since last time we successfully ran the tests. > Arguably a change to the test runtime classpath should also cause > the tests to run. In other words, the tests should be run only if > the input artifacts have not changed since last time we ran the > tests. Checking whether all the dependencies of the test task have > executed or not is only an approximation of this, and not a general > solution. For example, if I assemble my classes under test using, > say, 2 independent Compile tasks, then the test task should run if > either task has done something. Or, I may assemble my classes using > some other build tool, so that there's no task which we can use to > check whether or not the classes have changed. > > To me, the key to task optimisation is to base it on the input and > output artifacts of a task. If we make it easy to declare both the > input and output artifacts of a task, we make the model much richer, > and from this we get a lot of goodness. > > For example, if we know what the input artifacts for a task are, > Gradle can apply change detection to those input artifacts on the > task's behalf. If we also know which tasks produce those artifacts, > then Gradle can optimise the change detection. Gradle could, for > example, when it knows which task produces a given artifact, simply > use the fact that the producer task executed an action or not to > decide whether the input artifacts have changed, and only fall back > to hashing or timestamps or a Java 7 file watcher or whatever when > it doesn't know how the artifact is produced. Similarly, it could > use the fact that a Jar was downloaded by the dependency management > system to decide whether the input artifacts have changed. > This is very interesting. I'm just trying to play a little with some terminology. There are output-affecting input values (e.g. classpaths, src dirs, compiler options, ...) and also some non-output-affecting input values like log level. The output affecting input values can be subdivided into belonging to something like an Outputter and something like plain input values. Outputters can tell if they did some work, for plain input values the task needs its own history and change detection management. By providing a rich domain model important types of plain input values can be turned into outputters (e.g. SourceDir). And for a subset of the remaining range of input value types we should be able to provide a nice toolkit that makes it easy to define change detection. With the above model, the default behavior of onlyIf is inputValues.haveChanged == true There are also scenarios like: This task should not be executed on Friday. I think they don't fit into the input value model. So we still need to accommodate custom onlyIf rules. One of the interesting issues is to make it easy to write such tasks. > Adding input and output artifacts to the model also lets us use this > information to build the DAG, and to be smart about skipping tasks. > For example, if the test task were to declare that it uses the tests > classes directory and the test runtime configuration as input > artifacts, then Gradle would be able to automatically add the tasks > that produce these (if any) to the task dependencies of the test task. One things that comes to my mind is a scenario, that two tasks output into classesDir. But a third tasks only wants to be dependent on one of those other tasks. Yet I see your point. It is a very interesting question how to integrate the concepts of the input/output model with the DAG model. Again, a richer domain model can help. If the test tasks declares to use for example a SourceDir object as an input value, my scenario from above could easily be solved. But you could ask why not declaring a dependsOn relation from test to SourceDir? I think this is basically what we do with this new input model, with the difference that it is more specific. Instead of just providing a dependsOn method, an input value of type SourceDir could be translated into: This is a dependsOn for the purpose of having the classpath of the production code in the runtime classpath of the tests. - Hans -- Hans Dockter Gradle Project Manager http://www.gradle.org --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Task OptimizationSteve Appling wrote: > I have a proof of concept implementation of some of this at > git://github.com/sappling/gradle.git in the "opt" branch. > This looks pretty good. Some comments below. > This includes: > 1) A new onlyIf method on Task We will need an overload of onlyIf() which takes a TaskAction, so that build logic implemented in Java (eg the Java plugin) can make use of this too. > 2) A new didWork method on Task The didWork property should default to false, until the execute() method first executes a task action, when it should change to true. > 3) Implementations of didWork for Compile and GroovyCompile. Could you add some unit test coverage for this? > I don't think that we can handle Ant's Copy task in this same way. > We may have to use a replacement, but this has other consequences. > 4) Changes to src/samples/java/quickstart/build.gradle to demo didWork > and onlyIf. I don't think these belong in the quickstart sample. It would be unfortunate if users really had to think about optimisation when being introduced to their very first Gradle build. A better place for this would be in the Java plugin. Some integration test coverage would be good too. > 5) A start at OptimizationHelper.isNeeded method. This will require > some additional dependency management features, so I stopped > development until I got some feedback on this whole approach. > I'm still not convinced about this one. I'd rather get the above bits into the 0.7 release and leave this one out until we have a better idea of how it should work (which may also be in time for 0.7). If we have Task.onlyIf(), one can very easily add an equivalent of isNeeded() in their build script. > Steve Appling wrote: >> I am interested in ways to short circuit task execution for the >> purpose of optimization. I would love to see some of this in 0.7 and >> would be glad to contribute. >> >> Here are some ideas: >> 1) Add an "onlyIf" method to Task that is given a closure. The >> closure would be executed before the first action of the task and >> would cancel execution of the task (with appropriate lifecycle >> message) if it returned false. This closure would have as a delegate >> an optimization container with some helper methods that would provide >> more convenient access to change detection (among other things). Then >> you could do: >> mytask.onlyIf { >> timestampChanged 'src/main/mysrc' >> // or contentsChanged 'src/main/mysrc' >> } >> >> 2) Running a clean should probably remove the change detection state >> information for a project (or at least the clean task should be able >> to be configured to do this conveniently). >> >> 3) I would like some general way for tasks to indicate that they did >> anything. Perhaps task.getDidWork(). BTW, I figured out how to do >> this for gradle's use of ant.javac and can now tell if it really >> compiled anything. >> >> 4) I would like to be able to specify that a chain of dependent tasks >> only execute a task if Task.didWork is true for all of its >> dependents. Note that this is not always desired, so you need to be >> able to turn this on and off. I'm not sure of the best way to >> configure this. If we use the onlyIf method suggested above, it >> might take another closure to check this that would be returned from >> a "needed" method. This would look like: >> myTask.onlyIf(needed()) >> >> This probably should be the default for tests, but perhaps not for >> all Tasks. >> >> Javac is already checking to see if the source files are out of date >> with the classes, so I don't think that the javac task needs to use >> the new changedetection. This would, however let you stop other >> tasks in the chain (like test) if nothing needed to be compiled. >> (unrelated: I would also like to see an option on compile to use >> Ant's depend task. I think the current dependencyTracking option >> doesn't work with the modern compiler. ) >> >> Other types of tasks could make good use of Tom's change detection. >> >> 5) We probably want a command line option to be able to disable all >> of these optimizations. Sometimes you really want to force a build >> with no optimizations (without running clean). >> >> >> In the race for speed, Gradle will probably never catch Ant in a >> clean build (at least while you are delegating most of the expensive >> stuff to ant). However, most of the time developers are doing >> incremental changes on existing systems and not running clean. In >> this case, if Gradle can support features to conveniently bypass >> unneeded steps, it can be much faster. Also, Gradle has a huge >> advantage of a more maintainable and modular build specification. >> > --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Task OptimizationAdam Murdoch wrote: > > > Steve Appling wrote: >> I have a proof of concept implementation of some of this at >> git://github.com/sappling/gradle.git in the "opt" branch. >> > > This looks pretty good. Some comments below. > >> This includes: >> 1) A new onlyIf method on Task > > We will need an overload of onlyIf() which takes a TaskAction, so that > build logic implemented in Java (eg the Java plugin) can make use of > this too. > >> 2) A new didWork method on Task > > The didWork property should default to false, until the execute() method > first executes a task action, when it should change to true. > >> 3) Implementations of didWork for Compile and GroovyCompile. > > Could you add some unit test coverage for this? > implementation. I wanted to get feedback about this general approach. I'll add unit/integration tests and more javadoc comments. >> I don't think that we can handle Ant's Copy task in this same way. >> We may have to use a replacement, but this has other consequences. >> 4) Changes to src/samples/java/quickstart/build.gradle to demo didWork >> and onlyIf. > > I don't think these belong in the quickstart sample. It would be > unfortunate if users really had to think about optimisation when being > introduced to their very first Gradle build. A better place for this > would be in the Java plugin. This was not intended to ship as part of the quickstart samples, but was a convenient place to show the syntax for comment. > > Some integration test coverage would be good too. > >> 5) A start at OptimizationHelper.isNeeded method. This will require >> some additional dependency management features, so I stopped >> development until I got some feedback on this whole approach. >> > > I'm still not convinced about this one. I'd rather get the above bits > into the 0.7 release and leave this one out until we have a better idea > of how it should work (which may also be in time for 0.7). If we have > Task.onlyIf(), one can very easily add an equivalent of isNeeded() in > their build script. > >> Steve Appling wrote: >>> I am interested in ways to short circuit task execution for the >>> purpose of optimization. I would love to see some of this in 0.7 and >>> would be glad to contribute. >>> >>> Here are some ideas: >>> 1) Add an "onlyIf" method to Task that is given a closure. The >>> closure would be executed before the first action of the task and >>> would cancel execution of the task (with appropriate lifecycle >>> message) if it returned false. This closure would have as a delegate >>> an optimization container with some helper methods that would provide >>> more convenient access to change detection (among other things). Then >>> you could do: >>> mytask.onlyIf { >>> timestampChanged 'src/main/mysrc' >>> // or contentsChanged 'src/main/mysrc' >>> } >>> >>> 2) Running a clean should probably remove the change detection state >>> information for a project (or at least the clean task should be able >>> to be configured to do this conveniently). >>> >>> 3) I would like some general way for tasks to indicate that they did >>> anything. Perhaps task.getDidWork(). BTW, I figured out how to do >>> this for gradle's use of ant.javac and can now tell if it really >>> compiled anything. >>> >>> 4) I would like to be able to specify that a chain of dependent tasks >>> only execute a task if Task.didWork is true for all of its >>> dependents. Note that this is not always desired, so you need to be >>> able to turn this on and off. I'm not sure of the best way to >>> configure this. If we use the onlyIf method suggested above, it >>> might take another closure to check this that would be returned from >>> a "needed" method. This would look like: >>> myTask.onlyIf(needed()) >>> >>> This probably should be the default for tests, but perhaps not for >>> all Tasks. >>> >>> Javac is already checking to see if the source files are out of date >>> with the classes, so I don't think that the javac task needs to use >>> the new changedetection. This would, however let you stop other >>> tasks in the chain (like test) if nothing needed to be compiled. >>> (unrelated: I would also like to see an option on compile to use >>> Ant's depend task. I think the current dependencyTracking option >>> doesn't work with the modern compiler. ) >>> >>> Other types of tasks could make good use of Tom's change detection. >>> >>> 5) We probably want a command line option to be able to disable all >>> of these optimizations. Sometimes you really want to force a build >>> with no optimizations (without running clean). >>> >>> >>> In the race for speed, Gradle will probably never catch Ant in a >>> clean build (at least while you are delegating most of the expensive >>> stuff to ant). However, most of the time developers are doing >>> incremental changes on existing systems and not running clean. In >>> this case, if Gradle can support features to conveniently bypass >>> unneeded steps, it can be much faster. Also, Gradle has a huge >>> advantage of a more maintainable and modular build specification. >>> -- Steve Appling Automated Logic Research Team --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Task OptimizationSteve Appling wrote: > > > Hans Dockter wrote: >> <snip> >> >>> 4) I would like to be able to specify that a chain of dependent >>> tasks only execute a task if Task.didWork is true for all of its >>> dependents. >> >> I don't fully understand this. Could you explain this a bit more? >> >> <snip> >> >> - Hans >> > > Sure - I did not express that very well at all. I also wrote it > before I attempted an implementation, so I think I have a better idea > of what might be needed now. > > In the syntax that I implemented, you could say: > test.onlyIf { isNeeded() } > > I wanted this to be able to look at the TaskDependencies for the test > task and only execute if Task.didWork was true for one of them. I was > not able to figure out how to use TaskDependencies to accomplish this. > task.getTaskDependencies(task) only returns the tasks that are > explicitly added using dependsOn and doesn't seem to take into account > the tasks needed to build the artifacts in the configurations that are > contained in the TaskDependencies object. > It should return both types of tasks. However, it will only start doing this once all the projects have been evaluated, which is fine for implementing isNeeded() Adam --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Task OptimizationI've made a Jira issue for this as a new feature and submitted a patch. See
http://jira.codehaus.org/browse/GRADLE-533 This is just a starting point to enable users to manually specify optimization conditions. We need to add some built in support for optimizing the tasks added by the built in plugins. I had problems last week when I was trying to do this. I'll try to work on this more tomorrow and start another thread with a description of the issues I encountered. -- Steve Appling Automated Logic Research Team --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
| Free embeddable forum powered by Nabble | Forum Help |