Issue 602: Integrate IPC 2014 benchmarks - Fast Downward issue tracker

Title	Integrate IPC 2014 benchmarks
Priority	wish	Status	resolved
Superseder		Nosy List	jendrik, malte, silvan
Assigned To	silvan	Keywords
Optional summary

Created on 2015-12-03.15:02:05 by silvan, last changed by malte.

Messages
msg4944 (view)	Author: malte	Date: 2015-12-10.15:03:50
> Should I announce the integration on the developers and/or public mailing list? As you prefer -- perhaps it's not necessary for the public list, especially if we want to move the benchmarks out of the repository soon.
msg4943 (view)	Author: silvan	Date: 2015-12-10.14:11:33
Merged and pushed. Should I announce the integration on the developers and/or public mailing list? The next step is to also update lab.
msg4942 (view)	Author: malte	Date: 2015-12-10.13:53:07
As discussed live: feel free to remove the unneeded ones.
msg4941 (view)	Author: silvan	Date: 2015-12-10.12:28:09
Malte, we also changed the method that determines domain file names from given task file names in the driver script, and added a unit test to make sure that we can determine a domain file name for all task file names. When running the test, we noted that two out of five ways of looking for file names are not required. If the candidate list of domain file names was as follows before: domain_filenames = [ 'domain.pddl', taskfilename[:4] + 'domain.pddl', taskfilename[:3] + '-domain.pddl', 'domain_' + taskfilename, taskfilename[:-13] + "-domain.pddl" ] then the following is enough to pass the test: domain_basenames = [ 'domain.pddl', taskfilename[:3] + '-domain.pddl', 'domain_' + taskfilename, ] Was the list of possible domain file names used for other purposes as well? Can we remove the unused ways of matching names?
msg4927 (view)	Author: jendrik	Date: 2015-12-09.17:44:34
The domain and task names look fine to me.
msg4924 (view)	Author: silvan	Date: 2015-12-09.16:27:47
Jendrik insisted on reviewing this issue, hence I added him to the nosy list :-) You can find the code here: https://bitbucket.org/SilvanS/fd-dev/branch/issue602
msg4921 (view)	Author: silvan	Date: 2015-12-09.15:36:50
I started the branch over again, (hopefully) adding the files with their correct names, and also including the tasks from the agile and multicore tracks. Results for all 4 tracks can be found here (I ran our standard satisficing configurations for the agile and multicore tasks): http://ai.cs.unibas.ch/_tmp_files/sieverss/issue602-v1-agl.html http://ai.cs.unibas.ch/_tmp_files/sieverss/issue602-v1-mco.html http://ai.cs.unibas.ch/_tmp_files/sieverss/issue602-v1-opt.html http://ai.cs.unibas.ch/_tmp_files/sieverss/issue602-v1-sat.html There are three domains with ADL requirements, however maintenance is translated/preprocessed to a STRIPS task. I think that someone should briefly look over the domain and file names (no "." except before "pddl" etc), but other than that, we could integrate the benchmarks. The next step would then be to update lab to include the new domains in the suites module.
msg4866 (view)	Author: malte	Date: 2015-12-06.14:37:10
Sounds good!
msg4865 (view)	Author: silvan	Date: 2015-12-06.13:33:10
http://ai.cs.unibas.ch/_tmp_files/sieverss/issue602-issue602-v1-opt.html http://ai.cs.unibas.ch/_tmp_files/sieverss/issue602-issue602-v1-sat.html I had to exclude two domains from the optimal track, which none of our optimal planner support: cavediving (:requirements :typing :action-costs :adl) and citycar (:requirements :typing :equality :negative-preconditions :action-costs :conditional-effects). I don't know what conclusions to draw from the experiments, except that there have been no errors. Maybe the next step is to have a look at the domains and to discuss how to name the domains and problem files if we actually want to include them.
msg4856 (view)	Author: malte	Date: 2015-12-03.17:38:43
Sounds good, thanks for taking care of this!
msg4852 (view)	Author: silvan	Date: 2015-12-03.15:02:05
For future (paper) experiments, it would be nice to have the benchmarks of the latest IPC (2014) in the Fast Downward repository (at least as long as we are still including the benchmarks in our repository). On the competition's website, a version 1.1 containing bug fixes of the benchmarks has been released. We should probably test them with "all common" configurations, e.g. the configs listed in misc/tests/configs.py.

History
Date	User	Action	Args
2015-12-10 15:03:50	malte	set	messages: + msg4944
2015-12-10 14:11:33	silvan	set	status: chatting -> resolved messages: + msg4943
2015-12-10 13:53:07	malte	set	messages: + msg4942
2015-12-10 12:28:09	silvan	set	messages: + msg4941
2015-12-09 17:44:34	jendrik	set	messages: + msg4927
2015-12-09 16:27:47	silvan	set	messages: + msg4924
2015-12-09 16:25:55	silvan	set	nosy: + jendrik
2015-12-09 15:36:50	silvan	set	messages: + msg4921
2015-12-06 14:37:10	malte	set	messages: + msg4866
2015-12-06 13:33:10	silvan	set	messages: + msg4865
2015-12-03 17:38:43	malte	set	messages: + msg4856
2015-12-03 15:02:05	silvan	create

Issue602