Issue 349: get performance testing on OS X to work and report results on website - Fast Downward issue tracker

Title	get performance testing on OS X to work and report results on website
Priority	wish	Status	resolved
Superseder		Nosy List	malte, rpgoldman
Assigned To	malte	Keywords
Optional summary

Created on 2012-09-11.19:52:03 by malte, last changed by malte.

Messages
msg4354 (view)	Author: malte	Date: 2015-07-12.18:45:00
Noone spoke up, so I've removed the script.
msg4307 (view)	Author: malte	Date: 2015-07-02.03:15:53
I think we have given up on the original idea here, and I think run-calibration-test is not maintained any more. It does not work with the current planner code, and I would like to remove it altogether, as we now have better ways of running such experiments. Do you have any objections to removing the script and marking this issue as resolved?
msg3629 (view)	Author: malte	Date: 2014-10-04.18:34:27
OK, I had a bit more time to look at the current script now, and I think it would be good to defer working on this until issue414 is merged (which I hope will be soon). Some of the functionality that is currently in the performance testing script might be useful to integrate into the new driver script directly, and also the new script will influence the way we call the planner. In other words, I'm deferring this one for a bit, but hopefully not for very long.
msg3535 (view)	Author: rpgoldman	Date: 2014-09-24.18:25:13
OK. I am also busy right now, but might be able to run some tests for you later on Mac OS X.
msg3534 (view)	Author: malte	Date: 2014-09-24.18:17:22
Thanks, Robert! We've had many incompatibilities due to Unix utilities being named differently or behaving differently on Linux vs. Mac OS, and I'm coming round to the idea that avoiding bash + unix tools and instead implementing things like this script in Python makes it easier to maintain cross-platform compatibility. I'm a bit strapped for time right now, but I'll try to get back to this issue in early October with your comments and the Python option in mind.
msg3529 (view)	Author: rpgoldman	Date: 2014-09-24.15:13:54
With respect to date on Mac OS X, the easiest thing may be to check for the presence of gnu date which, if installed from MacPorts, will show up as "gdate". Some people may also choose to use the gnu versions in place of the Mac/BSD ones, in which case the "date" in their path WILL work. One could check for date --version working (doesn't work on Apple's date) and containing the substring "GNU". Failing that, check for "gdate", failing that fall over to using date w/o nanoseconds.
msg3457 (view)	Author: malte	Date: 2014-09-19.15:41:24
Emil replied that he is too busy with his other duties, so we'll need another OS X user to help us out with this one.
msg3456 (view)	Author: malte	Date: 2014-09-19.15:30:24
Hi Emil, I'd like to close some old issues. Do you still have time and interest for this sort of thing, and if yes, can you let me know if the things in this issue work correctly on OS X these days? If not, no worries, I'll find another Mac owner.
msg2332 (view)	Author: malte	Date: 2012-09-11.19:52:03
We started doing some simple benchmarking of the planner on OS X and run into a few issues. The idea was to report on the website which performance to expect on which platform, and to a clearer idea if we should recommend some platform over another. (In the past, we've occasionally warned people against using OS X For benchmarking since the planner is essentially untested on that platform.) Here's what I wrote about this in an email: =========================================================================== [I wrote] a script for some very informal and anecdotal benchmarking. To run it, just get the most recent code, ideally using a clean clone, go to new-scripts, and run $ ./run-calibration-test This will probably take an hour or so. It'd be good to keep the machine quiet while you're running this. Alternatively, or rather additionally, it'd also be interesting to see what happens if you run multiple instances of this in parallel, but I fear currently they'd all need their own checkout since they will clobber each other's temporary files. If you have k GB of RAM, don't run more than k/2 instances in parallel. The script will generate up to three log files (the third one only if something goes wrong) named "calibration-test.*". I'd need those, as well as the output of $ hg identify and $ cat /proc/cpuinfo (or if that doesn't exist on a Mac, whatever other info identifies your CPU). Once we're happy that everything went fine, it'd be good to put this info up on the wiki. I'll contribute data for a bunch of Linux machines. =========================================================================== Emil reported the following problems/bits of feedback (this is now a bit older, so might no longer be current): - The VAL Makefile didn't work out of the box because it linked statically. This can be fixed the same way as with our own code. - To get CPU info on the Mac, run /usr/sbin/system_profiler. - "date" on OS X doesn't print nanoseconds, so the script's 'date +%s.%N' doesn't work. 'date +%s' does, but only has second resolution. It further transpired that there were additional problems: ============================================================================= some of the output looks weird, e.g. this: > Testing lazy search with h^cea... > Solving satellite/p30-HC-pfile10.pddl... > Translator: [145.500s CPU, 90.416s wall-clock] > Plan cost: 280 > Evaluated 4048 state(s). > Search: 16.22s > Peak memory: 778332 KB > Elapsed wall-clock time: 125.000 seconds > Plan valid The line relating to the translator is strange because CPU time should not exceed wall-clock time. (What that line suggests is that the translator ran for a total of 90.416 seconds real-time and that during that time it used the CPU for 145.5 seconds.) ============================================================================= [...] ============================================================================= There was a bug in Python a while ago where on some OSes os.times() was off due to using a wrong multiplier internally. I remember that one well because I was the one who reported it. Let me see if I can dig out something relevant... Here is the issue: http://bugs.python.org/issue1040026 ============================================================================= The Python bug was apparently present in Python 2.6.1, but fixed in 2.6.2 and 2.7. So it should be sufficient to require a sufficiently new Python version.

History
Date	User	Action	Args
2015-07-12 18:45:00	malte	set	status: chatting -> resolved messages: + msg4354
2015-07-02 03:15:53	malte	set	messages: + msg4307
2014-10-04 18:34:27	malte	set	messages: + msg3629
2014-09-24 18:25:13	rpgoldman	set	messages: + msg3535
2014-09-24 18:17:22	malte	set	messages: + msg3534
2014-09-24 15:13:54	rpgoldman	set	nosy: + rpgoldman messages: + msg3529
2014-09-19 15:41:24	malte	set	nosy: - emilkeyder messages: + msg3457
2014-09-19 15:30:25	malte	set	messages: + msg3456
2012-09-11 19:52:03	malte	create