Issue 611: add simple logger - Fast Downward issue tracker

Title	add simple logger
Priority	feature	Status	resolved
Superseder		Nosy List	florian, jendrik, malte
Assigned To	jendrik	Keywords
Optional summary

Created on 2015-12-09.16:40:15 by jendrik, last changed by jendrik.

Messages
msg4981 (view)	Author: jendrik	Date: 2015-12-14.11:25:50
Merged and pushed, thanks guys!
msg4978 (view)	Author: florian	Date: 2015-12-12.23:45:00
Fine with me.
msg4977 (view)	Author: jendrik	Date: 2015-12-12.23:34:04
Thanks for explaining! I forgot to answer Florian's questions: this issue is only concerned with adding the logger, not with actually using it. It's already used by the CEGAR code, but we didn't want to sneak the logger in as a side-effect of merging the CEGAR code. I moved g_log into globals.h to mimic g_rng and so we can write "g_log" instead of Utils::g_log. I also believe that we should accept the runtime penalty of logging the memory usage for now, since we should try not to log too excessively anyway. Are you guys fine with me merging the code?
msg4976 (view)	Author: malte	Date: 2015-12-12.21:30:36
Will add the experiment to the repository once I'm done waiting for Bitbucket. Memory freed by a program is rarely returned to the OS, and it's not really possible to know if and when it happens without knowing a lot about the implementation details of the memory allocators (malloc, operator new and friends). Hence, VmSize tends to be harder to predict/interpret than VmPeak. Let's say I allocate 1000 data structures, each of which take up something like 1 MB. Let's say at this point VmSize and VmPeak will be something like 1 GB. I then release the memory for 999 of the data structures. VmPeak will remain at 1 GB. What will VmSize be? It depends on - whether the memory allocator internally used sbrk or mmap (which in turn usually depends mainly on how large each individual allocation is, but it can also depend on whether we ran out of contiguous address space for the heap due to shared libraries mapped into the program's address space) - whether the data structures we did not return was close to the start or the end of our program's memory barrier - whether we did any other allocations in between, and if yes, if they could be satisfied with spare memory that was already under our program's control or had to be satisfied from the memory - whether the allocator library tends to return memory to the OS whenever possible or hang on to it to satisfy future memory requests without going to the OS. As a consequence of this, VmSize in the example can be anywhere between 1 MB and 1 GB, and tiny changes of compiler/library/program can make the difference between these two.
msg4975 (view)	Author: jendrik	Date: 2015-12-12.21:04:29
I'd be interested in the micro benchmark. Can you push it to the issue611 branch at https://bitbucket.org/jendrikseipp/downward, please? While searching for alternatives for measuring memory usage, I asked myself why we report VmPeak instead of VmSize, the current amount of virtual memory? I guess this is due to the fact that we originally only reported memory usage at the end of program execution when VmSize doesn't give us much information. During execution however, wouldn't it be more informative to use VmSize? If so, we could read /proc/self/statm instead of /proc/self/status where VmSize is the first number in the file, given in number of memory pages (http://man7.org/linux/man-pages/man5/proc.5.html). Retrieving the number of bytes per page seems rather hacky, however (http://man7.org/linux/man- pages/man2/getpagesize.2.html, see "notes" paragraph).
msg4973 (view)	Author: malte	Date: 2015-12-12.19:56:58
It's not worth it -- on 32-bit systems with tasks that consume lots of memory, we hit an address space barrier for sbrk, at which points mmap is used internally all the time. So using sbrk would make the output very inaccurate.
msg4970 (view)	Author: florian	Date: 2015-12-12.18:49:21
We already have two versions of functions that determine peak memory: get_peak_memory_in_kb and get_peak_memory_in_kb_reentrant. (The second one is slower but can be used in exit handlers.) Maybe we should add a third alternative which is inaccurate but fast using the system call? On the other hand, spending 0.25 additional seconds while generating a 1MB log seems acceptable, so I'm also fine with leaving it as it is.
msg4969 (view)	Author: malte	Date: 2015-12-12.13:42:25
> If we use it in time critical places, get_peak_memory_in_kb could slow > things down. I guess this can only happen if we log excessively, which causes other problems, too (such as running out of disk space to store the logs). I think it's good to have some quantitative info on the overhead, so I ran a microbenchmark for get_peak_memory_usage and ran it on my notebook. Let me know if I should commit this somewhere, otherwise I'll just delete it again. Output: Running nothing 1000000 times: 0.00204 seconds Running get_peak_memory_in_kb 1000000 times: 10.4001 seconds So one call takes about 10 microseconds, which is quite a lot. If, say, a typical output line is 40 bytes long, then producing 1 MB of output where every line is prefixed with current peak memory takes about 0.25 seconds just for computing the peak memory usage. This is perhaps still acceptable if we don't log excessively (which we shouldn't), but it's also non-negligible. Out of curiosity, I also ran a test with the sbrk system call instead of get_peak_memory_in_kb, which is almost good enough to determine peak memory, but not quite because it ignores memory allocated with mmap. sbrk() is about 4000 times faster than get_peak_memory_in_kb: Running nothing 1000000 times: 0.001637 seconds Running get_peak_memory_in_kb 1000000 times: 10.4792 seconds Running sbrk 1000000 times: 0.00262 seconds
msg4968 (view)	Author: florian	Date: 2015-12-12.13:07:32
I agree with Malte's comment about the global logger. Otherwise it looks fine to me. I was surprised that you only introduce the class here and do not use it for existing logging output of that form. Did you want to minimize the diff? Do we have a plan about when and where to use the logger? If we use it in time critical places, get_peak_memory_in_kb could slow things down.
msg4963 (view)	Author: jendrik	Date: 2015-12-11.15:36:04
I have created a pull request at https://bitbucket.org/jendrikseipp/downward/pull-requests/38 Florian, can you have a look, please?
msg4925 (view)	Author: jendrik	Date: 2015-12-09.16:40:15
We want to add a simple logging module that supports the following usage: Log() << "States: " << num_states << endl; It prepends time and peak memory info to messages. Traditionally, we used to append this info to our planner messages, but prepending it is much easier to implement and more in line with other logging tools (e.g., the python logging module).

History
Date	User	Action	Args
2015-12-14 11:25:50	jendrik	set	status: reviewing -> resolved messages: + msg4981
2015-12-12 23:45:00	florian	set	messages: + msg4978
2015-12-12 23:34:04	jendrik	set	messages: + msg4977
2015-12-12 21:30:36	malte	set	messages: + msg4976
2015-12-12 21:04:29	jendrik	set	messages: + msg4975
2015-12-12 19:56:58	malte	set	messages: + msg4973
2015-12-12 18:49:21	florian	set	messages: + msg4970
2015-12-12 13:42:25	malte	set	messages: + msg4969
2015-12-12 13:07:32	florian	set	messages: + msg4968
2015-12-11 15:36:04	jendrik	set	status: in-progress -> reviewing messages: + msg4963
2015-12-10 11:29:56	florian	set	nosy: + florian
2015-12-09 16:40:15	jendrik	create

Issue611