Issue404

Title iPDB random walk evaluates every state
Priority wish Status resolved
Superseder Nosy List florian, jendrik, malte, silvan
Assigned To silvan Keywords
Optional summary

Created on 2013-12-17.11:06:58 by florian, last changed by silvan.

Messages
msg2879 (view) Author: silvan Date: 2014-01-05.15:47:18
Marking this one as resolved.
msg2878 (view) Author: silvan Date: 2014-01-05.15:47:02
I merged and pushed. Thank you for your feedback.
msg2876 (view) Author: malte Date: 2014-01-05.15:41:35
Looks good. If you've done a spot-check on 1-2 instances that the behaviour is
as before, I don't think we need another experiment.
msg2874 (view) Author: silvan Date: 2014-01-05.15:39:05
I integrated your suggested changes.
Do you also want another experiment here?
msg2868 (view) Author: silvan Date: 2013-12-30.19:51:54
I created an rietveld issue; for the moment, only Malte has access.
https://codereview.appspot.com/41330044/

I don't think the code got much complicated, but on the other side, there is no
real visible improvement. I think we could merge it in, though, but I am fine
with both choices.
msg2865 (view) Author: malte Date: 2013-12-30.19:35:50
Thanks! I'd like to have a brief look at the patch.

I assume it complicates the code, so we should probably only merge it if we
think the tradeoff between code effort and time saved is worth it. 

To see if there is a statistically significant difference, I looked at the
per-domain total time scores, comparing the domains where they differ and
checking if the difference can be interpreted as randomness. Assuming I counted
correctly, the improvement seems to be highly statistically significant, with a
p level of less than 0.000013.

However, even if the effect is significant, it doesn't seem to be strong. (In
other words, it often leads to an improvement, but not by much.)

What do the others think?
msg2862 (view) Author: silvan Date: 2013-12-30.18:21:14
Here are the results:
http://ai.cs.unibas.ch/_tmp_files/sieverss/2013-12-29-issue404-abs-d.html
http://ai.cs.unibas.ch/_tmp_files/sieverss/2013-12-29-issue404-abs-p.html

There is no visible difference. If anyone wants to see the patch, let me know,
otherwise I'll merge and push soon.
msg2858 (view) Author: silvan Date: 2013-12-29.14:52:16
Okay. I will run some experiments.
msg2857 (view) Author: malte Date: 2013-12-29.11:22:16
Let's say we have N PDBs and M maximal additive subsets, which are of size K on
average.

To detect a dead end, we need to check all PDBs once: O(N). To compute the
heuristic value, we need to look up the heuristic values and then maximize over
all maximal additive subsets: O(N + M*K).

If we only care about dead ends and not heuristic values, it would be nicer to
only pay the O(N) cost rather that O(N + M*K).

How much difference does it make? Florian has looked at the relationship between
N and M more thoroughly than me, but if I recall correctly, there are many
cases, where M is much larger than N. In the scatter plots on
http://ai.cs.unibas.ch/papers/pommerening-et-al-ijcai2013-poster.pdf (where
|MAS(C)| denotes M and |C| denotes N), there is a significant number of cases
where M is 100x or 1000x larger than N. (Although this is with the Sys^2 PDB
generation method and not iPDB; I would expect iPDB to have lower values of both
M and N, but still the number of maximal additive subsets is low.)
msg2856 (view) Author: silvan Date: 2013-12-29.11:06:52
Okay. But then again, all we can do is have a method in the canonical heuristic
function that iteratively evaluates all of its PDBs and stops as soon as one of
them is a dead end for the given state. There is no way to avoid retrieving the
heuristic value from the PDB, right? And this is cheap anyway. Or do I still
miss something?
If this is correct, everything we can save is just the summation over all PDB
heuristic values when computing the heuristic value of the canonical heuristic
function.
msg2855 (view) Author: malte Date: 2013-12-28.22:40:17
The problem (or rather, wasted effort) occurs when the sampled state is not a
dead-end.Then compute_heuristic evaluates all current maximal cliques to compute
the heuristic value, which can be quite expensive. But the sampling process
doesn't care about the heuristic value, only whether or not it's a dead end.

So it would be useful to have a way to do the dead-end test without computing
the exact heuristic value in case of a non-dead-end.
msg2852 (view) Author: silvan Date: 2013-12-28.18:15:25
I am not sure if I miss something, but I do not see where there is any overhead
in the way it is currently implemented: random_walk indeed calls evaluate(), but
having a look at evaluate(), the first call there is compute_heuristic(), which
in turn does the following:

    for (size_t i = 0; i < pattern_databases.size(); ++i) {
        pattern_databases[i]->evaluate(state);
        if (pattern_databases[i]->is_dead_end())
            return -1;
    }

(the last return should be return DEAD_END; I just fixed this)

For me, this looks exactly like the way you proposed it should be implemented.
The computation of the maximal additive subsets is only triggered when a new
pattern is added to the canonical heuristic function, something which does not
happen in the process of sampling.
msg2809 (view) Author: florian Date: 2013-12-17.11:06:58
The iPDB random walk that is used to create sample states calls evaluate() on
every intermediate state to determine if it is a recognized dead end. This can
be checked cheaper by adding a specialized method to canonical heuristic that
checks whether one of its component PDBs recognizes the state as a dead end.
This would avoid the overhead of evaluating the maximal additive subsets for
these states.
History
Date User Action Args
2014-01-05 15:47:18silvansetstatus: testing -> resolved
messages: + msg2879
2014-01-05 15:47:02silvansetmessages: + msg2878
2014-01-05 15:41:35maltesetmessages: + msg2876
2014-01-05 15:39:05silvansetmessages: + msg2874
2013-12-30 19:51:54silvansetmessages: + msg2868
2013-12-30 19:35:50maltesetmessages: + msg2865
2013-12-30 18:21:14silvansetmessages: + msg2862
2013-12-29 14:52:16silvansetstatus: chatting -> testing
assignedto: silvan
messages: + msg2858
2013-12-29 11:22:16maltesetmessages: + msg2857
2013-12-29 11:06:52silvansetmessages: + msg2856
2013-12-28 22:40:17maltesetmessages: + msg2855
2013-12-28 18:15:26silvansetmessages: + msg2852
2013-12-17 12:21:29jendriksetnosy: + jendrik
2013-12-17 11:06:58floriancreate