Replies: 6 comments
-
Hi Theodor! 👋 I like the idea and propose the following. We keep the existing function an add one or two new variants. Variant 1 is simple but does only work with Java's Map. Variant 2 is universal and does not assume a specific cache type. By contract, the cache needs to be mutable. We can't check that at function call time. This will work for Function1..N the same way, Function0 does not need a cache argument because it only provides one value. // variant 1
Function2<T1, T2, R> memoized() {
return memoized(new HashMap<>());
}
// variant 2
Function2<T1, T2, R> memoized() {
var cache = new HashMap<Tuple2<T1, T2>, R>();
return memoized(cache::get, cache::put);
}
// variant 1, e.g. works with Java's WeakHashMap
Function2<T1, T2, R> memoized(Map<Tuple2<T1, T2>, R> cache) {
...
}
// variant 2, e.g. works with Guava's LoadingCache
Function2<T1, T2, R> memoized(Function<Tuple2<T1, T2>, R> get, BiConsumer<Tuple2<T1, T2>, R> put) {
...
} |
Beta Was this translation helpful? Give feedback.
-
Hi Daniel 😄 I like your second proposal. Actually both variants looks good to me, but personally I'll use more the second one (through function concatenation I could do a lot of funny things). However let me clarify this. Are you say that you preferer to externalize the responsibility for providing a caching solution rather then creating our own mutable data structure for caching? Maybe we could use the second variant and also provide a module, such as vavr-mutable-ds , containing our default implementation for local caching. |
Beta Was this translation helpful? Give feedback.
-
Hi :)
when having a cache that removes items after a certain time, then memoization will not work as expected. example: we memoize a function that returns random values. the memoized version will return constant values if the the cache keeps items forever. if the cache loses items, the random number will probably change the next time we ask the memoized function. However, Vavr is built with immutability in mind, so by default we expect that re-calculated values are the same as before and the user needs to keep in mind that they might change if their function isn't pure. Therefore, it is perfectly ok if we would implement a mutable What do you think? |
Beta Was this translation helpful? Give feedback.
-
I strongly agree. Here we are focusing on pure data-structures and pure functions, and the memoization itself cannot make sense if a function is not pure. Furthermore, forcing the usage of any memoization technique on a impure function leads to a completely different behavior anyway. Considering your example, if we memoize a random generator we get the behavior of a
Unfortunately I doesn't . During the past week I've been studying the implementation of guava's LocalCache. I'm going to draft our version during the Christmas holidays. |
Beta Was this translation helpful? Give feedback.
-
Please be aware that soft references are problematic as a caching approach. They are collected in lru order, which may mean a flood of low valued objects force out more useful ones. They also had a tendency to cause GC thrashing by filling the old gen, a STW event recovering just enough to unpause, and the cycle repeated. Newer collectors are more aggressive, which means less caching. There is still a higher cost for the gc to handle soft reference processing so the overall penalty is probably higher than expected. Guava’s cache was originally designed and optimized for this approach. That looked great in microbenchmarks but suffered in practice. It was non trivial to inherit and refocus this towards a size based approach, which is the recommended configuration. Caffeine optimizes for that, causing soft support being a little more expensive footprint wise. I’d recommend against soft references in practice and consider other approaches (weak is okay). |
Beta Was this translation helpful? Give feedback.
-
Thanks for the input @ben-manes, that's interesting! This might be the reason that there isn't a SoftHashMap available in Java's std lib. I agree, we should provide an API (like suggested above) that is independent of the cache implementation. We will describe possible caching solutions in the Javadoc with hints about drawbacks. |
Beta Was this translation helpful? Give feedback.
-
I'd like to discuss about the possibility of improving the current memoization.
Right now we are relaying on java.util.HashMap for storing the values computed by a memoized function. Thus, we are using hard references to store each pair <$argument, $result>, hence if the function will be called with a wide rande of arguments it will lead to a huge amount of memory consumed. This is even more critic when multiple memorized functions are used in the same codebase.
I'd like to address this by proposing an alternative solution or a new api along the current one.
We could create an API
or a new module vavr-memoization having an API such as
A soft-memoized function is a memoized function using a Map<SoftReference<$argument>, SoftReference<$result>>. Using soft references, allows the GC to manage the memorization memory consumption by removing some pair <$argument, $return> when needed.
Also for the soft-memoization we could implement something like guava's LocalCache with softValues
A soft-memoization is also useful for scenario such as during a given continuous time window, Tn, the function is called with a arguments for a limited set of possibile arguments, Sn, and in the next time window, Tn+1, the set, Sn+1, will be composed by others values. Soon or later, the GC will remove from memory all the previous cached values which are no longer used.
Beta Was this translation helpful? Give feedback.
All reactions