-
Notifications
You must be signed in to change notification settings - Fork 42
Refactor value types #93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor value types #93
Conversation
almost done, got bigger and bigger would be good to get some 1st feedback on the changes, IMHO it simplifies usage and the code base despite the need of a bit more boiler plate code I also see some more improvements, e.g. further separating writing and merging (requires some thought on the generator), but I prefer doing this in separate follow up PR's. |
keyvi::dictionary::DictionaryCompiler<keyvi::dictionary::fsa::internal::SparseArrayPersistence<BucketT>, | ||
keyvi::dictionary::fsa::internal::IntInnerWeightsValueStore> | ||
compiler(value_store_params); | ||
keyvi::dictionary::DictionaryCompiler<keyvi::dictionary::dictionary_type_t::INT_WITH_WEIGHTS> compiler( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe here we can use types defined in dictionary_types.h
?
* Comparison for the string value store which does not use length prefixes | ||
*/ | ||
template <class PersistenceT, class HashCodeTypeT = int32_t> | ||
struct RawPointerForCompareString final { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this will be removed once current StringValueStore
is deprecated, right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we have to keep the reader for some time, but the writing part can be removed as part of the reimplementation.
JSON_VALUE_STORE_DEPRECATED = 4, // !< JsonValueStoreDeprecated | ||
JSON_VALUE_STORE = 5, // !< JsonValueStore | ||
INT_INNER_WEIGHTS_VALUE_STORE = 6, // !< IntInnerWeightsValueStore | ||
KEY_ONLY = 1, //!< NullValueStore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this refactoring is a good time to use scoped enums here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent Idea!
class IntInnerWeightsValueStore final : public IntInnerWeightsValueStoreBase { | ||
public: | ||
explicit IntInnerWeightsValueStore(const keyvi::util::parameters_t& parameters = keyvi::util::parameters_t()) {} | ||
IntInnerWeightsValueStore(const IntInnerWeightsValueStore& that) = delete; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we delete these methods in base class, we don't need to delete them in derived ones.
*/ | ||
class IntInnerWeightsValueStore final : public IntInnerWeightsValueStoreBase { | ||
public: | ||
explicit IntInnerWeightsValueStore(const keyvi::util::parameters_t& parameters = keyvi::util::parameters_t()) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can implement this constructor in base class, and later just make use of constructor inheritance.
|
||
class IntInnerWeightsValueStoreAppendMerge final : public IntInnerWeightsValueStoreBase { | ||
public: | ||
explicit IntInnerWeightsValueStoreAppendMerge( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this constructor needed for append merger ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, because the generator constructs ValueStore if it isn't passed into it, similar to the AddValue problem. We should look into this in 1 change.
@@ -99,6 +106,13 @@ class IntInnerWeightsValueStoreReader final : public IValueStoreReader { | |||
std::string GetValueAsString(uint64_t fsa_value) const override { return std::to_string(fsa_value); } | |||
}; | |||
|
|||
template <> | |||
struct Dict<value_store_t::INT_WITH_WEIGHTS> { | |||
typedef IntInnerWeightsValueStore value_store_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think modern C++ is encouraging use of using
statement over typedef
.
GeneratorAdapterInterface<PersistenceT, ValueStoreT>::CreateGenerator(size_t size_of_keys, | ||
const keyvi::util::parameters_t& params, | ||
ValueStoreT* value_store) { | ||
typename GeneratorAdapterInterface<ValueT>::AdapterPtr GeneratorAdapterInterface<ValueT>::CreateGenerator( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems that we are passing 2 value store related template params, maybe we can eliminate one of them ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks strange but note that ValueT is the template of the class, PersistenceT, ValueStoreT are templates of the method.
template <class PersistenceT, class ValueStoreT = fsa::internal::NullValueStore> | ||
template <int DictionaryType = fsa::internal::value_store_t::KEY_ONLY, | ||
typename ValueStoreMergeT = typename fsa::internal::Dict<DictionaryType>::value_store_merge_t, | ||
typename ValueStoreAppendMergeT = typename fsa::internal::Dict<DictionaryType>::value_store_append_merge_t> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to have this type deduction as a default behavior which is possible to override, or it should be enforced and be the only one ?
@@ -196,7 +196,7 @@ class Generator final { | |||
|
|||
// get value and mark final state | |||
bool no_minimization = false; | |||
uint64_t value_idx = value_store_->GetValue(value, &no_minimization); | |||
uint64_t value_idx = value_store_->AddValue(value, &no_minimization); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hendrikmuhs looks like this function is not used anymore, and can be deleted. What do you think ?
(this will also eliminate passing value_store_
to generator).
Note: Probably we can do that in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, this is basically the reason why the value stores for merging still require the constructor and the Add(key, value) method.
Still, I would like to work on that in a separate PR.
} | ||
|
||
} else { | ||
if (KeyDeleted(segment_it.segmentIndex(), top_key) == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe if ( ! KeyDeleted(...))
?
} | ||
} | ||
|
||
void Write(std::ostream& stream) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt if we need this function for merger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's afaik not used at the moment, but that's true for the compiler as well. Not added, but just some strange merge artifact.
I did a couple of the suggested re-factorings. Thanks for the review @narekgharibyan! Looks much better now. Would be nice if you can have another look. I also changed vector to use the new template typing system. Separation is still not perfect, to be continued in #102 |
/** | ||
* An IntValue store which uses the values as inner weights. | ||
*/ | ||
class IntInnerWeightsValueStore final : public IntInnerWeightsValueStoreBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems we don't need this class anymore. Maybe it's worth to delete it and just rename IntInnerWeightsValueStoreBase
-> IntInnerWeightsValueStore
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With #102 the AddValue(...)
methods will move out of the base class into here.
|
||
uint64_t GetValue(const char* p, uint64_t v, bool* no_minimization) { return v; } | ||
class IntValueStore final : public IntValueStoreBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -37,10 +37,11 @@ | |||
namespace keyvi { | |||
namespace vector { | |||
|
|||
template <class ValueStoreT> | |||
template <keyvi::dictionary::fsa::internal::value_store_t DictionaryType> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Truth be told I'm not sure about this change, as this adds a requirement of Dict< ... >
to be defined for value store, which theoretically can be missing for some special value store, implemented for keyvi::vector
only. No strong feelings though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds more like a naming issue to me: I haven't thought about vector when calling the type Dict<...>
- thinking about it, it would make sense to rename. All this does is simplifying the selection of a value store.
Any suggestions for a better name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think that this class should have a member for value store reader as well, which will ultimately eliminate a need of having ValueStoreFactory
as a separate class/method. Also it will ensure that someone implementing new value store, implements all it's components.
So maybe value_store_components
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good, I doubt we can get rid of the factory as this is part of runtime abstraction but I think we can do some improvements
@hendrikmuhs This PR is really great, looking forward to have this merged! And more idea to be discussed later on: In a long run I see |
I improved the naming, there were some overlaps, I hope this improves readability. The readers are also now part of the new helper type. @narekgharibyan Can you create an issue for your idea about |
3727b53
to
0fcf738
Compare
fixes #93
old:
todo: