Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable optional user traits on serializers #282

Closed

Conversation

Matthew-Whitlock
Copy link
Contributor

UserTraits

What: Serialize/deserialize calls are now also templated on UserTraits..., which is passed on to the base serializer. Some helpers for adding/removing and checking for the existence of trait are added as well.
Why: User traits allow arbitrary customization of serialization behavior based on user-defined terms. This enables more flexibility in checkpointing than an extra boolean variable, and is more clear in user-space code than a pre-defined variable name (EG: VT::RestoreProxies instead of s.isCheckpointRecovery())

Serialization overriding

What: When a non-intrusive serializer and an intrusive serializer are both found, call the non-intrusive serializer.
Why: Non-intrusive serializers can follow up with calls to the intrusive copy, meaning non-intrusive serializers can be used as a form of serialization subscriber. Particularly with user traits, we can define non-intrusive serializers which only run when a particular trait is used on a given object, allowing third parties (E.G. KokkosResilience) to perform custom logic before serialization of arbitrary objects. This can also be used to generally attach to serialization of objects for the purposes of tracing, etc.

For examples of these changes in action, see checkpoint_example_user_traits.cc/hpp


template<typename Trait, template<typename...> typename SerializerT, typename... UserTraits>
auto& withoutTrait(SerializerT<UserTraits...>& obj){
return reinterpret_cast<typename removeTrait<Trait, SerializerT, UserTraits...>::type&>(obj);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to do this without the reinterpret_cast. I think this is UB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some difficulty since the serializers are expected to be used by the same reference. I've been looking into it, and technically any mechanism we use to refer to the same memory space by the two different types is undefined. The user traits aren't used to change any members, so we should be good on memory positioning, but strict aliasing rules could cause issues w.r.t. compilers keeping member variables in different registers.

I haven't tested, but I think we ensure correct reads/writes by:

  1. Require all serializers be stored as volatile variables, or all non-reference/non-pointer members of serializers be declared volatile
  2. Require all serializer types declared with the may_alias trait, which should be available on gcc/clang/icc. I expect this would have more side effects on compiler optimizations.

Any thoughts?

@Matthew-Whitlock
Copy link
Contributor Author

@nmm0 @lifflander I've been considering the issues we discussed a while ago about virtual serialization - could you take a glance at this and let me know if it seems viable to try implementing?

The problem was that, for classes with inheritance, we need to call a virtual function to get to the appropriate serialize function. Since virtual functions can't be templated, we have to pre-register each serializer's type and do some shenanigans to cast an untyped object to the right serializer and go from there. The UserTraits aren't known beforehand, so we can't register the types of the traited serializers and then can't virtual serialize with the traits.

Based on a fun trick from this blog, I've worked out creating an insertable type list. Types add themselves to the list of inheritors for the type they inherit from, the inherited class does nothing. Then the dispatcher can use that to build templated calls for trying to convert to any base classes.

#include <cstdio>
#include <type_traits>
#include <string>

template<typename T, int index>
struct Inheritors;


//This bit inspired by https://devblogs.microsoft.com/oldnewthing/20190711-00/?p=102682
//Evaluates whether T has an inheritor defined for a given index at the code location this is initialized at
//Multiple calls w/ same parameters will always return same value, regardless of future changes.
//Uses Requester parameter to allow each inheritor to check independently.
template<typename T, typename Requester, int index = 0, typename = void>
constexpr bool hasPrevInheritor = false;
template<typename T, typename Requester, int index>
constexpr bool hasPrevInheritor<T, Requester, index, std::void_t<decltype(sizeof(Inheritors<T,index>))>> = true;

template<typename T, typename Requester, int offset=0>
constexpr int numPrevInheritorsImpl(){
    if constexpr(hasPrevInheritor<T,Requester,offset>) {
        return 1 + numPrevInheritorsImpl<T,Requester,offset+1>();
    } else {
        return 0;
    }   
}
template<typename T, typename Requester>
constexpr int numPrevInheritors = numPrevInheritorsImpl<T,Requester>();

#define DeclareInheritance(Parent, Child) \
    template<>                                                   \
    struct Inheritors<Parent, numPrevInheritors<Parent, Child>>{ \
        using type = Child;                                      \
    };                                                           \



//These get you "final" results, independent of call ordering
template<typename T, int index, typename = void>
struct GetInheritor {
    using type = void;
    static constexpr bool exists = false;
};
template<typename T, int index>
struct GetInheritor<T, index, std::void_t<typename Inheritors<T,index>::type>> {
    using type = typename Inheritors<T,index>::type;
    static constexpr bool exists = true;
};



template<typename T>
void serialize(const T& obj){
    printf("Serializing %s as %s\n", obj.getRealType().c_str(), obj.getCurrentType().c_str());
}

template<int index = 0, typename T>
void polymorphicSerialize(const T& obj){
    if constexpr(GetInheritor<T, index>::exists) {
        using TryType = typename GetInheritor<T, index>::type;

        const TryType* objAs = dynamic_cast<const TryType*>(&obj);

        if(objAs != nullptr){
            polymorphicSerialize(*objAs);
        } else {
            polymorphicSerialize<index+1>(obj);
        }
    } else {
        serialize(obj);
    }
}


struct A {
    std::string getCurrentType() const {
       return "A";
    }
    virtual std::string getRealType() const {return getCurrentType();}
};
struct AA : public A {
    std::string getCurrentType() const {
       return "AA";
    }
    virtual std::string getRealType() const override {return getCurrentType();}
};
struct AB : public A {
    std::string getCurrentType() const {
       return "AB";
    }
    virtual std::string getRealType() const override {return getCurrentType();}
};

DeclareInheritance(A, AA);
DeclareInheritance(A, AB);

int main(int argc, char** argv){

    if(GetInheritor<A, 0>::exists){
        printf("A has an inheritor\n");
    } else {
        printf("A has no inheritors\n");
    }
    /*
    //These values cannot safely be referenced directly, since they are not order-independent.
    //Should only be used by the macro
    printf("A had %d inheritors when AA asked\n", numPrevInheritors<A, AA>);
    printf("A had %d inheritors when AB asked\n", numPrevInheritors<A, AB>);
    printf("A still had %d inheritors when AA asked\n", numPrevInheritors<A, AA>);
    */
    AB ab;
    A& a = ab;

    printf("\nNon-poly serialize a:\n");
    serialize(a);
    printf("Non-poly serialize ab:\n");
    serialize(ab);

    printf("\nPoly serialize a:\n");
    polymorphicSerialize(a);
    printf("Poly serialize ab:\n");
    polymorphicSerialize(ab);
}

//Could even declare after main
//DeclareInheritance(A, AA);
//DeclareInheritance(A, AB);

@PhilMiller
Copy link
Member

Seeing this revised after a while, this has gone way down a rabbit hole of type system magic. Is there such a pressing concern for performance or compile-time checking that these sorts of traits can't be tracked as some sort of dynamic runtime attribute? E.g. a bitset with indices assigned by position of the attribute in the set of requested attributes? Heck, those attributes could be types in a registry, if you wanted. Then dispatch to the specialized serialization logic could happen based on whether a serializer has a given bit set.

@Matthew-Whitlock
Copy link
Contributor Author

The main motivation for using type-traiting is to allow non-intrusive 3rd party serialization hooks.

For example, KokkosResilience could serialize (or just size) a generic object with a KokkosResilience::ProxyReferences trait and gain fully-typed references to any VT Proxies that object holds as a member. During deserialization for recovery, we could verify that any referenced proxies are available and have the same proxy ID, and if not either recover the other object or replace the proxy with the updated ID.

The hook is found by the compiler to be called despite the object's type or the invocation namespace because the serializer has a KokkosResilience namespaced template parameter.

@nmm0
Copy link
Contributor

nmm0 commented Jun 20, 2024

Superseded by #345

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants