In this repository you can find an analysis of SAC algorithm in the contest of decision focused learning for solving the set covering problem.
We showed pros and cons of SAC algorithm in this context, indeed even if we were able to achieve better performances with respect to on-policy algorithms, the convergence speed is still in favour of the latter ones.
This problem was partially solved introducing the Prioritized Experience Replay technique.
A more exhaustive analysis of the experiemtns and of the results can be found in the report.