You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PATECTGAN performs poorly with categorical values, and seems to have been broken since at least 0.2.1. Continuous values work OK in PATECTGAN, and categorical and continuous both work OK with DPCTGAN, so this doesn't appear to be a bug in the conditional vector sampler.
repro code:
importpandasaspdimportnumpyasnpimportseabornassnsimportmatplotlib.pyplotaspltfromsnsynth.pytorch.nnimportPATECTGANsize=10000eps=3.0# Two columns with 5 categories each, all mass is on diagonalnp_data_xy= (
np.array([np.arange(0, size) %5, (np.arange(0, size) %5) *10]).astype(np.int16).T
)
df=pd.DataFrame(np_data_xy, columns=["x", "y"])
sns.scatterplot(data=df, x="x", y="y")
plt.show()
# train and synthesizedpgan=PATECTGAN(epsilon=eps)
dpgan.train(np_data_xy, categorical_columns=[0, 1])
synth_data=dpgan.generate(size)
synth_df=pd.DataFrame(synth_data, columns=["x", "y"])
sns.scatterplot(data=synth_df, x="x", y="y")
sns.kdeplot(data=synth_df, x="x", y="y", levels=5, alpha=0.5, fill=True)
plt.show()
expected: density plot with most mass on the diagonal
observed: density plot with mass evenly spread across all combinations
The text was updated successfully, but these errors were encountered:
PATECTGAN performs poorly with categorical values, and seems to have been broken since at least 0.2.1. Continuous values work OK in PATECTGAN, and categorical and continuous both work OK with DPCTGAN, so this doesn't appear to be a bug in the conditional vector sampler.
repro code:
expected: density plot with most mass on the diagonal
observed: density plot with mass evenly spread across all combinations
The text was updated successfully, but these errors were encountered: