Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于训练数据其中的行为类型 #3

Open
XGodLike opened this issue Apr 23, 2020 · 2 comments
Open

关于训练数据其中的行为类型 #3

XGodLike opened this issue Apr 23, 2020 · 2 comments

Comments

@XGodLike
Copy link

请问,我看了在taobao的训练数据中,数据得预处理,其中行为类型好像没有用到?是这样吗?另外时间数据,只是作为一个排序来使用的,并没有考虑行为之间的时间差是吗?

@nwf5d
Copy link

nwf5d commented Sep 3, 2020

论文中写的是有点击和购买等行为,只使用了点击行为数据。按时间排序,取200个,预测最后一个。数据预处理中确实没有过滤行为类型。加购、关注、购买、点击都算点击?

Taobao Dataset3 is a collection of user behaviors from Taobao’s
recommender system [12]. The dataset contains several types of
user behaviors including click, purchase, etc. It contains user behavior sequences of about one million users. We take the click
behaviors for each user and sort them according to time
in an attempt to construct the behavior sequence. Assuming there are T
behaviors of user u, we use the former T-1 clicked products as
features to predict whether users will click the T -th product. The
behavior sequence is truncated at length 200.

另:楼上最后一个问题另外时间数据,只是作为一个排序来使用的,并没有考虑行为之间的时间差是吗?你说对了,确实不考虑时间差,把用户所有历史合并到若干channel的embedding中,建议再去看看论文。

@shuDaoNan9
Copy link

论文中写的是有点击和购买等行为,只使用了点击行为数据。按时间排序,取200个,预测最后一个。数据预处理中确实没有过滤行为类型。加购、关注、购买、点击都算点击?

Taobao Dataset3 is a collection of user behaviors from Taobao’s
recommender system [12]. The dataset contains several types of
user behaviors including click, purchase, etc. It contains user behavior sequences of about one million users. We take the click
behaviors for each user and sort them according to time
in an attempt to construct the behavior sequence. Assuming there are T
behaviors of user u, we use the former T-1 clicked products as
features to predict whether users will click the T -th product. The
behavior sequence is truncated at length 200.

另:楼上最后一个问题另外时间数据,只是作为一个排序来使用的,并没有考虑行为之间的时间差是吗?你说对了,确实不考虑时间差,把用户所有历史合并到若干channel的embedding中,建议再去看看论文。

各位大佬有开源的TF2版本的实现吗??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants