Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KNN crashes - leaked semaphore - how to get to an error message? #508

Open
thomasf1 opened this issue May 2, 2023 · 1 comment
Open

Comments

@thomasf1
Copy link

thomasf1 commented May 2, 2023

Description

When training multiple models, one dataset seems to crash the KNN Model while training - unfortunately without any error messages.

To rule out any local issues, I have replicated the issue both on my mac as well as on google colab. The same setup with other data also works.

Locally, I get the following error:

Training data:
Number of users = 213962
Number of items = 785
Number of ratings = 712017
Max rating = 5.0
Min rating = 4.0
Global mean = 4.7
---
Test data:
Number of users = 44634
Number of items = 68
Number of ratings = 85231
Number of unknown users = 28896
Number of unknown items = 68
---
Total users = 242858
Total items = 853

[MF] Training started!

[MF] Evaluation started!
Ranking: 100%|██████████████████████████| 44634/44634 [00:09<00:00, 4825.82it/s]

[PMF] Training started!

[PMF] Evaluation started!
Ranking: 100%|███████████████████████████████████████████████████████████████████████| 44634/44634 [00:08<00:00, 5243.91it/s]

[BPR] Training started!

[BPR] Evaluation started!
Ranking: 100%|███████████████████████████████████████████████████████████████████████| 44634/44634 [00:10<00:00, 4297.13it/s]

[BiVAECF] Training started!

[BiVAECF] Evaluation started!
Ranking: 100%|███████████████████████████████████████████████████████████████████████| 44634/44634 [00:09<00:00, 4845.66it/s]

[BaselineOnly] Training started!

[BaselineOnly] Evaluation started!
Ranking: 100%|███████████████████████████████████████████████████████████████████████| 44634/44634 [00:08<00:00, 5355.77it/s]

[SVD] Training started!

[SVD] Evaluation started!
Ranking: 100%|███████████████████████████████████████████████████████████████████████| 44634/44634 [00:09<00:00, 4747.14it/s]

[MMMF] Training started!

[MMMF] Evaluation started!
Ranking: 100%|███████████████████████████████████████████████████████████████████████| 44634/44634 [00:10<00:00, 4246.24it/s]

[UserKNN-Cosine] Training started!
 31%|████████████████████████▊                                                      | 67299/213962 [00:22<00:44, 3289.25it/s]
zsh: killed     python3 testCornac.py
(base) tom@(...) % /Users/tom/miniconda/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

I guess due to the multiprocessing, it is not showing any exceptions.

In which platform does it happen?

MacOS 12.6.2
Google Colab

How do we replicate the issue?

(on request, i could provide the source code and training data.

Expected behavior (i.e. solution)

No cash ;).
It would be helpful to disable the multiprocessing in order to get to the exception that causes the crash.

Other Comments

@tqtg
Copy link
Member

tqtg commented Jun 30, 2023

You can set num_threads=1 to disable multiprocessing. We're also curious about the cause of this issue. Please give us more information regarding this issue. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants