Skip to content

Update Readme instruction to install cython #223

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 4, 2024

Conversation

praateekmahajan
Copy link
Contributor

@praateekmahajan praateekmahajan commented Aug 30, 2024

Description

Currently the installation using pip install . doesn't work. We add instructions to pre-install cython to make it work.

Root cause for cpython

nemo_toolkit depends on YouTokenToMe (and that requires Cython to exist). This is also addressed in our .github/workflows/test.yml
https://github.com/NVIDIA/NeMo-Curator/blob/2e0af575e65b73b7cf9455d36a68c5dba51a6553/.github/workflows/test.yml#L40

Even nemo_toolkit requires an explicit installation of Cython before installing nemo_toolkit
https://github.com/NVIDIA/NeMo/blob/a777a442c43cd2092684c8b8f06701ed62134a9f/README.md#pip

Future

However once nemo_toolkit 2.x is released we might be able to resolve this since NVIDIA/NeMo#8322 has been merged. In future we can also consider moving nemo_toolkit as an optional dependency since it's just used in a single place https://github.com/NVIDIA/NeMo-Curator/blob/2e0af575e65b73b7cf9455d36a68c5dba51a6553/nemo_curator/filters/code.py#L105

Why not add to install_requires?

IIUC explicitly having Cython as a dependency in install_requires within setup.py doesn't solve the issue because IIUC pip doesn't respect any specific order of installation, so when it's installing nemo_toolkit, Cython hasn't been installed by then.

Alternative solution

Tried adding build-system to the pyproject.toml however that didn't work as cython was still not found during install step.
ae26c34

Checklist

  • I am familiar with the Contributing Guide.
  • New or Existing tests cover these changes.
  • The documentation is up to date with these changes.

Signed-off-by: Praateek Mahajan <[email protected]>
Signed-off-by: Praateek Mahajan <[email protected]>
@praateekmahajan praateekmahajan changed the title Praateek/cpython build Add Cython in build-system within pyproject.toml Aug 30, 2024
@randerzander
Copy link

randerzander commented Aug 30, 2024

I just checked out your branch, and tried to install like so:

conda create --name curator python=3.10
conda activate curator
pip install .

But I still get a failure to install:

Collecting youtokentome>=1.0.5 (from nemo-toolkit[nlp]>=1.23.0->nemo_curator==0.4.0)

  Using cached youtokentome-1.0.6.tar.gz (86 kB)

  Preparing metadata (setup.py) ... error

  error: subprocess-exited-with-error

  

  × python setup.py egg_info did not run successfully.

  │ exit code: 1

  ╰─> [6 lines of output]

      Traceback (most recent call last):

        File "<string>", line 2, in <module>

        File "<pip-setuptools-caller>", line 34, in <module>

        File "/tmp/pip-install-z18e90x3/youtokentome_3bb31052da034e6e89300cb60cb5d3d8/setup.py", line 5, in <module>

          from Cython.Build import cythonize

      ModuleNotFoundError: No module named 'Cython'

      [end of output]

  

  note: This error originates from a subprocess, and is likely not a problem with pip.

error: metadata-generation-failed



× Encountered error while generating package metadata.

╰─> See above for output.



note: This is an issue with the package mentioned above, not pip.

hint: See above for details.

If I first pip install cython, then pip install . succeeds as expected, implying the pyproject.toml changes still don't successfully install cython first.

for what it's worth:

conda --version
conda 24.7.1

@praateekmahajan praateekmahajan marked this pull request as draft August 30, 2024 20:15
@praateekmahajan praateekmahajan changed the title Add Cython in build-system within pyproject.toml Update Readme instruction to install cython pybind11 Aug 30, 2024
@praateekmahajan praateekmahajan marked this pull request as ready for review September 3, 2024 23:40
@ryantwolf
Copy link
Contributor

Thanks for doing all this work debugging! Curator requires all commits to be signed (-s) and signed-off on (-S). Looks like you forgot to sign off on one of your commits. Follow step 3 in the contribution guide to fix this: https://github.com/NVIDIA/NeMo-Curator/blob/main/CONTRIBUTING.md#pull-requests-pr-guidelines

Signed-off-by: Praateek Mahajan <[email protected]>
@praateekmahajan praateekmahajan changed the title Update Readme instruction to install cython pybind11 Update Readme instruction to install cython Sep 4, 2024
Copy link
Contributor

@ryantwolf ryantwolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@ryantwolf ryantwolf merged commit ad13c61 into NVIDIA-NeMo:main Sep 4, 2024
3 checks passed
yyu22 pushed a commit to yyu22/NeMo-Curator that referenced this pull request Oct 9, 2024
* add build-system, remove cython

Signed-off-by: Praateek Mahajan <[email protected]>

* remove commented cython

Signed-off-by: Praateek Mahajan <[email protected]>

* remove pyproject.toml changes, add readme instructions

Signed-off-by: Praateek Mahajan <[email protected]>

* Add build system

Signed-off-by: Praateek Mahajan <[email protected]>

---------

Signed-off-by: Praateek Mahajan <[email protected]>
Signed-off-by: Yang Yu <[email protected]>
yyu22 pushed a commit to yyu22/NeMo-Curator that referenced this pull request Oct 10, 2024
* add build-system, remove cython

Signed-off-by: Praateek Mahajan <[email protected]>

* remove commented cython

Signed-off-by: Praateek Mahajan <[email protected]>

* remove pyproject.toml changes, add readme instructions

Signed-off-by: Praateek Mahajan <[email protected]>

* Add build system

Signed-off-by: Praateek Mahajan <[email protected]>

---------

Signed-off-by: Praateek Mahajan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants