Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We need more maintainers #343

Open
MartinThoma opened this issue Jan 6, 2023 · 33 comments
Open

We need more maintainers #343

MartinThoma opened this issue Jan 6, 2023 · 33 comments

Comments

@MartinThoma
Copy link
Contributor

It seems like camelot is dead:

Besides the owner there are only 35 other contributors.

https://opencollective.com/camelot might be another way to check if it's dead.

Does anybody know more? Should we try to transfer the project to https://github.com/jazzband ?

@MartinThoma
Copy link
Contributor Author

Or is somebody there who would like to become a maintainer?

@vinayak-mehta
Copy link
Member

Hi @MartinThoma, sorry for not being responsive here. I've been busy with some life stuff for some time now and haven't had the mindspace to look into the issues here. I've been wanting to get back into it, I'll look into them over this weekend.

I also want to stop being a single point of failure here and would love to get help maintaining camelot going forward.

@MartinThoma MartinThoma changed the title Project Governance: Is camelot dead? Project Governance: Is camelot IS NOT dead 🎉 Jan 6, 2023
@vinayak-mehta vinayak-mehta changed the title Project Governance: Is camelot IS NOT dead 🎉 We need more maintainers Jan 8, 2023
@vinayak-mehta vinayak-mehta pinned this issue Jan 8, 2023
@foarsitter
Copy link
Contributor

Is there any movement on this topic?

@ramSeraph
Copy link

@vinayak-mehta Would you mind if I post this on the Indian FOSS and opendata channels? This tool has been extremely helpful in dealing with the PDF crap Indian Government puts out.

@foarsitter
Copy link
Contributor

foarsitter commented Feb 8, 2023

Sorry for my inpatientence but the show must go on. In order to take camelot to production I created a fork and released it to pypi under camelot-fork==0.20.0. My intentions are limited and I hope this project finds new maintainers soon.

When the request came to extract tables from pdf files I thought it would be very tricky job but camelot does it all. Therefore I want to express my gratitude to all of you that made that possible.

If people are in need of a fix I'm willing to accept pull request as long as they have test-coverage.

@vinayak-mehta
Copy link
Member

Sorry for being a bit unresponsive since I created this issue. I've pushed a release based on @MartinThoma's last PR: https://pypi.org/project/camelot-py/

@MartinThoma Thank you for the PR, would you like to be added to the github org so that you have push access to the repo?

@foarsitter Are you interested in maintaining the project here instead of the fork? I can add you to the github org too.

@MartinThoma
Copy link
Contributor Author

MartinThoma commented Feb 26, 2023

Thank you for making a new release 🙏

would you like to be added to the github org so that you have push access to the repo?

I would probably not be super active as I spend most of my time with pypdf. If that is ok for you, then yes, please add me :-) I could probably go over a couple of the PRs / introduce update CI so that maintaining the library becomes easier :-)

@vinayak-mehta
Copy link
Member

@MartinThoma That would be awesome! Just sent you an invite ✉️

@foarsitter
Copy link
Contributor

@vinayak-mehta should be awesome

@MartinThoma
Copy link
Contributor Author

MartinThoma commented Feb 26, 2023

Are there any rules I should follow, e.g.

  1. Reviews: When I make a PR, should I ask you (or somebody else) for a review before merging it? (Here is the first one, btw: MAINT: Drop Python 3.6 from CI + skip failing Mac tests #356 😄 )
  2. Commit Messages: e.g. something like https://docs.scipy.org/doc/scipy/dev/contributor/development_workflow.html#writing-the-commit-message ?
  3. Merges: Merge / Squash+Merge / Rebase+Merge: I prefer squash+merge, but you seem to do normal merges only. Is sqash+merge ok?
  4. Tests: Do you have any hard rules in regards to unit tests, e.g. that every new feature needs to have full test coverage?

@foarsitter
Copy link
Contributor

foarsitter commented Feb 26, 2023

As I see it:

  1. a review is always good, unless it is something realy trivial. Be patiënt. Reverting releases because we are to eager is something we should want to avoid.
  2. a commit message should be clear about its contents, which style applied is less important to me. As I see it we can generate a changelog based on the titles of the merged pull requests (see my fork: https://github.com/foarsitter/camelot/releases)
  3. If there are more commit message needed because there are changes in various parts of the codebase then a squash seems not a good fit to me, so it depends on the PR.
  4. Adding tests afterwards is really hard, even harder when you are not the author of the code. So full coverage is recommended here in my point of view. If the addition is trivial the test should be trivial too right?

@vinayak-mehta
Copy link
Member

@foarsitter I just invited you, sorry it took so long

I'm a big fan of the scikit-learn contributing guidelines.

  1. We should go for at least 1 review.
  2. I agree with @foarsitter's point.
  3. I'm gonna lean towards squash + merge because it's easier to revert, just in case we need to do that. It should also lead to small PRs which would be easier to review. If we have PRs with major enhancements that touch various parts of the codebase, then squashing might not make sense.
  4. I agree with @foarsitter's point.

@nmstoker
Copy link

I'd be interested in contributing, particularly to the docs initially.

It seems to me there's a lot of value in this repo, but things seem to have got into a fairly confusing state. Devoting time to it, I think I've figured out most of the misunderstandings I had and it seems like it's worth sharing / updating docs, so that others don't fall into the exact same traps I (and others) did.

Correct me if I'm wrong but I get the sense that what has made things harder overall is that the migration to pdftopng/poppler backend was in progress yet not completed when the maintenance fell away (quite reasonably given world events!)

@foarsitter 's idea of a fork that cuts pdftopng out is interesting, although I would feel more comfortable if it was directly part of the main repo.

How feasible is it to make the "base" install be equivalent to the fork (ie such that it doesn't install pdftopng as a requirement)?
And with that, introduce a "pdftopng" extra requires option so people can optionally try it and then - only once it's deemed to work sufficiently well - it is switched to be what gets delivered with "base" at some later point. Presumably for that to happen there needs to be a bit of maintenance upstream in pdftopng too. If this last paragraph is best discussed in a separate issue, that's fine by me, just say 🙂

@MartinThoma
Copy link
Contributor Author

@nmstoker I cannot answer that question, but at least I could review/merge PRs with documentation updates :-) so if there are specific learnings you want to share, I would support you :-)

@nmstoker
Copy link

Sounds a good start, thanks @MartinThoma !

@foarsitter
Copy link
Contributor

@nmstoker Looking forward to your learnings!

@kshitiz305
Copy link
Contributor

After using the products for a long time in my developer career, I just started my contribution to Camlot with my in initial pull request for (#364). I would love to contribute to other projects as well. Thanks

@bosd
Copy link

bosd commented Jul 14, 2023

How about Excalibur?? That might need some ❤️ as well.
There is still an open refresh issue on windows which makes it unusuable.

P.s. I'm happy to contribute/maintain a bit on both projects.

@foarsitter
Copy link
Contributor

Looking forward to your contributions @bosd

@foarsitter
Copy link
Contributor

@MartinThoma I'm willing to help where I can!

@python3-dev
Copy link

@MartinThoma : Please pull me in. I would like to contribute to the code.

@ammadakram
Copy link

I can fix the PdfFileReader deprecation error, please pull me in.

@bosd
Copy link

bosd commented Mar 22, 2024

I can fix the PdfFileReader deprecation error, please pull me in.

@ammadakram Can you please open a PR here: https://github.com/py-pdf/pypdf_table_extraction

@jatinchhabriya
Copy link

jatinchhabriya commented Aug 20, 2024

@MartinThoma @vinayak-mehta @bosd I am facing the same error as Kushal,
Expected Output: List of tables
Standard Output since this week: "Attribute Error: File Format not supported". Could you please let me know if a fix has been deployed on the forked branch, this was working a week ago and for my particular use case lattice boundary provided exclusively in camelot-py[cv] is required.

@bosd
Copy link

bosd commented Aug 20, 2024

Please try the code from the new repo.
If the problem exists, please open a issue there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests