Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add Core #40

Merged
merged 27 commits into from
Jul 23, 2023
Merged

✨ Add Core #40

merged 27 commits into from
Jul 23, 2023

Conversation

ManiMozaffar
Copy link
Member

@ManiMozaffar ManiMozaffar commented Jul 17, 2023

  • implement Spider
  • implement Crawler
  • Connect crawler to rocketry
  • implement FastCrawler

closes #16

@ManiMozaffar ManiMozaffar changed the title Feature/core Implement Core Jul 17, 2023
@ManiMozaffar ManiMozaffar linked an issue Jul 17, 2023 that may be closed by this pull request
@ManiMozaffar ManiMozaffar marked this pull request as ready for review July 18, 2023 16:25
@ManiMozaffar ManiMozaffar marked this pull request as draft July 20, 2023 15:50
@ManiMozaffar ManiMozaffar marked this pull request as ready for review July 22, 2023 16:40
@ManiMozaffar ManiMozaffar changed the title Implement Core ✨ Add Core Jul 22, 2023
@ManiMozaffar ManiMozaffar self-assigned this Jul 22, 2023


class AioHttpEngine:
default_request_limit = 1
request_cls = Request
response_cls = Response

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where use the request_cls, response_cls?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, forgot to use them in code. Will make another commit now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.response_cls

this is already used. when translating response.

cond="every 1 second",
controller=ProcessController(app=RocketryApplication()),
)
await process.add_spiders()
Copy link

@aerosadegh aerosadegh Jul 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add_spiders() -> empty?!
where are the spiders to add?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we define them in the chain

async def main():
    process = Process(
        spider=MySpider() >> MySpiderTwo() >> MySpiderThree(),
        cond="every 1 second",
        controller=ProcessController(app=RocketryApplication()),
    )
    await process.add_spiders()

the method add_spiders also adds them to the controller, if there's any defined.
it is not meant to be called like this, usually this happen underhood in FastCrawler App. but if a user want to initiate and run a process manually, that's the way to go.

Copy link

@aerosadegh aerosadegh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need Some Changes:

  1. add_spiders ?
  2. request_cls using for?

@aerosadegh aerosadegh merged commit d424bf5 into develop Jul 23, 2023
@ManiMozaffar ManiMozaffar deleted the feature/core branch July 27, 2023 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement core
2 participants