Skip to content

Using pretrained T5 model for abstractive summarization of books

License

Notifications You must be signed in to change notification settings

saarthdeshpande/book-summarizer

Repository files navigation

Book-Summarizer

NLP-based book summarizer which summarises the book chapter-wise.
In case the book does not contain chapters: the entire book is summarized.

Why summarize a book?

  • The goal of writing a summary of an article, a single chapter or a whole book is to offer as accurately as possible the full sense of the original, but in a more condensed form.
  • A summary restates the author's main point, purpose, intent and supporting details in your own words.

How does the summarizer work?

  • The summarizer is developed using T5-small pretrained model from HuggingFace Transformers.
  • Chunks are created from individual chapters.
  • Then the chunks are tokenized using T5Tokenizer.
  • The tokenized text is passed to T5ForConditionalGeneration model class, for summary-ids generation.
  • The summary-ids are decoded to text using decode() function from the T5Tokenizer.

How to run the book summarizer:

  1. Clone the repository.
  2. git clone https://github.com/saarthdeshpande/book-summarizer.git
    
  3. Install all the dependencies mentioned in the requirements.txt.
  4. pip install -r requirements.txt
    
  5. To run via CLI:
  6. python3 bsCLI.py --path <path-to-PDF-file>
    
  7. To run on Flask server with frontend and mail:
    1. Update the value of sender_address and sender_pass in mail.py.
    2. Run views.py.
    3. python3 views.py
      

Screenshots

Home Page

Mail on Successful Processing

About

Using pretrained T5 model for abstractive summarization of books

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published