Character-Level-NER-using-Bidirectional-LSTM

It's a well known problem in the field of Natural Language Processing(NLP) where we need to find the Named Entities given in a sentence

Problem Statement

There are various categories of named entities in a sentence which include B-Pers, B-ORG, B-LOC and many others. Given a sentence we need to identify these Named Entities. Eg : New Delhi is very much polluted. Here the named entity is New Delhi which falls under the category of B-LOC. The dataset here is taken from the Twitter Annotated Dataset which includes both English and Hindi words along with various hashtags.

Approach

We mainly approach this problem at a Character Level and for this we use Bidirectional LSTM i.e. 2 LSTMs one forward and one backward t preserve th context of the words. We mark all the characters in the Named Entity using BIO where the boundaries are marked with B and all the characters within it are marked with I and the rest are marked with O. For out of vocabulary characters, we can use Out of Vocabulary token provided by Keras Tokenizer or we can also use our particular version using character dictionary.

Tools Used

Python, Tensorflow, Keras

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Dataset		Dataset
.gitignore		.gitignore
CharacterLevel_NER.ipynb		CharacterLevel_NER.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Character-Level-NER-using-Bidirectional-LSTM

Problem Statement

Approach

Tools Used

About

Releases

Packages

Languages

demonpratapdemon/Character-Level-NER-using-Bidirectional-LSTM

Folders and files

Latest commit

History

Repository files navigation

Character-Level-NER-using-Bidirectional-LSTM

Problem Statement

Approach

Tools Used

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages