Skip to content

k66inthesky/Predict_News

Folders and files

NameName
Last commit message
Last commit date

Latest commit

67e357b · Nov 26, 2023

History

41 Commits
Oct 10, 2019
May 5, 2020
Oct 10, 2019
Nov 26, 2023
Oct 10, 2019
Oct 11, 2019
Oct 10, 2019
Oct 20, 2019
Oct 20, 2019
Oct 10, 2019
Oct 10, 2019
Nov 11, 2019
Oct 11, 2019
Nov 11, 2019
Oct 10, 2019
Oct 10, 2019

Repository files navigation

Implementation of the paper 2019 IMP conferrence(https://arxiv.org/abs/2204.12093) in Kaohsiung, Taiwan - Chen, C. L., Huang, P. Y., Lin J., and Huang, Y. T. 2019 Approach to Predicting News ─ A Precise Multi-LSTM Network With BERT.

While predicting news from LTN website during 2019 July 25th to August 5th, the accuracy may up to 99%. The accuracy of experiment is related to the corpus, that is, if one news including many new words, this program may do bad.

Author: 
	Lana Chen(k66inthesky@gmail.com)
	Amanda Huang(amanda10702@gmail.com)
Advisor:
	Meng-Chang Chen(mcc@iis.sinica.edu.tw)
Update: Nov.11th,2019
Target: To predict an unknown news/article from the eight categories(Technology, Finance, Politics, Entertainment, International, Sports, Health, Fashion)
File description:
	To use this project, you should download the whole package of files and the bert_model.ckpt.data-00000-of-00001,
	and follow the following install instructions.
	You only need to execute two programs--news2E.ipynb and predict.py.
	It's optional for replacinging the context in 'myinput.csv' if you wish to predict your own news.

Installation:

(Request)
-Python(3.6.2)
-Tensorflow(1.15)
-Numpy
-json
-Pandas
-Keras
-re

(Optional)
-collections
-copy
-math
-six
-shutil
-unicodedata

(MUST DOWNLOAD FILE)
"bert_model.ckpt.data-00000-of-00001"
(It's one of the file inside this folder "chinese_L-12_H-768_A-12"
from this website: https://github.com/google-research/bert)
AND PUT IT IN THE FOLDER:
"Predict_News/chinese_L-12_H-768_A-12/"

OR YOU CAN JUST DOWNLOAD THE WHOLE FOLDER "chinese_L-12_H-768_A-12"
from this website: https://github.com/google-research/bert)

Notice:

*input:
	
	CHINESE ONLY!!!
	
	Should be like the example'myinput.csv':
		two rows: 
			-context
			-your news // It can be any size, but 30*20 words per paragraph may be better.
			
*MUST CREATE AN EMPTY FOLDER, "embedding" BY YOURSELF 

File:

|-chinese_L-12_H-768_A-12
|-embedding
	|-myoutput.npy
|-news2E.ipynb #turn your input news(myinput.csv) into mytmpfile.jsonl and myoutput.npy(located at ./embedding/)
|-news2E_optional #only need to choose one of the two--news2E.ipynb or news2E_optional
|-predict.py #predict the last output to one of the eight categories 
|-extract_features.py
|-modeling.py
|-myinput.csv #it's optional for replacinging the context
|-mytmpfile.jsonl
|-optimization.py
|-our_model.h5 # it had already been trained with 28,768 corpus
|-our_model_weight.h5 # it had already been trained with 28,768 corpus
|-tokenization.py

Input(input will be a table/dataframe with only one row): image

Output(You can find the answer on the bottom): image

If you have any questions, please feel free to ask;)

About

Predict News

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published