These are just simple scripts to scrape tweets and then do some analysis. Here, we try to search for IPL tweets and then analyse them using Gemini LLM. You can use the js snippets for scraping and further do your own analysis.
-
Goto Twitter Explore section - https://twitter.com/explore
-
Add the snippets
scrape.js
andauto_scroll.js
in your chrome devtools as snippets undersources
section. -
Search for your query, try to use twitter advanced search to filter out spam tweets and to use other filters. Highly recommended.
-
Run the
scrape
snippet. -
Run the
auto_scroll
snipper. -
Wait until you feel satisfied with the number of tweets scrapped. You can look at console to see the logs.
-
Once you get rate limited or you search bot, try log the variable
tweets
in the console. You can then right click and choose copy object. -
In the
data/
folder create a new JSON file and paste your object in there. -
Now you can merge all the files into one by running
merge.py
script. -
Run the
run_genai.py
file after entering your Gemini API key in it. This will run through the tweets and create a fileanalysed.json
inresults/
directory. -
Use
preprocess.py
to make sure the results data is in consistent format.