You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[CV2-2674] google factcheck tools lookup bot (#30)
* [CV2-2674] sketching structure for managing google factcheck data from archive file (unused)
* extracting and filtering from local file working
* bot lambda working in QA and live with googe-fact-check-tools and testing workspaces
* separate API keys for read and write, and and adjusted query url to not include feed id
---------
Co-authored-by: Skye Bender-deMoll <[email protected]>
Shows Notes with Claim Review content that has been imported from https://toolbox.google.com/factcheck/explorer that is simlar to the Project Media item.
4
+
5
+
# Overview
6
+
7
+
NOTE: These instructions include references to documentation and AWS infrastructure that are not
8
+
visable outside the Meedan organization. Please contact us if you need information about any of
9
+
these resources to install this code on your own system. The installation process is also recorded on an internal wiki page: `Installing the Google factcheck tools bot in a workspace`
10
+
11
+
## Background data
12
+
For items to become availible to be displayed by the bot
13
+
* ClaimReivew objects are parsed daily by Fetch plugin `fetch/lib/claim_review_parsers/google_fact_check.rb`
14
+
- TODO: replace this with more efficient ingest process, perhaps based on the code in `/ingest` (this code is currently not used)
15
+
* The google-fact-check-tools workspace https://checkmedia.org/google-fact-check-tools/project/15547 listens for
16
+
new ClaimReviews and stored in Check under its team id.
17
+
* The items in the workspace are availible for similarity queries via Check API, with access permission determined by API key.
18
+
* NOTE: usually items in the project need to be in the 'published' state to be availible for similarity matching. It may be necessary to toggle them using a script like https://github.com/meedan/check-scripts/blob/main/publish_imported_reports.rb
19
+
20
+
## Bot operation
21
+
When this bot is configured in a workspace
22
+
* The bot listens on a webhook for new ProjectMedia creation events for a team, configured as per internal wiki page `How to configure a webhook for a Check Bot`
23
+
```
24
+
bot_user = BotUser.where(name: "Google fact check workspace API Client")[0]
* The text from the PM is similarity compared with the availible set of ClaimReview items via Check API in a query
30
+
managed by an AWS Lambda function defined in `/google-factcheck-explorer-bot-lambda`
31
+
* Any resulting ClaimReview links are written back as 'comments' on the ProjectMedia items in the workspace to be displayed as Notes the sidebar
32
+
* An appropriately configured API key is needed to give bot permissions to write to the workspace. This is done by mapping the BotUser to the team. Internal wiki page `How to create an API key for a Check workspace`
33
+
34
+
# Bot testing setup
35
+
## Testing the background side
36
+
* Setup a workspace in QA to host the content (will this work or need to do locally)
37
+
* Configure an api key with premissions to access the workspace via a `BotUser`. Internal wiki page `How to create an API key for a Check workspace`
38
+
* Import the google claim review content from Fetch. Internal wiki page `How to re- import content from Fetch into a Check workspace`
39
+
* Confirm that the feed can be queried:
40
+
```
41
+
curl -X GET -H "Accept: application/vnd.api+json" -H "X-Check-Token: <API_KEY_GOES_HERE>" "https://qa-check-api.checkmedia.org/api/v2/feeds?filter\[query\]=test"
42
+
```
43
+
which should give a response like
44
+
```
45
+
{"data":[{"id":"20007","type":"feeds","links":{"self":"https://qa-check-api.checkmedia.org/api/v2/feeds/20007"},"attributes":{"claim":"-","claim-context":null,"claim-tags":"","fact-check-title":"Viral Test: Big B, Madhuri Dixit campaigning For Imran Khan?","fact-check-summary":"Pakistan's PTI party is using Amitabh Bachchan and Madhuri Dixit photos on their campaign posters","fact-check-published-on":1679572669,"fact-check-rating":"undetermined","published-article-url":"https://www.indiatoday.in/fact-check/story/viral-test-big-b-madhuri-dixit-campaigning-for-imran-khan-1294131-2018-07-24","organization":"Google fact check tools"}},{"id":"19384","type":"feeds","links":{"self":"https://qa-check-api.checkmedia.org/api/v2/feeds/19384"},"attributes":{"claim":"-","claim-context":null,"claim-tags":"","fact-check-title":"The Legend of the 'Pencil Death' Exam Suicide","fact-check-summary":"A student, stressed to the breaking point by the pressures of exams, committed suicide during a test by shoving pencils up his nostrils and into his brain.","fact-check-published-on":1679571320,"fact-check-rating":"undetermined","published-article-url":"https://www.snopes.com/fact-check/pencil-death/","organization":"Google fact check tools"}},{"id":"19302","type":"feeds","links":{"self":"https://qa-check-api.checkmedia.org/api/v2/feeds/19302"},"attributes":{"claim":"-","claim-context":null,"claim-tags":"","fact-check-title":"FACT CHECK: Poppy Seeds Alter Drug Test Results?","fact-check-summary":"The consumption of poppy seeds used on bagels and muffins can produce positive results on drug screening tests.","fact-check-published-on":1679571142,"fact-check-rating":"undetermined","published-article-url":"https://www.snopes.com/fact-check/poppy-seeds-alter-drug-test-results/","organization":"Google fact check tools"}}],"meta":{"record-count":3}}%
46
+
```
47
+
48
+
## Deploying the AWS lambda
49
+
This internal wiki page gives instructions for deploying a related bot: `How to deploy Check Slack Bot`
50
+
General AWS docs on how to deploy lambdas: https://docs.aws.amazon.com/lambda/latest/dg/lambda-deploy-functions.html
51
+
* If this is a release, bump the version number in `package.json`
52
+
* rename `config.js.example` to `config.js` (config.js is git ignored to avoid secrets)
53
+
* Run `npm install` to install all the required libraries locally so they will get packaged up by the build for deployment.
54
+
*`npm run build` this runs toplevel build script in `package.json` and creates a `google-factcheck-explorer-bot-lambda.zip` file with the bot script, and all of the requirements
55
+
56
+
* For the first deployment create a Lambda via the AWS web console similar to https://eu-west-1.console.aws.amazon.com/lambda/home?region=eu-west-1#/functions/qa-google-factcheck-explorer-bot
* The Lambda needs the API Gateway Trigger setup so that there is an external http endpoint that can be called.
59
+
* The endpoint url from the trigger needs to be set as the '`<webhook>`' when setting the bot configuration as per instructions on internal wiki `How to configure a webhook for a Check Bot`
60
+
* Lambda timeout can be increased to 3 minutes on the configuration tab
61
+
* Update environment (live/QA) appropriate secrets and config in Lambda's Configuration > Environment Variables section
62
+
*`CHECK_API_GOOGLE_FACT_CHECK_ACCESS_TOKEN` <-- this needs the key to the GoogleFactCheck feed workspace
63
+
*`CHECK_API_WORKSPACE_ACCESS_TOKEN` <-- this needs to authorize anotations on a team's ProjectMedia
64
+
*`CHECK_API_URL` <-- Usually `qa-check-api.checkmedia.org` or `check-api.checkmedia.org`
65
+
* To deploy, start an `aws cli` session and deploy local files to the lambda location (best for quickly redeploys during development)
* For 'real' deployments, we want to keep an archive of the deployed code, so best to deploy via https://s3.console.aws.amazon.com/s3/buckets/meedan-check-bot-deployments?region=eu-west-1&tab=objects and use the 'upload from S3 location' option in AWS Lambda console ui
68
+
* The Lambda can be tested in the AWS web console by firing an appropriately formatted 'test' event in the web console (Note that the team slug will need to correspond to the team hosting the project media and data dbid will be project media id)
69
+
*```
70
+
{
71
+
"body": "{\"event\": \"create_project_media\", \"team\": {\"dbid\": 1506991, \"id\": \"abcdefg\", \"avatar\": \"https://assets.checkmedia.org/uploads/team/6503/Group_89.png\", \"name\": \"Check testing\", \"slug\": \"check-testing\"}, \"data\": {\"type\": \"Claim\", \"dbid\": 19205, \"title\": \"Is it true Pakistan's PTI party is using Amitabh Bachchan and Madhuri Dixit photos on their campaign posters?\", \"description\": \"Charles III come\\u00e7a reinado em busca de monarquia simplificada e papel pol\\u00edtico mais ativo\"}}"
72
+
}
73
+
```
74
+
75
+
* The bot needs to be authorized to write to the project media of the target team by being added as TeamBotInstalation.
76
+
* The event structure sent by the webhook needs to match what the bot is expecting to parse out of the JSON payload, ie
0 commit comments