From feb645034070f9287dcabfac931df31a8de07a79 Mon Sep 17 00:00:00 2001 From: Dolsy Smith Date: Mon, 24 Apr 2023 14:01:07 -0400 Subject: [PATCH 1/7] Updates for v.2 API --- docs/collections.rst | 89 +++++++++++++++++++++++++++++++++++++++----- docs/credentials.rst | 51 +++++++------------------ docs/userguide.rst | 14 ++----- 3 files changed, 96 insertions(+), 58 deletions(-) diff --git a/docs/collections.rst b/docs/collections.rst index 52e50567..306a4299 100644 --- a/docs/collections.rst +++ b/docs/collections.rst @@ -9,23 +9,92 @@ Reading the social media platform's documentation provides further important details. Collection types - * `Twitter user timeline`_: Collect tweets from specific Twitter accounts - * `Twitter search`_: Collects tweets by a user-provided search query from recent tweets - * `Twitter sample`_: Collects a Twitter provided stream of a subset of all tweets in real - time. - * `Twitter filter`_: Collects tweets by user-provided criteria from a stream of - tweets in real time. + + * `Twitter user timeline (v. 2)`_: Collect tweets from specific Twitter accounts. + * `Twitter search (v. 2)`_: Collect tweets by a user-provided search query from recent tweets. * `Flickr user`_: Collects posts and photos from specific Flickr accounts * `Weibo timeline`_: Collects posts from the user and the user's friends * `Weibo search`_: Collects recent weibo posts by a user-provided search query * `Tumblr blog posts`_: Collects blog posts from specific Tumblr blogs +Deprecated collection types + +As of April 29, 2023, new collections of these types have been deprecated, due to changes in the Twitter API. + + * `Twitter user timeline`_: Collect tweets from specific Twitter accounts. **Deprecated** + * `Twitter search`_: Collects tweets by a user-provided search query from recent tweets. **Deprecated** + * `Twitter sample`_: Collects a Twitter provided stream of a subset of all tweets in real. **Deprecated** + time. + * `Twitter filter`_: Collects tweets by user-provided criteria from a stream of + tweets in real time. **Deprecated** + +.. _guide-twitter-user-timeline-2: + +.. _Twitter user timeline (v. 2): + +--------------------- +Twitter user timeline (v. 2) +--------------------- + +Twitter user timeline collections collect the 3,200 most recent tweets from each of +a list of Twitter accounts using `Twitter's user_timeline API +`_. + +**Seeds** for Twitter user timelines are individual Twitter accounts. + +To identify a user timeline, you can provide a screen name +(the string after @, like NASA for @NASA) +or Twitter user ID (a numeric string which never changes, like 11348282 for +@NASA). If you provide one identifier, the other will be looked up and displayed +in SFM the first time the harvester runs. The user may change the screen name +over time, and the seed will be updated accordingly. + +The harvest schedule should depend on how prolific the Twitter users are. +In general, the more frequent the tweeter, the more frequent you’ll want to +schedule harvests. + +SFM will notify you when incorrect or private user timeline seeds are requested; +all other valid seeds will be collected. + +See :ref:`guide-incremental-collecting` to decide whether or not to collect +incrementally. + +.. _guide-twitter-search-2: + +.. _Twitter search (v. 2): + +--------------- +Twitter search (v. 2) +--------------- + +Twitter searches collect tweets from the last 7-9 days that match search +queries, similar to a regular search done on Twitter, using +the `Twitter Search API `__. +This is **not** a complete search of all tweets; results are limited +both by time and arbitrary relevance (determined by Twitter). + +Search queries must follow the guidelines described in the Twitter documentation for `Building queries for Search Tweets `_. + +In SFM, each Twitter search (v. 2) collection can contain only one seed (query), though this may be a complex Boolean query. + +In creating the seed, you can also specify an upper limit (in number of Tweets). This technique is useful given the low monthly cap on data retrieval with Basic access. + +In choosing a schedule for your Twitter v. 2 collection, make sure to leave enough time between +searches. (If there is not enough time between searches, later harvests will +be skipped until earlier harvests complete.) In some cases, you may only +want to run the search once and then turn off the collection. + +See :ref:`guide-incremental-collecting` to decide whether or not to collect +incrementally. + +Only one active seed can be used per search collection. If you need to run multiple searches in parallel, create a new collection for each search, each with a single seed. + .. _guide-twitter-user-timelines: .. _Twitter user timeline: --------------------- -Twitter user timeline +Twitter user timeline (DEPRECATED) --------------------- Twitter user timeline collections collect the 3,200 most recent tweets from each of @@ -56,7 +125,7 @@ incrementally. .. _Twitter search: --------------- -Twitter search +Twitter search (DEPRECATED) --------------- Twitter searches collect tweets from the last 7-9 days that match search @@ -89,7 +158,7 @@ Only one active seed can be used per search collection. If you need to run multi .. _Twitter sample: -------------- -Twitter sample +Twitter sample (DEPRECATED) -------------- Twitter samples are a random collection of approximately 0.5--1% of public @@ -110,7 +179,7 @@ Only one sample or :ref:`Twitter filter` can be run at a time per credential. .. _Twitter filter: --------------- -Twitter filter +Twitter filter (DEPRECATED) --------------- Twitter Filter collections harvest a live selection of public tweets from diff --git a/docs/credentials.rst b/docs/credentials.rst index 19c59adc..205aa914 100644 --- a/docs/credentials.rst +++ b/docs/credentials.rst @@ -54,29 +54,14 @@ Accounts section of the Admin interface. Adding Twitter Credentials -------------------------- -As a user, the easiest way to set up Twitter credentials is to connect them to your -personal Twitter account or another Twitter account you control. If you want -more fine-tuned control, you can manually set up application-level credentials -(see below). To connect Twitter credentials, first sign in to Twitter with the account -you want to use. Then, on the Credentials page, click *Connect to Twitter*. Your browser will open a page from Twitter, asking you for authorization. Click *Authorize*, -and your credentials will automatically connect. Once credentials are connected, -you can start :ref:`guide-creating-collections`. - -Twitter application credentials can be obtained from `the Twitter API -`_. This process requires applying for -a developer account for your organization or your personal use and describing -your use case for SFM. Be sure to answer all of the questions in the -application. You may receive email follow-up requesting additional -information before the application is approved. - -Creating application credentials and manually adding Twitter credentials, -rather than connecting them automatically -using your Twitter account (see above), gives you greater control over your -credentials and allows you to use multiple credentials. +To harvest data from the Twitter API as of April 29, 2023, it is necessary to sign up **and pay for** `Basic access `_. (The "Free" access tier does not permit users to retrieve Tweets via API, only to publish them.) + +Due to the low monthly limits on data retrieval imposed by Twitter (10K Tweets per month, as of 4/29/2023), each SFM user should obtain their own API credentials. To obtain application credentials: * Navigate to ``_. - * Sign in to Twitter. + * Sign in to Twitter, or create an account if you don't already have one. + * Once you are logged into the Twitter Developer Portal, you can click the **Upgrade** button to upgrade your account to Basic Access. * Follow the prompts to describe your intended use case for academic research. * When a description for your app is requested, you may include: *This is an instance of Social Feed Manager, a social media research and @@ -89,25 +74,15 @@ To obtain application credentials: * It is recommended to change the application permissions to read-only. * **Review and agree to the Twitter Developer Agreement**. -You may need to wait several days for the account and app to be approved. One -approved, it is recommended that you: - * Click on your new application. - * Navigate to the *Permissions* tab. - * Select *Read only* then *Update settings*. - -You now have application-level credentials you can use in your ``.env`` file. - To manually add a Twitter Credential in your SFM user account: - * **Go to the Credentials page of SFM,** and click *Add Twitter Credential*. - * Fill out all fields: - * On the Twitter apps page (https://apps.twitter.com/) click your new - application. - * Navigate to the *Keys and Access Tokens* tab. - * From the top half of the page, copy and paste into the matching fields - in SFM: *Consumer Key* and *Consumer Secret*. - * From the bottom half of the page, copy and paste into the matching - fields in SFM: *Access Token* and *Access Token Secret*. - * **Click** *Save* + * **Go to the Credentials page of SFM,** and click *Add Twitter2 Credential*. + * There are two ways of saving your credentials in SFM: + 1. Enter an `API key`, `API key secret`, `Access token`, and `Access token secret`. + 2. Or enter a `Bearer token` (recommended). + * To obtain your credentials, visit the `Twitter Developer Portal Dashboard `_and select your project. Under the `Apps` section, click on the key icon to access the `Keys and tokens` menu. + * Generate the credentials needed (either the API key/secret and Access token/secret, or the Bearer token). + * Save these keys, tokens, and secrets somewhere secure. + * Enter the credentials in the Twitter2 Credential form on SFM, and click `Save`. .. _flickr-credentials: diff --git a/docs/userguide.rst b/docs/userguide.rst index 338034d9..888dbaaa 100644 --- a/docs/userguide.rst +++ b/docs/userguide.rst @@ -81,14 +81,10 @@ Here's a sample of what a collection set looks like: Types of Collections ^^^^^^^^^^^^^^^^^^^^ - * :ref:`guide-twitter-user-timelines`: Collect tweets from specific + * :ref:`guide-twitter-user-timeline-2`: Collect tweets from specific Twitter accounts - * :ref:`guide-twitter-search`: Collects tweets by a user-provided search query + * :ref:`guide-twitter-search-2`: Collects tweets by a user-provided search query from recent tweets - * :ref:`guide-twitter-sample`: Collects a Twitter-provided stream of a subset - of all tweets in real time. - * :ref:`guide-twitter-filter`: Collects tweets by user-provided criteria from - a stream of tweets in real time. * :ref:`guide-flickr-user-timeline`: Collects posts and photos from specific Flickr accounts * :ref:`guide-weibo-timelines`: Collects posts from the user and the user's @@ -225,10 +221,8 @@ for Twitter Sample and Sina Weibo: For details on each collection type, see: -| :ref:`guide-twitter-user-timelines` -| :ref:`guide-twitter-search` -| :ref:`guide-twitter-sample` -| :ref:`guide-twitter-filter` +| :ref:`guide-twitter-user-timeline-2` +| :ref:`guide-twitter-search-2` | :ref:`guide-flickr-user-timeline` | :ref:`guide-weibo-timelines` | :ref:`guide-tumblr-blog-posts` From 37a22ac3fcb72fb058df887128afa8253467cca1 Mon Sep 17 00:00:00 2001 From: Dolsy Smith Date: Mon, 24 Apr 2023 14:14:52 -0400 Subject: [PATCH 2/7] Fixed formatting --- docs/collections.rst | 24 ++++++++++++------------ docs/userguide.rst | 5 +---- 2 files changed, 13 insertions(+), 16 deletions(-) diff --git a/docs/collections.rst b/docs/collections.rst index 306a4299..02220878 100644 --- a/docs/collections.rst +++ b/docs/collections.rst @@ -32,9 +32,9 @@ As of April 29, 2023, new collections of these types have been deprecated, due t .. _Twitter user timeline (v. 2): ---------------------- +---------------------------- Twitter user timeline (v. 2) ---------------------- +---------------------------- Twitter user timeline collections collect the 3,200 most recent tweets from each of a list of Twitter accounts using `Twitter's user_timeline API @@ -63,9 +63,9 @@ incrementally. .. _Twitter search (v. 2): ---------------- +--------------------- Twitter search (v. 2) ---------------- +--------------------- Twitter searches collect tweets from the last 7-9 days that match search queries, similar to a regular search done on Twitter, using @@ -93,9 +93,9 @@ Only one active seed can be used per search collection. If you need to run multi .. _Twitter user timeline: ---------------------- +---------------------------------- Twitter user timeline (DEPRECATED) ---------------------- +---------------------------------- Twitter user timeline collections collect the 3,200 most recent tweets from each of a list of Twitter accounts using `Twitter's user_timeline API @@ -124,9 +124,9 @@ incrementally. .. _Twitter search: ---------------- +--------------------------- Twitter search (DEPRECATED) ---------------- +--------------------------- Twitter searches collect tweets from the last 7-9 days that match search queries, similar to a regular search done on Twitter, using @@ -157,9 +157,9 @@ Only one active seed can be used per search collection. If you need to run multi .. _Twitter sample: --------------- +--------------------------- Twitter sample (DEPRECATED) --------------- +--------------------------- Twitter samples are a random collection of approximately 0.5--1% of public tweets, using the `Twitter sample stream @@ -178,9 +178,9 @@ Only one sample or :ref:`Twitter filter` can be run at a time per credential. .. _Twitter filter: ---------------- +--------------------------- Twitter filter (DEPRECATED) ---------------- +--------------------------- Twitter Filter collections harvest a live selection of public tweets from criteria matching keywords, locations, languages, or users, based on the diff --git a/docs/userguide.rst b/docs/userguide.rst index 888dbaaa..ea3713c3 100644 --- a/docs/userguide.rst +++ b/docs/userguide.rst @@ -58,12 +58,9 @@ Users can then use this collected data for research, analysis or archiving. Some ideas for how to use SFM: - **Collecting from individual accounts** such as the tweets of every U.S. - Senator (:ref:`guide-twitter-user-timelines`). + Senator (:ref:`guide-twitter-user-timeline-2`). - **Gathering Flickr images for analysis** or archiving the photographs from accounts donated to your organization (:ref:`guide-flickr-user-timeline`). - - **Researching social media use** by retrieving a sample of all tweets - (:ref:`guide-twitter-sample`), or by filtering by specific search terms - (:ref:`guide-twitter-filter`). - **Capturing a major event** by collecting tweets in a specific geographic location or by following specific hashtags. - **Collecting Tumblr posts** for preserving institutional blogs or the work From 837daa9bceee8a7ed8f2b89cfb61d5e033505084 Mon Sep 17 00:00:00 2001 From: Adhithya Kiran Date: Mon, 24 Apr 2023 15:58:34 -0400 Subject: [PATCH 3/7] updated data dictionary --- docs/collections.rst | 271 --------------------------------------- docs/data_dictionary.rst | 3 + 2 files changed, 3 insertions(+), 271 deletions(-) delete mode 100644 docs/collections.rst diff --git a/docs/collections.rst b/docs/collections.rst deleted file mode 100644 index 52e50567..00000000 --- a/docs/collections.rst +++ /dev/null @@ -1,271 +0,0 @@ -================ -Collection types -================ - -Each collection type connects to one of a social media platform's APIs, or -methods for retrieving data. Understanding what each collection type provides is -important to ensure you collect what you need and are aware of any limitations. -Reading the social media platform's documentation provides further important -details. - -Collection types - * `Twitter user timeline`_: Collect tweets from specific Twitter accounts - * `Twitter search`_: Collects tweets by a user-provided search query from recent tweets - * `Twitter sample`_: Collects a Twitter provided stream of a subset of all tweets in real - time. - * `Twitter filter`_: Collects tweets by user-provided criteria from a stream of - tweets in real time. - * `Flickr user`_: Collects posts and photos from specific Flickr accounts - * `Weibo timeline`_: Collects posts from the user and the user's friends - * `Weibo search`_: Collects recent weibo posts by a user-provided search query - * `Tumblr blog posts`_: Collects blog posts from specific Tumblr blogs - -.. _guide-twitter-user-timelines: - -.. _Twitter user timeline: - ---------------------- -Twitter user timeline ---------------------- - -Twitter user timeline collections collect the 3,200 most recent tweets from each of -a list of Twitter accounts using `Twitter's user_timeline API -`_. - -**Seeds** for Twitter user timelines are individual Twitter accounts. - -To identify a user timeline, you can provide a screen name -(the string after @, like NASA for @NASA) -or Twitter user ID (a numeric string which never changes, like 11348282 for -@NASA). If you provide one identifier, the other will be looked up and displayed -in SFM the first time the harvester runs. The user may change the screen name -over time, and the seed will be updated accordingly. - -The harvest schedule should depend on how prolific the Twitter users are. -In general, the more frequent the tweeter, the more frequent you’ll want to -schedule harvests. - -SFM will notify you when incorrect or private user timeline seeds are requested; -all other valid seeds will be collected. - -See :ref:`guide-incremental-collecting` to decide whether or not to collect -incrementally. - -.. _guide-twitter-search: - -.. _Twitter search: - ---------------- -Twitter search ---------------- - -Twitter searches collect tweets from the last 7-9 days that match search -queries, similar to a regular search done on Twitter, using -the `Twitter Search API `__. -This is **not** a complete search of all tweets; results are limited -both by time and arbitrary relevance (determined by Twitter). - -Search queries must follow standard search term formulation; permitted queries -are listed in the documentation for the `Twitter Search API -`__, -or you can construct a query -using the `Twitter Advanced Search query builder -`_. - -Broad Twitter searches may take longer to complete -- possibly days -- due -to Twitter’s rate limits and the amount of data available from the Search -API. In choosing a schedule, make sure that there is enough time between -searches. (If there is not enough time between searches, later harvests will -be skipped until earlier harvests complete.) In some cases, you may only -want to run the search once and then turn off the collection. - -See :ref:`guide-incremental-collecting` to decide whether or not to collect -incrementally. - -Only one active seed can be used per search collection. If you need to run multiple searches in parallel, create a new collection for each search, each with a single seed. - -.. _guide-twitter-sample: - -.. _Twitter sample: - --------------- -Twitter sample --------------- - -Twitter samples are a random collection of approximately 0.5--1% of public -tweets, using the `Twitter sample stream -`_, useful for -capturing a sample of what people are talking about on Twitter. -The Twitter sample stream returns approximately 0.5-1% of public tweets, -which is approximately 3GB a day (compressed). - -Unlike other Twitter collections, there are no seeds for a Twitter sample. - -When on, the sample returns data every 30 minutes. - -Only one sample or :ref:`Twitter filter` can be run at a time per credential. - -.. _guide-twitter-filter: - -.. _Twitter filter: - ---------------- -Twitter filter ---------------- - -Twitter Filter collections harvest a live selection of public tweets from -criteria matching keywords, locations, languages, or users, based on the -`Twitter filter streaming API -`_. Because -tweets are collected live, tweets from the past are not included. (Use a -:ref:`Twitter search` collection to find tweets from the recent past.) - -There are four different filter queries supported by SFM: track, follow, -location, and language. - -**Track** collects tweets based on a keyword search. A space between words -is treated as 'AND' and a comma is treated as 'OR'. Note that exact phrase -matching is not supported. See the `track parameter documentation -`_ for more -information. - -- Note: When entering a comma-separated list of search terms for the track or follow parameters, make sure to use the standard ``,`` character. When typing in certain languages that use a non-Roman alphabet, a different character is generated for commas. For example, when typing in languages such as Arabic, Farsi, Urdu, etc., typing a comma generates the ``،`` character. To avoid errors, the Track parameter should use the Roman ``,`` character; for example: سواقة المرأه , قرار قيادة سيارة - -**Follow** collects tweets that are posted by or about a user (not including -mentions) from a comma separated list of user IDs (the numeric identifier for -a user account). Tweets collected will include those made by the user, retweeting -the user, or replying to the user. See the `follow parameter documentation -`_ for -more information. - -- Note: The Twitter website does not provide a way to look up the user ID for a user account. You can use `https://tweeterid.com `_ for this purpose. - - -**Location** collects tweets that were geolocated within specific parameters, -based on a bounding box made using the southwest and northeast corner -coordinates. See the `location parameter documentation -`_ for -more information. - -**Language** collects tweets that Twitter detected as being written in the specified languages. -For example, specifying `en,es` will only collect Tweets detected to be in the English or Spanish languages. -See the `language parameter documentation -`_ for -more information. - -Twitter will return a limited number of tweets, so filters that return many -results will not return all available tweets. Therefore, more narrow filters -will usually return more complete results. - -Only one filter or :ref:`Twitter sample` can be run at a time per credential. - -SFM captures the filter stream in 30 minute chunks and then momentarily stops. -Between rate limiting and these momentary stops, you should never assume that -you are getting every tweet. - -There is only one seed in a filter collection. Twitter filter collection are -either turned on or off (there is no schedule). - -.. _guide-flickr-user-timeline: - -.. _Flickr user: - ------------ -Flickr user ------------ - -Flickr User Timeline collections gather metadata about public photos by a -specific Flickr user, and, optionally, copies of the photos at specified sizes. - -Each Flickr user collection can have multiple seeds, where each seed is a Flickr -user. To identify a user, you can provide a either a username or an NSID. If you -provide one, the other will be looked up and displayed in the SFM UI during the -first harvest. The NSID is a unique identifier and does not change; usernames -may be changed but are unique. - -Usernames can be difficult to find, so to ensure that you have the correct -account, use `this tool `_ to find the -NSID from the account URL (i.e., the URL when viewing the account on the Flickr -website). - -Depending on the image sizes you select, the actual photo files will be -collected as well. Be very careful in selecting the original file size, as this -may require a significant amount of storage. Also note that some Flickr users -may have a large number of public photos, which may require a significant amount -of storage. It is advisable to check the Flickr website to determine the number -of photos in each Flickr user's public photo stream before harvesting. - -For each user, the user's information will be collected using Flickr's -`people.getInfo `_ -API and the list of her public photos will be retrieved from `people.getPublicPhotos -`_. -Information on each photo will be collected with -`photos.getInfo `_. - -See :ref:`guide-incremental-collecting` to decide whether or not to collect -incrementally. - -.. _guide-tumblr-blog-posts: - -.. _Tumblr blog posts: - ------------------ -Tumblr blog posts ------------------ - -Tumblr Blog Post collections harvest posts by specified Tumblr blogs using the -`Tumblr Posts API `_. - -**Seeds** are individual blogs for these collections. Blogs can be specified with -or without the .tumblr.com extension. - -See :ref:`guide-incremental-collecting` to decide whether or not to collect incrementally. - -.. _guide-weibo-timelines: -.. _Weibo timeline: - --------------- -Weibo timeline --------------- - -Weibo Timeline collections harvest weibos (microblogs) by the user and friends -of the user whose credentials are provided using the `Weibo friends_timeline API -`_. - -Note that because collection is determined by the user whose credentials are -provided, there are no seeds for a Weibo timeline collection. To change what is -being collected, change the user's friends from the Weibo website or app. - -.. _Weibo search: - --------------- -Weibo search --------------- - -Collects recent weibos that match a search query using the `Weibo -search_topics API `_. -The Weibo API does not return a complete search of all Weibo posts. -It only returns the most recent 200 posts matching a single keyword -when found between pairs of '#' in Weibo posts (for example: `#keyword#` or -`#你好#`) - -The incremental option will attempt to only count weibo posts that haven't been harvested before, -maintaining a count of non-duplicate weibo posts. Because the Weibo search API does not accept -`since_id` or `max_id` parameters, filtering out already-harvested weibos from the -search count is accomplished within SFM. - -When the incremental option is not selected, the search will be performed again, -and there will most likely be duplicates in the count. - - -.. _guide-incremental-collecting: - ----------------------- -Incremental collecting ----------------------- - -The incremental option is the default and will collect tweets or posts that have been published since the last harvest. -When the incremental option is not selected, the maximum number of tweets or posts will be harvested each -time the harvest runs. If a non-incremental harvest is performed multiple times, there will most likely be -duplicates. However, with these duplicates, you may be able to track changes across time in a user's -timeline, such as changes in retweet and like counts, deletion of tweets, and follower counts. diff --git a/docs/data_dictionary.rst b/docs/data_dictionary.rst index 7d134e9d..1601888a 100644 --- a/docs/data_dictionary.rst +++ b/docs/data_dictionary.rst @@ -186,6 +186,9 @@ and `Entities | user_verified | Indicates that the user's account is verified. | true | | | | | +------------------------------+-----------------------------------------------------+-------------------------------------------+ +| referenced_tweets | Describes referenced tweets in current tweet. | Referenced,replied or retweets | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ ----------------- Tumblr Dictionary From 6192ea32492c2000e8f2b38aadd679d2f0d5a609 Mon Sep 17 00:00:00 2001 From: Adhithya Kiran Date: Mon, 24 Apr 2023 16:05:36 -0400 Subject: [PATCH 4/7] updated data dictionary,typo --- docs/data_dictionary.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/data_dictionary.rst b/docs/data_dictionary.rst index 1601888a..c2e3c670 100644 --- a/docs/data_dictionary.rst +++ b/docs/data_dictionary.rst @@ -186,7 +186,7 @@ and `Entities | user_verified | Indicates that the user's account is verified. | true | | | | | +------------------------------+-----------------------------------------------------+-------------------------------------------+ -| referenced_tweets | Describes referenced tweets in current tweet. | Referenced,replied or retweets | +| referenced_tweets | Describes referenced tweets in current tweet. | Referenced,replied or retweets | | | | | +------------------------------+-----------------------------------------------------+-------------------------------------------+ From 1ff2e391b84f361fa31b46fa94fde47c0382df92 Mon Sep 17 00:00:00 2001 From: Adhithya Kiran Date: Thu, 27 Apr 2023 18:13:35 -0400 Subject: [PATCH 5/7] Added v2 api data dictionary --- docs/data_dictionary.rst | 133 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 132 insertions(+), 1 deletion(-) diff --git a/docs/data_dictionary.rst b/docs/data_dictionary.rst index c2e3c670..ac48658d 100644 --- a/docs/data_dictionary.rst +++ b/docs/data_dictionary.rst @@ -24,6 +24,8 @@ Twitter Dictionary For more info about source tweet data, see the `Twitter API documentation `_, including `Tweet data dictionaries `_. +V2 API data dictionary can be found here: +``_. Documentation about older archived tweets is archived by the Wayback Machine for the `Twitter API @@ -32,6 +34,9 @@ the `Twitter API and `Entities `_. + +V1 API + +------------------------------+-----------------------------------------------------+-------------------------------------------+ | Field | Description | Example | | | | | @@ -186,9 +191,135 @@ and `Entities | user_verified | Indicates that the user's account is verified. | true | | | | | +------------------------------+-----------------------------------------------------+-------------------------------------------+ -| referenced_tweets | Describes referenced tweets in current tweet. | Referenced,replied or retweets | + + +V2 API + ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| Field | Description | Example | +| | | | ++==============================+=====================================================+===========================================+ +| id | Twitter identifier for the tweet. | 114749583439036416 | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| tweet_url | URL of the tweet on Twitter's website. If the tweet | https://twitter.com/NASA/ | +| | is a retweet, the URL will be redirected to the | status/394883921303056384 | +| | original tweet. | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| created_at | Date and time the tweet was created, in Twitter's | Fri Sep 16 17:16:47 +0000 2011 | +| | default format. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_username | The unique screen name of the account that authored | NASA | +| | the tweet, at the time the tweet was posted. Screen | | +| | names are generally displayed with a @ prefixed. | | +| | Note that an account’s screen name may change over | | +| | time. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| text | The text of the tweet. Newline characters are | Observing Hurricane Raymond Lashing | +| | replaced with a space. | Western Mexico: Low pressure System 96E | +| | | developed quickly over the… | +| | | http://t.co/YpffdKVrgm | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| tweet_type | original, reply, quote, or retweet | retweet | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| bbox | The geographic coordinates of the tweet. This is | [-0.22012208, 51.59248806] | +| | only enabled if geotagging is enabled on the | | +| | account. The value, if present, is of the form | | +| | [longitude, latitude]. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| hashtags | Hashtags from the tweet text, as a comma-separated | Mars, askNASA | +| | list. Hashtags are generally displayed with a # | | +| | prefixed. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| media | URLs of media objects (photos, videos, GIFs) that | https://twitter.com/NASA_Orion/status/ | +| | are attached to the tweet. | 394866827857100800/photo/1 | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| urls | URLs entered by user as part of tweet. Note that | http://instagram.com/p/gA_zQ5IaCz/ | +| | URL may be a shortened URL, e.g. from bit.ly. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| like_count | Number of times this tweet had been favorited/liked | 12 | +| | by other users at the time the tweet was collected. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| in_reply_to_user_id | If tweet is a reply, the user id of the author | 2244994945 | +| | of the tweet that is being replied to. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| lang | Language of the tweet text, as determined by | en | +| | Twitter. | | | | | | +------------------------------+-----------------------------------------------------+-------------------------------------------+ +| place | The user or application-provided geographic | Washington, DC | +| | location from which a tweet was posted. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| possibly_sensitive | Indicates that URL contained in the tweet may | true | +| | reference sensitive content. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| retweet_count | Number of times the tweet had been retweeted at | 25 | +| | the time the tweet was collected. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| referenced_tweets_id | If tweet is a retweet or quote tweet, the Twitter | 114749583439036416 | +| | identifier of the source tweet. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| source | The application from which the tweet was posted. | Twitter for | +| | | iPhone | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| author_id | Twitter identifier for the author of the tweet. | 481186914 | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_created_at | Date and time the tweet was created, in Twitter's | Wed Mar 18 13:46:38 +0000 2009 | +| | default format. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_default_profile_image | URL of the user's profile image. | https://pbs.twimg.com/profile_images/ | +| | | 942858479592554497/BbazLO9L_normal.jpg | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_description | The user-provided account description. Newline | The safest spacecraft designed by NASA, | +| | characters are replaced with a space. | Orion will carry humans to the moon and | +| | | beyond. | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_followers_count | Number of followers this account had at the time | 235 | +| | the tweet was collected. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_friends_count | Number of users this account was following at the | 114 | +| | time the tweet was collected. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_listed_count | Number of public lists that this user is a member | 3 | +| | of. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_location | The user's self-described location. Not necessarily | San Francisco, California | +| | an actual place. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| username | The user's self-provided name. | Orion Spacecraft | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_urls | URLs entered by user as part of user's description. | http://www.Instagram.com/realDonaldTrump | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_verified | Indicates that the user's account is verified. | true | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ + ----------------- Tumblr Dictionary From 501da80e9e54ce26b701060a7ac91f0df06a73e1 Mon Sep 17 00:00:00 2001 From: adhithyakiran Date: Fri, 28 Apr 2023 11:12:36 -0400 Subject: [PATCH 6/7] referenced_tweets in v2 data dict --- docs/data_dictionary.rst | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/data_dictionary.rst b/docs/data_dictionary.rst index ac48658d..96aa5319 100644 --- a/docs/data_dictionary.rst +++ b/docs/data_dictionary.rst @@ -319,6 +319,9 @@ V2 API | user_verified | Indicates that the user's account is verified. | true | | | | | +------------------------------+-----------------------------------------------------+-------------------------------------------+ +| referenced_tweets | A list of Tweets this Tweet refers to. | "Great book by @username on AI" | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ ----------------- From a332f46a7625a79f78f5327796f92d3644b134d1 Mon Sep 17 00:00:00 2001 From: Dolsy Smith Date: Tue, 23 May 2023 11:18:41 -0400 Subject: [PATCH 7/7] Minor fixes --- docs/conf.py | 2 +- docs/credentials.rst | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/conf.py b/docs/conf.py index 159d9549..7b53cb59 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -55,7 +55,7 @@ # built documents. # # The full version, including alpha/beta/rc tags. -release = '2.5.0' +release = '3.0.0' # The short X.Y version. version = release[0:release.rindex(".")] diff --git a/docs/credentials.rst b/docs/credentials.rst index 205aa914..5a3d399d 100644 --- a/docs/credentials.rst +++ b/docs/credentials.rst @@ -75,11 +75,11 @@ To obtain application credentials: * **Review and agree to the Twitter Developer Agreement**. To manually add a Twitter Credential in your SFM user account: - * **Go to the Credentials page of SFM,** and click *Add Twitter2 Credential*. + * Go to the Credentials page of SFM, and click `Add Twitter2 Credential`. * There are two ways of saving your credentials in SFM: 1. Enter an `API key`, `API key secret`, `Access token`, and `Access token secret`. 2. Or enter a `Bearer token` (recommended). - * To obtain your credentials, visit the `Twitter Developer Portal Dashboard `_and select your project. Under the `Apps` section, click on the key icon to access the `Keys and tokens` menu. + * To obtain your credentials, visit the `Twitter Developer Portal Dashboard `_ and select your project. Under the `Apps` section, click on the key icon to access the `Keys and tokens` menu. * Generate the credentials needed (either the API key/secret and Access token/secret, or the Bearer token). * Save these keys, tokens, and secrets somewhere secure. * Enter the credentials in the Twitter2 Credential form on SFM, and click `Save`.