diff --git a/docs/collections.rst b/docs/collections.rst index 52e50567..02220878 100644 --- a/docs/collections.rst +++ b/docs/collections.rst @@ -9,25 +9,94 @@ Reading the social media platform's documentation provides further important details. Collection types - * `Twitter user timeline`_: Collect tweets from specific Twitter accounts - * `Twitter search`_: Collects tweets by a user-provided search query from recent tweets - * `Twitter sample`_: Collects a Twitter provided stream of a subset of all tweets in real - time. - * `Twitter filter`_: Collects tweets by user-provided criteria from a stream of - tweets in real time. + + * `Twitter user timeline (v. 2)`_: Collect tweets from specific Twitter accounts. + * `Twitter search (v. 2)`_: Collect tweets by a user-provided search query from recent tweets. * `Flickr user`_: Collects posts and photos from specific Flickr accounts * `Weibo timeline`_: Collects posts from the user and the user's friends * `Weibo search`_: Collects recent weibo posts by a user-provided search query * `Tumblr blog posts`_: Collects blog posts from specific Tumblr blogs -.. _guide-twitter-user-timelines: +Deprecated collection types -.. _Twitter user timeline: +As of April 29, 2023, new collections of these types have been deprecated, due to changes in the Twitter API. + + * `Twitter user timeline`_: Collect tweets from specific Twitter accounts. **Deprecated** + * `Twitter search`_: Collects tweets by a user-provided search query from recent tweets. **Deprecated** + * `Twitter sample`_: Collects a Twitter provided stream of a subset of all tweets in real. **Deprecated** + time. + * `Twitter filter`_: Collects tweets by user-provided criteria from a stream of + tweets in real time. **Deprecated** + +.. _guide-twitter-user-timeline-2: + +.. _Twitter user timeline (v. 2): + +---------------------------- +Twitter user timeline (v. 2) +---------------------------- + +Twitter user timeline collections collect the 3,200 most recent tweets from each of +a list of Twitter accounts using `Twitter's user_timeline API +`_. + +**Seeds** for Twitter user timelines are individual Twitter accounts. + +To identify a user timeline, you can provide a screen name +(the string after @, like NASA for @NASA) +or Twitter user ID (a numeric string which never changes, like 11348282 for +@NASA). If you provide one identifier, the other will be looked up and displayed +in SFM the first time the harvester runs. The user may change the screen name +over time, and the seed will be updated accordingly. + +The harvest schedule should depend on how prolific the Twitter users are. +In general, the more frequent the tweeter, the more frequent you’ll want to +schedule harvests. + +SFM will notify you when incorrect or private user timeline seeds are requested; +all other valid seeds will be collected. + +See :ref:`guide-incremental-collecting` to decide whether or not to collect +incrementally. + +.. _guide-twitter-search-2: + +.. _Twitter search (v. 2): --------------------- -Twitter user timeline +Twitter search (v. 2) --------------------- +Twitter searches collect tweets from the last 7-9 days that match search +queries, similar to a regular search done on Twitter, using +the `Twitter Search API `__. +This is **not** a complete search of all tweets; results are limited +both by time and arbitrary relevance (determined by Twitter). + +Search queries must follow the guidelines described in the Twitter documentation for `Building queries for Search Tweets `_. + +In SFM, each Twitter search (v. 2) collection can contain only one seed (query), though this may be a complex Boolean query. + +In creating the seed, you can also specify an upper limit (in number of Tweets). This technique is useful given the low monthly cap on data retrieval with Basic access. + +In choosing a schedule for your Twitter v. 2 collection, make sure to leave enough time between +searches. (If there is not enough time between searches, later harvests will +be skipped until earlier harvests complete.) In some cases, you may only +want to run the search once and then turn off the collection. + +See :ref:`guide-incremental-collecting` to decide whether or not to collect +incrementally. + +Only one active seed can be used per search collection. If you need to run multiple searches in parallel, create a new collection for each search, each with a single seed. + +.. _guide-twitter-user-timelines: + +.. _Twitter user timeline: + +---------------------------------- +Twitter user timeline (DEPRECATED) +---------------------------------- + Twitter user timeline collections collect the 3,200 most recent tweets from each of a list of Twitter accounts using `Twitter's user_timeline API `_. @@ -55,9 +124,9 @@ incrementally. .. _Twitter search: ---------------- -Twitter search ---------------- +--------------------------- +Twitter search (DEPRECATED) +--------------------------- Twitter searches collect tweets from the last 7-9 days that match search queries, similar to a regular search done on Twitter, using @@ -88,9 +157,9 @@ Only one active seed can be used per search collection. If you need to run multi .. _Twitter sample: --------------- -Twitter sample --------------- +--------------------------- +Twitter sample (DEPRECATED) +--------------------------- Twitter samples are a random collection of approximately 0.5--1% of public tweets, using the `Twitter sample stream @@ -109,9 +178,9 @@ Only one sample or :ref:`Twitter filter` can be run at a time per credential. .. _Twitter filter: ---------------- -Twitter filter ---------------- +--------------------------- +Twitter filter (DEPRECATED) +--------------------------- Twitter Filter collections harvest a live selection of public tweets from criteria matching keywords, locations, languages, or users, based on the diff --git a/docs/conf.py b/docs/conf.py index 159d9549..7b53cb59 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -55,7 +55,7 @@ # built documents. # # The full version, including alpha/beta/rc tags. -release = '2.5.0' +release = '3.0.0' # The short X.Y version. version = release[0:release.rindex(".")] diff --git a/docs/credentials.rst b/docs/credentials.rst index 19c59adc..5a3d399d 100644 --- a/docs/credentials.rst +++ b/docs/credentials.rst @@ -54,29 +54,14 @@ Accounts section of the Admin interface. Adding Twitter Credentials -------------------------- -As a user, the easiest way to set up Twitter credentials is to connect them to your -personal Twitter account or another Twitter account you control. If you want -more fine-tuned control, you can manually set up application-level credentials -(see below). To connect Twitter credentials, first sign in to Twitter with the account -you want to use. Then, on the Credentials page, click *Connect to Twitter*. Your browser will open a page from Twitter, asking you for authorization. Click *Authorize*, -and your credentials will automatically connect. Once credentials are connected, -you can start :ref:`guide-creating-collections`. - -Twitter application credentials can be obtained from `the Twitter API -`_. This process requires applying for -a developer account for your organization or your personal use and describing -your use case for SFM. Be sure to answer all of the questions in the -application. You may receive email follow-up requesting additional -information before the application is approved. - -Creating application credentials and manually adding Twitter credentials, -rather than connecting them automatically -using your Twitter account (see above), gives you greater control over your -credentials and allows you to use multiple credentials. +To harvest data from the Twitter API as of April 29, 2023, it is necessary to sign up **and pay for** `Basic access `_. (The "Free" access tier does not permit users to retrieve Tweets via API, only to publish them.) + +Due to the low monthly limits on data retrieval imposed by Twitter (10K Tweets per month, as of 4/29/2023), each SFM user should obtain their own API credentials. To obtain application credentials: * Navigate to ``_. - * Sign in to Twitter. + * Sign in to Twitter, or create an account if you don't already have one. + * Once you are logged into the Twitter Developer Portal, you can click the **Upgrade** button to upgrade your account to Basic Access. * Follow the prompts to describe your intended use case for academic research. * When a description for your app is requested, you may include: *This is an instance of Social Feed Manager, a social media research and @@ -89,25 +74,15 @@ To obtain application credentials: * It is recommended to change the application permissions to read-only. * **Review and agree to the Twitter Developer Agreement**. -You may need to wait several days for the account and app to be approved. One -approved, it is recommended that you: - * Click on your new application. - * Navigate to the *Permissions* tab. - * Select *Read only* then *Update settings*. - -You now have application-level credentials you can use in your ``.env`` file. - To manually add a Twitter Credential in your SFM user account: - * **Go to the Credentials page of SFM,** and click *Add Twitter Credential*. - * Fill out all fields: - * On the Twitter apps page (https://apps.twitter.com/) click your new - application. - * Navigate to the *Keys and Access Tokens* tab. - * From the top half of the page, copy and paste into the matching fields - in SFM: *Consumer Key* and *Consumer Secret*. - * From the bottom half of the page, copy and paste into the matching - fields in SFM: *Access Token* and *Access Token Secret*. - * **Click** *Save* + * Go to the Credentials page of SFM, and click `Add Twitter2 Credential`. + * There are two ways of saving your credentials in SFM: + 1. Enter an `API key`, `API key secret`, `Access token`, and `Access token secret`. + 2. Or enter a `Bearer token` (recommended). + * To obtain your credentials, visit the `Twitter Developer Portal Dashboard `_ and select your project. Under the `Apps` section, click on the key icon to access the `Keys and tokens` menu. + * Generate the credentials needed (either the API key/secret and Access token/secret, or the Bearer token). + * Save these keys, tokens, and secrets somewhere secure. + * Enter the credentials in the Twitter2 Credential form on SFM, and click `Save`. .. _flickr-credentials: diff --git a/docs/data_dictionary.rst b/docs/data_dictionary.rst index 7d134e9d..96aa5319 100644 --- a/docs/data_dictionary.rst +++ b/docs/data_dictionary.rst @@ -24,6 +24,8 @@ Twitter Dictionary For more info about source tweet data, see the `Twitter API documentation `_, including `Tweet data dictionaries `_. +V2 API data dictionary can be found here: +``_. Documentation about older archived tweets is archived by the Wayback Machine for the `Twitter API @@ -32,6 +34,9 @@ the `Twitter API and `Entities `_. + +V1 API + +------------------------------+-----------------------------------------------------+-------------------------------------------+ | Field | Description | Example | | | | | @@ -187,6 +192,138 @@ and `Entities | | | | +------------------------------+-----------------------------------------------------+-------------------------------------------+ + +V2 API + ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| Field | Description | Example | +| | | | ++==============================+=====================================================+===========================================+ +| id | Twitter identifier for the tweet. | 114749583439036416 | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| tweet_url | URL of the tweet on Twitter's website. If the tweet | https://twitter.com/NASA/ | +| | is a retweet, the URL will be redirected to the | status/394883921303056384 | +| | original tweet. | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| created_at | Date and time the tweet was created, in Twitter's | Fri Sep 16 17:16:47 +0000 2011 | +| | default format. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_username | The unique screen name of the account that authored | NASA | +| | the tweet, at the time the tweet was posted. Screen | | +| | names are generally displayed with a @ prefixed. | | +| | Note that an account’s screen name may change over | | +| | time. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| text | The text of the tweet. Newline characters are | Observing Hurricane Raymond Lashing | +| | replaced with a space. | Western Mexico: Low pressure System 96E | +| | | developed quickly over the… | +| | | http://t.co/YpffdKVrgm | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| tweet_type | original, reply, quote, or retweet | retweet | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| bbox | The geographic coordinates of the tweet. This is | [-0.22012208, 51.59248806] | +| | only enabled if geotagging is enabled on the | | +| | account. The value, if present, is of the form | | +| | [longitude, latitude]. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| hashtags | Hashtags from the tweet text, as a comma-separated | Mars, askNASA | +| | list. Hashtags are generally displayed with a # | | +| | prefixed. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| media | URLs of media objects (photos, videos, GIFs) that | https://twitter.com/NASA_Orion/status/ | +| | are attached to the tweet. | 394866827857100800/photo/1 | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| urls | URLs entered by user as part of tweet. Note that | http://instagram.com/p/gA_zQ5IaCz/ | +| | URL may be a shortened URL, e.g. from bit.ly. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| like_count | Number of times this tweet had been favorited/liked | 12 | +| | by other users at the time the tweet was collected. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| in_reply_to_user_id | If tweet is a reply, the user id of the author | 2244994945 | +| | of the tweet that is being replied to. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| lang | Language of the tweet text, as determined by | en | +| | Twitter. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| place | The user or application-provided geographic | Washington, DC | +| | location from which a tweet was posted. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| possibly_sensitive | Indicates that URL contained in the tweet may | true | +| | reference sensitive content. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| retweet_count | Number of times the tweet had been retweeted at | 25 | +| | the time the tweet was collected. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| referenced_tweets_id | If tweet is a retweet or quote tweet, the Twitter | 114749583439036416 | +| | identifier of the source tweet. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| source | The application from which the tweet was posted. | Twitter for | +| | | iPhone | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| author_id | Twitter identifier for the author of the tweet. | 481186914 | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_created_at | Date and time the tweet was created, in Twitter's | Wed Mar 18 13:46:38 +0000 2009 | +| | default format. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_default_profile_image | URL of the user's profile image. | https://pbs.twimg.com/profile_images/ | +| | | 942858479592554497/BbazLO9L_normal.jpg | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_description | The user-provided account description. Newline | The safest spacecraft designed by NASA, | +| | characters are replaced with a space. | Orion will carry humans to the moon and | +| | | beyond. | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_followers_count | Number of followers this account had at the time | 235 | +| | the tweet was collected. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_friends_count | Number of users this account was following at the | 114 | +| | time the tweet was collected. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_listed_count | Number of public lists that this user is a member | 3 | +| | of. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_location | The user's self-described location. Not necessarily | San Francisco, California | +| | an actual place. | | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| username | The user's self-provided name. | Orion Spacecraft | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_urls | URLs entered by user as part of user's description. | http://www.Instagram.com/realDonaldTrump | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| user_verified | Indicates that the user's account is verified. | true | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ +| referenced_tweets | A list of Tweets this Tweet refers to. | "Great book by @username on AI" | +| | | | ++------------------------------+-----------------------------------------------------+-------------------------------------------+ + + ----------------- Tumblr Dictionary ----------------- diff --git a/docs/userguide.rst b/docs/userguide.rst index 338034d9..ea3713c3 100644 --- a/docs/userguide.rst +++ b/docs/userguide.rst @@ -58,12 +58,9 @@ Users can then use this collected data for research, analysis or archiving. Some ideas for how to use SFM: - **Collecting from individual accounts** such as the tweets of every U.S. - Senator (:ref:`guide-twitter-user-timelines`). + Senator (:ref:`guide-twitter-user-timeline-2`). - **Gathering Flickr images for analysis** or archiving the photographs from accounts donated to your organization (:ref:`guide-flickr-user-timeline`). - - **Researching social media use** by retrieving a sample of all tweets - (:ref:`guide-twitter-sample`), or by filtering by specific search terms - (:ref:`guide-twitter-filter`). - **Capturing a major event** by collecting tweets in a specific geographic location or by following specific hashtags. - **Collecting Tumblr posts** for preserving institutional blogs or the work @@ -81,14 +78,10 @@ Here's a sample of what a collection set looks like: Types of Collections ^^^^^^^^^^^^^^^^^^^^ - * :ref:`guide-twitter-user-timelines`: Collect tweets from specific + * :ref:`guide-twitter-user-timeline-2`: Collect tweets from specific Twitter accounts - * :ref:`guide-twitter-search`: Collects tweets by a user-provided search query + * :ref:`guide-twitter-search-2`: Collects tweets by a user-provided search query from recent tweets - * :ref:`guide-twitter-sample`: Collects a Twitter-provided stream of a subset - of all tweets in real time. - * :ref:`guide-twitter-filter`: Collects tweets by user-provided criteria from - a stream of tweets in real time. * :ref:`guide-flickr-user-timeline`: Collects posts and photos from specific Flickr accounts * :ref:`guide-weibo-timelines`: Collects posts from the user and the user's @@ -225,10 +218,8 @@ for Twitter Sample and Sina Weibo: For details on each collection type, see: -| :ref:`guide-twitter-user-timelines` -| :ref:`guide-twitter-search` -| :ref:`guide-twitter-sample` -| :ref:`guide-twitter-filter` +| :ref:`guide-twitter-user-timeline-2` +| :ref:`guide-twitter-search-2` | :ref:`guide-flickr-user-timeline` | :ref:`guide-weibo-timelines` | :ref:`guide-tumblr-blog-posts`