|
1 | 1 | # kdb+taq
|
2 | 2 |
|
3 |
| -## Changes to kdb+taq |
4 |
| - |
5 |
| -### 2022.01.06 (`Bid_Price` type fix in tq.q) |
6 |
| -changing `Bid_Price` from real to float to avoid `Bid_Price`, `Offer_Price` type inconsistency |
7 |
| - |
8 |
| -### 2020.06.12 (version 3.3 - sync with NYSE daily taq version 3.3) |
9 |
| -handle (ignore) additional flags TradedOnMEMX and TradedOnMIAX in master file |
10 |
| -https://www.nyse.com/publicdocs/nyse/data/Daily_TAQ_Client_Spec_v3.3.pdf |
11 |
| - |
12 |
| -### 2020.04.15 (version 3.2 - sync with NYSE daily taq version 3.2) |
13 |
| -handle (ignore) additional flag TradedOnLTSE in master file |
14 |
| -https://www.nyse.com/publicdocs/nyse/data/Daily_TAQ_Client_Spec_v3.2.pdf |
15 |
| - |
16 |
| -### 2018.10.05 |
17 |
| -small changes to field load types (mostly just wider) - see before/after |
18 |
| -versions of nf2, tf2, qf2 |
19 |
| - |
20 |
| -### 2017.10.27 (version 3.0 - jump in version number to sync with NYSE daily taq version 3.0b) |
21 |
| -support NYSE change to schema as specified in: |
22 |
| -https://list.theice.com/t/92262/395348/57007/0/ |
23 |
| - |
24 |
| -### 2016.10.18 (version 2.2 - jump in version number to sync with NYSE daily taq specs) |
25 |
| -support NYSE change to schema, format and datacentre as specified in: |
26 |
| -http://www.nyxdata.com/doc/247075 |
27 |
| - |
28 |
| -taq.k reads the new data files, but saves data to the old schema, discarding |
29 |
| -nano-second timestamp precision. It is a drop-in replacement for the previous |
30 |
| -version for those who don't want to move to the new schema yet. |
31 |
| - |
32 |
| -tq.q reads and exports ALL fields for trade, quote and nbbo |
33 |
| - |
34 |
| -### 2015.07.27 (version 1.12) |
35 |
| -Timestamp precision extended from millisecond to microseconds (additional 3 digits ignored for now). |
36 |
| -Ignore 3 additional fields participant timestamp, RRN and TRF |
37 |
| -http://www.nyxdata.com/nysedata/default.aspx?tabid=993&id=2784 |
38 |
| - |
39 |
| -### 2014.04.28 (version 1.11) |
40 |
| -Support additional exchange codes (12-15 - ZJKY) in taqmaster |
41 |
| - |
42 |
| -### 2014.01.15 (version 1.10) |
43 |
| -Amend taq.k to discard the first line of taqmaster* when it is just lists the record count |
44 |
| - |
45 |
| -### 2013.11.25 (version 1.9) |
46 |
| -Amend taq.k and tq.q to recognise and ignore the new quote fields: |
47 |
| -- SIP-generated Message Identifier |
48 |
| -- National BBO LULD Indicator |
49 |
| -tq.q to recognise and ignore the new NBBO fields: |
50 |
| -- Limit-Up/Limit-Down Indicator |
51 |
| -- Limit-up/Limit-down NBBO Indicator |
52 |
| -- SIP-generated Message Identifier |
53 |
| -and handle deletion of previous NBBO fields: |
54 |
| -- Limit-Up/Limit-Down NBBO (UTP) Indicator |
55 |
| -- Limit-Up/Limit-Down NBBO (CQS) Indicator |
56 |
| - |
57 |
| -The format change is scheduled to take effect from 2nd December 2013 |
58 |
| -see: |
59 |
| -http://www.nyxdata.com/nysedata/Default.aspx?tabID=993&id=2194 |
60 |
| - |
61 |
| -### 2013.01.22 (version 1.8) |
62 |
| -Amended taq.k and tq.q to recognise and ignore the new quote fields: |
63 |
| -- Short Sale Restriction (SSR) Indicator |
64 |
| -- Limit-Up/Limit-Down BBO (UTP) Indicator |
65 |
| -- Limit-Up/Limit-Down BBO (CQS) Indicator |
66 |
| -- FINRA ADF MPID Indicator |
67 |
| -and tq.q to recognise and ignore the new NBBO fields: |
68 |
| -- Limit-Up/Limit-Down NBBO (UTP) Indicator |
69 |
| -- Limit-Up/Limit-Down NBBO (CQS) Indicator |
70 |
| - |
71 |
| -The format change is scheduled to take effect from 1st February 2013 |
72 |
| -see: |
73 |
| -http://www.nyxdata.com/nysedata/default.aspx?tabid=993&id=1771 |
74 |
| -http://www.nyxdata.com/doc/185107 |
75 |
| - |
76 |
| -### 2012.07.31 (version 1.7) |
77 |
| -Amended taq.k and tq.q to recognise (and ignore) the new Quote field RPI (Retail Interest Indicator) |
78 |
| - |
79 |
| -### 2011.10.07 |
80 |
| -Added tq.q as example of how to load more fields from the FTP files. |
81 |
| -taq.k is unchanged. |
82 |
| - |
83 |
| -### 2011.08.11 |
84 |
| -Adjust partitioning used with -par to split at a whole symbol – so any particular symbol will only be in one partition. |
85 |
| - |
86 |
| -### 2011.08.07 |
87 |
| -Enabled to handle >2billion rows in input file. Use in combination with `-par` cmd line option and par.txt |
88 |
| - |
89 |
| -### 2010.08.19 |
90 |
| -Amend the file detection code to pick up the new file names (as well as the old ones) |
91 |
| -when NYSE changes the filenames for the FTP download on September 17th, 2010. |
92 |
| - |
93 |
| - |
94 |
| -## Hot linking from your application |
95 |
| - |
| 3 | +kdb-taq is a tool for processing and analyzing historical NYSE Daily TAQ (Trade and Quote) data using kdb+/q. This repository contains scripts and utilities to parse, load, and query TAQ datasets efficiently. |
| 4 | + |
| 5 | +## Prerequisites |
| 6 | + |
| 7 | +- [kdb+](https://kx.com/kdb-personal-edition-download/) installed on your machine |
| 8 | +- NYSE Daily TAQ files from [ftp.nyse.com](ftp.nyse.com) |
| 9 | + |
| 10 | +## Getting Started |
| 11 | + |
| 12 | +Follow the steps below to set up and process a TAQ file: |
| 13 | + |
| 14 | +### 1. Download a Sample TAQ File |
| 15 | + |
| 16 | +Obtain TAQ data files from the NYSE FTP link. For example: |
| 17 | + |
| 18 | +``` |
| 19 | +wget https://ftp.nyse.com/Historical%20Data%20Samples/DAILY%20TAQ/EQY_US_ALL_TRADE_20240702.gz |
| 20 | +``` |
| 21 | + |
| 22 | +These files are ~2GB each so may take significant time to download. |
| 23 | + |
| 24 | +### 2. Clone the Repository |
| 25 | + |
| 26 | +Clone the kdb-taq repository to your server: |
| 27 | + |
| 28 | +``` |
| 29 | +git clone https://github.com/KxSystems/kdb-taq.git |
| 30 | +cd kdb-taq |
| 31 | +``` |
| 32 | + |
| 33 | +### 3. Prepare the Data |
| 34 | + |
| 35 | +Create a source directory and move the downloaded TAQ file to this and decompress it: |
| 36 | + |
| 37 | +``` |
| 38 | +mkdir SRC |
| 39 | +mv /path/to/EQY_US_ALL_TRADE_20240702.gz SRC/ |
| 40 | +gzip -d SRC/* |
| 41 | +``` |
| 42 | + |
| 43 | +### 4. Process the TAQ Data |
| 44 | + |
| 45 | +Run the tq.q script to process the data. Replace SRC with the full path to the source directory if necessary: |
| 46 | +``` |
| 47 | +q tq.q -s 8 SRC |
| 48 | +``` |
| 49 | + |
| 50 | +The -s option specifies the number of threads (optional). |
| 51 | + |
| 52 | +### 5. Load the Processed Data |
| 53 | + |
| 54 | +Load the data into the kdb+ environment: |
| 55 | +``` |
| 56 | +q)\l tq |
| 57 | +``` |
| 58 | + |
| 59 | +### 6. Query the Data |
| 60 | + |
| 61 | +You can now query the loaded data. For example runnning `meta` to see the table schema and datatypes: |
| 62 | + |
| 63 | +``` |
| 64 | +q)meta trade |
| 65 | +
|
| 66 | +c | t f a |
| 67 | +----------------------------------| ----- |
| 68 | +date | d |
| 69 | +Time | n |
| 70 | +Exchange | c |
| 71 | +Symbol | s p |
| 72 | +SaleCondition | s |
| 73 | +TradeVolume | i |
| 74 | +TradePrice | e |
| 75 | +TradeStopStockIndicator | b |
| 76 | +TradeCorrectionIndicator | h |
| 77 | +SequenceNumber | i |
| 78 | +TradeId | C |
| 79 | +SourceofTrade | c |
| 80 | +TradeReportingFacility | b |
| 81 | +ParticipantTimestamp | n |
| 82 | +TradeReportingFacilityTRFTimestamp| n |
| 83 | +TradeThroughExemptIndicator | b |
| 84 | +``` |
| 85 | +And run aggregations on the data, for example get the number of trades and the max prices for each hour: |
| 86 | +``` |
| 87 | +q)select numTrade:count i,maxPrice:max TradePrice by Time.hh from trade |
| 88 | +
|
| 89 | +hh| numTrade maxPrice |
| 90 | +--| ------------------- |
| 91 | +1 | 14019 15.0399 |
| 92 | +2 | 28475 15.04391 |
| 93 | +3 | 28535 15.04839 |
| 94 | +4 | 194690 7465 |
| 95 | +5 | 122619 3880 |
| 96 | +6 | 117835 7475 |
| 97 | +7 | 281648 7460 |
| 98 | +8 | 676191 7458.8 |
| 99 | +9 | 7657888 611225.6 |
| 100 | +10| 11303243 611071.8 |
| 101 | +11| 8726594 610600 |
| 102 | +12| 7114388 610980 |
| 103 | +13| 7039454 611065 |
| 104 | +14| 7512397 611679.9 |
| 105 | +15| 16510252 613149.4 |
| 106 | +16| 385603 612600.2 |
| 107 | +17| 145800 7460 |
| 108 | +18| 121943 610668 |
| 109 | +19| 96918 610668 |
| 110 | +20| 6655 8662.955 |
| 111 | +
|
| 112 | +``` |
| 113 | + |
| 114 | +## Changelog |
| 115 | +Detailed update history can be found in [CHANGELOG.md](CHANGELOG.md). |
| 116 | + |
| 117 | +## Best Practices for Integration |
96 | 118 |
|
97 | 119 | You are welcome to download and use this code according to the terms of the licence.
|
98 | 120 |
|
99 |
| -Kx Systems recommends you do not link your application to this repository, |
| 121 | +[KX](kx.com) recommends you do not link your application to this repository, |
100 | 122 | which would expose your application to various risks:
|
101 | 123 |
|
102 | 124 | - This is not a high-availability hosting service
|
103 | 125 | - Updates to the repo may break your application
|
104 | 126 | - Code refactoring might return 404s to your application
|
105 | 127 |
|
| 128 | +### Recommendation: |
106 | 129 | Instead, download code and subject it to the version control and regression testing
|
107 | 130 | you use for your application.
|
0 commit comments