Skip to content

Commit f6ba6aa

Browse files
committed
updated readme
1 parent fc7011e commit f6ba6aa

File tree

2 files changed

+206
-94
lines changed

2 files changed

+206
-94
lines changed

CHANGELOG.md

Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
## Changes to kdb+taq
2+
3+
### 2022.01.06 (`Bid_Price` type fix in tq.q)
4+
changing `Bid_Price` from real to float to avoid `Bid_Price`, `Offer_Price` type inconsistency
5+
6+
### 2020.06.12 (version 3.3 - sync with NYSE daily taq version 3.3)
7+
handle (ignore) additional flags TradedOnMEMX and TradedOnMIAX in master file
8+
https://www.nyse.com/publicdocs/nyse/data/Daily_TAQ_Client_Spec_v3.3.pdf
9+
10+
### 2020.04.15 (version 3.2 - sync with NYSE daily taq version 3.2)
11+
handle (ignore) additional flag TradedOnLTSE in master file
12+
https://www.nyse.com/publicdocs/nyse/data/Daily_TAQ_Client_Spec_v3.2.pdf
13+
14+
### 2018.10.05
15+
small changes to field load types (mostly just wider) - see before/after
16+
versions of nf2, tf2, qf2
17+
18+
### 2017.10.27 (version 3.0 - jump in version number to sync with NYSE daily taq version 3.0b)
19+
support NYSE change to schema as specified in:
20+
https://list.theice.com/t/92262/395348/57007/0/
21+
22+
### 2016.10.18 (version 2.2 - jump in version number to sync with NYSE daily taq specs)
23+
support NYSE change to schema, format and datacentre as specified in:
24+
http://www.nyxdata.com/doc/247075
25+
26+
taq.k reads the new data files, but saves data to the old schema, discarding
27+
nano-second timestamp precision. It is a drop-in replacement for the previous
28+
version for those who don't want to move to the new schema yet.
29+
30+
tq.q reads and exports ALL fields for trade, quote and nbbo
31+
32+
### 2015.07.27 (version 1.12)
33+
Timestamp precision extended from millisecond to microseconds (additional 3 digits ignored for now).
34+
Ignore 3 additional fields participant timestamp, RRN and TRF
35+
http://www.nyxdata.com/nysedata/default.aspx?tabid=993&id=2784
36+
37+
### 2014.04.28 (version 1.11)
38+
Support additional exchange codes (12-15 - ZJKY) in taqmaster
39+
40+
### 2014.01.15 (version 1.10)
41+
Amend taq.k to discard the first line of taqmaster* when it is just lists the record count
42+
43+
### 2013.11.25 (version 1.9)
44+
Amend taq.k and tq.q to recognise and ignore the new quote fields:
45+
- SIP-generated Message Identifier
46+
- National BBO LULD Indicator
47+
tq.q to recognise and ignore the new NBBO fields:
48+
- Limit-Up/Limit-Down Indicator
49+
- Limit-up/Limit-down NBBO Indicator
50+
- SIP-generated Message Identifier
51+
and handle deletion of previous NBBO fields:
52+
- Limit-Up/Limit-Down NBBO (UTP) Indicator
53+
- Limit-Up/Limit-Down NBBO (CQS) Indicator
54+
55+
The format change is scheduled to take effect from 2nd December 2013
56+
see:
57+
http://www.nyxdata.com/nysedata/Default.aspx?tabID=993&id=2194
58+
59+
### 2013.01.22 (version 1.8)
60+
Amended taq.k and tq.q to recognise and ignore the new quote fields:
61+
- Short Sale Restriction (SSR) Indicator
62+
- Limit-Up/Limit-Down BBO (UTP) Indicator
63+
- Limit-Up/Limit-Down BBO (CQS) Indicator
64+
- FINRA ADF MPID Indicator
65+
and tq.q to recognise and ignore the new NBBO fields:
66+
- Limit-Up/Limit-Down NBBO (UTP) Indicator
67+
- Limit-Up/Limit-Down NBBO (CQS) Indicator
68+
69+
The format change is scheduled to take effect from 1st February 2013
70+
see:
71+
http://www.nyxdata.com/nysedata/default.aspx?tabid=993&id=1771
72+
http://www.nyxdata.com/doc/185107
73+
74+
### 2012.07.31 (version 1.7)
75+
Amended taq.k and tq.q to recognise (and ignore) the new Quote field RPI (Retail Interest Indicator)
76+
77+
### 2011.10.07
78+
Added tq.q as example of how to load more fields from the FTP files.
79+
taq.k is unchanged.
80+
81+
### 2011.08.11
82+
Adjust partitioning used with -par to split at a whole symbol – so any particular symbol will only be in one partition.
83+
84+
### 2011.08.07
85+
Enabled to handle >2billion rows in input file. Use in combination with `-par` cmd line option and par.txt
86+
87+
### 2010.08.19
88+
Amend the file detection code to pick up the new file names (as well as the old ones)
89+
when NYSE changes the filenames for the FTP download on September 17th, 2010.

README.md

Lines changed: 117 additions & 94 deletions
Original file line numberDiff line numberDiff line change
@@ -1,107 +1,130 @@
11
# kdb+taq
22

3-
## Changes to kdb+taq
4-
5-
### 2022.01.06 (`Bid_Price` type fix in tq.q)
6-
changing `Bid_Price` from real to float to avoid `Bid_Price`, `Offer_Price` type inconsistency
7-
8-
### 2020.06.12 (version 3.3 - sync with NYSE daily taq version 3.3)
9-
handle (ignore) additional flags TradedOnMEMX and TradedOnMIAX in master file
10-
https://www.nyse.com/publicdocs/nyse/data/Daily_TAQ_Client_Spec_v3.3.pdf
11-
12-
### 2020.04.15 (version 3.2 - sync with NYSE daily taq version 3.2)
13-
handle (ignore) additional flag TradedOnLTSE in master file
14-
https://www.nyse.com/publicdocs/nyse/data/Daily_TAQ_Client_Spec_v3.2.pdf
15-
16-
### 2018.10.05
17-
small changes to field load types (mostly just wider) - see before/after
18-
versions of nf2, tf2, qf2
19-
20-
### 2017.10.27 (version 3.0 - jump in version number to sync with NYSE daily taq version 3.0b)
21-
support NYSE change to schema as specified in:
22-
https://list.theice.com/t/92262/395348/57007/0/
23-
24-
### 2016.10.18 (version 2.2 - jump in version number to sync with NYSE daily taq specs)
25-
support NYSE change to schema, format and datacentre as specified in:
26-
http://www.nyxdata.com/doc/247075
27-
28-
taq.k reads the new data files, but saves data to the old schema, discarding
29-
nano-second timestamp precision. It is a drop-in replacement for the previous
30-
version for those who don't want to move to the new schema yet.
31-
32-
tq.q reads and exports ALL fields for trade, quote and nbbo
33-
34-
### 2015.07.27 (version 1.12)
35-
Timestamp precision extended from millisecond to microseconds (additional 3 digits ignored for now).
36-
Ignore 3 additional fields participant timestamp, RRN and TRF
37-
http://www.nyxdata.com/nysedata/default.aspx?tabid=993&id=2784
38-
39-
### 2014.04.28 (version 1.11)
40-
Support additional exchange codes (12-15 - ZJKY) in taqmaster
41-
42-
### 2014.01.15 (version 1.10)
43-
Amend taq.k to discard the first line of taqmaster* when it is just lists the record count
44-
45-
### 2013.11.25 (version 1.9)
46-
Amend taq.k and tq.q to recognise and ignore the new quote fields:
47-
- SIP-generated Message Identifier
48-
- National BBO LULD Indicator
49-
tq.q to recognise and ignore the new NBBO fields:
50-
- Limit-Up/Limit-Down Indicator
51-
- Limit-up/Limit-down NBBO Indicator
52-
- SIP-generated Message Identifier
53-
and handle deletion of previous NBBO fields:
54-
- Limit-Up/Limit-Down NBBO (UTP) Indicator
55-
- Limit-Up/Limit-Down NBBO (CQS) Indicator
56-
57-
The format change is scheduled to take effect from 2nd December 2013
58-
see:
59-
http://www.nyxdata.com/nysedata/Default.aspx?tabID=993&id=2194
60-
61-
### 2013.01.22 (version 1.8)
62-
Amended taq.k and tq.q to recognise and ignore the new quote fields:
63-
- Short Sale Restriction (SSR) Indicator
64-
- Limit-Up/Limit-Down BBO (UTP) Indicator
65-
- Limit-Up/Limit-Down BBO (CQS) Indicator
66-
- FINRA ADF MPID Indicator
67-
and tq.q to recognise and ignore the new NBBO fields:
68-
- Limit-Up/Limit-Down NBBO (UTP) Indicator
69-
- Limit-Up/Limit-Down NBBO (CQS) Indicator
70-
71-
The format change is scheduled to take effect from 1st February 2013
72-
see:
73-
http://www.nyxdata.com/nysedata/default.aspx?tabid=993&id=1771
74-
http://www.nyxdata.com/doc/185107
75-
76-
### 2012.07.31 (version 1.7)
77-
Amended taq.k and tq.q to recognise (and ignore) the new Quote field RPI (Retail Interest Indicator)
78-
79-
### 2011.10.07
80-
Added tq.q as example of how to load more fields from the FTP files.
81-
taq.k is unchanged.
82-
83-
### 2011.08.11
84-
Adjust partitioning used with -par to split at a whole symbol – so any particular symbol will only be in one partition.
85-
86-
### 2011.08.07
87-
Enabled to handle >2billion rows in input file. Use in combination with `-par` cmd line option and par.txt
88-
89-
### 2010.08.19
90-
Amend the file detection code to pick up the new file names (as well as the old ones)
91-
when NYSE changes the filenames for the FTP download on September 17th, 2010.
92-
93-
94-
## Hot linking from your application
95-
3+
kdb-taq is a tool for processing and analyzing historical NYSE Daily TAQ (Trade and Quote) data using kdb+/q. This repository contains scripts and utilities to parse, load, and query TAQ datasets efficiently.
4+
5+
## Prerequisites
6+
7+
- [kdb+](https://kx.com/kdb-personal-edition-download/) installed on your machine
8+
- NYSE Daily TAQ files from [ftp.nyse.com](ftp.nyse.com)
9+
10+
## Getting Started
11+
12+
Follow the steps below to set up and process a TAQ file:
13+
14+
### 1. Download a Sample TAQ File
15+
16+
Obtain TAQ data files from the NYSE FTP link. For example:
17+
18+
```
19+
wget https://ftp.nyse.com/Historical%20Data%20Samples/DAILY%20TAQ/EQY_US_ALL_TRADE_20240702.gz
20+
```
21+
22+
These files are ~2GB each so may take significant time to download.
23+
24+
### 2. Clone the Repository
25+
26+
Clone the kdb-taq repository to your server:
27+
28+
```
29+
git clone https://github.com/KxSystems/kdb-taq.git
30+
cd kdb-taq
31+
```
32+
33+
### 3. Prepare the Data
34+
35+
Create a source directory and move the downloaded TAQ file to this and decompress it:
36+
37+
```
38+
mkdir SRC
39+
mv /path/to/EQY_US_ALL_TRADE_20240702.gz SRC/
40+
gzip -d SRC/*
41+
```
42+
43+
### 4. Process the TAQ Data
44+
45+
Run the tq.q script to process the data. Replace SRC with the full path to the source directory if necessary:
46+
```
47+
q tq.q -s 8 SRC
48+
```
49+
50+
The -s option specifies the number of threads (optional).
51+
52+
### 5. Load the Processed Data
53+
54+
Load the data into the kdb+ environment:
55+
```
56+
q)\l tq
57+
```
58+
59+
### 6. Query the Data
60+
61+
You can now query the loaded data. For example runnning `meta` to see the table schema and datatypes:
62+
63+
```
64+
q)meta trade
65+
66+
c | t f a
67+
----------------------------------| -----
68+
date | d
69+
Time | n
70+
Exchange | c
71+
Symbol | s p
72+
SaleCondition | s
73+
TradeVolume | i
74+
TradePrice | e
75+
TradeStopStockIndicator | b
76+
TradeCorrectionIndicator | h
77+
SequenceNumber | i
78+
TradeId | C
79+
SourceofTrade | c
80+
TradeReportingFacility | b
81+
ParticipantTimestamp | n
82+
TradeReportingFacilityTRFTimestamp| n
83+
TradeThroughExemptIndicator | b
84+
```
85+
And run aggregations on the data, for example get the number of trades and the max prices for each hour:
86+
```
87+
q)select numTrade:count i,maxPrice:max TradePrice by Time.hh from trade
88+
89+
hh| numTrade maxPrice
90+
--| -------------------
91+
1 | 14019 15.0399
92+
2 | 28475 15.04391
93+
3 | 28535 15.04839
94+
4 | 194690 7465
95+
5 | 122619 3880
96+
6 | 117835 7475
97+
7 | 281648 7460
98+
8 | 676191 7458.8
99+
9 | 7657888 611225.6
100+
10| 11303243 611071.8
101+
11| 8726594 610600
102+
12| 7114388 610980
103+
13| 7039454 611065
104+
14| 7512397 611679.9
105+
15| 16510252 613149.4
106+
16| 385603 612600.2
107+
17| 145800 7460
108+
18| 121943 610668
109+
19| 96918 610668
110+
20| 6655 8662.955
111+
112+
```
113+
114+
## Changelog
115+
Detailed update history can be found in [CHANGELOG.md](CHANGELOG.md).
116+
117+
## Best Practices for Integration
96118

97119
You are welcome to download and use this code according to the terms of the licence.
98120

99-
Kx Systems recommends you do not link your application to this repository,
121+
[KX](kx.com) recommends you do not link your application to this repository,
100122
which would expose your application to various risks:
101123

102124
- This is not a high-availability hosting service
103125
- Updates to the repo may break your application
104126
- Code refactoring might return 404s to your application
105127

128+
### Recommendation:
106129
Instead, download code and subject it to the version control and regression testing
107130
you use for your application.

0 commit comments

Comments
 (0)