-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qlib to load csv minutes level trading data #1775
Comments
I think your csv file needs some preprocessing before it can be converted to a bin file, with the following caveats. |
Hi there,
New to use Qlib but I did look up my questions online and asked LLM, no solutions so far.
Here are what I am facing:
I have 1min level trading data in more than 10 csv files, each file is over 500MB. All the csv files follow same format,
[instrument, time, open, high, low, close, volume, turnover, is_paused].
In this case column 'instrument' saves asset code, so one file will have tons of stock code.
Column 'time' saves trading time stamp, e.g. '1/2/2019 9:53:00 AM'.
Problems:
1, All the csv files are in one folder, I tried run 'python dump_bin.py dump_all --csv_path 'csv file folder path' --qlib_dir 'target file path' --symbol_field_name instrument --date_field_name time --include_fields open,high,low,close,volume,turnover,is_paused'.
then the system returned 'concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.'
Is this because of short of memory? (file size too large? )
because I tried to put only one csv file in the folder then the 'python dump_bin.py' worked, partially.
However in folder instruments, I only see an 'all.txt' file, and it has only one row, the csv file name, start date and end date.
There is a 'day.txt' in calendar folder, but it only save date level data, e.g. '2019-01-02', there is no minute.
Appreciated if anyone could share your advice!
The text was updated successfully, but these errors were encountered: