-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added remove indexes before bulk insert and add indexes after #12
base: main
Are you sure you want to change the base?
Conversation
Doesn't work for rails < 4.0: travis. |
eb114e3
to
ce0fd35
Compare
Sorry, didn't notice that – fixed now. |
It will be better to use More general — I assume that removing indexes during insertion should make whole process faster, but my attempts with/without removing indexes showed no difference using sqlite and postgres or opposite effect using mysql. Could you please create an example (schema and a script to generate data) which will be faster to expand with indexes removed? |
@aishek Did you get a notification about my comment? |
@toy yeah, I've got it. A lot of work these days, sorry :) |
@aishek No worries :) |
Let's discuss impact to speed for indexes removing. Here is example for sqlite3:
create_index_first.sql I suppose high selectivity of age column in my example make sense. How do you think, should this feature present in dump? If so, I'll improve my pull request using your comments. |
I am a convinced. Reworked example for multiple databases: #!/usr/bin/env ruby
require 'benchmark'
table_cmd = 'echo "create table people (age int);"'
index_cmd = 'echo "create index people_age on people(age);"'
drop_cmd = 'echo "drop table people;"'
# Create data.sql inserting `n` rows
n = 100_000
per_insert = 500
values = Array(20..30)
File.open('data.sql', 'w') do |f|
(n / per_insert).times do
values_sql = Array.new(per_insert){ "('#{values.sample}')" }.join(',')
f.puts("insert into people (age) VALUES #{values_sql};")
end
end
data_cmd = 'cat data.sql'
Benchmark.bmbm do |x|
{
:sqlite => 'sqlite3 test.sqlite',
:mysql => 'mysql5 -D test',
:postgres => 'psql -q test',
}.each do |db_name, db_cmd|
x.report("#{db_name}: data then index") do
system "(#{[table_cmd, data_cmd, index_cmd, drop_cmd] * ';'}) | #{db_cmd}"
end
x.report("#{db_name}: index then data") do
system "(#{[table_cmd, index_cmd, data_cmd, drop_cmd] * ';'}) | #{db_cmd}"
end
end
end Results:
|
1255e19
to
5df6df8
Compare
cd89333
to
b6e666f
Compare
78318c7
to
6adfbcf
Compare
I am using I also removed additional I also added tests, rebased code, fixed |
After looking at rails DB adapters code it seems to me that not all index options which are supported by Your current approach for adding indexes ignores at least lengths and orders. To check - try to add an index with length to a table in MySQL DB and check how indexes returned by The test you've added checks only if Changes to I've already cherry-picked your commits about |
I'll fix these issues later. Thanks for detailed remark! |
Ok, I fixed all:
There is two things to discuss. First, I've extracter assets-related logic from Second, 7966b03#diff-89f6960ada41447910198cd9a526b3dcR73 – ugly What do you think? |
Sorry for long wait. About extracting assets related logic: better not to do it as part of this pull request — just increase Entry in readme — there is
Also you handle lengths but not orders, and PostgreSQL adapter returns orders. So I still think that instead of using |
I've improved bulk insert performance: remove all indexes before first insert, and create after all inserts to avoid index rebuilds on each insert operation.