Skip to content

Latest commit

 

History

History
185 lines (121 loc) · 7.4 KB

import_csv_to_db.md

File metadata and controls

185 lines (121 loc) · 7.4 KB

Import CSV file to a Database, quickly

Challenge: you have a csv file and you need to import it into a new table in a database. You have 5 mins to get it done. What tools do you use?

@secretgeek

Below are many different solutions to this common problem, for a variety of databases and operating systems.

Table of contents:

Any database with any OS

Some brute force approaches (common!)

  • Sublime (editor), use Multiple line edit to turn the data into SQL

  • vim, "A combination of macros, regex, and standard commands that change each line to an insert statement." (via OJ)

  • (Any editor), use search/replace (perhaps with regular expressions) to turn the data into SQL

  • (Any spreadsheet), write formulae that combine the data together to turn the data into SQL, "="('" + A1 + "','" + A2..." etc.`

  • Automate the search/replace approach with NimbleText

      $ONCE
      CREATE TABLE Contacts
      ("<% $row.replace(/[ ]*,[ ]*/g,'" varchar(50) NULL,\n"') %>" varchar(50) NULL)
      GO
    
      Insert into Contacts
      ('<% $row.replace(/[ ]*,[ ]*/g,"','") %>')
      Values
      $EACH+
      (<% ("'" + $row + "'").replace(/[ ]*,[ ]*/g,"','").replace(/'NULL'/g,'NULL') %>)<% if ($rowNumOne != $numrows) { ',' } %>
      $ONCE
      GO
    

Any database with Windows

  • LinqPadCSV to IEnumerable (or Array) in Linqpad

  • Microsoft Log Parser

     Logparser -o sql -server 127.0.0.1 -database -createtable on "select * into newtable from C:\Apps\Logs\Logfile.log" 
    
  • Alteryx ($$$)

    • drag in an Input data, point it at the csv file.
    • (optional) drag in a Select, use it to configure the column types
    • drag in an Output data, point it at the database (and set the table name)
    • Run!
  • Generate SQL Insert Statements with NimbleText (desktop version)

    The desktop version lets you use a "real" CSV parser that handles embedded delimiters/line breaks etc. Just switch on the use qualifier option under tools | options.

Any Database with MacOS

(No examples available, so use solutions for "Any database with any OS")

MS SQL Server Specific

SQLite specific

  • Tutorial: Import a CSV File Into an SQLite Table

    If there are no header names in the csv, then find the column names and run:

      create table mytable (header_a, header_b, header_c) 
    

    Next step (do this whether you needed the previous step or not)

      .mode csv
      .import data.csv mytable
    

MySQL specific

  • Load Data Infile

      LOAD DATA INFILE 'c:/tmp/discounts.csv' 
      INTO TABLE discounts 
      FIELDS TERMINATED BY ',' 
      ENCLOSED BY '"'
      LINES TERMINATED BY '\n'
      IGNORE 1 ROWS;
    
  • PhpMyAdmin

Postgresql specific

Two techniques: Copy command in sql itself, and \copy command in psql (interactive commandline).

  • Sql-copy — requires root access

     COPY dog_habits FROM '/home/user521/pets/dog_habits.csv' DELIMITER ',' CSV HEADER;
    
  • psql \copy — use interactive psql commandline where file permissions are an issue

     psql -c "\copy sample FROM '/home/user521/pets/dog_habits.csv' WITH (FORMAT CSV)"
    

Oracle specific

Hadoop specific

  • @BenBrandwood gives this Hadoop example...

      hive> CREATE EXTERNAL TABLE IF NOT EXISTS test_ext
      > (ID int,
      > DEPT int,
      > NAME string
      > )
      > ROW FORMAT DELIMITED
      > FIELDS TERMINATED BY ','
      > STORED AS TEXTFILE
      > LOCATION '/test';
      OK
      Time taken: 0.395 seconds
    

Alternative: Just treat the CSV as SQL

  • TextQL — Execute SQL against CSV or TSV.
  • q — Run SQL directly on CSV Files
  • RBQL — Rainbow Query Language, a SQL-like language with JavaScript or Python backend.
  • PSKit Query — Powershell module lets you run simple queries over objects, including imported with csv.