0.13.0
We are excited to announce the release of Dolt 0.13.0, hot on the heels of relaunching DoltHub.
Easy Install Script
It's now incredibly easy to install Dolt, so if you haven't tried it yet, you can now obtain a copy with a single command, and start playing with datasets:
$ curl -L https://github.com/liquidata-inc/dolt/releases/latest/download/install.sh | bash
The installer script works on Mac and Linux. For Windows, download the MSI installer below.
System Tables
We released a blog post detailing some exciting new functionality for surfacing versioning data in Dolt. This is the first of a set of features that will eventually expose all the Git-like internals of Dolt to SQL, and facilitate automated use of Dolt by allowing users to define their default choices inside SQL statements.
- dolt_log: Access the same information as the
dolt log
command via a SQL query - dolt_diff_$table: A system table for each of your tables, which lets you query the diff between two commits. See the blog post for more details.
- dolt_history_$table: A system table for each of your tables, which lets you query past values of rows in the table at any commit in its history. See the blog post for more details.
LICENSE and README functionality
We now allow users to create License and Readme documents as part of their Dolt repository, these appear as LICENSE.md
and README.md
files in the root of your repo. Edit them with the text editor of your choice, then add them to a commit with dolt add
, the same as a table. Their contents are versioned alongside your tables' data. License and Readme files will soon be visible on DoltHub for repositories that provide them. Allowing users to specify the terms on which data is available is an important step towards creating a vibrant data-sharing community.
Views
Our SQL implementation now supports persistent views, taking us closer to having a fully functioning SQL engine. Create a view using the standard SQL syntax:
CREATE VIEW myview AS SELECT col1, col2 FROM mytable
Then query it like any other table:
SELECT * FROM myview
Other
We made performance enhancements to SQL, including supporting indexed joins on a table's primary key columns. This should make the engine usable for joins on the primary key columns of two tables. Additional improvements in join performance are in the works. We also fixed assorted bugs and made performance improvements in other areas of the SQL engine.
Merged PRs
- 331: Removed Windows carriage-return and trailing whitespace from bats tests
- 329: CSV export compliant with RFC 4180
- 328: bats/helper/windows-compat.bash: Try mktemp on Windows.
- 326: one down
The other 32 skipped bats tests are confirmed to fail - 324: Removed old table and schema commands from the command line
- 323: fix buffered sequence iterator and put it back in row iterator
- 322: reverting buffered iter
- 320: bats/creds.bats: Debug windows failures.
- 319: Added a bats test for committing views and referencing them later
Added some checks for checked in views. - 318: Added test case for dolt reset --hard on new tables
- 316: Buffered Sequence Iterator
- Created a new interface
sequenceIterator
for the use case whensequenceCurosor
is simply accessing elements in its sequence (ieMapIterator
,SetIterator
, andListIterator
) - Created a new buffered implementation of
sequenceIterator
designed by @reltuk to batch chunk fetching from theValueStore
. In use cases such as DoltHub where chunk fetching IO is slow, this will dramatically accelerate performance.
- Created a new interface
- 310: Km/non-trivial merge of master into doc feature branch
This is just a merge from master into my doc feature branch. So you can ignore that there are many commits authored by not-me.
I wanted to get eyes on the last commit before I merge it (d4de259). I had to remove 2 of the HasDoltPrefix checks that was breaking create-views.bats. Now i'm checking for DocTableName explicitly. I left the HasDoltPrefix function since I'm using it in the commands package, and presume we'll eventually need to use it again the sqle package. - 308: Updated to latest go-mysql-server. Re-enabled indexes by default, and…
… un-skipped an integration test of indexed join behavior. - 307: go/utils/publishrelease: First pass at an install.sh
- 306: Bumped go-mysql-server version
- 305: bats/creds.bats: Some initial bats tests for dolt creds new, ls and rm.
- 302: Km/doc tests
This PR:- Simplifies tests in docs.bats
- Adds tests for some helper functions in
doltdb/root_val_test.go
Will do more testing tomorrow, but wanted to get this in
- 301: dumps docs
This code dumps the standard command line help pages for every command that isn't hidden.
Because we only had functions for each command it was difficult to add a new method that would be implemented for each command, so I had to refactor all of that code. The refactor makes up the bulk of the PR. - 299: dolt checkout, and merge with dolt docs, with bats coverage
This PR includes:- Fixed a bug where dEnv.Docs was not always matching the docs of the current repo state (working root). This required changing the Docs type in the
env
package to[]doltdb.DocDetails
from[]*doltdb.DocDetails
. You'll see some reformatting to accommodate this change. checkout <doc>
checkout <branch>
merge <branch>
(one scenario is still buggy, need help identifying solution)- FF merge - docs on the FS get updated to target branch
- Merge with conflicts - docs on the FS remain as is
- Merge auto resolved conflicts (currently buggy) - docs on the FS should be updated to targetBranch, but they should not be added to the new working root. This would allow
dolt status
to indicate that the doc needs to be added and committed to finish merging. Right now it appears the doc is getting added to the working root.
- Fixed a bug where dEnv.Docs was not always matching the docs of the current repo state (working root). This required changing the Docs type in the
- 298: go/cmd/dolt/commands/sql: Add view persistence into dolt database.
- 296: go/cmd/dolt: credcmds/check: Add dolt creds check command.
- 295: update go-mysql-server to be the latest from liquidata-inc/go-mysql-s…
…erver@ld-master - 294: Added indexes to dolt sqllogictest harness and updated dependency on …
…go-mysql-server. - 293: go/cmd/dolt/commands/credcmds: Add documentation and a little bit of chrome to dolt creds commands.
- 291: Fixed ignoring an error in put-row
- 290: go/go.mod: Run go get -u all. Migrate to dbr/v2.
- 289: Tim/add docs bats
This is the test for branch, merge, and conflict resolve behavior. You can break it into multiple tests if you want but I think this is fine. - 288: add diff_type column to be able to select where diff_type is added, r…
…emoved, or modified - 287: fixes casing issue with system tables
- 285: Added bad describe bats test per testing session with Katie
- 284: {go,bats}: Implement dolt diff by parsing docs from args, with …
…bats test - 283: Zachmu/explain
Fixed describe table statements, and unskipped related tests. - 282: change the date field to be a Sql.DateTime
Output of the date field was in a format that wasn't able to be sorted properly. - 281: fix select on system table that doesn't exist
fix select on a system table that has a valid prefix but whose suffix does not match a valid table.
What makes this a little bit tough is that you can query diffs or the history of a table that no longer exists. So need to process the entire history and then see if at any time there was a schema'd table with the given name. - 280: {bats, go/libraries/doltcore/sqle/database.go}: Remove DoltNamespace from
dolt sql
command - 279: {go,bats}: Remove DocTableName from dolt table, schema, ls, add, reset, diff
This PR removes DocTableName from the outstanding commands so we don't expose the dolt docs table. - 278: {go,bats}: Add dolt docs to
dolt diff
This PR adds docs to the dolt diff command. Diffing individual docsdolt diff <doc>
will land in a subsequent PR.
Here are some example outputs:
Removing a file that has already been committed:Adding docs that don't already exist on the staged root:rm LICENSE.md dolt diff diff --dolt a/LICENSE.md b/LICENSE.md deleted doc - new license
Modifying a doc that has already been committed:touch README.md touch LICENSE.md dolt diff diff --dolt a/README.md b/README.md added doc diff --dolt a/LICENSE.md b/LICENSE.md added doc
dolt diff diff --dolt a/LICENSE.md b/LICENSE.md --- a/LICENSE.md +++ b/LICENSE.md this is my license + + How to use this repository: + Step 1)....
- 274: Zachmu/alter table engine
Alter table statements on the go-mysql-server engine, with new support for modify column statements. - 272: go/store/nbs: Fix CopySource in UploadPartCopy for namespaced aws table persisters.
- 271: go/store/nbs: Disable persisting table data in dynamodb.
- 270: Removed some extraneous bash commands I found in bats
- 269: go/store/util/verbose: Add ability to override Log function.
- 268: added args check to dolt sql command
- 267: Updated go.mod to point to newest go-mysql-server
- 266: fix and unit test
- 265: fixed bool conversion
- 264: Small fix to change strings back to LONGTEXT
- 263: dolt reset --soft and --hard for dolt docs
This PR includes:- Adding docs to
dolt reset --hard
, with tests - Adding docs to
dolt reset --soft
,dolt reset .
,dolt reset
,dolt reset <doc>
, with tests - Refactor
doltcore/env/dolt_docs
to more general use - ...and other helper functions in
doltcore/env/environment.go
to help stage and unstage docs
- Adding docs to
- 262: Include required length value for "varchar" column type in README tutorial
- 261: Dockerfile: First pass at a simple Dockerfile for building an image with dolt installed.
- 260: Now referencing newest go-mysql-server version
- 259: fix csv parsing
Currently there is a panic if a csv line contains nothing but whitespace. - 256: Andy/sqlweb
I needed to reuse logic from SQL processing in Dolt, but as written it needed a CLI environment. Did some refactoring to de-couple env.DoltEnv and env.RepoState:- Removed env.DoltEnv as a dependency for sqlEngine in commands/sql
- Removed FileSystem as a dependency in env.RepoState
👀 that symmetric stat line tho
- 255: Changed Datasets to clone section
- 254: Fix diff where clause
Fixes diff where so that you no longer need to use to_col or from_col.
--where col=val
is evaluated as:
where to_col=val || from_col=val - 249: go-mysql-server types
- 248: bh/skip describe bats
- 247: {go,bats}: dolt add
I will change the base on this PR to the feature branch (#222) once #242 is merged
I updated the base to the feature branch 👍 - 246: slow history tables
- 245: change super_schema to use a CommitItr
- 242: {go,bats}:
dolt add .
for dolt_docs, with updateddolt status
and bats tests - 55: Fixed integer overflow when compiling to 32 bit platform
- 54: Zachmu/join improvements
Improved indexed join analysis to match more kinds of queries. Added analysis tests to ensure that queries get the expected join plans. - 53: Zachmu/multi column index joins
Multi-column joins. Also bug fixes for order by clauses. - 52: sql/plan: {create,drop}_view: Improve ViewDropper and ViewCreator interactions to better support OR REPLACE / IF EXISTS.
- 51: absolute value function for SQL
- 50: Standardized error messages for unsupported syntax / features.
- 49: Bug fix for sorter where both values are nil.
- 48: Indexed joins
Indexed joins for single-column indexes. Also:- Standardized capitalization of keywords in engine tests
- Fixed some bugs in non-indexed join logic
- Refactored and renamed a few things
This PR exposes (but does not create) a bug in sort logic for float columns. One new test query fails sometimes on tests of parallel query execution, depending on the race outcome. I'll fix that in a separate PR.
- 47: Zachmu/describe
- 46: Implemented SET
- 45: Implemented ENUM
- 44: Zachmu/alter table
Alter table implementation - 43: Added DECIMAL type
- 42: Internal TEXT to LONGTEXT
Changed all of the internal locations where we're using TEXT to now use LONGTEXT. - 41: Fixed BETWEEN and IN issues due to type changes
- 40: sql: Add interfaces for signaling to database when views are created and dropped.
- 39: sql/analyzer: resolve_views: Make resolving views an independent analyzer rules that runs before resolve_subqueries.
- 38: sql/plan/create_view.go: Store the original parsed definition of the view, not the analyzed version.
A problem occurs when things like the resolved table are stored in the view
definition. This changes the analyze path to analyze the view definition anew
each time, and stores the unanalyzed view definition the view registry. CREATE
VIEW still completely analyzes the view at creation time, in order to catch any
errors in the view definition. - 37: Zachmu/drop create fixes
Added support for IF NOT EXISTS and column comments to CREATE TABLE statements - 36: Zachmu/more index fixes
Fixed bugs in the implementation of AscendIndex and DescendIndex for memory index lookups, and fixed a bug in assigning indexes to OR expressions. - 35: Fixed null comparison panics in artithmetic operations and added tests
- 34: Add basic VIEWS support.
Basically takes src-d/go-mysql-server#860 and adapts it for our sqlparser changes. - 33: Better types compliance
First off, I'm going to list the types that are missing from this PR:The first three mentioned are the last ones I'm going to implement for now, as theDECIMAL ENUM SET GEOMETRY GEOMETRYCOLLECTION LINESTRING MULTILINESTRING POINT MULTIPOINT POLYGON MULTIPOLYGON
GEOMETRY
types are very, very complex. However,DECIMAL
proved to be a bit tougher than expected, and I didn't want to delay getting this PR out since I'll be gone over the entire Thanksgiving holiday (technically I'm on it now but I promised a PR by the 25th). Besides the ones listed above, every other type is in this PR in some capacity (NCHAR
and friends are missing, but that type can be replicated as it's just a charset/collation shortcut).
Some notes, because they're important. There are no tests. Changing the types has broken a fair amount, and I haven't bothered to fix those things yet before I can get started writing tests.
I'm making use of higher-level interfaces that are specific to some type or group of types. I played around with the idea of adding a new function to theType
interface that returned an attribute map, but decided against it since I wanted to be able to add more than raw attributes, and a map of functions just seemed too complex for the tradeoff.
Character sets and collations are complete, in that every single supported one in MySQL is here...which is why there are so many. I had many ideas on how to model this, but decided on using string constants as the base so that it's easy to reference them from outside code, and then have multiple maps for each specific attribute of a charset/collation. For example, if an implementer wants to specifically support thehebrew
character set, then they can just referenceCharacterSet_hebrew
, rather than defining their own variable withcustom_var, err := ParseCharacterSet("hebrew")
.
JSON is actually a very complex type, so I didn't really change anything from the logic that was already present before. This does mean that we don't truly supportJSON
, but implementing it right may take 1-2 full weeks, which I don't feel is worth it right now.
It pains me to say, but in the effort of full compliancy, I am modeling aBOOLEAN
as anint8
. But that's how MySQL does it, so it's probably for the best.
You'll notice that some types have global variables while others don't, and that's because some types don't make sense to reference without other information.Int8
can be a global type because it is fully self-contained in its information.Varchar
, OTOH, requires a length to be meaningful.
On the topic of the string types, if any logic looks weird, then it's to mimic something that MySQL is doing. I did a lot of testing to see what MySQL actually did when the documentation wasn't clear, so some rules seem inconsistent by MySQL is apparently inconsistent. I'm also doing nothing with the character set and collation stuff regarding comparisons, so we just supportutf8mb4
as a character set (which mimics Go's string type implementation to the best of my knowledge), and changing it to something else doesn't change any actual behavior. That can come later because that's A LOT to do...
TheTIME
type is weird, in that the minimum precision is microseconds, so I chose to base the entire type around that. All of the parseable variations are present as well. I was initially going to usetime.Duration
from Go, but it actually didn't suit the requirements. Working with it turned out to be more difficult than just using microseconds. For example, the only way to make a duration is to parse a string of a specific format, which would be parsing a string, then recreating a different string to parse, which is too much.
YEAR
is the most useless type I've ever encountered in anything, but hey it's included and it's fully supported now, hooray... - 32: Zachmu/unary plus
- 31: bh/lazy loading
- 30: Zachmu/fix pushdown
Bug fixes for table pushdowns and index lookups:- Failing to renumber fields in a filter after projecting a subset of columns onto tables in some cases
- Incorrectly applying index lookups to OR clauses involving more than one table, which inappropriately restricts the indexed table to only matching values.
Added a couple test tables to engine_test, and broke out the tables used for information schema into their own set of test definitions.
- 28: Zachmu/insert update delete perf
- 27: Fixed replace tests (broken by my changes to type handling code in Up…
…date) - 26: DATETIME handling & microsecond support
- 25: Zachmu/better type checking
Better type checking for inserts (results in an error message before any inserts are attempted) - 24: Zachmu/insert into select
First draft. Works, but error messages could be better. Working on an extension to the Type so that we can compare column types for compatibility (right now type mismatches fail at execution with e.g. "expecting int64 but got string" messages) - 23: Removed unused variable assignments
- 22: Zachmu/index or bug fix
Fixed a bug in indexing code that would cause queries to return incorrect results. Along the way, implemented index capabilities for the in-memory tables (only for correctness verification), and expanded engine_test to test every combination of indexes, partitions, and parallelism.* 21: Fixed PK NN behavior to match MySQL
Primary Keys should always beNOT NULL
according to the MySQL documentation.
https://dev.mysql.com/doc/refman/8.0/en/create-table.html#idm139638853954816 - 20: Fixed test error
- 19: PR feedback: added engine tests, removed Alterable interface, added c…
…omments. - 18: Zachmu/drop table
Drop and create table support, as well as support for 24-bit integers. - 17: Throw an error on CREATE VIEW statements, which were parsing as CREAT…
…E TABLE with an empty table spec prior to this change. - 16: Zachmu/drop table 2
Drop and create table - 14: Compare nulls
- 13: Implemented UPDATE
Take a look at this. I have yet to implement all of the tests, as this is mainly to check the logic to make sure that I'm on the correct track and all. First time working with their pipeline stuff, it's kind of interesting. - 12: Implemented REPLACE
I decided to go with theDelete
thenInsert
way instead of adding aReplacer
interface. The insert-only functionality in the event of no primary key isn't what the "expected" behavior would be in my opinion, and since we don't have a way to check from this library, I think we should just do it this way instead. On the other side, if we have aReplacer
interface, then we can't ensure that the implementer will properly choose if it will an Insert-only. I'd rather enforce a behavior that will be applicable and correct in the most obvious case, and it reduces the necessary code from implementers too since they won't need to duplicate any trigger functionality or anything. - 11: Deletes
DELETE has been implemented! - 10: Fixes for column duplicates and existence
Fixes for both the duplicate columns not erroring and also for invalid column names. - 9: Windows support
- 8: Insert Fixes
Work so far for fixing issues in imports.