25 Jan 00:12

c62bc29

0.13.1

We are releasing a patch fix to Dolt, as the 0.13.0 release contained a bug. The new feature for creating license and read me documents on a repository contained a bug such that when cloning a repository using dolt clone, the documents were not updated. This patch ensures that functionality works correctly.

Since this is a patch of a recent release, see 0.13.0 release notes for details about the new features recently introduced.

Merged PRs

335: Km/doc clone bug
333: fix regex

Assets 11

23 Jan 23:04

oscarbatori

v0.13.0

b0e4817

0.13.0

We are excited to announce the release of Dolt 0.13.0, hot on the heels of relaunching DoltHub.

Easy Install Script

It's now incredibly easy to install Dolt, so if you haven't tried it yet, you can now obtain a copy with a single command, and start playing with datasets:

$ curl -L https://github.com/liquidata-inc/dolt/releases/latest/download/install.sh | bash

The installer script works on Mac and Linux. For Windows, download the MSI installer below.

System Tables

We released a blog post detailing some exciting new functionality for surfacing versioning data in Dolt. This is the first of a set of features that will eventually expose all the Git-like internals of Dolt to SQL, and facilitate automated use of Dolt by allowing users to define their default choices inside SQL statements.

dolt_log: Access the same information as the dolt log command via a SQL query
dolt_diff_$table: A system table for each of your tables, which lets you query the diff between two commits. See the blog post for more details.
dolt_history_$table: A system table for each of your tables, which lets you query past values of rows in the table at any commit in its history. See the blog post for more details.

LICENSE and README functionality

We now allow users to create License and Readme documents as part of their Dolt repository, these appear as LICENSE.md and README.md files in the root of your repo. Edit them with the text editor of your choice, then add them to a commit with dolt add, the same as a table. Their contents are versioned alongside your tables' data. License and Readme files will soon be visible on DoltHub for repositories that provide them. Allowing users to specify the terms on which data is available is an important step towards creating a vibrant data-sharing community.

Views

Our SQL implementation now supports persistent views, taking us closer to having a fully functioning SQL engine. Create a view using the standard SQL syntax:

CREATE VIEW myview AS SELECT col1, col2 FROM mytable

Then query it like any other table:

SELECT * FROM myview

Other

We made performance enhancements to SQL, including supporting indexed joins on a table's primary key columns. This should make the engine usable for joins on the primary key columns of two tables. Additional improvements in join performance are in the works. We also fixed assorted bugs and made performance improvements in other areas of the SQL engine.

Merged PRs

331: Removed Windows carriage-return and trailing whitespace from bats tests
329: CSV export compliant with RFC 4180
328: bats/helper/windows-compat.bash: Try mktemp on Windows.
326: one down
The other 32 skipped bats tests are confirmed to fail
324: Removed old table and schema commands from the command line
323: fix buffered sequence iterator and put it back in row iterator
322: reverting buffered iter
320: bats/creds.bats: Debug windows failures.
319: Added a bats test for committing views and referencing them later
Added some checks for checked in views.
318: Added test case for dolt reset --hard on new tables
316: Buffered Sequence Iterator
- Created a new interface sequenceIterator for the use case when sequenceCurosor is simply accessing elements in its sequence (ie MapIterator, SetIterator, and ListIterator)
- Created a new buffered implementation of sequenceIterator designed by @reltuk to batch chunk fetching from the ValueStore. In use cases such as DoltHub where chunk fetching IO is slow, this will dramatically accelerate performance.
310: Km/non-trivial merge of master into doc feature branch
This is just a merge from master into my doc feature branch. So you can ignore that there are many commits authored by not-me.
I wanted to get eyes on the last commit before I merge it (d4de259). I had to remove 2 of the HasDoltPrefix checks that was breaking create-views.bats. Now i'm checking for DocTableName explicitly. I left the HasDoltPrefix function since I'm using it in the commands package, and presume we'll eventually need to use it again the sqle package.
308: Updated to latest go-mysql-server. Re-enabled indexes by default, and…
… un-skipped an integration test of indexed join behavior.
307: go/utils/publishrelease: First pass at an install.sh
306: Bumped go-mysql-server version
305: bats/creds.bats: Some initial bats tests for dolt creds new, ls and rm.
302: Km/doc tests
This PR:
- Simplifies tests in docs.bats
- Adds tests for some helper functions in doltdb/root_val_test.go
  Will do more testing tomorrow, but wanted to get this in
301: dumps docs
This code dumps the standard command line help pages for every command that isn't hidden.
Because we only had functions for each command it was difficult to add a new method that would be implemented for each command, so I had to refactor all of that code. The refactor makes up the bulk of the PR.
299: dolt checkout, and merge with dolt docs, with bats coverage
This PR includes:
- Fixed a bug where dEnv.Docs was not always matching the docs of the current repo state (working root). This required changing the Docs type in the env package to []doltdb.DocDetails from []*doltdb.DocDetails. You'll see some reformatting to accommodate this change.
- checkout <doc>
- checkout <branch>
- merge <branch> (one scenario is still buggy, need help identifying solution)
- FF merge - docs on the FS get updated to target branch
- Merge with conflicts - docs on the FS remain as is
- Merge auto resolved conflicts (currently buggy) - docs on the FS should be updated to targetBranch, but they should not be added to the new working root. This would allow dolt status to indicate that the doc needs to be added and committed to finish merging. Right now it appears the doc is getting added to the working root.
298: go/cmd/dolt/commands/sql: Add view persistence into dolt database.
296: go/cmd/dolt: credcmds/check: Add dolt creds check command.
295: update go-mysql-server to be the latest from liquidata-inc/go-mysql-s…
…erver@ld-master
294: Added indexes to dolt sqllogictest harness and updated dependency on …
…go-mysql-server.
293: go/cmd/dolt/commands/credcmds: Add documentation and a little bit of chrome to dolt creds commands.
291: Fixed ignoring an error in put-row
290: go/go.mod: Run go get -u all. Migrate to dbr/v2.
289: Tim/add docs bats
This is the test for branch, merge, and conflict resolve behavior. You can break it into multiple tests if you want but I think this is fine.
288: add diff_type column to be able to select where diff_type is added, r…
…emoved, or modified
287: fixes casing issue with system tables
285: Added bad describe bats test per testing session with Katie
284: {go,bats}: Implement dolt diff by parsing docs from args, with …
…bats test
283: Zachmu/explain
Fixed describe table statements, and unskipped related tests.
282: change the date field to be a Sql.DateTime
Output of the date field was in a format that wasn't able to be sorted properly.
281: fix select on system table that doesn't exist
fix select on a system table that has a valid prefix but whose suffix does not match a valid table.
What makes this a little bit tough is that you can query diffs or the history of a table that no longer exists. So need to process the entire history and then see if at any time there was a schema'd table with the given name.
280: {bats, go/libraries/doltcore/sqle/database.go}: Remove DoltNamespace from dolt sql command
279: {go,bats}: Remove DocTableName from dolt table, schema, ls, add, reset, diff
This PR removes DocTableName from the outstanding commands so we don't expose the dolt docs table.
278: {go,bats}: Add dolt docs to dolt diff
This PR adds docs to the dolt di...

Assets 11

09 Dec 22:58

oscarbatori

v0.12.0

b74939c

0.12.0

We are excited to announce the release of Dolt 0.12.0!

Community

We have our first open-source committer to the Dolt project! Thanks to @namdnguyen for providing a helpful fix to our documentation. We are hoping this will be the first of many open-source contributions to Dolt.

SQL

As discussed in this blog post, we use sqllogictest to test our SQL implementation's logical correctness. It contains 5 million tests! This release marks a huge jump in compliance, with our implementation now hitting 89%, up from well under 50% just a few weeks ago.

Diff With Predicate

--diff-where command allows the user to add a predicate on the table being diff'd to reduce the surface area of the diff output and drill into specific data of interest.

Override Commit Date

When a user commits data, a timestamp is associated with the commit. By allowing Dolt users to customize the timestamp we allow the user to create an implicit bi-temporal database (based on commit time) while maintaining the ordinal integrity of the commit graph for querying history and reasoning about the sequence of updates.

SQL Diffs

Using the SQL diff command, that is dolt diff -q or dolt diff --sql, users can produce SQL output that will transform one branch into another. In other words this command will produce the difference, in data and schema transformations, between two refspecs in the commit log of Dolt repository.

As usual, this release also contains bug fixes and performance improvements. Please create an issue if you have any questions or find a bug.

Merged PRs

241: Bumped version and added release script
239: bats/create-views.bats: Pick up go-mysql-server support for views.
237: go/performance/benchmarks: remove id from results
236: Noticed an alter table test that now works was skipped. Unskipped.
235: Fix typo in README for table import
I ran into this typo while using Dolt yesterday. The command keywords were in the incorrect order in the README.
233: Zachmu/sql batch
Killed off original sql batch inserter and implemented equivalent functionality for new engine.
232: Andy/sqldiffrefactor
230: go/store/nbs: table_set.go: Rebase: Reuse upstream table file instances when supplied table specs correspond to them.

229: fix schema diff primary key changes
Output looks like this for changing a pk:

--- a/test @ 4uvb6bb3p7dqudnuidh9oh4ccsehik7n
+++ b/test @ 2tl4quv92ot0jg4v3ai204rld00trbo4
CREATE TABLE test (
`pk` BIGINT NOT NULL COMMENT 'tag:0'
-   `c1` BIGINT COMMENT 'tag:1'
`c2` BIGINT COMMENT 'tag:2'
`c3` BIGINT COMMENT 'tag:3'
`c4` BIGINT COMMENT 'tag:4'
`c5` BIGINT COMMENT 'tag:5'
<    PRIMARY KEY (`pk`, `c1`)
>    PRIMARY KEY (`pk`)
);

Also add the pk contstraint so it shows when it is not changing:

--- a/test @ idfqe6c5s2i9ohihkk4r4tj70tf3l8c7
+++ b/test @ 2tl4quv92ot0jg4v3ai204rld00trbo4
CREATE TABLE test (
`pk` BIGINT NOT NULL COMMENT 'tag:0'
`c1` BIGINT COMMENT 'tag:1'
`c2` BIGINT COMMENT 'tag:2'
<          `c3` BIGINT COMMENT 'tag:3'
>   `newColName3` BIGINT COMMENT 'tag:3'
`c4` BIGINT COMMENT 'tag:4'
`c5` BIGINT COMMENT 'tag:5'
PRIMARY KEY (`pk`, `c1`)
);

228: bh/add commit date
227: Bug fixes for sqllogictest dolt harness:
- More inclusive types
- Better error handling for panics
- Cheat on tables without primary keys to allow more tests (~40%) to succeed.
225: disable benchmarking dolt sql imports
224: go/cmd/dolt: commands/sql: Keep the sql engine around throughout the lifetime of the shell / batch import.
221: improved super schema names
220: update go-mysql-server dependency
219: dolt benchmarking
Initial approach is to write a script that will run n benchmarks, collect their results, then serialize those results to later be imported into dolt. Looking for feedback on approach before I head too far down this path, if it is suboptimal.
In it's current state, there are a lot of switch statements and panics and it only accounts for types int and string and only accounts for .csv style test data formats, but I'd like to make my data generation functions robust enough to be able to account for all file formats that dolt supports and all noms types...
218: Added skipped bats test for schema diffs on adding a primary key
216: Andy/sqlschemadiffs
Adding schema changes to dolf diff --sql output. Supports:
- add/drop table
- add/drop column
- rename table
- rename column
215: diff table
214: Bh/super schema
213: Zachmu/sql performance
212: Zachmu/sql indexes2
211: Added time to the handled cases in DATETIME & changed tests
This won't compile until dolthub/go-mysql-server#26 is referenced in go.mod.

Assets 11

12 Nov 18:38

oscarbatori

v0.11.0

17d14ba

Dolt 0.11.0 released

We are excited to announce the release of Dolt 0.11.0.

SQL

System Table

We implemented a dolt log table, thus making our first attempt to surface dolt version control concepts in SQL by surfacing commit data. This will allow users to leverage commit data in an automated setting via SQL. Clone a public repo to see how it works:

$ dolt clone Liquidata/ip-to-country
$ cd ip-to-country
$ dolt sql -q "select * from dolt_log"
$ dolt sql -q "select committer,date from dolt_log order by date desc"
+-------------+--------------------------------+
| committer   | date                           |
+-------------+--------------------------------+
| Tim Sehn    | Wed Sep 25 12:30:43 -0400 2019 |
| Tim Sehn    | Wed Sep 18 18:27:02 -0400 2019 |
.
.
.

Timestamps

We added support for DATETIME data type in SQL. This is a major milestone in achieving compatibility with existing RDBMS solutions.

Performance

We continue to rapidly improve our SQL implementation. On the performance side some degenerate cases of query performance saw large improvements. We also resolved some issues where update statements had to be "over parenthesized", with the parser now matching the standard.

Other

We support null values in CSV files that are imported via the command line, as well as minor bug fixes under the hood.

If you find any bugs, or have any questions or feature requests, please create an issue and we will take a look.

Merged PRs

208: go/libraries/doltcore/row: tagged_values.go: Fix n^2 behavior in ParseTaggedValues.
ParseTaggedValues used to call Tuple.Get(0)...Tuple.Get(n), but Tuple.Get(x)
has O(n) perf, so the function did O(n^2) decoding work to decode a tuple.
Use a TupleIterator instead.
206: go/store/types: Improve perf of value decoding for primitive types.
This fixes a performance regression in value decoding after the work to make it easier to add primitive types to the storage layer.
First, we change some map lookups into slice lookups, because hashing the small integers on hot decode paths dominates CPU profiles.
Next, we inline logic for some frequently used primitive types in value_decoder.go, as opposed to going through the table indirection. This is about a 30% perf improvement for linear scans on skipValue(), which is worth the duplication here.
Code for adding a kind remains correct if the decoder isn't changed to include an inlined decode path. We omit inlining UUID and InlineBlob here for now.
199: Km/redo import nulls
198: checkout a remote only branch
196: Bh/log table
195: Added Timestamp to Dolt and Datetime to SQL
Have a look!
I ran into an import cycle issue that I just could not figure out how to avoid, except by putting the tests into their own test folder (sqle/types/tests), so that's why they're in there. In particular, the cycle was that sqle imports sqle/types, and the tests rely on (and thus must import) sqle, causing the cycle.
I'm thinking of adding tests for the other SQL types later so that we have a few more built-in tests using the server portion, rather than everything using the -q pathway. That will be a different/future PR though.
193: diff where and limit
191: fix branch name panic with period
Looked into supporting periods in branch names, but it looks like noms relies on periods specifically pretty heavily. Seems to be excluded from the regex below by design, since they build some types on the expectation that a branch name or ref contain a period.
My understanding is that a user's branch name is used to look up a particular dataset within the noms layer and this variable (go/store/datas/dataset.go):
```
// DatasetRe is a regexp that matches a legal Dataset name anywhere within the
// target string.
var DatasetRe = regexp.MustCompile(`[a-zA-Z0-9\-_/]+`)
```
acts as the regex source of "truth" for branch names/ dataset look ups, and I believe more.
Noms also expects to be able to append a . to this string in order to parse the string later and correctly create it's Path types...
I went down a rabbit hole trying to change all of the noms Path delimiters to be a different character, but the changes go pretty deep and start breaking a lot of things. Happy to continue down that course in order to support periods in branch names, but it might take me a bit of time change everything. I'm also not sure what character should replace the period... asterisk? Anyway, this PR seemed like low hanging fruit fix to resolve the panic at least.
190: Missed Kind to String
In my last PR, it looks like I missed that we were using the old DoltToSQLType hardcoded map from the original SQL implementation. I didn't change it everywhere (it's used heavily in the old SQL code that isn't even being called anymore), but it's changed where it matters. Added a new interface function and changed the printing code to be a bit more consistent (we were mixing uppercase with lowercase).
I'm also returning different values, such as BIGINT for sql.Int64, as int parses in MySQL to a 32-bit integer, which isn't correct. Essentially made it so that if you took the CREATE statement exactly as-is and exported your data to a bunch of inserts and ran it in MySQL then it wouldn't error out, as it previously would have.
189: diff source refactor
188: go/cmd/git-dolt/README.md: Add comparison to git-lfs and note on updates
187: Remove skip on test for / in branch names. Added a skipped test for .…
… in branch names. . in branch names panics rights now.
186: go/go.mod: Pick up sqlparser improvements for ADD COLUMN, RENAME COLUMN. Fix some tests.
185: Moved command line SQL processing to use new engine for CREATE and DROP
183: add appid to logevents requests
Need to update the requests so that it reflects the current proto definitions.
182: Moved SQL types to an interface
Have a look! Just make an empty struct type that implements SqlTypeInit and add the struct to sqlTypeInitializers and you've got a type that works in SQL now!
179: clone reliability
178: go/store/nbs: store.go: Be more careful about updates to nbs field values until all operations have completed successfully.
177: go/cmd/dolt: Bump version to 0.10.0

Closed Issues

194: Wide tables create poor query performance

Assets 10

29 Oct 20:31

mjesuele

v0.10.0

4bf175c

0.10.0

We are excited to announce the latest release of Dolt, which includes a new feature, substantial improvements to existing features, and a new Windows installer.

Dolt Blame

Dolt now has a blame command, which provides row audit functionality familiar to Git users. (Blame for individual cells is in the works.) We have a deep dive on the implementation of dolt blame on our blog, so definitely check that out if you're interested.

One of the long-term goals of Dolt is to provide a database with fine-grained audit capabilities to support hygienic management of valuable human-scale data, and this feature is a huge step towards realizing that vision.

SQL Enhancements

One of our major goals for the product is full SQL compliance; this release contains steps towards achieving that. In particular, the following commands are now supported:

CREATE TABLE
DROP TABLE
INSERT VALUES & INSERT SET (no IGNORE or ON DUPLICATE KEY UPDATE support yet, also no INSERT SELECT support yet)
UPDATE (Single table, no IGNORE support yet)
REPLACE VALUES and REPLACE SET (no REPLACE SELECT support yet)

As well as making progress against our goal of full compliance, we also created a test suite that will help validate our SQL implementation. Check out the test suite and harness, and the related blog post. This is an important step in creating a fully transparent mechanism for our progress against our compliance goal.

We also fixed some bugs and made some performance improvements.

Schema Import

We now support schema inference from a CSV file. This is a convenience function to make importing a CSV with a correct schema easier. The command is best understood by looking at the help details:

$ dolt schema import --help
NAME
	dolt schema import - Creates a new table with an inferred schema.

SYNOPSIS
	dolt schema import [--create|--replace] [--force] [--dry-run] [--lower|--upper] [--keep-types] [--file-type <type>] [--float-threshold] [--map <mapping-file>] [--delim <delimiter>]--pks <field>,... <table> <file>

Windows Installer Packages

We now provide both 32- and 64-bit MSI packages for easy installation of Dolt on Windows. These may be used instead of manually extracting the archives (which are now provided in .zip format instead of .tar.gz). Please let us know if you encounter any issues.

Other

Various bug fixes and enhancements, and also improvements to dolt clone which had a problematic race condition.

As always, bug reports, feedback, and feature requests are very much appreciated. We hope you enjoy using Dolt!

Merged PRs

175: {bats, go}: Make commit spec truly optional in blame

$ dolt blame lunch-places
+--------------------+----------------------------------------------------+-----------------+------------------------------+----------------------------------+
| NAME               | COMMIT MSG                                         | AUTHOR          | TIME                         | COMMIT                           |
+--------------------+----------------------------------------------------+-----------------+------------------------------+----------------------------------+
| Boa                | Had an unhandled schema merge conflict which I ch… | Tim Sehn        | Fri Mar 22 12:21:59 PDT 2019 | ffabndafbp64r393ttghused8siip77j |
| Chipotle           | lunch-places: Added Chipotle                       | katie mcculloch | Thu Aug 29 11:38:00 PDT 2019 | m2jbro89ou8g6rv71rs7q9f3jsmjuk1d |
| Sidecar            | Had an unhandled schema merge conflict which I ch… | Tim Sehn        | Fri Mar 22 12:21:59 PDT 2019 | ffabndafbp64r393ttghused8siip77j |
| Wendy's            | Had an unhandled schema merge conflict which I ch… | Tim Sehn        | Fri Mar 22 12:21:59 PDT 2019 | ffabndafbp64r393ttghused8siip77j |
| Bangkok West Thai  | Had an unhandled schema merge conflict which I ch… | Tim Sehn        | Fri Mar 22 12:21:59 PDT 2019 | ffabndafbp64r393ttghused8siip77j |
| Jamba Juice        | Had an unhandled schema merge conflict which I ch… | Tim Sehn        | Fri Mar 22 12:21:59 PDT 2019 | ffabndafbp64r393ttghused8siip77j |
| Kazu Nori          | Had an unhandled schema merge conflict which I ch… | Tim Sehn        | Fri Mar 22 12:21:59 PDT 2019 | ffabndafbp64r393ttghused8siip77j |
| McDonald's         | Had an unhandled schema merge conflict which I ch… | Tim Sehn        | Fri Mar 22 12:21:59 PDT 2019 | ffabndafbp64r393ttghused8siip77j |
| Sunnin             | change rating                                      | bheni           | Thu Apr  4 15:43:00 PDT 2019 | 137qgvrsve1u458briekqar5f7iiqq2j |
| Bruxie             | Had an unhandled schema merge conflict which I ch… | Tim Sehn        | Fri Mar 22 12:21:59 PDT 2019 | ffabndafbp64r393ttghused8siip77j |
| Espresso Cielo     | added Espresso Cielo                               | Matt Jesuele    | Wed Jul 10 12:20:39 PDT 2019 | 314hls5ncucpol2qfdphf923s21luk16 |
| Seasalt Fish Grill | fixed ratings                                      | bheni           | Thu Apr  4 14:07:36 PDT 2019 | rqpd7ga1nic3jmc54h44qa05i8124vsp |
| Starbucks          | Had an unhandled schema merge conflict which I ch… | Tim Sehn        | Fri Mar 22 12:21:59 PDT 2019 | ffabndafbp64r393ttghused8siip77j |
| Tocaya             | update tocaya rating                               | bheni           | Thu Jun  6 17:22:24 PDT 2019 | qi331vjgoavqpi5am334cji1gmhlkdv5 |
| Sake House         | fixed ratings                                      | bheni           | Thu Apr  4 14:07:36 PDT 2019 | rqpd7ga1nic3jmc54h44qa05i8124vsp |
| Swingers           | Had an unhandled schema merge conflict which I ch… | Tim Sehn        | Fri Mar 22 12:21:59 PDT 2019 | ffabndafbp64r393ttghused8siip77j |
| Art's Table        | Had an unhandled schema merge conflict which I ch… | Tim Sehn        | Fri Mar 22 12:21:59 PDT 2019 | ffabndafbp64r393ttghused8siip77j |
| Bay Cities         | Had an unhandled schema merge conflict which I ch… | Tim Sehn        | Fri Mar 22 12:21:59 PDT 2019 | ffabndafbp64r393ttghused8siip77j |
| Benny's Tacos      | Had an unhandled schema merge conflict which I ch… | Tim Sehn        | Fri Mar 22 12:21:59 PDT 2019 | ffabndafbp64r393ttghused8siip77j |
| Bibibop            | Had an unhandled schema merge conflict which I ch… | Tim Sehn        | Fri Mar 22 12:21:59 PDT 2019 | ffabndafbp64r393ttghused8siip77j |
| Curious Palate     | Had an unhandled schema merge conflict which I ch… | Tim Sehn        | Fri Mar 22 12:21:59 PDT 2019 | ffabndafbp64r393ttghused8siip77j |
| Meat on Ocean      | Had an unhandled schema merge conflict which I ch… | Tim Sehn        | Fri Mar 22 12:21:59 PDT 2019 | ffabndafbp64r393ttghused8siip77j |
+--------------------+----------------------------------------------------+-----------------+------------------------------+----------------------------------+

174: Update dolt blame description
170: Skipping the right SQL test that has a hanging race condition (joins …
…on legacy engine, code to be deleted soon)
168: Bh/correctness fixes
166: clone bug fix
165: README.md: Remove Tim's username from shell prompt
161: Zachmu/sql logictest
Improved main method for running or parsing sqllogic tests.
160: Bh/upload error checking
159: Zachmu/sql logictest
Removed code for sqllogic test and took a dependency on the new module instead.
158: Disabling sql server tests on linux, since they appear to be hanging …
…waiting for the server to start
157: Basic dolt blame
This still needs some more BATS tests and maybe some UI touches like what @Hydrocharged suggested, but overall I think it's ready for some eyes.
The biggest thing I don't like is the logic surrounding pretty-printing of primary keys (you'll see why) but please do tell me if you notice other things that are janky.
In general, feedback and suggestions are very welcome.
156: Bh/schema import
155: Zachmu/sql logictest
Implementation of sqllogictest for dolt on go-mysql-server. After getting this merged, I plan to fork off the non-dolt portions to a separate repo.
154: Bumping dependency of go-mysql-server to head of ld-master branch. Al…
…so fixing several issues that come up when doing so.
152: go/store/nbs: Add recover in table_set Rebase goroutines. (Saw a SIGSEGV which crashed doltremoteapi).
148: Added new InlineBlob type
This turned out to be far smaller than I thought as far as changes go. This types change might be simpler than I first thought! I probably felt it was harder just because finding all of these locations was a major pain...
146: proto/dolt/services/eventsapi: Adopt the version of eventsapi that lives in ld repo instead of here.
145: Update README.md
144: Miscellaneous...

Assets 10

04 Oct 18:36

reltuk

v0.9.9

d77543c

0.9.9

Contained in this release

remote performance improvements (clone, push, and pull)
better support for MySQL in server mode, including DROP, UPDATE, INSERT
SQL performance improvement
diff summary
more metrics
other assorted bug fixes and improvements

If you find any bugs, have a feature request, or an interesting use-case, please raise an issue.

Merged PRs

114: go/libraries/doltcore/sqle: types: Make SqlValToNomsVal compile for 32bit by checking for overflow on uint -> int64 differently.
112: Zachmu/drop table
110: go/utils/checkcommitters: Oscar is an allowed committer and author.
109: attempted deadlock fix
108: Correct the installation instructions

105: dolt diff --summary
Example output using Liquidata/tatoeba-sentence-translations:

$ dolt diff --summary rnfm50gmumlettuebt2latmer617ni3t
diff --dolt a/sentences b/sentences
--- a/sentences @ gd1v6fsc04k5676c105d046m04hla3ia
+++ b/sentences @ 2ttci8id13mijhv8u94qlioqegh7lgpo
7,800,102 Rows Unmodified (99.99%)
15,030 Rows Added (0.19%)
108 Rows Deleted (0.00%)
960 Rows Modified (0.01%)
1,888 Cells Modified (0.00%)
(7,801,170 Entries vs 7,816,092 Entries)
diff --dolt a/translations b/translations
--- a/translations @ p2355o6clst8ssvr9jha2bfgqbrstkmm
+++ b/translations @ 62ri8lmohbhs1mc01m9o4rbvj6rbl8ee
5,856,845 Rows Unmodified (90.91%)
468,173 Rows Added (7.27%)
578,242 Rows Deleted (8.98%)
7,626 Rows Modified (0.12%)
7,626 Cells Modified (0.06%)
(6,442,713 Entries vs 6,332,494 Entries)

Fixes #77

104: Bh/output updates3
103: dolt/go/store: Stop panicing on sequence walks when expected hashes are not in the ValueReader.
101: go/{store,libraries/doltcore/remotestorage}: Make the code peddling in nbs table file formats a little more explicit about it.
100: newline changes
99: Implemented UPDATE
I think we should delete the old SQL methods that are in the sql.go file. I know at first you mentioned keeping them there for reference, but they're not being used at all at this point, and they're still in git history if we want to look at them again in the future for some reason. It's clutter at this point.
I'm skipping that one test at the end because of a WHERE decision in go-mysql-server. The code looks intentional, in that converting strings to ints will return 0 if the string is not parsable. I'll file it as a non-conforming bug on their end, but for now I'm skipping the test.
98: Bh/output updates
97: store/{nbs,chunks}: Make ChunkStore#GetMany{,Compressed} take send-only channels.
96: update status messages for push/pull
94: Update README.md
Ensure that installing from source is properly documented, including go-gotchas.
93: Reverts the revert of my push/pull changes with fixes.
92: content length fix
91: go: store/nbs: table_reader: getManyAtOffsetsWithReadFunc: Stop unbounded I/O parallelism in GetMany implementation.
When we do things like push, pull or (soon-to-be) garbage collection, we have large sets of Chunk addresses that we pass into ChunkStore#GetMany and then go off and process. Clients largely try to control the memory overhead and pipeline depth by passing in a buffered channel of an appropriate size. The expectation is that the implementation of GetMany will have an amount of data in flight at any give in time that is in some reasonable way proportional to the channel size.
In the current implementation, there is unbounded concurrency on the read destination allocations and the reads themselves, with one go routine spawned for each byte range we want to read. This results in absolutely massive (virtual) heap utilization and unreasonable I/O parallelism and context switch thrashing in large repo push/pull situations.
This is a small PR to change the concurrency paradigm inside getManyAtOffsetsWithReadFunc so that we only have 4 concurrent dispatched reads per table_reader instance at a time.
This is still not the behavior we actually want.
- I/O concurrency should be configurable at the ChunkStore layer (or eventually per-device backing a set of tableReaders), and not depend on the number of tableReaders which happen to back the chunk store.
- Memory overhead is still not correctly bounded here, since read ahead batches are allowed to grow to arbitrary sizes. Reasonable bounds on memory overhead should be configurable at the ChunkStore layer.
  I'm landing this as a big incremental improvement over status quo. Here are some non-reproducible one-shot test results from a test program. The test program walks the entire chunk graph, assembles every chunk address, and then does a GetManyCompressed on every chunk address and copies their contents to /dev/null. It was run on a ~10GB (compressed) data set:
  Before:
```
$ /usr/bin/time -l -- go run test.go
...
MemStats: Sys: 16628128568
161.29 real        67.29 user       456.38 sys
5106425856  maximum resident set size
0  average shared memory size
0  average unshared data size
0  average unshared stack size
10805008  page reclaims
23881  page faults
0  swaps
0  block input operations
0  block output operations
0  messages sent
0  messages received
8  signals received
652686  voluntary context switches
21071339  involuntary context switches
```
After:
```
$ /usr/bin/time -l -- go run test.go
...
MemStats: Sys: 4590759160
32.17 real        30.53 user        29.62 sys
4561879040  maximum resident set size
0  average shared memory size
0  average unshared data size
0  average unshared stack size
1228770  page reclaims
67100  page faults
0  swaps
0  block input operations
0  block output operations
0  messages sent
0  messages received
14  signals received
456898  voluntary context switches
2954503  involuntary context switches
```
On these runs, sys time, wallclock time, vm page reclaims and virtual memory used are all improved pretty substantially.
Very open to feedback and discussion of potential performance regressions here, but I think this is an incremental win for now.
90: Implemented REPLACE
Mostly tests since this just uses the Delete and Insert functions that we already have. The previous delete would ignore a delete on a non-existent row, so I just changed it to throw the correct error if the row does not exist so that REPLACE works properly now (else it will always say a REPLACE did both a delete & insert).
89: Push and Pull v2
88: Add metrics attributes
Similar to previous PR db/event-metrics, but this time, no byte measurements on clone as the implementation is different. Some things in the events package have been refactored to prevent circular dependencies. Adding StandardAttributes will help me generate the info for my new metrics.
87: {go, bats}: Replace table works with file with schema in different order
86: dolt table import -r
Fixes #76
Replaces existing table with the contents of the file while preserving the original schema
85: Bh/cmp chunks
84: revert nil check and always require stats to match aws behavior
83: Bh/clone2
This version of clone works on the table files directly. It enumerates all the table files and downloads them. It does not inspect the chunks as v1 did.
82: Naked deletes now just delete everything instead of iterating
I mean this works but it's ugly and I'm not sure of a better way to do it really
81: Progress on switching deletes to new engine
Currently works for deletes but not thoroughly testing.
80: go/store/nbs: store.go: Make global index cache 64MB instead of 8MB.
79: Removed skips for tests that will now work
This will fail for now, waiting on dolthub/go-mysql-server#10 to be approved before I merge this in. Super small stuff though.
73: go/libraries/doltcore/remotestorage: Add the ability to have a noop cache on DoltChunkStore.
72: proto: Use fully qualified paths for go_packages.
This allows cross-package references within proto files to work appropriately.
71: Db/events dir lock
initial implementation of making event flush concurrency safe
70: go/store/spec: Move to aws://[table:bucket] for NBS on AWS specs because of Go URL parsing changes.
See https://go.googlesourc...

Assets 8

29 Aug 22:44

reltuk

v0.9.8

3d89c79

0.9.8

We have released version 0.98 of Dolt, which as you probably know is now open source. A quick reminder that you can freely host the awesome public data you put in Dolt at DoltHub.

This release contains performance improvements, and bug fixes but no major new features. Please let me know if you have any questions.

Merged PRs

60: bump version
57: Added a PID to a directory. This was causing jenkins on windows to fa…
…il if it ran twice on the same instance.

55: {bats,go}: Log successful commits
This closes dolthub/ld#1744
Before:

$ dolt commit -m "commit ints"

After:

$ dolt commit -m "commit ints"
commit 3cvbeh6bn94hlhfaig5pa65peiribrhn
Author: Matt Jesuele <[email protected]>
Date:   Mon Aug 26 19:10:17 -0700 2019
commit ints

50: add dustin to approved commiters/authors
49: [WIP] Add client events to dolt commands
Added events to all of the dolt commands.
Turned logging back on while I work on this PR. (will remove before merge)
I need to write tests for these, should I create a test file for each command file where I test to ensure that the command has an event and the appropriate metrics? Would love input on this.
48: client events
47: Threading context from app launch
46: Add client_event.proto and compiled .go file
45: Add support to get the last modified time from the filesys
44: Changed default remote host to use the env constant
Before we were using dolthub.com as the default, which is incorrect. I've changed it to the appropriate environment constant so that it also properly updates when we change from our beta domain.
43: Created skipped test for newlines on CSV
42: README.md: Remove erroneous go install instructions.
41: Make the InMemFS thread safe
The current InMemFS was failing in a multithreaded context as it edits a map which is not thread safe. Something to note is that golang locks are not re-entrant. Some of the refactoring is related to that. Locks are typically put on the exported methods and not the internal methods.
40: Fixed JSON imports and disallowed schemas on import updates
Fixes #36
39: Add move file functionality to the filesys package
38: Fixes a panic that occurs if multiple bad rows are found during import
When a pipeline is being run, any stage can write to the bad row channel when an error is encountered. There is a go routine reading from this channel that will not exit until the channel is closed, or an error is encountered. In typical operation the pipeline's sink would close the bad row channel once the pipeline finishes (either via an error triggered stoppage, or successful completion). However, in the case where multiple errors are getting written to the bad row channel from multiple go routines, it is possible for the bad row channel to be written to, which triggers the pipeline to be stopped, and the channel to be closed, and then have a go routine write to that closed channel.
The fix here is to not close the channel in the sink, but instead to write a marker to the channel which will cause the go routine watching for errors to exit.
37: go/go.mod: Do not depend on //proto/third_party/golang-protobuf.
Development ergonomics are much worse and the runtime library will maintain
compability with the generator major version anyway, or it will explicitly
break compilation.
35: dolt/go: Fix spelling on ancestor
34: proto/Makefile: Use submodule for protoc-gen-go instead of whatever is on the path.
33: Jenkinsfile: Use goimports from go.mod for check_fmt.sh
31: support importing and exporting data to and from stdin and std out
In the current releases it was possible to chain dolt with other programs via stdout/stdin like so:
dolt table export table_name --file-type csv /dev/stdout -f|python row_cleaner.py|dolt table import cleaned_data -u --file-type csv /dev/stdin
Which only works in environments where stdin / stdout are mapped to files on the filesystem. This change will use the stdin / stdout streams for import / export when a file is not provided.
30: Added column lengths for schema output to varchar columns so that the…
…y can be re-imported
29: go/cmd/dolt: dolt ls -v shows number of rows in each table.
27: Refer to newest version of mmap-go
We now strictly refer to our own fork of mmap-go. Plus cleaned up the go.mod, as we have git history and don't quite need the comments.
25: Added .idea directory (goland) to top-level .gitignore file
24: fix race condition which caused reproducible crash
The declaration of the variables readStart, readEnd, and batch are declared outside of the for loop, and it is possible that their value can change before the go routine calls readAtOffsets causing some or all of these values to be incorrect. The fix is to save them to variables scoped to the loop before calling the go routine.
23: Fixed a bug on windows when redirecting STDIN for SQL import, e.g. do…
…lt sql < dump.sql. Also fixed up ip2nation sample so that it successfully imports

Closed Issues

54: Just a note to say
36: Unable to update tables using JSON files

Assets 8

08 Aug 22:15

reltuk

v0.9.7

10e3377

0.9.7

The first public binary release of Dolt, git for data.

Assets 8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merged PRs

Easy Install Script

System Tables

LICENSE and README functionality

Views

Other

Merged PRs

Community

SQL

Diff With Predicate

Override Commit Date

SQL Diffs

Merged PRs

SQL

System Table

Timestamps

Performance

Other

Merged PRs

Closed Issues

Dolt Blame

SQL Enhancements

Schema Import

Windows Installer Packages

Other

Merged PRs

Merged PRs

Merged PRs

Closed Issues

Releases: dolthub/dolt

0.13.1

Merged PRs

0.13.0

Easy Install Script

System Tables

LICENSE and README functionality

Views

Other

Merged PRs

0.12.0

Community

SQL

Diff With Predicate

Override Commit Date

SQL Diffs

Merged PRs

Dolt 0.11.0 released

SQL

System Table

Timestamps

Performance

Other

Merged PRs

Closed Issues

0.10.0

Dolt Blame

SQL Enhancements

Schema Import

Windows Installer Packages

Other

Merged PRs

0.9.9

Merged PRs

0.9.8

Merged PRs

Closed Issues

0.9.7