Contextual string tags to prevent SQL injection #29

mikesamuel · 2018-01-26T21:12:14Z

https://nodesecroadmap.fyi/chapter-7/query-langs.html describes
this approach as part of a larger discussion about library support
for safe coding practices.

This is one step in a larger effort to enable

connection.querySELECT * FROM T WHERE x = ${x}, y = ${y}, z = ${z}(callback)

and similar idioms.

This was broken out of mysqljs/mysql#1926

dougwilson · 2018-01-26T21:19:57Z

Thanks so much! I would live to play around this to review, but not 100% sure about all the different cases around in here. When you get a chance, can you write up documentation for the README so myself (and everyone else) knows this exists and how to make use of it? I'm really confused why there is a SQL parser in here and concerned that there are going to be lots of edge cases with issues and I'm not super interested in maintaining an actual SQL parser as part of this. Why is the parser necessary? I would almost expect this to act just like ? does today, or what is happening here?

dougwilson · 2018-01-26T21:21:46Z

lib/Template.js

+    return false;
+  };
+
+  function stringWrapper() {


What does this do, exactly? How is it different from the toSqlString already implemented here?

Explained when discussing Fragment and Identifier below.

So the difference is that you want to use instanceof I see. Well, that won't work, so there doesn't sound like a difference then?

Removed in favor of SqlString.raw and SqlString.identifier. The latter is newly added and round passes through escapeId.

I apologize for not looking carefully enough for existing APIs. I read the mysql module APIs but did not read the sqlstring APIs. I now see that toSqlString is mentioned both places but it didn't register.

I reached for TypedString without looking too carefully because I'm arguing in "A common style guide for tag implementors" for using TypedString subclasses to encourage interoperability among tag implementations, but I should put in more effort to see that my PRs fit into the existing code well.

dougwilson · 2018-01-26T21:23:58Z

lib/Template.js

+   * @param {string} content
+   * @constructor
+   */
+  module.exports.Fragment = stringWrapper();


Identifier and Fragment look like they are identical. Was this an accident or can they just be combined into one thing?

The important fact is that instanceof Fragment is separate from instanceof Identifier.

A fragment is semantically a series of SQL tokens.
An identifier is a run of characters that can appear between `...`.

When the runtime support ES6, these will be represented as subclasses of require('template-tag-common').TypedString which allows associating a runtime type with a sub-language so that RTTI can be used to avoid over-escaping.

Without ES6, these are meant to just be valid right-hand-sides to the instanceof operator.

Sure, but instanceof is super broken when using modules in Node.js as O explained. We don't accept using instanceof any longer because of these issues.

And Fragment sounds like it should be using our existing Fragment API: the toSqlString contract. If that is not possible, I need to understand why and then you should remove the toSqlString API completely in favor of the API because there is no reason to have two different Fragment APIs in the same module, that is silly like the module cannot coordinate with itself :)

Refactored to use SqlString.raw and SqlString.identifier.

dougwilson · 2018-01-26T21:26:22Z

lib/es6/Template.js

+if (require('process').env.npm_lifecycle_event === 'test') {
+  // Expose for testing.
+  // Harmless if this leaks
+  sql.makeLexer = makeLexer;


Just put this into a different file and require it directly in the tests. And here, rather than a condition export. When users run their tests, this will be exported in everyone's suite, and if they use it accidentally, their own test suite won't fail, instead they'll fail when it gets into production, probably the worst place to fail :)

TODO(msamuel): separate out into TemplateLexer.js and require it directly.

Done. Actually in lib/es6/Lexer.js

dougwilson · 2018-01-26T21:27:40Z

lib/es6/Template.js

+ * Template tag function that contextually autoescapes values
+ * producing a SqlFragment.
+ */
+const sql = memoizedTagFunction(computeStatic, interpolateSqlIntoFragment);


If this is an in memory cache, let's make sure to note in the README the maximum amount of RAM this is set to so folks will understand how much memory this will consume over the lifetime of their application.

Underlying the cache is a WeakMap, so it should adapt to memory pressure.

The maximum memory usage should scale with the number of uses of the tag that appear in the source code.

I'm not sure I understand, sorry. Doesn't memoizing something cache some aspect in memory for later use? What is the maximum number of objects that will end up in that cache?

The maximum number of objects is 3 per occurrence of a relevant tag usage:

Two created arrays

One record that bundles the two together.

They're not pinned in memory since they key of the template object which is itself weakly referenced by the containing realm.

If that answers your question, I can put that in the docs.

I guess I'm still confused on how this is working and what it's doing under the hood. Is there another way you can explain what this is doing? For example, if you remove the memoization what does that do? What does this memoization provide? Is it calculating some things and storing in memory for later use? If so, for how long is that thing stored in memory? I'm trying to figure this out, because Node.js apps will typically stay running for months at a time if possible, and I'm concerned if this is going to introduce some kind of small long-term memory growth from memoization of many queries over the lifetime of the app. So maybe that would help provide context for the information I'm trying to understand is :)

When sql`foo${bar}` is parsed by the EcmaScript engine, it creates a frozen array ['foo', ''].

When control reaches that expression, the memoizer uses that array as a key in a WeakMap lookup to decide whether it has to run the lexer or whether it can reuse.

The frozen array is itself weakly held by the module scope. This happens in "12.2.9.3Runtime Semantics: GetTemplateObject ( templateLiteral )". Although the spec doesn't make this explicit, it has to be weakly held because otherwise

function foo() {} while (1) { eval(' foo`...` ') }

would leak memory.

So after a full garbage collection, any memoized state for scopes that are not reachable by any re-enterable scope will be flushed.

So the max size of the memo-table post full garbage collection should be the count of sql`...` calls in loaded modules.

By "re-enterable scope" above, I mean that those used only in (function () { ... }()) module initializers and in eval-ed code should not contribute in the steady state.

The min size of the memo-table will be the count of those uses that have been evaluated.

The memory cost per entry in the memo table will be the 3 objects detailed earlier.

I did a bit of digging around and found tc39/ecma262#840 so some of my assumptions were misfounded. I'm going to replace the cache with an LRU cache which should put a constant limit on the memory overhead.

I replaced the WeakMap with an LRU cache.

What is the LRU cache size in bytes set to / is it adjustable?

The cache caps at 100 entries.

I'm going to assume that in the 3σ case, all uses fits on a 120 column line and have 12 ${...}s.
In this 3σ case, a full cache consumes 11.7kB.

I'm going to assume that the average case has 3 or fewer ${...}.
In the typical case, a full cache consumes 4.7kB.

If storing a single ASCII character in an array costs less than an 8B pointer, then it will be correspondingly cheaper.

I could probably save some space by joining the contexts into a string instead of storing characters in an array, but I don't have the tooling to do memory micro-benchmarks so can't say for sure.

The LRU cache size is not adjustable. I could make that adjustable if you have an idea for an API.

dougwilson · 2018-01-26T21:32:15Z

Running npm test seems to fail on my machine for some reason:

$ npm test

> [email protected] test C:\Users\dougw\GitHub\nodejs-sqlstring
> node test/run.js

1..2
ok 1 test\unit\test-SqlString.js
  0 fail | 64 pass | 16 ms

not ok 2 test\unit\test-Template.js
  Failed: template tag date

    AssertionError: 'SELECT \'1999-12-31 19:00:00.000\'' == 'SELECT \'2000-01-01 00:00:00.000\''
        at runTagTest (C:\Users\dougw\GitHub\nodejs-sqlstring\test\unit\es6-Template.js:72:12)
        at Object.date (C:\Users\dougw\GitHub\nodejs-sqlstring\test\unit\es6-Template.js:83:5)
        at TestCase.run (C:\Users\dougw\GitHub\nodejs-sqlstring\node_modules\utest\lib\TestCase.js:30:10)
        at Collection._runTestCase (C:\Users\dougw\GitHub\nodejs-sqlstring\node_modules\utest\lib\Collection.js:44:6)
        at Collection.run (C:\Users\dougw\GitHub\nodejs-sqlstring\node_modules\utest\lib\Collection.js:23:10)
        at _combinedTickCallback (internal/process/next_tick.js:73:7)
        at process._tickCallback (internal/process/next_tick.js:104:9)
        at Module.runMain (module.js:606:11)
        at run (bootstrap_node.js:389:7)
        at startup (bootstrap_node.js:149:9)

  1 fail | 24 pass | 46 ms

npm ERR! Test failed.  See above for more details.

dougwilson · 2018-01-26T21:48:44Z

Ok, so I poked around a bit there. Let me know when you have some docs, because I can't (at least in the small time I have right now) figure out how to specify the timezone setting when using the template strings, for example (or stringifyObject).

dougwilson · 2018-01-26T21:49:35Z

In the end, would be great to see template tags land because they are very convenient.

dougwilson · 2018-01-26T21:57:01Z

lib/es6/Template.js

+
+    const one = valueArray[i];
+    let valueStr = null;
+    if (one instanceof SqlFragment) {


If users can construct the instance of these variables themselves, you need to be really careful with instanceof. For example, if a user uses an ORM that has mysql under the hood, which depends on sqlstring and then they are importing sqlstring in order to construct these instances, instanceof will end up being false unless the user is sure they depend on the exact version of sqlstring as the underlying mysql and npm was able to dedup the two installs into a single install. In practice, this has been a long term issue users encounter all the time, so if they can construct the instance, don't use instanceof check at all.

TODO(mikesamuel): I will rework TypedString to carry a language identifier and a static method TypedString.same which can be used in lieu of instanceof.

Out of curiosity, does this occur when two dependencies loading two different versions of sqlstring, or is this a result of { "bundleDependencies": true" } in package.json?

Both. For example, if a user uses an ORM that has mysql under the hood, which depends on sqlstring and then they are importing sqlstring in order to construct these instances, instanceof will end up being false unless the user is sure they depend on the exact version of sqlstring as the underlying mysql and npm was able to dedup the two installs into a single install.

When there are two copies in play, all reference checks will fail between them because they are completely separate module instances. That is why instanceof checks fail. You can see all this discussion from when the "toSqlString" was implemented as that was oroginally going to use instanceof.

As mentioned, removed SqlFragment and Identifier in favor of SqlString.raw and SqlString.identifier.

dougwilson · 2018-01-26T21:57:19Z

lib/es6/Template.js

+      }
+      valueStr = one.toString();
+      needsSpace = i + 1 === nValues;
+    } else if (one instanceof Identifier) {


Same instanceof comment here as above.

No longer relevant to this PR, but I rewrote TypedString in npmjs.com/package/template-tag-common to make it easy to avoid instanceof.

dougwilson · 2018-01-26T22:05:18Z

Is the following the expected output?

> SqlString.sql`SELECT * FROM foo WHERE bin = "${Buffer.from("1f3870be274f6c49b3e31a0c6728957f","hex")}"`.toString()
SELECT * FROM foo WHERE bin = "8p�\'OlI��\Z
                                           g(�"

dougwilson · 2018-01-26T22:09:00Z

Also not sure if I'm using this right: was just trying to put in a NULL:

> SqlString.sql`SELECT * FROM foo WHERE bin = ${null}`.toString()
SELECT * FROM foo WHERE bin = ?

dougwilson · 2018-01-27T00:13:35Z

The following seems to throw an error, even though it's valid SQL syntax:

> SqlString.sql`UPDATE scores SET score=score--1`.toString()
Error: Expected delimiter at "UPDATE scores SET score=score--1"

mikesamuel

Thanks for giving it a look.
I responded to comments with TODOs and will respond again as they're done.

Thanks so much! I would live to play around this to
review, but not 100% sure about all the different
cases around in here.

When you get a chance, can
you write up documentation for the README so
myself (and everyone else) knows this exists and
how to make use of it?

TODO(mikesamuel): Document API changes

I'm really confused why there
is a SQL parser in here and concerned that there are
going to be lots of edge cases with issues and I'm
not super interested in maintaining an actual SQL
parser as part of this. Why is the parser necessary?
I would almost expect this to act just like ? does
today, or what is happening here?

There seem to be two separable questions:

Why is a sql parser needed?
Why is SqlString.sql`FOO ${bar} BAZ` not just
syntactic sugar for SqlString.format('FOO ? BAZ', [bar])?

I expect (1) is more controversial so I'll deal with that first.

.format might benefit from taking into account context.

.../sqlstring > node
> console.log(sqlstring.format(` SELECT '?' `, ['; DROP TABLE T -- ']))
 SELECT ''; DROP TABLE T -- ''
undefined

This is a footgun. The problem occurs because the author might have been thinking that the '?' specifies a string, and adds quotes. Those quotes interact badly with the quotes inserted by sqlstring.format.

In most cases, a developer will catch this in testing when ? gets filled with something that causes a parse error. It may not for holes that are often filled with empty strings or null values.

Lexing to find delimited string boundaries allows us to identify when a substitution point ( ? in prepared statement syntax ) appears inside quotes which also allows us to switch automagically between escape and escapeId semantics.

I've had success preventing injection attacks with similar systems. SQL is a simpler language, so the approach may be overkill, but the case above worries me and the parser is correspondingly simpler.

and concerned that there are going to be lots of edge cases

Language corner cases are a valid concern, but there is a separation of concerns here that we can take advantage of.

An attacker attempts to use obscure language corner cases to escape containment, but the template author has no such incentives.

This lexer only applies to the portion of the SQL string written by the trusted author though, and the escapers that contain untrusted strings are unchanged by this PR.

(2) is an excellent question and the answer is because I did not thoroughly investigate extending the existing API before implementing. I think I can rework this CL to adapt and extend format in a way that should be backwards compatible except w.r.t. odd cases like the SELECT '?' case above.

The goal could be to provide two idioms:

SqlString.sql({ stringifyObjects, zone}) returns SqlString.sql curried with options.

Used as a tag, SqlString.sql`...`callsSqlString.format` under the hood.

Combined, those would enable SqlString.sql({ stringifyObjects, zone })`...`.

mikesamuel · 2018-01-27T13:54:09Z

lib/Template.js

+   * @param {string} content
+   * @constructor
+   */
+  module.exports.Fragment = stringWrapper();


The important fact is that instanceof Fragment is separate from instanceof Identifier.

A fragment is semantically a series of SQL tokens.
An identifier is a run of characters that can appear between `...`.

When the runtime support ES6, these will be represented as subclasses of require('template-tag-common').TypedString which allows associating a runtime type with a sub-language so that RTTI can be used to avoid over-escaping.

Without ES6, these are meant to just be valid right-hand-sides to the instanceof operator.

mikesamuel · 2018-01-27T13:54:15Z

lib/Template.js

+    return false;
+  };
+
+  function stringWrapper() {


Explained when discussing Fragment and Identifier below.

mikesamuel · 2018-01-27T14:00:06Z

lib/es6/Template.js

+
+    const one = valueArray[i];
+    let valueStr = null;
+    if (one instanceof SqlFragment) {


TODO(mikesamuel): I will rework TypedString to carry a language identifier and a static method TypedString.same which can be used in lieu of instanceof.

Out of curiosity, does this occur when two dependencies loading two different versions of sqlstring, or is this a result of { "bundleDependencies": true" } in package.json?

mikesamuel · 2018-01-27T14:00:16Z

lib/es6/Template.js

+      }
+      valueStr = one.toString();
+      needsSpace = i + 1 === nValues;
+    } else if (one instanceof Identifier) {


mikesamuel · 2018-01-27T14:02:42Z

lib/es6/Template.js

+ * Template tag function that contextually autoescapes values
+ * producing a SqlFragment.
+ */
+const sql = memoizedTagFunction(computeStatic, interpolateSqlIntoFragment);


Underlying the cache is a WeakMap, so it should adapt to memory pressure.

The maximum memory usage should scale with the number of uses of the tag that appear in the source code.

mikesamuel · 2018-01-27T14:03:43Z

lib/es6/Template.js

+if (require('process').env.npm_lifecycle_event === 'test') {
+  // Expose for testing.
+  // Harmless if this leaks
+  sql.makeLexer = makeLexer;


TODO(msamuel): separate out into TemplateLexer.js and require it directly.

mikesamuel · 2018-01-27T14:49:15Z

Ok, so I poked around a bit there. Let me know when you have some docs, because I can't (at least in the small time I have right now) figure out how to specify the timezone setting when using the template strings, for example (or stringifyObject).

I neglected to provide a way to thread those through.

TODO(mikesamuel): allow sql(optionsObject) to specify a tag handler that closes over options including timezone and stringifyObject.
Fix the test that fails in non GMT contexts.
Run tests in two or more TZ=... contexts before submitting PRs.

dougwilson · 2018-01-27T16:58:00Z

Thanks for your comments! I don't see any replies for the bugs(?) I think I found and would love to hear back on those. Also, not sure how you made that second set of line comments but there is no reply button for them so I cannot reply to them.

mikesamuel · 2018-01-29T16:03:21Z

I don't see any replies for the bugs(?) I think I found and would love to hear back on those.

Looking at this again. Will address those and others shortly.

Also, not sure how you made that second set of line comments but there is no reply button for them so I cannot reply to them.

Hmm. That's odd. I'll try to leave things as file comments.

mikesamuel · 2018-01-29T22:17:39Z

lib/es6/Lexer.js

+    '|' +
+    (
+      // Run of non-comment non-string starts
+      '(?:[^\'"`\\-/#]|-(?!-' + WS + ')|/(?![*]))'


@dougwilson, The addition of WS fixes the failure to lex X--1 example you found.

mikesamuel · 2018-01-29T22:18:23Z

lib/es6/Lexer.js

@@ -0,0 +1,107 @@
+// A simple lexer for SQL.


@dougwilson This separate file is now imported by the unittest instead getting rid of the conditional export.

mikesamuel · 2018-01-29T22:19:02Z

lib/es6/Template.js

+      if (delimiter) {
+        result += escapeDelimitedValue(value, delimiter, timeZone);
+      } else {
+        result += SqlString.escape(value, stringifyObjects, timeZone);


The responsibility for escaping is almost entirely delegated to ../SqlString.js now.

mikesamuel · 2018-01-29T22:20:02Z

lib/es6/Template.js

+    return SqlString.escapeId(String(value)).replace(/^`|`$/g, '');
+  }
+  if (Buffer.isBuffer(value)) {
+    value = value.toString('binary');


I found no good way to use an X"..." style syntax the way format/escape do.

Will this insert the data into the database without silent data loss for all possible values? If it's not possible to do the right escaping, we probably don't want to end up formatting silently for data loss, right?

Hmm. https://dev.mysql.com/doc/refman/5.7/en/string-literals.html seems relevant

[_charset_name]'string' [COLLATE collation_name]

so it seems to depend on whether collations can be lossy even if charset decoding doesn't substitute U+FFFD or the like.

https://dev.mysql.com/doc/refman/5.7/en/adding-collation-simple-8bit.html seems to suggest they can be. The custom <collation name="latin1_test_ci"> at that link seems to downgrade to LATIN. Default collations can be associated with columns, so there may be no cues in the text.

mikesamuel · 2018-01-29T22:20:53Z

package.json

  ],
  "repository": "mysqljs/sqlstring",
+  "dependencies": {
+    "template-tag-common": "2.0.1"


Bumped the major version because I reworked it to allow efficiently threading options objects through to address my failure to thread timeZone and stringifyObjects properly.

mikesamuel · 2018-01-29T22:22:31Z

test/unit/es6-Lexer.js

+    assert.equal(tokens('SELECT ```', '`'), '`,_');
+    assert.equal(tokens('SELECT `\\`', '`'), '`,_');
+  },
+  'comment': function () {


Cruft. Will remove.

mikesamuel · 2018-01-29T22:22:50Z

test/unit/es6-Template.js

+  'date': function () {
+    runTagTest(
+      `SELECT '2000-01-01 00:00:00.000'`,
+      () => sql({ timeZone: 'GMT' })`SELECT ${new Date(Date.UTC(2000, 0, 1, 0, 0, 0))}`);


This is how timezones can thread through.

mikesamuel · 2018-01-29T22:23:42Z

test/unit/es6-Template.js

+    runTagTest(
+      'SELECT "\x1f8p\xbe\\\'OlI\xb3\xe3\\Z\x0cg(\x95\x7f"',
+      () =>
+        sql`SELECT "${Buffer.from("1f3870be274f6c49b3e31a0c6728957f","hex")}"`


In case incremental diffs aren't working, I added this and the following two testcases to address other bugs that you found.

mikesamuel · 2018-01-29T22:24:05Z

test/unit/test-Lexer.js

+
+// If we're on a Node runtime that should support ES6, run the ES6 tests.
+var nodeVersion = process.env.npm_config_node_version;
+if (/^[0-5][.]/.test(nodeVersion || '')) {


TODO(mikesamuel): Is there a common place I can put this old-ES-engine-test-skipping machinery?

mikesamuel · 2018-01-29T22:25:17Z

test/unit/test-SqlString.js

+  'double quest marks passes pre-escaped id': function () {
+    var sql = SqlString.format(
+      'SELECT * FROM ?? WHERE id = ?',
+      [SqlString.identifier('table'), 42]);


I can't quite put my finger on why, but it seems like if SqlString.identifier is an API, this is important for some kind of symmetry.

mikesamuel · 2018-01-29T22:32:31Z

The test failures seem to be related to npm run-script of eslint and test-ci.

I'll tackle that in the next commit but probably won't push anything until tomorrow.

mikesamuel · 2018-01-30T02:16:37Z

The remaining Travis CI failures seem to be in test-{Lexer,Template} because the following test is not working on Node runtimes with version 0.x.x:

var nodeVersion = process.env.npm_config_node_version;
if (/^[0-5][.]/.test(nodeVersion || '')) {

I don't know enough about historical oddities of Node runtimes, so I'll have to figure out how to do a robust Node version test.

I'll see if https://www.npmjs.com/package/check-node-version works on older node tomorrow.

mikesamuel · 2018-01-30T15:24:43Z

I discovered npx which lets me test with various versions.

$ npm install --no-save npx
$ ./node_modules/.bin/npx [email protected] test/run.js
1..3
ok 1 test/unit/test-Lexer.js
  Skipping ES6 tests for node_version v0.12.18

ok 2 test/unit/test-SqlString.js
  0 fail | 72 pass | 11 ms

ok 3 test/unit/test-Template.js
  Skipping ES6 tests for node_version v0.12.18
  0 fail | 3 pass | 1 ms

mikesamuel · 2018-01-30T15:57:12Z

Tests run green now.

I've looked over the coverage report. The main sticky point there is

// lib/Template.js
try {
  module.exports = require('./es6/Template');
} catch (ignored) {
  // ES6 code failed to load.
  ...
}

I added tests for the missing branch but those won't be reflected in the coverage report since it does not, IIUC, union coverage from runs on multiple node versions.

dougwilson · 2018-01-30T16:13:15Z

Sorry I've been busy these past 2 days. The coverage is definitely a union as reported to the PR status -- I maintain many modules that have separate code paths based on versions. Looking at the missing 2.2% it's lines not covered in any Node.js version run.

mikesamuel · 2018-01-30T16:48:07Z

The coverage is definitely a union as reported to the PR status

Ah. I see https://coveralls.io/builds/15290075/source?filename=lib%2FTemplate.js
Nice!

mikesamuel · 2018-01-30T16:49:10Z

it's lines not covered in any Node.js version run.

It was unreachable since the lexer patterns will all fall back to matching the empty string. Fixed.

dougwilson · 2018-01-30T16:50:49Z

It was unreachable since the lexer patterns will all fall back to matching the empty string. Fixed.

No, it was as I said at the time I made comments. I'm not sure why you are assuming I'm making comments on changes you made after I made comments. Or am I missing something here? You're saying all 2.2% of those uncovered lines where from the lexer pattern fallback stuff?

dougwilson · 2018-01-30T16:52:34Z

lib/SqlString.js

+    if (QUALIFIED_ID_REGEXP.test(sqlString)) {
+      return sqlString;
+    } else {
+      throw new TypeError();


Can this provide some kind of message / description that will assist the dev who encounters this thrown error?

throw new TypeError( 'raw sql reached ?? or escapeId but is not an identifier: ' + sqlString);

dougwilson · 2018-01-30T16:55:18Z

lib/Template.js

@@ -0,0 +1,32 @@
+/* eslint no-unused-vars: "off" */


Is this 100% necessary? If it's possible to not override eslint, please try not to. Otherwise, add comments somewhere in the code explaining why it's necessary so we know for the future :) And ideally if it's possible to scope the override to only the specific line, that is better for the future because line scope probably doesn't mean comment (since it is likely to be more obvious) and it will continue to protect this mistake in the rest of the file 👍

I can definitely do line overrides, or just get rid of the formal parameters if you're ok with calledAsTemplateTagQuick.length differing depending on whether the fallback is what's used.

What do you prefer?

Hmm, not sure. I feel like they aren't necessary, but maybe I'm not seeing something here. What would the users encounter if the property differed?

I'm not sure. I just bias towards minimizing programmatically visible differences.

Some code might treat zero-arity functions as a thunk, but I don't know why these functions would reach code that does.

mikesamuel · 2018-01-30T17:05:52Z

No, it was as I said at the time I made comments. I'm not sure why you are assuming I'm making comments on changes you made after I made comments. Or am I missing something here? You're saying all 2.2% of those uncovered lines where from the lexer pattern fallback stuff?

You're not missing anything, and I'm not saying that.

dougwilson · 2018-01-30T17:06:09Z

test/unit/test-Lexer.js

@@ -0,0 +1,7 @@
+// If we're on a Node runtime that should support ES6, run the ES6 tests.
+if (/^v?[0-5][.]/.test(process.version)) {


Rather than version detect, can the tests simply skip if loading the Lexer fails? Or at least use some kind of feature detection instead of version sniffing? I remember in the io.js days and post io.js Node.js core constantly said never try to version sniff in the code, only feature detect. There are also things like Chakra Node there days too. Version skipping in the main mysql package is all done using feature detection.

If one of the source files fails to load because of an error in the source when read by an ES>=6 interpreter, then npm test would not alert us. npm run-script lint does give us some confidence here, but would not alert us if there was a failure like

const A_REGEXP = new RegExp('[invalid input to RegExp constructor');

Putting a stake in the ground seemed the safest strategy for test code, and degrading gracefully seemed the best for production code.

Makes sense not to just abort if the require fails. You can just add es6 feature detect here instead, then 👍

Ah. Good point. Were you thinking of something like

var es6Compatible = true; try { eval('const [ x ] = [ 1234 ];'); } catch (ignored) { es6Compatible = false; }

But I I suppose that would interact badly if tests were run under FLAG_disallow_code_generation_from_strings.

TODO(mikesamuel): Add a simple es6 canary file that can be required inside a try block.

dougwilson · 2018-01-30T17:07:27Z

You're not missing anything, and I'm not saying that.

Very sorry, I guess I just misunderstood what you're trying to say 😂 So what are you trying to say?

mikesamuel · 2018-01-30T17:12:10Z

Very sorry, I guess I just misunderstood what you're trying to say 😂 So what are you trying to say?

I saw your comment just after I'd convinced myself that the last missed branch was unreachable, and assumed I hadn't seen it earlier since it hadn't been there.

I was focused on particular lines in files, and did not intend to make a claim about percentages though that is a reasonable interpretation of what I wrote.

dougwilson · 2018-01-30T17:13:46Z

P.S. thanks so much for all this work ! I think I'm finally understanding why you're adding the lexer and it seems like the old .format would also benefit from the added context information, but the lexer is written in es6. There doesn't seem to be anything in the lexer itself that requires es6 to function, though. One day it would be sweet to post that back to es5 and add to .format so the es6 template can be sugar and everyone can benefit from the additional context / protection that the lexer is enabling instead of letting those users continue to footgun 😂

I need to get going and I'll be back in a few days to continue to review and learn from the code. There is a lot of code, especially in the lexer that is new to me, at least :) Also something to think about would be if you'd be open to committing to helping maintain this stuff for the next 1 year or not. No big deal if not, but just asking because I don't really need to fully understand it prior to merging, for example, if I know you'll be around to help if issues come up 👍

mikesamuel · 2018-01-30T17:35:14Z

Also something to think about would be if you'd be open to committing to helping maintain this stuff for the next 1 year or not.

I'm planning on doing a variety of open-source hardening things, so I can be available to walk early adopters through and fix bugs that shake out. I'm unlikely to have reliable availability outside of GMT+5 working hours though.

One day it would be sweet to post that back to es5

It should be fairly straightforward.

Trying to recast format in terms of machinery currently provided by Template though should happen after we have early adopters' experiences to inform us.

I need to get going and ...

Ttyt.

dougwilson · 2018-02-24T16:09:18Z

Can you make changes to the new dependency template-tag-common so it does not have a fragile installation? It has dependencies with ">=" which means when new major versions of those are published they'll get installed, likely breaking installs of this module that used to work.

mikesamuel · 2018-02-26T19:06:04Z

Docs at https://github.com/mikesamuel/sqlstring/tree/contextual-template-tags#es6-template-tag-support

mikesamuel · 2018-02-27T16:48:02Z

lib/es6/Template.js

+ * Template tag function that contextually autoescapes values
+ * producing a SqlFragment.
+ */
+const sql = memoizedTagFunction(computeStatic, interpolateSqlIntoFragment);


The cache caps at 100 entries.

I'm going to assume that in the 3σ case, all uses fits on a 120 column line and have 12 ${...}s.
In this 3σ case, a full cache consumes 11.7kB.

I'm going to assume that the average case has 3 or fewer ${...}.
In the typical case, a full cache consumes 4.7kB.

If storing a single ASCII character in an array costs less than an 8B pointer, then it will be correspondingly cheaper.

I could probably save some space by joining the contexts into a string instead of storing characters in an array, but I don't have the tooling to do memory micro-benchmarks so can't say for sure.

The LRU cache size is not adjustable. I could make that adjustable if you have an idea for an API.

mikesamuel · 2018-02-27T16:53:22Z

docs/sql-railroad.svg

+* adding xmlns="http://www.w3.org/2000/svg" to the <svg> tag
+* adding <defs><style>...</style></defs> as the first child
+  of the <svg> element with the contents of
+  https://raw.githubusercontent.com/tabatkins/railroad-diagrams/gh-pages/railroad-diagrams.css


The code review tools doesn't show image sources by default so see this if you want the diagram's derivation.

mikesamuel · 2018-02-27T16:57:21Z

I answered your questions on the cache, and did all the remaining TODOs.

If the cache is a concern, I could try benchmarking with and without. If that sounds useful, do you have any preferred way of doing micro-benchmarks in JS?

https://nodesecroadmap.fyi/chapter-7/query-langs.html describes this approach as part of a larger discussion about library support for safe coding practices. This is one step in a larger effort to enable connection.query`SELECT * FROM T WHERE x = ${x}, y = ${y}, z = ${z}`(callback) and similar idioms. This was broken out of mysqljs/mysql#1926

mikesamuel · 2018-03-07T17:15:06Z

I rebased around releases 2.3.{0,1} which required a git push -f. That required one manual merge around the new 'triple question marks are ignored' and a test I added in test-SqlString.js

mikesamuel · 2018-04-03T16:49:33Z

Ping?

mikesamuel · 2018-05-07T14:43:40Z

Ping?

mikesamuel · 2018-06-12T13:40:50Z

@dougwilson If this is effectively dead, please let me know and I'll close out the PR.

dougwilson · 2018-06-13T03:01:53Z

I am sorry I have been bad about providing updates. I have been working on rewriting the syntax to be es5 instead of es6 so it can be exposed to everyone and not restricted to only use in template strings (though that would still be supported). There are also some edge cases I fixed as well. I'm going to push them up (as separate commits -- not altering your existing commits) to the branch in this PR so we can review them to make sure they are correct. I know you didn't want to translate it to es5 from our earlier conversations, so I figured I'd help with that effort since I'm the one of us two who is interested in that support 👍

mikesamuel · 2018-06-13T17:08:39Z

Thanks for explaining. I wasn't averse to translating to es5. I was just worried that doing that in a single PR might be too much, but it sounds like you've got it sorted.

mikesamuel · 2018-09-15T19:38:37Z

Fyi, I needed something like this so I wrote safesql to provide template tags for MySQL and Postgres based in part on this PR.

mikesamuel mentioned this pull request Jan 26, 2018

Add a string template tag handler for securely composing queries. mysqljs/mysql#1926

Open

dougwilson self-assigned this Jan 26, 2018

dougwilson reviewed Jan 26, 2018

View reviewed changes

mikesamuel commented Jan 27, 2018

View reviewed changes

dougwilson added enhancement needs discussion labels Jan 29, 2018

mikesamuel commented Jan 29, 2018

View reviewed changes

dougwilson reviewed Jan 30, 2018

View reviewed changes

dougwilson added needs docs needs rebase labels Feb 24, 2018

mikesamuel commented Feb 27, 2018

View reviewed changes

dougwilson removed the needs docs label Mar 7, 2018

mikesamuel force-pushed the contextual-template-tags branch from 877df9f to fc306fa Compare March 7, 2018 17:07

dougwilson mentioned this pull request Jan 3, 2019

Add API for unquoted escaped strings #39

Closed

		@@ -0,0 +1,7 @@
		// If we're on a Node runtime that should support ES6, run the ES6 tests.
		if (/^v?[0-5][.]/.test(process.version)) {

Contextual string tags to prevent SQL injection #29

Are you sure you want to change the base?

Contextual string tags to prevent SQL injection #29

Uh oh!

Conversation

mikesamuel commented Jan 26, 2018

Uh oh!

dougwilson commented Jan 26, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mikesamuel Jan 29, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dougwilson commented Jan 26, 2018

Uh oh!

dougwilson commented Jan 26, 2018

Uh oh!

dougwilson commented Jan 26, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dougwilson commented Jan 26, 2018

Uh oh!

dougwilson commented Jan 26, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mikesamuel Jan 29, 2018 •

edited

Loading

dougwilson commented Jan 26, 2018 •

edited

Loading

mikesamuel left a comment •

edited

Loading