Skip to content

Result of query with comment is inconsistent and random #975

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kkrypt0nn opened this issue Mar 28, 2023 · 18 comments · Fixed by #996
Closed

Result of query with comment is inconsistent and random #975

kkrypt0nn opened this issue Mar 28, 2023 · 18 comments · Fixed by #996

Comments

@kkrypt0nn
Copy link

kkrypt0nn commented Mar 28, 2023

There seem to be an inconsistent bug in the comments parsing method.

Having the following raw SQL query (making it a one liner doesn't change anything either):

const query = `SELECT *
						   FROM user_satellites_satellite
									FULL JOIN satellite ON user_satellites_satellite.satelliteId = satellite.id
						   WHERE user_satellites_satellite.userId = '${user.id}'
							 AND (satellite.name LIKE '%${searchQuery}%' OR satellite.description LIKE '%${searchQuery}%');`;

with searchQuery being (random can be anything):

random') UNION SELECT CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), tbl_name, CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR) FROM sqlite_master; --

And executing it with query()

await this.dataSource.query(query);

Will either return the correct and expected data or the error

The supplied SQL string contains more than one statement

Logging with NestJS shows the following query is getting executed (query logged being always the same):

Query getting executed

The exact same query being executed multiple times over and over, e.g. by spamming, will make the query return correct data after some time, which makes the bug inconsistent:

bug.mp4

Switching the SQLite driver to the sqlite - being installed with npm install sqlite3 - driver makes the query always work and return the expected result, hence is not related to TypeORM itself.

@kkrypt0nn kkrypt0nn changed the title Result of query is inconsistent and random Result of query with comment is inconsistent and random Mar 28, 2023
@Prinzhorn
Copy link
Contributor

Can you provide a self-contained reproduction? Preferably a single index.js that uses a :memory: database and can just be run standalone without anything other than better-sqlite3?

Alternatively, since you've seem to have narrowed down the possible cause, maybe you can contribute a failing test? But from what I understand you're saying it's not behaving consistently?

@Prinzhorn
Copy link
Contributor

Prinzhorn commented Mar 28, 2023

FWIW, what you're doing is not optimal. It allows SQL injections (including injecting comments) and prevents you from re-using prepared statements. I would do sth. like satellite.name LIKE ('%' || :searchQuery || '%') and pass it as a parameter.

@kkrypt0nn
Copy link
Author

kkrypt0nn commented Mar 28, 2023

Can you provide a self-contained reproduction? Preferably a single index.js that uses a :memory: database and can just be run standalone without anything other than better-sqlite3?

Alternatively, since you've seem to have narrowed down the possible cause, maybe you can contribute a failing test? But from what I understand you're saying it's not behaving consistently?

What exactly do you mean with "self-contained reproduction"? I could provide a NestJS example which has the issue contained in it if needed.

FWIW, what you're doing is not optimal. It allows SQL injections (including injecting comments) and prevents you from re-using prepared statements. I would do sth. like satellite.name LIKE '%' || :searchQuery || '%' and pass it as a parameter.

Yep I am aware of that, the purpose of the application is to showcase it, hence why the tested input is actually an SQLi and why I am using the dataSource to make a raw query instead of making use of TypeORM's repositories.

@Prinzhorn
Copy link
Contributor

What exactly do you mean with "self-contained reproduction"? I could provide a NestJS example which has the issue contained in it if needed.

import Database from 'better-sqlite3';

const db = new Database(':memory:');

// Your code here.

Yep I am aware of that, the purpose of the application is to showcase it, hence why the tested input is actually an SQLi

Ok, so the problem is not that you're getting "The supplied SQL string contains more than one statement" errors but that you claim for the same input better-sqlite3 sometimes errors and sometimes doesn't? Correct?

@kkrypt0nn
Copy link
Author

kkrypt0nn commented Mar 28, 2023

Ok, so the problem is not that you're getting "The supplied SQL string contains more than one statement" errors but that you claim for the same input better-sqlite3 sometimes errors and sometimes doesn't? Correct?

Absolutely, when spamming the same query over and over it leads to a previously failed query returning the correct data - which is there the big inconsistency.

I will provide the reproduction as soon as possible 👍

@Prinzhorn
Copy link
Contributor

I will provide the reproduction as soon as possible +1

I'm super curious to see. Right now I'm assuming the error is somewhere else in the chain and not in better-sqlite3 itself. But let's find out.

Absolutely, when spamming the same query over and over it leads to a previously failed query returning the correct data - which is there the big inconsistency.

This could very easily be a race condition (or other type of bug, caching etc.) in your server or client code, making it look like that when in reality it is actually consistent. Hence why we need a repro without any other dependency.

@kkrypt0nn
Copy link
Author

kkrypt0nn commented Mar 28, 2023

Seems like when just using it in a standalone way it's fine.

I suppose NestJS's TypeORM module is doing some magic which makes it behave differently 🤔

@kkrypt0nn kkrypt0nn closed this as not planned Won't fix, can't repro, duplicate, stale Mar 28, 2023
@kkrypt0nn
Copy link
Author

kkrypt0nn commented Mar 28, 2023

Seems like the issue is actually with the comments parsing here after all. When using template strings and new lines like in the example below, it will fail - though when using template strings and putting everything in a single line it succeeds perfectly. There is no reason it should fail for multiple statements as the query executed is

SELECT *
                                                   FROM user_satellites_satellite
                                                                        FULL JOIN satellite ON user_satellites_satellite.satelliteId = satellite.id
                                                   WHERE user_satellites_satellite.userId = '630633ba-030c-4967-acd4-d790bf26cfda'
                                                         AND (satellite.name LIKE '%ezrms7w8tzmjb3s02iu6') UNION SELECT tbl_name, CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR) FROM sqlite_master; --%' OR satellite.description LIKE '%ezrms7w8tzmjb3s02iu6') UNION SELECT tbl_name, CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR) FROM sqlite_master; --%');

Which has only one statement as the rest is commented out.

Source code used:

import Database from "better-sqlite3";

const db = new Database("db.sqlite");

let failed = 0;
let succeeded = 0;

for (let i = 0; i < 5000; i++) {
  const random = Array.from(Array(20), () =>
    Math.floor(Math.random() * 36).toString(36)
  ).join("");
  const searchQuery = `${random}') UNION SELECT tbl_name, CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR) FROM sqlite_master; --`;
  const query = `SELECT *
						   FROM user_satellites_satellite
									FULL JOIN satellite ON user_satellites_satellite.satelliteId = satellite.id
						   WHERE user_satellites_satellite.userId = '630633ba-030c-4967-acd4-d790bf26cfda'
							 AND (satellite.name LIKE '%${searchQuery}%' OR satellite.description LIKE '%${searchQuery}%');`;
  try {
    await db.prepare(query).all();
    succeeded++;
  } catch (_) {
    failed++;
  }
}

console.log(failed, succeeded); // Will never be consistent

The interesting part is that sometimes it works, sometimes it doesn't. You can also edit the code so that failed queries are executed again, some will work after some new attempts made.

Running the exact same code with sqlite3 works fine and will always log 0 5000:

const sqlite3 = require("sqlite3").verbose();
const db = new sqlite3.Database("db.sqlite");

async function main() {
  let failed = 0;
  let succeeded = 0;

  for (let i = 0; i < 5000; i++) {
    const random = Array.from(Array(20), () =>
      Math.floor(Math.random() * 36).toString(36)
    ).join("");
    const searchQuery = `${random}') UNION SELECT tbl_name, CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR) FROM sqlite_master; --`;
    const query = `SELECT *
						   FROM user_satellites_satellite
									FULL JOIN satellite ON user_satellites_satellite.satelliteId = satellite.id
						   WHERE user_satellites_satellite.userId = '630633ba-030c-4967-acd4-d790bf26cfda'
							 AND (satellite.name LIKE '%${searchQuery}%' OR satellite.description LIKE '%${searchQuery}%');`;
    try {
      await db.prepare(query).all();
      succeeded++;
    } catch (_) {
      failed++;
    }
  }

  console.log(failed, succeeded); // 0 5000
}

main();

The database used is available here: https://file.io/GaG5ThL0YbHs

@kkrypt0nn kkrypt0nn reopened this Mar 28, 2023
@Prinzhorn
Copy link
Contributor

Prinzhorn commented Mar 29, 2023

I'm getting consistent 5000 0. Are you on the latest version of better-sqlite3? What Node.js version? Does this only happen locally or also on other machines? Can you provide a Dockerfile that demonstrates the inconsistency so that I can see it too? Also can you reproduce this with a consistent seeded random value (e.g. https://github.com/davidbau/seedrandom)? This should give you at least consistent numbers, albeit maybe not 5000 0. If not, then this contradicts everything I know about computers.

@Prinzhorn
Copy link
Contributor

Prinzhorn commented Mar 29, 2023

Are you sure it's actually inconsistent or are you expecting the following to not produce an error?

import Database from "better-sqlite3";

const db = new Database(":memory:");

db.prepare(`SELECT 1; --%test`);

@kkrypt0nn
Copy link
Author

kkrypt0nn commented Mar 29, 2023

Are you on the latest version of better-sqlite3? What Node.js version? Does this only happen locally or also on other machines?

"better-sqlite3": "^8.2.0", Node.JS 19.8.1 (macOS) & 18.15.0 (Windows)

Are you sure it's actually inconsistent or are you expecting the following to not produce an error?

That is it output I am getting:

image

When making the query a one line query with template strings, without new lines or tabs, it works as it should:

image

It is very inconsistent, and the query executed should never produce an error (hence must always return 0 5000), as the query is a valid SQLite query with comment.

Can you provide a Dockerfile that demonstrates the inconsistency so that I can see it too?

Using that basic Dockerfile

FROM node:19.8.1

WORKDIR /app

COPY db.sqlite /app
COPY main.js /app
COPY package-lock.json /app
COPY package.json /app

RUN npm i

and getting an interactive shell with

docker run -it $(docker build -q .) bash

gives me sometimes fully failed, sometimes fully succeeded results:

image

When logging the error in the docker:

RangeError: The supplied SQL string contains more than one statement
    at Database.prepare (/app/node_modules/better-sqlite3/lib/methods/wrappers.js:5:21)
    at file:///app/main.js:19:14
    at ModuleJob.run (node:internal/modules/esm/module_job:193:25)

@kkrypt0nn
Copy link
Author

kkrypt0nn commented Mar 29, 2023

If you're interested - those are the results when using sqlite3 along with the same database & query:

Locally on macOS:
image

Within a docker:
image

@Prinzhorn
Copy link
Contributor

Prinzhorn commented Mar 29, 2023

I'm completely fascinated by this bug. I've created the following repo, which uses a memory database and does not use any random values. It behaves the same for Node 16 and 18. It will seemingly randomly work and then not work. The interesting bit is that when one of the three doIt() calls succeeds or fails, all others also will usually succeed or fail. So you get either of the two outputs:

 $ node main.js 
RangeError: The supplied SQL string contains more than one statement
    at Database.prepare (/src/issues/better-sqlite3-random/node_modules/better-sqlite3/lib/methods/wrappers.js:5:21)
    at doIt (file:///src/issues/better-sqlite3-random/main.js:22:8)
    at file:///src/issues/better-sqlite3-random/main.js:30:1
    at ModuleJob.run (node:internal/modules/esm/module_job:194:25)
RangeError: The supplied SQL string contains more than one statement
    at Database.prepare (/src/issues/better-sqlite3-random/node_modules/better-sqlite3/lib/methods/wrappers.js:5:21)
    at doIt (file:///src/issues/better-sqlite3-random/main.js:22:8)
    at file:///src/issues/better-sqlite3-random/main.js:31:1
    at ModuleJob.run (node:internal/modules/esm/module_job:194:25)
RangeError: The supplied SQL string contains more than one statement
    at Database.prepare (/src/issues/better-sqlite3-random/node_modules/better-sqlite3/lib/methods/wrappers.js:5:21)
    at doIt (file:///src/issues/better-sqlite3-random/main.js:22:8)
    at file:///src/issues/better-sqlite3-random/main.js:32:1
    at ModuleJob.run (node:internal/modules/esm/module_job:194:25)

Or

 $ node main.js 
OK
OK
OK

I only got a single very rare mix of

$ node main.js 
OK
RangeError: The supplied SQL string contains more than one statement
    at Database.prepare (/src/issues/better-sqlite3-random/node_modules/better-sqlite3/lib/methods/wrappers.js:5:21)
    at doIt (file:///src/issues/better-sqlite3-random/main.js:22:8)
    at file:///src/issues/better-sqlite3-random/main.js:31:1
    at ModuleJob.run (node:internal/modules/esm/module_job:194:25)
RangeError: The supplied SQL string contains more than one statement
    at Database.prepare (/src/issues/better-sqlite3-random/node_modules/better-sqlite3/lib/methods/wrappers.js:5:21)
    at doIt (file:///src/issues/better-sqlite3-random/main.js:22:8)
    at file:///src/issues/better-sqlite3-random/main.js:32:1
    at ModuleJob.run (node:internal/modules/esm/module_job:194:25)

Code:

import Database from 'better-sqlite3';

function doIt() {
  const db = new Database(':memory:');

  db.exec(`
    CREATE TABLE "satellite" ("id" varchar PRIMARY KEY NOT NULL, "name" varchar(255) NOT NULL, "description" varchar(255) NOT NULL, CONSTRAINT "UQ_ebcb5f391c90d59b38c2adecb13" UNIQUE ("name"));

    CREATE TABLE "user" ("id" varchar PRIMARY KEY NOT NULL, "username" varchar(255) NOT NULL, CONSTRAINT "UQ_78a916df40e02a9deb1c4b75edb" UNIQUE ("username"));

    CREATE TABLE "user_satellites_satellite" ("userId" varchar NOT NULL, "satelliteId" varchar NOT NULL, CONSTRAINT "FK_058e916625b4d0e5d920c62f2bb" FOREIGN KEY ("userId") REFERENCES "user" ("id") ON DELETE CASCADE ON UPDATE CASCADE, CONSTRAINT "FK_f3762d755df5bbba15af5507822" FOREIGN KEY ("satelliteId") REFERENCES "satellite" ("id") ON DELETE CASCADE ON UPDATE CASCADE, PRIMARY KEY ("userId", "satelliteId"));
  `);

  const searchQuery = `xxxxxxxxxxx') UNION SELECT tbl_name, CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR) FROM sqlite_master; --`;
  const query = `SELECT *
               FROM user_satellites_satellite
                  FULL JOIN satellite ON user_satellites_satellite.satelliteId = satellite.id
               WHERE user_satellites_satellite.userId = '630633ba-030c-4967-acd4-d790bf26cfda'
               AND (satellite.name LIKE '%${searchQuery}%' OR satellite.description LIKE '%${searchQuery}%');`;

  try {
    db.prepare(query);
    console.log('OK');
  } catch (err) {
    console.error(err);
  }
  db.close();
}

doIt();
doIt();
doIt();

With the length of xxxxxxxxxx you can control if it leans towards success or fail. E.g. using a single x or xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx I can't seem to make it fail, while xxxxxxxxxxxxxxxxxxxxxxxxxx fails almost always.

This is like better than Netflix. Invite your friends over 😄

@JoshuaWise is pretty good with this type of stuff, any idea WTF is happening? 👀

@kkrypt0nn
Copy link
Author

This is like better than Netflix. Invite your friends over 😄

Fully agree - if it was consistent it could potentially make sense, but right now from what I've seen is that it's fully random and if you spam failed query lots of times, at some point it will eventually succeed:

image

Within the docker it's a bit different but the same sort of happens:

image

image

Code used

import Database from 'better-sqlite3';

function doIt(i) {
  const db = new Database(':memory:');

  db.exec(`
    CREATE TABLE "satellite" ("id" varchar PRIMARY KEY NOT NULL, "name" varchar(255) NOT NULL, "description" varchar(255) NOT NULL, CONSTRAINT "UQ_ebcb5f391c90d59b38c2adecb13" UNIQUE ("name"));

    CREATE TABLE "user" ("id" varchar PRIMARY KEY NOT NULL, "username" varchar(255) NOT NULL, CONSTRAINT "UQ_78a916df40e02a9deb1c4b75edb" UNIQUE ("username"));

    CREATE TABLE "user_satellites_satellite" ("userId" varchar NOT NULL, "satelliteId" varchar NOT NULL, CONSTRAINT "FK_058e916625b4d0e5d920c62f2bb" FOREIGN KEY ("userId") REFERENCES "user" ("id") ON DELETE CASCADE ON UPDATE CASCADE, CONSTRAINT "FK_f3762d755df5bbba15af5507822" FOREIGN KEY ("satelliteId") REFERENCES "satellite" ("id") ON DELETE CASCADE ON UPDATE CASCADE, PRIMARY KEY ("userId", "satelliteId"));
  `);

  const replace = "x".repeat(i);
  const searchQuery = `${replace}') UNION SELECT tbl_name, CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR) FROM sqlite_master; --`;
  const query = `SELECT *
               FROM user_satellites_satellite
                  FULL JOIN satellite ON user_satellites_satellite.satelliteId = satellite.id
               WHERE user_satellites_satellite.userId = '630633ba-030c-4967-acd4-d790bf26cfda'
               AND (satellite.name LIKE '%${searchQuery}%' OR satellite.description LIKE '%${searchQuery}%');`;

  try {
    db.prepare(query);
  } catch (err) {
    console.error("FAIL, attempting 50 times the same query and hoping for success...");
    for (let index = 1; index <= 50; index++) {
      try {
        db.prepare(query);
        console.log(`Success after attempt #${index}`)
        break
      } catch {}
    }
  }
  db.close();
}

for (let index = 0; index < 50; index++) {
  doIt(index);
}

@Prinzhorn
Copy link
Contributor

FWIW ChatGPT is entirely useless telling me for the third time that I need to escape searchQuery because I'm apparently using two statements which better-sqlite3 does no support. Or it insists there might be race conditions between multiple processes or database connections after repeatedly being told it's a single-use in-memory database.

@arimah
Copy link
Contributor

arimah commented Apr 24, 2023

Hello, fellow developers! I happened upon this most fascinating of issue, which intrigued me so much I elected to abandon an hour of perfectly good sleep in favour of debugging the crap out of this.

The first thing I did was to insert a bunch of ugly printf statements directly into better-sqlite3.cpp, which mostly led me nowhere. However, after many twists and turns, I made the following modification, allowing me to see every single character processed as part of the query tail:

                 for (char c; (c = *tail); ++tail) {
+                        printf("%c", c);
                         if (IS_SKIPPED(c)) continue;
                         if (c == '/' && tail[1] == '*') {
                                 tail += 2;
+                                printf("*");
                                 for (char c; (c = *tail); ++tail) {
                                         printf("%c", c);
                                         if (c == '*' && tail[1] == '/') {
                                                 tail += 1;
                                                 break;
                                         }
                                 }
                         } else if (c == '-' && tail[1] == '-') {
                                 tail += 2;
+                                printf("-");
                                 for (char c; (c = *tail); ++tail) {
                                         printf("%c", c);
                                         if (c == '\n') break;
                                 }
                         } else {
                                 sqlite3_finalize(handle);
+                                printf("\n");
                                 return ThrowRangeError("The supplied SQL string contains more than one statement");
                         }
                 }
+                printf("\n");

And in my trusty terminal, I began to see something... intriguing. Something unexpected. A pattern of noise just past the last character emerges:

image

Is this an out-of-bounds read? Surely, v8::String::Utf8Value cannot possibly produce a string without zero terminator. But what if... Ah. It is as clear as day now. The nested loops are at fault! Let us consider a simplified double-loop:

for (char a; (a = *tail); ++tail) {
    for (char b; (b = *tail); ++tail) {
      whatever(c);
    }
}

When the condition (b = *tail) is false, then we know tail points to a zero byte. So far, so good. This is how we detect the end of the string, of course. The inner loop thus exits, and we proceed to the next iteration of the outer loop. The outer loop performs its increment: ++tail... and the world ceases to be beautiful. Now tail points almost certainly to garbage. The once innocent *tail invokes behaviour of an undefined nature. Probably. Certainly the trusty processor reads something, which it usually interprets as a query character.

The solution to this loopy confusion requires more thought than my 1am brain can produce. After some sleep, perhaps a pull request will present itself...

@Prinzhorn
Copy link
Contributor

Prinzhorn commented Apr 25, 2023

It's nice to see the chosen one show up in times of need 🙏

@mceachen
Copy link
Member

Let us consider a simplified double-loop

Your explanation: 🧑‍🍳 🤌 💯

Thanks for digging into this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

4 participants