Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

async operations #525

Open
jimmywarting opened this issue Aug 13, 2022 · 2 comments
Open

async operations #525

jimmywarting opened this issue Aug 13, 2022 · 2 comments

Comments

@jimmywarting
Copy link

jimmywarting commented Aug 13, 2022

I know you can create a sql instance of an uint8array and make operations on that.
but i wonder: is it possible to just load a simple File instead of a Uint8Array?

i would like to load a read-only database and being able to do:

// load file from <input type="file">
const file = input.files[0]

// Create a database from a file without reading everything into memory
const db = new SQL.Database(file)

// Prepare an sql statement
const stmt = db.prepare("SELECT * FROM sqlite_master")

// Bind values to the parameters and fetch the results of the query
const result = await stmt.getAsObject()
// note: using await

The idea would just be for sql.js to slice the part of the file that is needed without loading the hole thing into memory
and kind of doing: await file.slice(start, end).arrayBuffer() operations instead

@jimmywarting
Copy link
Author

jimmywarting commented Aug 13, 2022

it would technically also be possible to have some kind of write operations too without loading everything into memory also if you did something like:

const pre = this._file.slice(start1, end1)
const middle = new Blob([ data ])
const end = this._file.slice(start2, end2)

this._file = new Blob([ pre, mid, end ])

creating new blobs like this with .slice() dose not create any more memory, it only creates a new handle \w a offset and a size property

image taken from: https://docs.google.com/presentation/d/1MOm-8kacXAon1L2tF6VthesNjXgx0fp5AP17L7XDPSM/edit

just b/c you concatenate two of the same blob chunks new Blob([a, a]) dose not mean you will create 2x as much memory, it will only mean: read the same chunk twice

@twoxfh
Copy link

twoxfh commented Aug 23, 2022

Anything is possible but I think your looking for a file api vs a sqlite api such as https://github.com/mailru/FileAPI. Using NDJSON or CSV format may also help.

Edit: not sure why this was thumbs down, maybe need to clarify the usecase? The architectural concern might be wasm.. you copy data to the wasm fs and then perform operations and the copy out. There are various projects that use indexdb such as absurd-sql, which may offload the memory once loaded, wa-sqlite let's you write your own vfs, but still requires data and wasm tango. If this is all server side a wasi based sqlite might be better, but your usecase needs to be clearer and likely beyond the scope of this project especially since it's not async. As mentioned in other issues, until pthreads are supported, adopted, and used with wasm, async really isn't async

I have loaded around 300 mb files using sql.js over the emscripten filesystem and then loaded them into a db in sub 10 seconds but that required multiple copies in mem and was fairly straight forward data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants