Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exploding .wt collection file #32307

Open
andrew-thought opened this issue Apr 24, 2024 · 7 comments
Open

Exploding .wt collection file #32307

andrew-thought opened this issue Apr 24, 2024 · 7 comments

Comments

@andrew-thought
Copy link

Description:

I have a wired tiger file in /var/snap/rocketchat-server/common that increases 10-20GB per day, currently at 100GB. Our server is running out of disk space constantly.

Would like to trace what it happening here. Our group is not uploading large files and we do not have a large team.

Steps to reproduce:

  1. Navigate to /var/snap/rocketchat-server/common
  2. Check large files in directory
  3. See large .wt file -> -rw------- 1 root root 108925952000 Apr 24 00:27 collection-48--77318xxx37791.wt

Expected behavior:

Expect reasonable database and .wt file sizes considering the amount of data we put into rocketchat

Actual behavior:

Exploding .wt file size

Server Setup Information:

  • Version of Rocket.Chat Server: 6.7.0
  • Operating System: Ubuntu 18.04
  • Deployment Method: Snap
  • Number of Running Instances: 1
  • DB Replicaset Oplog:
  • NodeJS Version:
  • MongoDB Version:

Client Setup Information

  • Desktop App or Browser Version: Brave 1.64.122
  • Operating System: Debian 12

Additional context

Relevant logs:

@andrew-thought
Copy link
Author

traced it to rocketchat_userDataFiles.chunks

@andrew-thought
Copy link
Author

andrew-thought commented Apr 24, 2024

Could be a failed upload? Is there a way to clear out the userDataFiles and userDataFiles.chunks?

@reetp
Copy link

reetp commented May 8, 2024

Are you using GridFS?

It really isn't up to storing large amounts of file.

You should migrate to local file storage or some form of online storage.

https://github.com/RocketChat/filestore-migrator

@andrew-thought
Copy link
Author

I found the issue, one of the users had started a download of content for their user off the server, the download failed, but it was stuck in the cron list and kept initiating every 2 minutes, and I believe the .wt file contained some portion of the failed zip? file, which repeatedly added 250MB every 2 minutes to userDataFiles and userDataFiles.chunks and that .wt. I deleted the cron job and the failed compressed images from the database in userDataFiles and userDataFiles.chunks directly and it resolved the issue.

So possibly this is issue is really related to the failed user content download, but can close otherwise.

@reetp
Copy link

reetp commented May 9, 2024

Ah excellent and thanks for letting us know!

I'll leave it open for the moment and ask someone to consider this.

@reetp reetp added the type: bug label May 9, 2024
@david-uhlig
Copy link

We experienced the same issue. This added >150GB of data within 24h until our server ran out of space.

Every ~2 minutes, a new document was added to rocketchat_user_data_files looking like this

{
    _id: 'qcg7KSZGJWqnyPzMn',
    userId: 'SZJmM6FCEnm9W3PL3',
    type: 'application/zip',
    size: 353507086,
    name: '2024-05-15-John%20Doe-qcg7KSZGJWqnyPzMn.zip',
    store: 'GridFS:UserDataFiles',
    _updatedAt: ISODate('2024-05-15T09:36:02.597Z')
}

Also, corresponding documents were added to rocketchat_userDataFiles.chunks and rocketchat_userDataFiles.files. All the added data was related to the same user. The user claims he did not upload or download anything. Our server is set up to accept uploads up to ~10MB. Mind, the size in the document is ~350MB?! We were not able to reproduce the issue after fixing it on the MongoDB side.

We were able to fix the issue, performing the following steps:

  1. Shut down the Rocket Chat container docker compose stop rocketchat
  2. Enter the mongo shell docker compose exec -it mongodb mongosh
  3. Select the rocketchat database: use rocketchat
  4. Find the cause of the problem by looking at the data in rocketchat_user_data_files, rocketchat_userDataFiles.chunks and rocketchat_userDataFiles.files, e.g. db.rocketchat_user_data_files.find().pretty()
  5. For us, it was a single user with a fresh account, so we could simply select his userId. You may need to adjust for your case. You will also need some free disk space, as the mongoDB will keep growing in size while deleting. You might need to execute the delete in batches and free up space with db.runCommand({compact: "rocketchat_userDataFiles.chunks", force: true}) in between.
  6. Consider first creating an index on rocketchat_userDataFiles.chunks. Otherwise, deleting can take much longer.
db.rocketchat_userDataFiles.chunks.createIndex({"files_id": 1})
  1. Delete the files from the database:
const userId = 'identified-user-id'

var cursor = db.rocketchat_user_data_files.find({userId: userId})

while (cursor.hasNext()) {
	var file = cursor.next();
	db.rocketchat_userDataFiles.chunks.deleteMany({"files_id": file._id });
	db.rocketchat_userDataFiles.files.deleteOne({"_id": file._id });
}

db.rocketchat_user_data_files.deleteMany({userId: userId});
  1. Compact the collection to free up disk space: db.runCommand({compact: "rocketchat_userDataFiles.chunks", force: true})
  2. Drop the index: db.rocketchat_userDataFiles.chunks.dropIndex("files_id_1")
  3. Restart and recreate the Rocket Chat container: docker compose up -d rocketchat --force-recreate

Server Setup Information:

  • Version of Rocket.Chat Server: 6.8.0
  • Operating System: Ubuntu 22.04
  • Deployment Method: Docker
  • Number of Running Instances: 1
  • DB Replicaset Oplog:
  • NodeJS Version:
  • MongoDB Version: 5.0.24

@david-uhlig
Copy link

david-uhlig commented May 16, 2024

Just realized the data in the collections is from the data self-checkout on /account/preferences. So we might be fine just removing all data from the three mentioned collections?

However, since removing the excessive data with the process described above, the self-checkout isn't processed anymore. We get the following error message in the logs, which seems to stem from the deleted data, possibly. It keeps repeating over and over every few minutes.

rocketchat-1  | Error: ENOENT: no such file or directory, stat '/tmp/zipFiles/feefe5d3-7aa5-4c0f-b2c0-98517a42106f.zip'
rocketchat-1  |  => awaited here:
rocketchat-1  |     at Function.Promise.await (/app/bundle/programs/server/npm/node_modules/meteor/promise/node_modules/meteor-promise/promise_server.js:56:12)
rocketchat-1  |     at server/lib/dataExport/uploadZipFile.ts:12:16
rocketchat-1  |     at /app/bundle/programs/server/npm/node_modules/meteor/promise/node_modules/meteor-promise/fiber_pool.js:43:40
rocketchat-1  |  => awaited here:
rocketchat-1  |     at Function.Promise.await (/app/bundle/programs/server/npm/node_modules/meteor/promise/node_modules/meteor-promise/promise_server.js:56:12)
rocketchat-1  |     at server/lib/dataExport/processDataDownloads.ts:232:25
rocketchat-1  |     at /app/bundle/programs/server/npm/node_modules/meteor/promise/node_modules/meteor-promise/fiber_pool.js:43:40 {
rocketchat-1  |   errno: -2,
rocketchat-1  |   code: 'ENOENT',
rocketchat-1  |   syscall: 'stat',
rocketchat-1  |   path: '/tmp/zipFiles/feefe5d3-7aa5-4c0f-b2c0-98517a42106f.zip'
rocketchat-1  | }

Any way to fix this?

Edit: Simply creating the file inside the container does the trick and allows RC to continue, e.g. touch /tmp/zipFiles/feefe5d3-7aa5-4c0f-b2c0-98517a42106f.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants