CockroachDB as File Store

Hi!

I’ve been searching information in docs (datatype bytes section), but I became a bit confused.

Scenario:

  1. I’m planning to use CockroachDB to store files I receive from FTP server. Biggest files are limited to 10MB in size. Most of files are XML and have 15kb in average size. Biggest files are text (column delimited with lots of spaces inside) files. Few files are zip/rar/7zip/gz/tar files.
  2. Most of files are plain text, and can be highly compacted using any algoritm (more than 1:10 in some cases)
  3. After some processing, we’ll record a log linked to the file
  4. After 30~90 days, I’ll delete files and their logs.
  5. We do process almost 25.000 files per day, 24 x 7.

My questions are:

  1. Is 5MB too much for geo-replicated data (3 datacenters, maximum 10ms network delay between them with 20Mbps full duplex connection) - shall I foresee problems?
  2. Does CockroachDB compact files by itself (easier for me), or should I compact them before sending (bit harder, have to touch several systems)?
  3. Does CockroachDB support textual indexes over bytea for searching certain blocks of text?

Thanks in advance!

ER

Hi! Thanks for trying out CRDB.

  1. CockroachDB is not well-suited for storing large blobs like that. Our docs recommend keeping the size to under 1MB. Larger values are possible, but the total size of a row (including all columns, and all versions within the GC TTL) must not exceed the maximum range size (64MB by default).
  2. CockroachDB does compress the data that is stored on disk.
  3. No, there isn’t support for full-text search. See sql: Full Text Search · Issue #7821 · cockroachdb/cockroach · GitHub

Thank you! You cleared all my doubts.
For certain, I was going with wrong solution: my plan B would be the best way - is to keep files on FS, and use microservices to manage them as I would on a object storage.
Million thanks! You saved me from hours of headache.