NewBudgetExceededError


(Chris Chambers) #1

Hi,

We’re running a v2.0.0 3-node cluster, replication factor 3, 4 cores per node, 16GiB per node, as a statefulset in Kubernetes on AWS. We’re seeing unusual memory behavior something like this:

In the console, each node reports that it’s memory usage 300-600MB, but after 10-15 minutes under a mild load, queries start to fail with NewBudgetExceededError:

“memory budget exceeded: 3471360 bytes requested, 8588801296 currently allocated, 8589934592 bytes in budget”

We’re currently set with max-sql-memory at 50% (it was 25%, same error). I’m not sure what other information to include, but I’m happy to provide whatever is needed.

Any ideas about what could cause this? In particular, the difference in reported memory usage and the amount of memory reported in the budged exceeded message seems unusual.

Thanks in advance,

Chris


(Rebecca Taft) #2

Hi @chris-chambers,

Sorry you’re running into these issues. Where are you seeing the memory usage reported as 300-600 MB?

Also, have you followed all the recommendations in https://www.cockroachlabs.com/docs/stable/kubernetes-performance.html and https://www.cockroachlabs.com/docs/stable/orchestrate-cockroachdb-with-kubernetes.html? That’s probably the first place to start. I don’t see anything wrong with your configuration based on the information you’ve included above.

– Becca


(Chris Chambers) #3

The memory report is from the Cockroach console. I’m reading through the other links you sent now, to make sure we’re doing everything they describe.

Thanks for the quick reply!


(Alex Robinson) #4

I don’t think that looking at the Kubernetes configurations is likely to be of much help here, since the memory budget reported in the log message is correct given the node size and --max-sql-memory setting @chris-chambers stated in the original question.

What’s more likely to be of interest is information about the queries being run and the size of the data that they’re operating over.

cc @knz as an expert on memory budget concerns.


(Chris Chambers) #5

The Kubernetes dashboard reports similar memory usage to the Cockroach dashboard.


(Chris Chambers) #6

The size of the dataset is quite small. 105MiB for the whole database, split across four tables, roughly 73MiB, 27MB, 2MiB, and 1MiB.

The queries haven’t been optimized, so that may be a likely culprit. Let me find out whether I’ll be able to share the queries as-is, or if I’ll need to obfuscate them.


(Raphael 'kena' Poss) #7

Hi Chris,
thanks for bringing this issue to our attention.
Either the queries you are sending to CockroachDB take a pathological code path that causes a lot of RAM to be allocated, or, (more likely) there is a leak in the memory accounting.

How many different queries are you sending? Do you reckon it would be possible to share your schema and queries with us? (The dataset itself would be less important.)


(Chris Chambers) #8

Hi Raphael,

I’ve got permission to send you the schema and queries. Would the output of cockroach debug zip cover the schema? Or would you prefer CREATE statements? We’re getting the queries into a more readable form now.

Is there a non-public channel I can send these artifacts to you on?

Thanks,

Chris


(Chris Chambers) #9

In the meantime, should I upgrade to v2.0.1? Or would that just muddy the waters?


(Raphael 'kena' Poss) #10

Let me get back to you on the non-public channel. Upgrading to 2.0.1 should be possible, I don’t see this influencing the situation.