Very slow inserts on Azure

Hello,

I might need a little help with a VERY slow test program I created to insert 1000 lines into a coackroach database.

I have created a cockrach db cluster with 3 nodes running on Azure under Ubuntu (VM size was Standard_D4s_v3, 16GB Ram, 4 vCPUs and SSD storage).

After that I wrote a program to insert 1000 lines into a table with 2 columns, which took around 17 to 18 seconds! That is less than 60 inserts/sec. I tried many different setups, different VM sizes or only one cockroach node … but there was no noticeable difference.
Which led me to believe that there might be something wrong with my test program.

I also ran the program against a postgresql database (hosted on an Azure VM as well) and it only took around 200 to 300 milliseonds. So around 80 times faster.

My test program creates one string with all insert statements and sends it to the database. I have also tried using SqlCommands with parameters and sending each insert separately, but I got similar results.

I used version 1.1.5. I also read “Recommended Production Settings” docs and tried different --cache and --max-sql-memory settings, but the results did not really change.
Is there anything else I could try?

This is my program:

using (var conn = new NpgsqlConnection(connString))
{
	conn.Open();
	// Insert some data
	using (var trans = conn.BeginTransaction(System.Data.IsolationLevel.Snapshot))
	{
		using (var cmd = new NpgsqlCommand())
		{
			cmd.Connection = conn;
			var sb_sql = new StringBuilder();
			for (int counter = 1; counter <= 1000; counter++)
			{
				sb_sql.Append("Insert into logentry (id, caption) values ('");
				sb_sql.Append(Guid.NewGuid().ToString());
				sb_sql.Append("', 'Text for our log entry ");
				sb_sql.Append(counter.ToString());
				sb_sql.AppendLine("');");
			}
			cmd.CommandText = sb_sql.ToString();
			cmd.ExecuteNonQuery();
		}
		trans.Commit();
	}
}

Change this:

to this:

sb_sql.Append("Insert into logentry (id, caption) values");
var sep = "";
for (int counter = 1; counter <= 1000; counter++)
{
    sb_sql.Append(sep)
    sb_sql.Append("('");
    sb_sql.Append(Guid.NewGuid().ToString());
    sb_sql.Append("', 'Text for our log entry ");
    sb_sql.Append(counter.ToString());
    sb_sql.AppendLine("')");
    sep = ", "
}
sb_sql.Append(";")

Also see https://www.cockroachlabs.com/docs/stable/insert.html#insert-multiple-rows-into-an-existing-table.

Does this help?

Thank you, This is kind of emberrassing - I should have thought of that myself.

The performance is now very close to that of postgresql (close enough for me).

I still have one question though: is there actually a difference between the “multiple row insert” compared to the “single row insert” - except for performance? Since the difference is so big I suspect this has something to do with disk/ssd access? Which seems odd when all the inserts are in one transaction. Is this something that might be optimized in a future version?

No

No it has to do with the fact that when you have just one statement CockroachDB can batch the synchronization, so it avoids network roundtrips.

It can only be optimized if you somehow can change the laws of physics to make the speed of light (and electricity) infinite. The problem is not disk I/O but network latency.

Okay, thank you for the clarification.

Although this seems kind of strange to me. If it is caused by network latency shouldn’t there be a notable performance difference between a one-node cluster and a three-node cluster? (Which was not the case in my tests)