On January 1st (that’s dedication - New Year’s Day), 2024 Gunnar Morling published on his blog the One Billion Row Challenge. The challenge is to load and aggregate one billion rows using Java. The challenge took on a life of its own, and there are now several implementations of the challenge in different languages, including databases (Robin, Hubert, Francesco, among others). The data to aggregate is a list of temperature readings from weather stations.
I thought It would be fun to do the challenge using Azure Data Explorer (ADX) since I like ADX and have written blog posts about it. ADX is a fast, scalable, and highly available data analytics service. It is optimized for data exploration over large data volumes. ADX is a columnar store that uses a query language called Kusto Query Language (KQL). It is a fully managed service, and it is part of the Azure platform.
So, in this blog post, you’ll see what I did to load and aggregate one billion rows. Spoiler alert: certain things didn’t work out as I had hoped, but that could be due to me being more rusty with ADX than anything else.