K-ary heapsort: more comparisons, less memory traffic

The impetus for this post was a max heap routine I had to write because libc, unlike the STL, does not support incremental or even partial sorting. After staring at the standard implicit binary heap for a while, I realised how to generalise it to arbitrary arity. The routine will be used for medium size elements (a couple dozen bytes) and with a trivial comparison function; in that situation, it makes sense to implement a high arity heap and perform fewer swaps in return for additional comparisons. In fact, the trade-off is interesting enough that a heapsort based on this routine is competitive with BSD and glibc sorts. This post will present the k-ary heap code and explore the impact of memory traffic on the performance of a few classical sort routines. The worst performing sorts in BSD’s and GNU’s libc overlook swaps and focus on minimising comparisons. I argue this is rarely the correct choice, although our hands are partly tied by POSIX.

Continue reading

comments (4)

Enable Your Python Developers by Making “Code Investments”

Enable Your Python Developers by Making “Code Investments”

Note: portions of this post appeared on my personal blog under the title “Supercharge Your Python Developers

I think it’s safe to say that a project’s inception is the best, indeed perhaps only, opportunity to influence the quality of the code for years from now. Many (most?) projects are started without much direction; code simply springs into being and is put under version control. Making a series of thoughtful, upfront “investments,” however, can pay large dividends. In this post, I’ll describe investments I made at the start of a project that allowed a Python novice to quickly write concise, idiomatic, and well-tested code.

Continue reading

comments (0)

Hash Set versus Dense Hash

During the development of the Concurrency Kit hash set and hash table, detailed microbenchmarks were used to measure latency and variability of various operations in relation to various open-source hash table implementations. For read-mostly workloads, the implementation was at least twice as fast than Google Dense Hash on reference machines even though it provides stronger forward-progress guarantees for concurrent workloads. For example, it is lock-free for readers and wait-free for writers in single-writer many-reader use-cases. However, a recent use-case required the hash table implementations to handle delete-heavy workloads. As many open-addressing schemes, the implementation failed miserably in this workload due to tombstone accumulation. The strength of the collision resolution mechanism would very quickly lead to the complete elimination of probe sequence termination markers (empty slots) in favor of tombstones. Google Dense Hash performed in a stable manner with the delete-heavy workloads at the cost of a higher operation completion floor for more favorable workloads (due to increased clumping of the quadratic probing it uses). The performance gap for delete-heavy workloads has since then been closed with some trade-offs.

Continue reading

comments (1)

API Rearchitecture Series – The Juicy Details

In the previous post, my esteemed colleague and sometimes friend wrote about an epic quest that the API team is undertaking. I wanted to take a few moments to explain some problems we have had in our current system, what our new architecture will be, and how it will solve our problems. This blog post will talk mostly about what happens during the lifetime of a REST request and will ignore some of the dependencies for simplicity.

Continue reading

comments (3)

API Rearchitecture Series – Breaking down barriers for more powerful platform

This month, the Web Services team is embarking on a revolutionary change: a rearchitecture of how our platform operates. Upon completion, our clients will have access to powerful new features that will enable them to derive even greater value from building on top of our platform.

We want to share the details and experience of our technical development process. We’re currently working through the proof of concept phase and very soon will be developing our prototype. We believe the challenges we’re solving and the roadblocks we overcome are highly interesting and think others would be interested as well.

Continue reading

Tagged | comments (1)