Categories
Uncategorized

revdoor

revdoor is a single-file C++ library for visiting revolving door combinations.

The combinations without replacement generator implements Algorithm R from TAOCP 7.2.1.3 [1]. The combinations with replacement generator implements the same algorithm, modified to support replacement.

The algorithms visit combinations by indicating at most two pairs of items to swap in and out on each iteration.

The source code is available on GitHub:
https://github.com/dstein64/revdoor

Categories
Uncategorized

SMAWK in C++

I recently implemented kmeans1d—discussed in a prior post—for efficiently performing globally optimal 1D k-means clustering. The implementation utilizes the SMAWK algorithm (Aggarwal et al., 1987), which calculates argmin(i) for each row i of an arbitrary n × m totally monotone matrix, in O(m(1 + lg(n/m))).

I’ve factored out my SMAWK C++ code into the example below. In general, SMAWK works with an implicitly defined matrix, utilizing a function that returns a value corresponding to an arbitrary position in the matrix. An explicitly defined matrix is used in the example for the purpose of illustration.

The program prints the column indices corresponding to the minimum element of each row in a totally monotone matrix. The matrix is from monge.pdf—a course document that I found online.

Categories
Uncategorized

kmeans1d: Globally Optimal Efficient 1D k‑means Clustering

I implemented kmeans1d, a Python library for performing k-means clustering on 1D data, based on the algorithm from Xiaolin (1991), as presented by Grønlund et al. (2017, Section 2.2).

Globally optimal k-means clustering is NP-hard for multi-dimensional data. Lloyd’s algorithm is a popular approach for finding a locally optimal solution. For 1-dimensional data, there are polynomial time algorithms.

kmeans1d contains an O(kn + n log n) dynamic programming algorithm for finding the globally optimal k clusters for n 1D data points. The code is written in C++—for faster execution than a pure Python implementation—and wrapped in Python.

The source code is available on GitHub:
https://github.com/dstein64/kmeans1d