`colortrans`

is a I’ve written a document, colortrans.pdf, that describes the implemented algorithms.

The library is available on PyPI and can be installed with *pip*.

`$ pip3 install colortrans`

The source code and documentation is available on GitHub:

https://github.com/dstein64/colortrans

`nvim-scrollview`

, a Neovim plugin that displays interactive scrollbars.
The scrollbars serve as a visual aid, which can be helpful in addition to the position information already provided in the status line. The main features are 1) handling for folds, 2) support for mouse dragging, and 3) partial transparency so that text is not covered. Scrollbar generation and refreshing work automatically.

The plugin is implemented primarily in Vimscript, but requires Neovim 0.5 for its `WinScrolled`

event. Additionally, Neovim’s built-in support for Lua was utilized to speed up processing.

The source code—along with installation instructions—is available on GitHub:

https://github.com/dstein64/nvim-scrollview

As shown by the GitHub contributions chart below, development was mostly inactive for a few years following the initial release.

I’ve recently added additional features. Thanks to the users who suggested some of these!

The extension permits three levels of highlighting. As of `v2.1.0`

, it’s possible to use the context menu to directly apply the desired level of highlighting, rather than cycling through the levels. Additionally, controls for *Global Highlighting* and *Autonomous Highlights*—both discussed below—are available on the context menu.

The menu can be accessed by right-clicking either 1) the Auto Highlight icon in the browser toolbar, or 2) the currently open web page. In either case, the icons are shown only on Firefox, as Chrome and Edge do not currently support context menu icons below the top level.

Within the options menu, there are now settings to control the color of highlighted text.

The *Autonomous Highlights* option toggles whether pages are highlighted upon loading, without requiring user interaction. *Delay* specifies how many seconds to wait—after loading a page—until autonomous highlighting is applied. *State* controls the amount of autonomous highlighting.

The *Blocklist* controls which pages do not receive autonomous highlights. In addition to URLs and hostnames, URL match patterns can also be added to the blocklist. Rather than *blocking* autonomous highlighting on certain pages, the blocklist can alternatively be used for *allowing* autonomous highlighting on certain pages (this functionality is documented on the options page under *Blocklist Information > Allowing Certain Pages*). Items can be added to the blocklist by either 1) manually specifying them on the options page, or 2) using the context menu to add specific URLs and hostnames.

There is a new option that toggles whether highlights are tinted based on predicted sentence importance. Lower importance sentences, which receive more tinting, are highlighted with subsequent clicks of the highlighter icon.

*Global Highlighting* can be used to set the corresponding highlight level on every tab. This is applied once, upon clicking the desired highlighting level from the context menu or options page.

Initially available only on the Chrome Web Store, *Auto Highlight* is now available from three browser add-on repositories.

- Chrome
- Edge
- Firefox

Additionally, as of Chrome OS 80, *Auto Highlight* is available for download on Chromebooks with Family Link.

`revdoor`

is a single-file C++ library for visiting The *combinations without replacement* generator implements Algorithm R from TAOCP 7.2.1.3 [1]. The *combinations with replacement* generator implements the same algorithm, modified to support replacement.

The algorithms visit combinations by indicating at most two pairs of items to swap in and out on each iteration.

The source code is available on GitHub:

https://github.com/dstein64/revdoor

The following example program—that uses the `revdoor`

single-file header library—visits combinations by indicating the replacement for each iteration. For the purpose of illustration, the full set of combination items is tracked and printed on each iteration.

An example usage is shown below.

$ ./example 5 3 init: 0,1,2 out: 1, in: 3 state: 0,2,3 out: 0, in: 1 state: 1,2,3 out: 2, in: 0 state: 0,1,3 out: 1, in: 4 state: 0,3,4 out: 0, in: 1 state: 1,3,4 out: 1, in: 2 state: 2,3,4 out: 3, in: 0 state: 0,2,4 out: 0, in: 1 state: 1,2,4 out: 2, in: 0 state: 0,1,4

[1] Donald Knuth, The Art of Computer Programming, Volume 4, Fascicle 3: Generating All Combinations and Partitions (Addison-Wesley Professional, 2005).

]]>`vim-startuptime`

is a plugin for viewing Vim startup event timing—reported in milliseconds. This can be helpful when trying to modify your configuration to improve Vim’s startup time.

- Launch
`vim-startuptime`

with`:StartupTime`

. - Press
`<space>`

on events to get additional information. - Press
`<enter>`

on sourcing events to load the corresponding file in a new split. - Access documentation with
`:help vim-startuptime`

.

The source code—along with installation instructions—is available on GitHub:

https://github.com/dstein64/vim-startuptime

`vim-win`

is a plugin for managing windows, including 1) selecting windows, 2) swapping window buffers, and 3) resizing windows. Full functionality requires `vim>=8.2`

or `nvim>=0.4.0`

.

- Enter
`vim-win`

with`<leader>w`

or`:Win`

. - Arrows or
`hjkl`

keys are used for movement. - Change windows with movement keys or numbers.
- Hold
`<shift>`

and use movement keys to resize the active window. - Press
`s`

or`S`

followed by a movement key or window number, to swap buffers. - Press
`?`

to show a help message. - Press
`<esc>`

to leave`vim-win`

. - Access documentation with
`:help vim-win`

.

The source code—along with installation instructions—is available on GitHub:

https://github.com/dstein64/vim-win

The following animated GIFs—generated with gifcast—show a sample of available profiles.

Here is the asciinema cast file used to generate the animated GIFs: profile_demo.cast

]]>https://dstein64.github.io/gifcast/

The JavaScript source code is available on GitHub:

https://github.com/dstein64/gifcast

The example below was generated with *gifcast*.

Here is the asciinema cast file used to generate the animated GIF: gifcast.cast

]]>I recently implemented pastiche—discussed in a prior post—for applying neural style transfer. I encountered a size limit when uploading the library to PyPI, as a package cannot exceed 60MB. The 32-bit floating point weights for the underlying VGG model [1] were contained in an 80MB file. My package was subsequently approved for a size limit increase that could accommodate the VGG weights as-is, but I was still interested in compressing the model.

Various techniques have been proposed for compressing neural networks—including distillation [2] and quantization [3,4]—which have been shown to work well in the context of classification. My problem was in the context of style transfer, so I was not sure how model compression would impact the results.

I decided to experiment with weight quantization, using a scheme where I could store the quantized weights on disk, and then uncompress the weights to full 32-bit floats at runtime. This quantization scheme would allow me to continue using my existing code after the model is loaded. I am not targeting environments where memory is a constraint, so I was not particularly interested in approaches that would also reduce the model footprint at runtime. I used kmeans1d—discussed in a prior post—for quantizing each layer’s weights.

Before I implemented support for loading a quantized VGG model, I first ran experiments to see how different levels of compression would impact style transfer. I did not conduct extensive experiments—just a few style transfers at different levels of compression. `quantize.py`

creates updated VGG models with simulated quantization, and `quantized_pastiche.sh`

runs style transfer using the updated VGG models. These scripts are in a separate branch I created for the experiments.

The images at the top of this post were generated with Edvard Munch’s The Scream and a photo I took at the Pittsburgh Zoo in 2017. The images below were generated with Vincent van Gogh’s *The Starry Night* and a photo I took in Boston in 2015. The image captions indicate the compression rate of the VGG model used for the corresponding style transfer.

32-bit float (no quantization) | 8-bit | 7-bit |

6-bit | 5-bit | 4-bit |

3-bit | 2-bit | 1-bit |

I originally decided to compress the model using 6-bit weights, and ran a few additional style transfers to check the quality at this compression level. I modified the code to generate and load VGG models with weights quantized to arbitrary bit widths. Unfortunately, my implementation had a noticeable effect on latency when loading the model, taking almost twenty seconds for a model with weights compressed to 2 bits (I didn’t test for other compression rates, but larger bit widths would presumably take longer).

I subsequently decided to quantize the weights to 8 bits instead of 6 bits, since this allowed for fast processing using PyTorch’s built-in `uint8`

type. The VGG file size decreased from 80MB to 20MB, well within the 60MB PyPI limit that I originally encountered. Loading the quantized model takes less than 1 second.

[1] Simonyan, Karen, and Andrew Zisserman. “Very Deep Convolutional Networks for Large-Scale Image Recognition.” ArXiv:1409.1556 [Cs], September 4, 2014. http://arxiv.org/abs/1409.1556.

[2] Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. “Distilling the Knowledge in a Neural Network.” ArXiv:1503.02531 [Cs, Stat], March 9, 2015. http://arxiv.org/abs/1503.02531.

[3] Vanhoucke, Vincent, Andrew Senior, and Mark Z. Mao. “Improving the Speed of Neural Networks on CPUs.” In Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011.

[4] Han, Song, Huizi Mao, and William J. Dally. “Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding.” ArXiv:1510.00149 [Cs], October 1, 2015. http://arxiv.org/abs/1510.00149.

`argmin(`*i*)

for each row `n × m`

`O(m(1 + lg(n/m)))`

.
I’ve factored out my SMAWK C++ code into the example below. In general, SMAWK works with an *implicitly* defined matrix, utilizing a function that returns a value corresponding to an arbitrary position in the matrix. An *explicitly* defined matrix is used in the example for the purpose of illustration.

The program prints the column indices corresponding to the minimum element of each row in a totally monotone matrix. The matrix is from monge.pdf—a course document that I found online.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters

#include <functional> | |

#include <iostream> | |

#include <numeric> | |

#include <vector> | |

#include <unordered_map> | |

using namespace std; | |

typedef unsigned long ulong; | |

/* | |

* Internal implementation of the SMAWK algorithm. | |

*/ | |

template <typename T> | |

void _smawk( | |

const vector<ulong>& rows, | |

const vector<ulong>& cols, | |

const function<T(ulong, ulong)>& lookup, | |

vector<ulong>* result) { | |

// Recursion base case | |

if (rows.size() == 0) return; | |

// ******************************** | |

// * REDUCE | |

// ******************************** | |

vector<ulong> _cols; // Stack of surviving columns | |

for (ulong col : cols) { | |

while (true) { | |

if (_cols.size() == 0) break; | |

ulong row = rows[_cols.size() – 1]; | |

if (lookup(row, col) >= lookup(row, _cols.back())) | |

break; | |

_cols.pop_back(); | |

} | |

if (_cols.size() < rows.size()) | |

_cols.push_back(col); | |

} | |

// Call recursively on odd-indexed rows | |

vector<ulong> odd_rows; | |

for (ulong i = 1; i < rows.size(); i += 2) { | |

odd_rows.push_back(rows[i]); | |

} | |

_smawk(odd_rows, _cols, lookup, result); | |

unordered_map<ulong, ulong> col_idx_lookup; | |

for (ulong idx = 0; idx < _cols.size(); ++idx) { | |

col_idx_lookup[_cols[idx]] = idx; | |

} | |

// ******************************** | |

// * INTERPOLATE | |

// ******************************** | |

// Fill-in even-indexed rows | |

ulong start = 0; | |

for (ulong r = 0; r < rows.size(); r += 2) { | |

ulong row = rows[r]; | |

ulong stop = _cols.size() – 1; | |

if (r < rows.size() – 1) | |

stop = col_idx_lookup[(*result)[rows[r + 1]]]; | |

ulong argmin = _cols[start]; | |

T min = lookup(row, argmin); | |

for (ulong c = start + 1; c <= stop; ++c) { | |

T value = lookup(row, _cols[c]); | |

if (c == start || value < min) { | |

argmin = _cols[c]; | |

min = value; | |

} | |

} | |

(*result)[row] = argmin; | |

start = stop; | |

} | |

} | |

/* | |

* Interface for the SMAWK algorithm, for finding the minimum value | |

* in each row of an implicitly-defined totally monotone matrix. | |

*/ | |

template <typename T> | |

vector<ulong> smawk( | |

const ulong num_rows, | |

const ulong num_cols, | |

const function<T(ulong, ulong)>& lookup) { | |

vector<ulong> result; | |

result.resize(num_rows); | |

vector<ulong> rows(num_rows); | |

iota(begin(rows), end(rows), 0); | |

vector<ulong> cols(num_cols); | |

iota(begin(cols), end(cols), 0); | |

_smawk<T>(rows, cols, lookup, &result); | |

return result; | |

} | |

#define NUM_ROWS 9 | |

#define NUM_COLS 18 | |

// SMAWK works on implicitly defined matrices, utilizing a function | |

// that returns a value as a function of matrix indices. | |

// An explicitly defined matrix is used here for the purpose of | |

// illustration. | |

// The matrix is from: | |

// http://web.cs.unlv.edu/larmore/Courses/CSC477/monge.pdf. | |

double matrix[NUM_ROWS][NUM_COLS] = { | |

{ 25, 21, 13,10,20,13,19,35,37,41,58,66,82,99,124,133,156,178}, | |

{ 42, 35, 26,20,29,21,25,37,36,39,56,64,76,91,116,125,146,164}, | |

{ 57, 48, 35,28,33,24,28,40,37,37,54,61,72,83,107,113,131,146}, | |

{ 78, 65, 51,42,44,35,38,48,42,42,55,61,70,80,100,106,120,135}, | |

{ 90, 76, 58,48,49,39,42,48,39,35,47,51,56,63, 80, 86, 97,110}, | |

{103, 85, 67,56,55,44,44,49,39,33,41,44,49,56, 71, 75, 84, 96}, | |

{123,105, 86,75,73,59,57,62,51,44,50,52,55,59, 72, 74, 80, 92}, | |

{142,123,100,86,82,65,61,62,50,43,47,45,46,46, 58, 59, 65, 73}, | |

{151,130,104,88,80,59,52,49,37,29,29,24,23,20, 28, 25, 31, 39}, | |

}; | |

int main(void) { | |

auto lookup = [](ulong i,ulong j) { | |

return matrix[i][j]; | |

}; | |

vector<ulong> argmin = smawk<double>(NUM_ROWS, NUM_COLS, lookup); | |

for (ulong i = 0; i < argmin.size(); ++i) { | |

cout << argmin[i] << endl; | |

} | |

return 0; | |

} |

**References**

Aggarwal, Alok, Maria M. Klawe, Shlomo Moran, Peter Shor, and Robert Wilber. “Geometric Applications of a Matrix-Searching Algorithm.” Algorithmica 2, no. 1 (November 1, 1987): 195–208.

]]>Globally optimal *k*-means clustering is NP-hard for multi-dimensional data. Lloyd’s algorithm is a popular approach for finding a locally optimal solution. For 1-dimensional data, there are polynomial time algorithms.

*kmeans1d* contains an *O(kn + n log n)* dynamic programming algorithm for finding the globally optimal *k* clusters for *n* 1D data points. The code is written in C++—for faster execution than a pure Python implementation—and wrapped in Python.

The source code is available on GitHub:

https://github.com/dstein64/kmeans1d

The package is available on PyPI, the Python Package Index. It can be installed with *pip*.

`$ pip3 install kmeans1d`

The snippet below includes an example of how to use the library.

**References**

[1] Wu, Xiaolin. “Optimal Quantization by Matrix Searching.” Journal of Algorithms 12, no. 4 (December 1, 1991): 663

[2] Grønlund, Allan, Kasper Green Larsen, Alexander Mathiasen, Jesper Sindahl Nielsen, Stefan Schneider, and Mingzhou Song. “Fast Exact K-Means, k-Medians and Bregman Divergence Clustering in 1D.” ArXiv:1701.07204 [Cs], January 25, 2017. http://arxiv.org/abs/1701.07204.