Magic Analysis

When analyzing keyboard layouts with magic keys, we face a unique challenge: the same text can be typed in multiple ways, each creating different patterns that affect typing comfort. Consider typing the word “boat” on a layout with the magic rule o★ → oa:

Normal typing: b → o → a → t (4 keystrokes)
Magic typing: b → o → ★ → t (4 keystrokes, but different key sequence)

Both produce the text “boat”, but they create completely different n-gram patterns for analysis. The normal sequence creates bigrams “bo”, “oa”, “at”, while the magic sequence creates “bo”, “o★”, “★t”. Since we analyze typing keystrokes pressed rather than characters output, these represent fundamentally different typing experiences.

Pathfinding

The challenge is choosing which typing sequence to use for each word. If the magic key and regular keys are on comfortable fingers, using magic might reduce Same-Finger Bigrams. But if the magic key is awkwardly positioned, normal typing might be better. We need an algorithm that can automatically find the optimal typing sequence for every word in a large corpus.

Just like layout optimization, magic analysis faces a combinatorial explosion. Consider a corpus with 100,000 unique words and magic rules that could apply to many of them. For each word, there might be multiple ways to type it using different combinations of magic rules. Checking every possibility for every word would take far too long.

To make matters worse, magic decisions can interact with each other. Using a magic rule early in a word might enable or prevent using another magic rule later in the same word. This means we can’t simply make locally optimal decisions—we need to consider the entire word as a unit.

The solution lies in treating each word as a pathfinding problem and using efficient algorithms to find the optimal route through the space of possible keystroke sequences.

Keygen Pro, AKLDB’s layout analyzer, uses two different approaches for magic analysis, depending on the complexity of the magic rules:

Fast Path: Word-Boundary Optimization

For layouts where magic rules don’t contain spaces, we can use an efficient word-by-word approach:

Analyze each word independently using A* search to find the optimal keystroke sequence
Apply changes incrementally by calculating the difference between normal and magic typing
Aggregate results by scaling each word’s changes by its frequency in the corpus

This approach is fast because magic rules can’t affect each other across word boundaries. If a magic rule changes how you type “boat”, it has no impact on how you type “starts” in the phrase “boat starts”. This independence allows us to process each unique word once and multiply the results by that word’s frequency.

Slow Path: Typing Simulation

For layouts with space-containing magic rules (like Magic Sturdy), word boundaries become relevant. A magic rule like , ★ → , but produces a space character, which means magic decisions can affect text across word boundaries.

When this happens, we fall back to typing the entire corpus:

Break up the corpus into valid sequences; that is, characters that aren’t included in our analysis separate one sequence from another,
Find the best way to type the entire sequence using A* search, then
Extract n-gram statistics from the actual keystrokes used

This approach is much slower—multiple minutes per layout in practice—but handles the full generality of magic rules that can span word boundaries.

Key Algorithmic Techniques

A* Search: Finding Optimal Paths

At the heart of magic analysis is A search*, a pathfinding algorithm that finds the lowest-cost route from start to finish. Think of it like GPS navigation: given a starting point (beginning of a word) and destination (end of word), A* finds the best route (keystroke sequence) while avoiding high-cost areas (uncomfortable key combinations).

For each word, A* explores different ways to type it:

States represent progress through the word: “I’ve typed 3 characters, last key was ‘o’”
Actions represent pressing a key: normal keys, magic keys, or repeat keys
Costs incorporate layout-specific penalties like Same-Finger Bigrams or Lateral Stretch

The algorithm uses a heuristic (educated guess) about remaining cost to guide its search toward promising paths while avoiding obviously bad ones. This makes it much faster than checking every possible sequence.

Beam Search: Managing Complexity

Pure A* can use exponential memory in the worst case, so we use beam search as an optimization. Instead of keeping track of every possible path, beam search only remembers the k best partial paths at each step (typically k=10). This bounds memory usage and makes the algorithm predictable.

The tradeoff is that beam search might occasionally miss the globally optimal solution, but in practice, the best paths are usually obviously better than alternatives, so this rarely matters.

Opportunity Detection: Avoiding Wasted Work

Before running A* on a word, we need to know if any magic rules could possibly apply. Rather than running expensive path-finding on every word, we use indexing to quickly identify candidates.

We preprocess the corpus to build indexes: “Which words contain the bigram ‘oa’?” or “Which words contain the trigram ‘men’?”. When analyzing a magic rule like t★ → tment, we can instantly find all words containing “tment” and only run A* on those candidates.

This reduces the search space from millions of words to hundreds or thousands of relevant candidates.

Delta-Based Updates: Incremental Efficiency

Rather than recalculating all n-gram statistics from scratch, we use a delta-based approach:

Start with base statistics calculated assuming no magic rules are used
For each word that uses magic, calculate the difference between normal and magic typing
Apply the changes incrementally, scaled by word frequency

For example, if “treatment” appears 1000 times and switches from normal typing to using t★ → tment, we remove 1000 instances of bigrams like “tm”, “me”, “en” and add 1000 instances of “t★”.

This approach keeps non-magic layouts fast (they use base statistics directly) while efficiently handling magic layouts through targeted updates.

Performance and Complexity

The two-path approach provides excellent scalability:

Fast path complexity: O(words_with_opportunities × beam_width × average_word_length)

Typical values: 50,000 candidate words × 10 beam width × 6 characters = ~3 million operations
Suitable for real-time analysis of large corpora

Slow path complexity: O(corpus_length × beam_width)

Used only when necessary for space-containing rules
Much slower but handles full generality

Memory usage: O(beam_width) for active search, plus O(unique_words + unique_bigrams/trigrams) for indexes

Bounded and predictable regardless of corpus size

Metrics

Once the optimal keystroke sequences are calculated, the final n-gram counts (after applying magic optimizations) feed into the same Same-Finger Bigram, Lateral Stretch, and Rhythm calculators used for traditional layouts.

Additionally, magic layouts get a Speedup metric that measures typing efficiency gains:

$\text{Speedup}\% = 1 - \frac{\text{Adjusted Keystrokes}}{\text{Original Keystrokes}}$

A 5% speedup means magic keys reduce total keystrokes by 5% compared to normal typing.