# DETAILS: cowc internals, semantics, and performance This document explains: 1. **COW language semantics** as implemented by the original C++ implementation 2. How `cowc.rs` matches those semantics 3. Code structure and maintainability choices 4. Performance decisions (and how they compare to `cow.cpp` and `cowcomp.cpp`) ## 1) What exactly is “COW” here? The “spec” is the behavior of the original implementation: - `cow.cpp`: interpreter (parses tokens and executes a program vector) - `cowcomp.cpp`: compiler variant (parses tokens and emits a native program via generated C++) `cowc.rs` targets **the compiler variant semantics** (especially runtime error behavior), while still matching the shared parsing and opcode behavior. ## 2) Tokenization: the sliding 3-byte window Both `cow.cpp` and `cowcomp.cpp` tokenize using a rolling 3-byte buffer: - read one byte into `buf[2]` - compare `buf` against the 12 known tokens - if matched: emit instruction and reset buffer to `{0,0,0}` - else: shift (`buf[0]=buf[1]; buf[1]=buf[2]; buf[2]=0`) `cowc.rs` implements the same logic in `parse_cow_source()`. ## 3) Instruction set The 12 tokens map to numeric opcodes: | Token | ID | Meaning | |------:|---:|---------| | `moo` | 0 | loop end (jump back) | | `mOo` | 1 | move pointer left | | `moO` | 2 | move pointer right | | `mOO` | 3 | eval (execute instruction in current cell; if cell==3 exit) | | `Moo` | 4 | if cell!=0 output char else read char and flush rest of line | | `MOo` | 5 | decrement cell | | `MoO` | 6 | increment cell | | `MOO` | 7 | loop start (if cell==0 skip forward) | | `OOO` | 8 | set cell to 0 | | `MMM` | 9 | toggle register load/store | | `OOM` | 10 | print int + newline | | `oom` | 11 | read int line (`atoi`-style) ## 4) Tape / pointer / register The C++ compiler variant uses: - `std::vector m;` - `iterator p;` - `int r; bool h;` for the register toggle. Rust uses: - `Vec m` - `usize p` - `i32 r; bool h` Rust uses explicit `wrapping_*` arithmetic for deterministic overflow. ## 5) Loop matching quirks Loop matching in `cowcomp.cpp` is done by scanning through the instruction vector with a nesting counter, but it has a few peculiarities: - For `moo` (case 0), it “skips previous command” before scanning backward, and it breaks when reaching the beginning without inspecting index 0. - For `MOO` (case 7), it “skips next command” when scanning forward, and it decrements nesting twice when a `moo` immediately follows a `MOO` (`prev == 7` special case). `cowc.rs` mirrors these behaviors in: - `match_for_moo_back()` - `match_for_moo_forward()` ### Why “virtual” matches? `mOO` (eval) can dynamically execute `moo`/`MOO` relative to the *current program counter*. The C++ compiler variant implements this by calling `compile(op, false)` at the current position, reusing the same scanning behavior. To match that cleanly, `cowc.rs` precomputes match results **for every instruction index** for both directions. ## 6) `mOO` (eval) semantics In `cowcomp.cpp`: - `mOO` emits a `switch(*p)` with cases 0..2 and 4..11 - it deliberately omits case 3; value 3 falls into `default: goto x;` (exit) - unknown values also `goto x;` (exit) `cowc.rs` matches that: - cell value 3 exits - unknown values exit - otherwise it performs the instruction’s effect *without* advancing the program counter during the eval itself (except when the evaluated instruction causes a jump), and then execution continues to the next instruction. ## 7) I/O semantics ### `Moo` Matches C++ compiler variant behavior: - if cell != 0: output as a byte (`putchar(*p)`) - else: read one byte and then flush until newline ### `oom` The reference reads up to 99 chars into a fixed buffer, then calls `atoi`. It also tries to flush on overflow, but the condition never triggers (a small bug). `cowc.rs` intentionally preserves this: it reads at most 99 bytes or until newline and does not flush extra input. ## 8) “Fully Rust” output: why a `match pc` dispatch loop? The original `cowcomp.cpp` emits gotos and relies on `g++ -O3` to build a fast binary. Rust does not have `goto`, but the closest equivalent that optimizes well is: ```text loop { match pc { 0 => { ... pc = 1; continue; } 1 => { ... pc = 2; continue; } ... _ => break } } ``` With `-C opt-level=3`, this typically becomes a compact jump table plus tight blocks. It avoids interpreter overhead and stays close to the C++ compiler’s control-flow shape. ## 9) Performance choices `cowc.rs` emits Rust that is tuned for speed: - **Chunked stdin buffering**: reads from stdin on demand (does not block at startup on interactive consoles). - **Buffered stdout**: append to `Vec` and write once. (now optional with a flag after issues arose) - **Wrapping arithmetic**: `wrapping_add/sub` keeps semantics stable and avoids debug-vs-release surprises. - **Unsafe cell access**: `get_unchecked` removes bounds checks in the hot path (safe because pointer growth is guarded). - **Rustc flags**: - `-C opt-level=3` - `-C codegen-units=1` (better optimization at the cost of compile time) - `-C panic=abort` (smaller + faster) - optional `--lto` (`-C lto=fat`) - optional native CPU (`-C target-cpu=native`) for host builds ## 10) Comparing cowc.rs to the C++ files ### vs `cow.cpp` (interpreter) - `cow.cpp` dispatches at runtime via a function / switch per step - `cowc.rs` produces ahead-of-time code with a PC jump table - Result: compiled output is generally much faster on loop-heavy programs ### vs `cowcomp.cpp` (compiler variant) - `cowcomp.cpp` emits gotos in generated C++ - `cowc.rs` emits a `match pc` loop in generated Rust - Both produce straight-line blocks with explicit jumps - `cowc.rs` removes the dependency on an external C/C++ toolchain (clang/g++) --- Details were documented using the AIGEN toolset.