160 lines
5.8 KiB
Markdown
160 lines
5.8 KiB
Markdown
# DETAILS: cowc internals, semantics, and performance
|
||
|
||
This document explains:
|
||
|
||
1. **COW language semantics** as implemented by the original C++ implementation
|
||
2. How `cowc.rs` matches those semantics
|
||
3. Code structure and maintainability choices
|
||
4. Performance decisions (and how they compare to `cow.cpp` and `cowcomp.cpp`)
|
||
|
||
## 1) What exactly is “COW” here?
|
||
|
||
The “spec” is the behavior of the original implementation:
|
||
|
||
- `cow.cpp`: interpreter (parses tokens and executes a program vector)
|
||
- `cowcomp.cpp`: compiler variant (parses tokens and emits a native program via generated C++)
|
||
|
||
`cowc.rs` targets **the compiler variant semantics** (especially runtime error behavior), while still matching
|
||
the shared parsing and opcode behavior.
|
||
|
||
## 2) Tokenization: the sliding 3-byte window
|
||
|
||
Both `cow.cpp` and `cowcomp.cpp` tokenize using a rolling 3-byte buffer:
|
||
|
||
- read one byte into `buf[2]`
|
||
- compare `buf` against the 12 known tokens
|
||
- if matched: emit instruction and reset buffer to `{0,0,0}`
|
||
- else: shift (`buf[0]=buf[1]; buf[1]=buf[2]; buf[2]=0`)
|
||
|
||
`cowc.rs` implements the same logic in `parse_cow_source()`.
|
||
|
||
## 3) Instruction set
|
||
|
||
The 12 tokens map to numeric opcodes:
|
||
|
||
| Token | ID | Meaning |
|
||
|------:|---:|---------|
|
||
| `moo` | 0 | loop end (jump back) |
|
||
| `mOo` | 1 | move pointer left |
|
||
| `moO` | 2 | move pointer right |
|
||
| `mOO` | 3 | eval (execute instruction in current cell; if cell==3 exit) |
|
||
| `Moo` | 4 | if cell!=0 output char else read char and flush rest of line |
|
||
| `MOo` | 5 | decrement cell |
|
||
| `MoO` | 6 | increment cell |
|
||
| `MOO` | 7 | loop start (if cell==0 skip forward) |
|
||
| `OOO` | 8 | set cell to 0 |
|
||
| `MMM` | 9 | toggle register load/store |
|
||
| `OOM` | 10 | print int + newline |
|
||
| `oom` | 11 | read int line (`atoi`-style)
|
||
|
||
## 4) Tape / pointer / register
|
||
|
||
The C++ compiler variant uses:
|
||
- `std::vector<int> m;`
|
||
- `iterator p;`
|
||
- `int r; bool h;` for the register toggle.
|
||
|
||
Rust uses:
|
||
- `Vec<i32> m`
|
||
- `usize p`
|
||
- `i32 r; bool h`
|
||
|
||
Rust uses explicit `wrapping_*` arithmetic for deterministic overflow.
|
||
|
||
## 5) Loop matching quirks
|
||
|
||
Loop matching in `cowcomp.cpp` is done by scanning through the instruction vector with a nesting counter, but it
|
||
has a few peculiarities:
|
||
|
||
- For `moo` (case 0), it “skips previous command” before scanning backward, and it breaks when reaching the beginning
|
||
without inspecting index 0.
|
||
- For `MOO` (case 7), it “skips next command” when scanning forward, and it decrements nesting twice when a `moo`
|
||
immediately follows a `MOO` (`prev == 7` special case).
|
||
|
||
`cowc.rs` mirrors these behaviors in:
|
||
- `match_for_moo_back()`
|
||
- `match_for_moo_forward()`
|
||
|
||
### Why “virtual” matches?
|
||
`mOO` (eval) can dynamically execute `moo`/`MOO` relative to the *current program counter*. The C++ compiler variant
|
||
implements this by calling `compile(op, false)` at the current position, reusing the same scanning behavior.
|
||
|
||
To match that cleanly, `cowc.rs` precomputes match results **for every instruction index** for both directions.
|
||
|
||
## 6) `mOO` (eval) semantics
|
||
|
||
In `cowcomp.cpp`:
|
||
- `mOO` emits a `switch(*p)` with cases 0..2 and 4..11
|
||
- it deliberately omits case 3; value 3 falls into `default: goto x;` (exit)
|
||
- unknown values also `goto x;` (exit)
|
||
|
||
`cowc.rs` matches that:
|
||
- cell value 3 exits
|
||
- unknown values exit
|
||
- otherwise it performs the instruction’s effect *without* advancing the program counter during the eval itself
|
||
(except when the evaluated instruction causes a jump), and then execution continues to the next instruction.
|
||
|
||
## 7) I/O semantics
|
||
|
||
### `Moo`
|
||
Matches C++ compiler variant behavior:
|
||
- if cell != 0: output as a byte (`putchar(*p)`)
|
||
- else: read one byte and then flush until newline
|
||
|
||
### `oom`
|
||
The reference reads up to 99 chars into a fixed buffer, then calls `atoi`.
|
||
It also tries to flush on overflow, but the condition never triggers (a small bug).
|
||
`cowc.rs` intentionally preserves this: it reads at most 99 bytes or until newline and does not flush extra input.
|
||
|
||
## 8) “Fully Rust” output: why a `match pc` dispatch loop?
|
||
|
||
The original `cowcomp.cpp` emits gotos and relies on `g++ -O3` to build a fast binary.
|
||
|
||
Rust does not have `goto`, but the closest equivalent that optimizes well is:
|
||
|
||
```text
|
||
loop {
|
||
match pc {
|
||
0 => { ... pc = 1; continue; }
|
||
1 => { ... pc = 2; continue; }
|
||
...
|
||
_ => break
|
||
}
|
||
}
|
||
```
|
||
|
||
With `-C opt-level=3`, this typically becomes a compact jump table plus tight blocks. It avoids interpreter overhead
|
||
and stays close to the C++ compiler’s control-flow shape.
|
||
|
||
## 9) Performance choices
|
||
|
||
`cowc.rs` emits Rust that is tuned for speed:
|
||
|
||
- **Chunked stdin buffering**: reads from stdin on demand (does not block at startup on interactive consoles).
|
||
- **Buffered stdout**: append to `Vec<u8>` and write once. (now optional with a flag after issues arose)
|
||
- **Wrapping arithmetic**: `wrapping_add/sub` keeps semantics stable and avoids debug-vs-release surprises.
|
||
- **Unsafe cell access**: `get_unchecked` removes bounds checks in the hot path (safe because pointer growth is guarded).
|
||
- **Rustc flags**:
|
||
- `-C opt-level=3`
|
||
- `-C codegen-units=1` (better optimization at the cost of compile time)
|
||
- `-C panic=abort` (smaller + faster)
|
||
- optional `--lto` (`-C lto=fat`)
|
||
- optional native CPU (`-C target-cpu=native`) for host builds
|
||
|
||
## 10) Comparing cowc.rs to the C++ files
|
||
|
||
### vs `cow.cpp` (interpreter)
|
||
- `cow.cpp` dispatches at runtime via a function / switch per step
|
||
- `cowc.rs` produces ahead-of-time code with a PC jump table
|
||
- Result: compiled output is generally much faster on loop-heavy programs
|
||
|
||
### vs `cowcomp.cpp` (compiler variant)
|
||
- `cowcomp.cpp` emits gotos in generated C++
|
||
- `cowc.rs` emits a `match pc` loop in generated Rust
|
||
- Both produce straight-line blocks with explicit jumps
|
||
- `cowc.rs` removes the dependency on an external C/C++ toolchain (clang/g++)
|
||
|
||
---
|
||
|
||
Details were documented using the AIGEN toolset.
|