seppdroid/cowc

Fork 0

Files

Sepp Jeremiah Morris 6e98f4860e made the stdout buffer optional with a flag, to prevent issues like issue#1

2026-02-21 16:36:41 +01:00

5.8 KiB

Raw Permalink Blame History

DETAILS: cowc internals, semantics, and performance

This document explains:

COW language semantics as implemented by the original C++ implementation
How cowc.rs matches those semantics
Code structure and maintainability choices
Performance decisions (and how they compare to cow.cpp and cowcomp.cpp)

1) What exactly is “COW” here?

The “spec” is the behavior of the original implementation:

cow.cpp: interpreter (parses tokens and executes a program vector)
cowcomp.cpp: compiler variant (parses tokens and emits a native program via generated C++)

cowc.rs targets the compiler variant semantics (especially runtime error behavior), while still matching the shared parsing and opcode behavior.

2) Tokenization: the sliding 3-byte window

Both cow.cpp and cowcomp.cpp tokenize using a rolling 3-byte buffer:

read one byte into buf[2]
compare buf against the 12 known tokens
if matched: emit instruction and reset buffer to {0,0,0}
else: shift (buf[0]=buf[1]; buf[1]=buf[2]; buf[2]=0)

cowc.rs implements the same logic in parse_cow_source().

3) Instruction set

The 12 tokens map to numeric opcodes:

Token	ID	Meaning
`moo`	0	loop end (jump back)
`mOo`	1	move pointer left
`moO`	2	move pointer right
`mOO`	3	eval (execute instruction in current cell; if cell==3 exit)
`Moo`	4	if cell!=0 output char else read char and flush rest of line
`MOo`	5	decrement cell
`MoO`	6	increment cell
`MOO`	7	loop start (if cell==0 skip forward)
`OOO`	8	set cell to 0
`MMM`	9	toggle register load/store
`OOM`	10	print int + newline
`oom`	11	read int line (`atoi`-style)

4) Tape / pointer / register

The C++ compiler variant uses:

std::vector<int> m;
iterator p;
int r; bool h; for the register toggle.

Rust uses:

Vec<i32> m
usize p
i32 r; bool h

Rust uses explicit wrapping_* arithmetic for deterministic overflow.

5) Loop matching quirks

Loop matching in cowcomp.cpp is done by scanning through the instruction vector with a nesting counter, but it has a few peculiarities:

For moo (case 0), it “skips previous command” before scanning backward, and it breaks when reaching the beginning without inspecting index 0.
For MOO (case 7), it “skips next command” when scanning forward, and it decrements nesting twice when a moo immediately follows a MOO (prev == 7 special case).

cowc.rs mirrors these behaviors in:

match_for_moo_back()
match_for_moo_forward()

Why “virtual” matches?

mOO (eval) can dynamically execute moo/MOO relative to the current program counter. The C++ compiler variant implements this by calling compile(op, false) at the current position, reusing the same scanning behavior.

To match that cleanly, cowc.rs precomputes match results for every instruction index for both directions.

6) `mOO` (eval) semantics

In cowcomp.cpp:

mOO emits a switch(*p) with cases 0..2 and 4..11
it deliberately omits case 3; value 3 falls into default: goto x; (exit)
unknown values also goto x; (exit)

cowc.rs matches that:

cell value 3 exits
unknown values exit
otherwise it performs the instruction’s effect without advancing the program counter during the eval itself (except when the evaluated instruction causes a jump), and then execution continues to the next instruction.

7) I/O semantics

`Moo`

Matches C++ compiler variant behavior:

if cell != 0: output as a byte (putchar(*p))
else: read one byte and then flush until newline

`oom`

The reference reads up to 99 chars into a fixed buffer, then calls atoi. It also tries to flush on overflow, but the condition never triggers (a small bug). cowc.rs intentionally preserves this: it reads at most 99 bytes or until newline and does not flush extra input.

8) “Fully Rust” output: why a `match pc` dispatch loop?

The original cowcomp.cpp emits gotos and relies on g++ -O3 to build a fast binary.

Rust does not have goto, but the closest equivalent that optimizes well is:

loop {
  match pc {
    0 => { ... pc = 1; continue; }
    1 => { ... pc = 2; continue; }
    ...
    _ => break
  }
}

With -C opt-level=3, this typically becomes a compact jump table plus tight blocks. It avoids interpreter overhead and stays close to the C++ compiler’s control-flow shape.

9) Performance choices

cowc.rs emits Rust that is tuned for speed:

Chunked stdin buffering: reads from stdin on demand (does not block at startup on interactive consoles).
Buffered stdout: append to Vec<u8> and write once. (now optional with a flag after issues arose)
Wrapping arithmetic: wrapping_add/sub keeps semantics stable and avoids debug-vs-release surprises.
Unsafe cell access: get_unchecked removes bounds checks in the hot path (safe because pointer growth is guarded).
Rustc flags:
- -C opt-level=3
- -C codegen-units=1 (better optimization at the cost of compile time)
- -C panic=abort (smaller + faster)
- optional --lto (-C lto=fat)
- optional native CPU (-C target-cpu=native) for host builds

10) Comparing cowc.rs to the C++ files

vs `cow.cpp` (interpreter)

cow.cpp dispatches at runtime via a function / switch per step
cowc.rs produces ahead-of-time code with a PC jump table
Result: compiled output is generally much faster on loop-heavy programs

vs `cowcomp.cpp` (compiler variant)

cowcomp.cpp emits gotos in generated C++
cowc.rs emits a match pc loop in generated Rust
Both produce straight-line blocks with explicit jumps
cowc.rs removes the dependency on an external C/C++ toolchain (clang/g++)

Details were documented using the AIGEN toolset.

5.8 KiB Raw Permalink Blame History Unescape Escape