Research / Spark

Spark

A novel neural network built from scratch in C++ and Metal. No frameworks, no PyTorch. Trained on a single laptop.

Samir Awuapara Solo Researcher 53 Days, Nights & Weekends

Key Results

2.92

Val BPC

20 min

Wall Time

423K

Parameters

M1 Pro

Hardware

Timeline

Feb 16, 2026

Project started

First commit. C++ engine, Metal compute shaders, custom memory allocator.

Apr 9, 2026

First published results: 2.92 val BPC

392 training runs, 207 merged PRs, 53 days of development.

April 2026 Results

Apr 9, 2026

Setup

Task

Character-level language modeling

Dataset

WikiText-103 (raw character stream)

Hardware

Apple M1 Pro, 16 GB unified memory

Architecture

Novel architecture (no attention, no convolution)

Results

Spark is a neural network engine written from scratch in C++ with Metal compute shaders for GPU acceleration on Apple Silicon. It implements a novel architecture where the network topology itself is a learned artifact: the structure evolves during training rather than being fixed upfront.

There are no attention layers, no convolution, no skip connections borrowed from existing architectures. Every component, from the memory allocator to the GPU kernels, was built from scratch in C++ and Metal by a solo researcher over 53 days of nights-and-weekends work. No frameworks were used.

On the WikiText-103 character-level language modeling benchmark, the best run reached 2.92 validation bits-per-character (BPC) in under 20 minutes of wall time, with the network growing to 423K parameters across 2,841 hidden units and a depth of 5 layers. The entire training run executed on a single M1 Pro laptop.

Over the course of the project, 392 training runs were executed and 207 pull requests were merged.

Learning Curve

Wall Time	Train BPC	Val BPC
0s	6.64	6.62
1m 12s	4.19	4.22
3m 02s	3.71	3.74
5m 18s	3.48	3.51
7m 45s	3.32	3.36
10m 10s	3.19	3.23
12m 38s	3.10	3.14
15m 05s	3.02	3.06
17m 30s	2.95	2.99
19m 53s	2.89	2.93

Best validation BPC: 2.92 (measured at final evaluation checkpoint)

Final Network Statistics

225,749

Edges

2,841

Hidden Units

Depth (Layers)

423K

Parameters

Articles

Apr 10, 2026 Efficient ML

← Back to Research

Spark

Key Results

Timeline

April 2026 Results

Setup

Task

Dataset

Hardware

Architecture

Results

Learning Curve

Final Network Statistics

Articles

Sub-500K Parameters on WikiText-103 Char-Level

Why Atomic-Free Sparse Backward Passes Are Slower on Metal

Metal Compute Shader Patterns for Sparse ML Training