Recognizing Imaginary Creatures

After I built a small system that could recognize birds locally - without cloud APIs, without expensive GPUs, running on a simple Intel NUC with 8 GB RAM - I became fascinated with how fast modern models can run even on modest hardware.

I’ve always wanted to create solutions that are fully independent from external cloud services. And that bird project gave me the motivation to push further.

That’s the idea for my new experiment - teach a model to recognize a drawing of a non-existent, imaginary creature and automatically analyze its features.

The project turned out to be much more insightful and fun than I expected. It combines computer vision, preprocessing, dataset building, neural networks - basically a miniature ML pipeline, but in a playful format.

For the early prototype, I gathered several dozen drawings of fantasy animals from open sources. Each drawing was manually annotated with a set of binary features.

Here’s the first version of the feature list:
{
"image": "f3.jpg",
"has_wings": 0,
"has_tail": 1,
"has_horns": 0,
"has_big_teeth": 0,
"has_spikes": 0,
"has_armor": 0,
"has_big_eyes": 0,
"is_colorful": 0,
"is_detailed": 0,
"has_patterns": 0,
"environment_flying": 0,
"environment_water": 0,
"environment_forest": 1,
"environment_space": 0,
"looks_friendly": 1,
"looks_magical": 0
}
This list will grow, but it was enough to build the first working pipeline.

For the prototype, I chose ResNet18 from torchvision because it’s lightweight, fast, great for experimentation. The final fully connected layer was replaced with a custom head that outputs as many values as there are features.

Before training, all images were resized to 256×256, normalized, organized into a dataset paired with JSON annotations (about 20 drawings in the first iteration)

Training was straightforward - standard DataLoader, ResNet18 as the backbone
several dozen epochs, BCE or MSE as the loss function, Adam optimizer and weight saving to a .pth file. Nothing exotic — but surprisingly effective.

The second file is a small command-line tool that takes a path to a JPG file, runs it through the model, outputs the detected features, and attempts a short interpretation.

Something like:

tail: yes
wings: no
friendly: yes
forest creature: yes

…and then a short “psychological profile” based on the combination of features - partly serious, partly playful.

Even with this minimal setup - just two files - the prototype works surprisingly well. And most importantly: It runs entirely locally. No cloud, no API keys, no external inference, just a small model and some logic.

I’m now working on a more advanced version with better preprocessing, a richer feature dictionary, and a more robust model

As for what I plan to do with this system next… I’ll leave a bit of intrigue for another post.