Developer Frameworks

High-performance, lightweight libraries optimized for local inference and edge hardware.

AUDIO ENGINE

Nanowakeword

Install package locally:

$ pip install nanowakeword

Train custom model:

$ nanowakeword -c ./config.yaml

An automated training toolkit designed for deploying production-grade, low-latency custom wake words directly onto edge devices and microcontrollers. Nanowakeword automates audio data synthesis, synthetic background noise mixing, and neural model quantization.

  • Production-ready keyword spotting (KWS) models in minutes
  • Advanced synthetic voice generation for negative data curation
  • Ultra-lightweight weight exports optimized for ARM Cortex processors
GitHub Repository →
NLP & LOGIC

Phonemize

Install package locally:

$ pip install phonemize

A lightweight, zero-dependency, pure-Python library designed to convert raw text inputs into phonetic IPA representations. Essential for building high-fidelity Text-to-Speech (TTS) frontends and NLP text processing chains.

  • Accurate International Phonetic Alphabet (IPA) conversions
  • No external C binaries or compiled system dependencies required
  • Pre-trained multilanguage phonetic mapping tables
Get Started → Download Model

Research Datasets Hub

A live feed of our active machine learning datasets on HuggingFace. The registry cycles automatically below.

arcosoph/SonicWeave-v2 Size: 12.4 GB Format: 48kHz Stereo WAV License: CC-BY-4.0
Browse All Datasets

Academic Preprints

Read the methodology, mathematical formulations, and engineering optimizations behind our open-source tools.

PREPRINT arXiv:2603.01024

Optimizing Neural Keyword Spotting Models for Low-Power ARM Microcontrollers

Muhammad Abid, Sarah Chen, Arcosoph AI Team

We describe our custom training pipeline that compresses and compiles speech wake-word models. By incorporating synthetic soundscape augmentation and post-training integer-8 quantization, we achieve robust wake word recognition under 15KB of local SRAM.

Attention(Q, K, V) = softmax(½ Q KT / √dk) V
Read Full Preprint →
DOCUMENT arXiv:2602.04910

High-Speed IPA Phonetic Conversions: Pure-Python Frontends for Speech Pipelines

Arcosoph AI Team

This paper presents a zero-dependency phonetic translation toolkit that converts multi-language orthographic text to the International Phonetic Alphabet (IPA). We demonstrate a 40% inference latency reduction using native dictionary mappings.

P(Φ | W) = ∏i P(φi | wi, φi-1)
Read Documentation →
ARTICLE ARCOSOPH-2026

Active Loss-Weighted Sampling for Reinforcement Learning Alignment from Rationales

Sarah Chen, Muhammad Abid

We present a dynamic sampling strategy for alignment datasets. By clustering prompt logic trees C_k and sampling inputs x_i proportional to historical losses, we achieve higher training convergence on multi-step reasoning tasks.

P(xi | xiCk, t) =
( Li(t-1) )α + ε
xj ∈ Ck
[ ( Lj(t-1) )α + ε ]
Read Research Article →