Developer Frameworks

High-performance, lightweight libraries optimized for local inference and edge hardware.

AUDIO ENGINE

Nanowakeword

Install package locally:

$ pip install nanowakeword

Train custom model:

$ nanowakeword -c ./config.yaml

An automated training toolkit designed for deploying production-grade, low-latency custom wake words directly onto edge devices and microcontrollers. Nanowakeword automates audio data synthesis, synthetic background noise mixing, and neural model quantization.

GitHub Repository →
NLP & LOGIC

Phonemize

Install package locally:

$ pip install phonemize

A lightweight, zero-dependency, pure-Python library designed to convert raw text inputs into phonetic IPA representations. Essential for building high-fidelity Text-to-Speech (TTS) frontends and NLP text processing chains.

Get Started →Download Model

Research Datasets Hub

A live feed of our active machine learning datasets on HuggingFace. The registry cycles automatically below.

arcosoph/SonicWeave-v2Size: 12.4 GBFormat: 48kHz Stereo WAVLicense: CC-BY-4.0

Academic Preprints

Read the methodology, mathematical formulations, and engineering optimizations behind our open-source tools.

PREPRINTarXiv:2603.01024

Optimizing Neural Keyword Spotting Models for Low-Power ARM Microcontrollers

Muhammad Abid, Sarah Chen, Arcosoph AI Team

We describe our custom training pipeline that compresses and compiles speech wake-word models. By incorporating synthetic soundscape augmentation and post-training integer-8 quantization, we achieve robust wake word recognition under 15KB of local SRAM.

Read Full Preprint →
DOCUMENTarXiv:2602.04910

High-Speed IPA Phonetic Conversions: Pure-Python Frontends for Speech Pipelines

Arcosoph AI Team

This paper presents a zero-dependency phonetic translation toolkit that converts multi-language orthographic text to the International Phonetic Alphabet (IPA). We demonstrate a 40% inference latency reduction using native dictionary mappings.

Read Documentation →
ARTICLEARCOSOPH-2026

Active Loss-Weighted Sampling for Reinforcement Learning Alignment from Rationales

Sarah Chen, Muhammad Abid

We present a dynamic sampling strategy for alignment datasets. By clustering prompt logic trees C_k and sampling inputs x_i proportional to historical losses, we achieve higher training convergence on multi-step reasoning tasks.

Read Research Article →