NixInfer - Coming Soon

Project Overview

Netalia, an ACN-qualified Public Cloud provider, proposes NixInfer: a project to build the "missing link" in the Fediverse hosting stack—sovereign, privacy-preserving AI.

Recent academic analysis identifies "dependency hell" as a primary barrier to machine learning reproducibility, where incompatible version constraints force researchers and self-hosters into time-consuming manual resolution. Currently, Fediverse admins are forced to rely on US-based APIs (OpenAI) for features like semantic search, which violates EU data residency principles championed by the European Sovereign Cloud initiative.

NixInfer solves this by delivering a NixOS-based AI Service Module that enables European ISPs to offer AI-enhanced services that are fully GDPR-compliant, running entirely on sovereign infrastructure.

Core Components

The Build Foundry

Netalia utilizes its H200 infrastructure to act as a "Build Foundry," applying 4-bit quantization (proven to reduce VRAM usage by 3x-4x) to heavy models, exposing them via reproducible Nix Binary Caches.

Pre-quantized models from Netalia's H200s, served via public Binary Cache.

Nix-Native

No Docker. No dependency hell. Just services.nixinfer.enable = true.

Declarative packaging guarantees bit-for-bit reproducibility, solving the "Dependency Hell" problem that plagues ML deployments.

Graph-Grounded Reasoning

We implement GraphRAG, which recent benchmarks show outperforms standard Vector RAG by 4-11% on complex reasoning tasks, enabling Small Language Models (SLMs) to perform like flagship models on consumer hardware.

Smarter than Vector Search. Maps ActivityPub structure into Knowledge Graphs.

The Problem

Dependency Hell

Incompatible version constraints force researchers and self-hosters into time-consuming manual resolution, breaking reproducibility.

Data Sovereignty Violations

Fediverse admins are forced to rely on US-based APIs (OpenAI) for features like semantic search, violating EU data residency principles.

Vector RAG Limitations

Most open-source tools use Vector RAG, which struggles with multi-hop reasoning tasks compared to graph-native approaches.

The Solution

Sovereign Hosting Primitive

By packaging AI as a NixOS service, intelligence lives inside the European data center. Netalia proves that European providers can offer value beyond simple storage by becoming the "Build Factory" for the decentralized web.

Graph-Native Reasoning

We map the Fediverse (ActivityPub) structure into a Knowledge Graph. Pairing Small Language Models (like Llama-3-8B) with Knowledge Graphs compensates for their smaller parameter count, allowing efficient local models to match the reasoning depth of proprietary giants.

Binary Caches, Not Just Code

We don't just release code; we release Binary Caches. We use our industrial capacity to do the heavy lifting, so the end-user downloads a pre-optimized binary. This mirrors impactful open-source strategies (like Hugging Face/ONNX) but brings them natively into the Nix ecosystem.

Democratization

We lower the barrier to entry for "Smart Hosting," allowing European ISPs to offer AI-enhanced services that are fully GDPR-compliant, running on €10-20/month VPS instances.

Technical Challenges & Solutions

The "Driver Matrix" Challenge

Packaging AI runtimes (PyTorch/ONNX) inside Nix is historically difficult due to non-standard build processes. Platform-specific inconsistencies break reproducibility.

Solution: Netalia will engineer a Hardware-Aware Flake. The build system will detect the target architecture (NVIDIA GPU vs. CPU) and automatically fetch the correct closure from our Binary Cache.

Real-Time Graph Ingestion & Incremental Updates

Processing the "Firehose" of a Mastodon instance into a Knowledge Graph is computationally expensive.

Solution: We leverage recent NLP advancements in Incremental Graph Updates. The engine updates the knowledge graph asynchronously without retraining the model, using lightweight extractors to ensure responsive user experience.

Model Size vs. Hosting Cost (The Efficiency Gap)

Our goal is to run this on €10-20/month VPS instances. Standard FP16 models are too RAM-heavy.

Solution: Industrial Quantization. Using Netalia's H200s to pre-process models into GGUF/AWQ formats (4-bit quantization cuts memory by 3x-4x). We implement "Graph Pruning" to feed the model only the exact sub-graph needed for a query.

Expected Outcomes

A public Nix Binary Cache for AI models, accessible at cache.nixinfer.netalia.it
The services.nixinfer NixOS module, contributed upstream to nixpkgs
A reference implementation on Netalia's cloud, demonstrating sovereign AI hosting
Documentation and presentations at NixCon and FOSDEM

Ecosystem & Engagement

NixInfer connects the European Cloud Industry, the NixOS Community, and the Fediverse.

Industry Validation

Netalia will demonstrate NixInfer as a case study for Sovereign AI Hosting, promoting this solution to Public Administrations migrating to the Fediverse.

Nix Integration

We will maintain the public cache and contribute the module upstream to nixpkgs, making "AI Moderation" a standard, built-in feature of the OS.

Community Dissemination

We will present results at NixCon and FOSDEM, showcasing how "Big Iron" (Netalia's H200s) can support "Small Tech" (Self-hosters) through Nix Build architecture.

Project Partners

Deep Learning Italia

Technical Partner • Italy

Leading Italian e-learning platform for AI and Data Science, with over 1,000 hours of exclusive content and expert instructors. Specializing in practical AI education and professional certification.

Provides R&D expertise for GraphRAG architectures and Nix packaging.

Netalia S.r.l.

Lead Partner • Italy

ACN-qualified Public Cloud Service Provider, aligned with European Sovereign Cloud objectives. Lead partner for Mermaid-AI (PNRR) and architect of digital twin platforms for the City of Genoa.

Provides H200 GPU infrastructure and Build Foundry capacity.