close
Skip to content

naresh-cn2/Axiom-Zero-RAM-Extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Axiom Core: Zero-RAM CSV Extractor ⚡

A high-performance, hardware-aligned CSV column extractor written in pure C. Built to bypass the Python Pandas "Out-of-Memory" (OOM) ingestion bottleneck.

⚠️ The Problem

Data engineering teams frequently process massive CSV/NDJSON logs (10GB+). Using pd.read_csv() forces the entire dataset into RAM, requiring expensive AWS instances (e.g., r6g.xlarge) and causing frequent OOM crashes just to extract a few columns.

🛠️ The Axiom Solution

This engine uses mmap (Memory Mapping) to read data directly from the SSD, bypassing RAM allocation entirely. It utilizes raw C pointers and a custom state machine to extract columns at the hardware limit.

  • Includes a seamless Python wrapper (axiom_pandas_accelerator.py) so data engineers don't have to leave their native environment.

📊 Benchmark (1GB CSV - 10 Million Rows)

Tested on: Acer Nitro 16 (Ryzen 7)

  • Pandas Baseline (Read-Only): 3.21 seconds ❌ (High RAM usage)
  • Axiom Engine (Read + Write): 1.23 seconds ✅ (Virtually Zero RAM)
  • Speedup: ~2.6x faster end-to-end execution, with 99% less memory footprint.

🚀 Usage (Python Wrapper)

from axiom_pandas_accelerator import extract_columns_fast
# Extracts Column 0 and Column 9 instantly
extract_columns_fast("huge_data.csv", 0, 9)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages