OpsMind: On-Prem AI for Manufacturing — No Cloud, No API Keys, No Budget

#python #ai #opensource #manufacuring

Manufacturing companies run on SQL Server ERPs with hundreds of tables. Shift managers need yield numbers, waste reports, temperature readings — daily. The usual path: email IT, wait for an SSRS report, get yesterday's numbers tomorrow.

OpsMind is an open-source tool that lets anyone on the factory floor type a question in English and get the SQL result in 5 seconds. No SQL knowledge required. Runs locally on Ollama, no cloud dependency.

Manager: "What was today's yield by product?"

OpsMind: Salmon fillets: 91.2% (target 90%)
         Cod loins: 88.7% (below target - check line 2 defrost timing)
         Haddock: 93.1%

It runs entirely on-premises. A Gemma 3 12B model via Ollama on a desktop PC. No data leaves the building. No cloud subscription. No API keys. Total hardware cost: one PC.

How it works

OpsMind uses a LangGraph state graph with 6 nodes:

question -> detect_domain -> check_library -> generate_sql -> validate_sql -> execute_sql -> explain_results

detect_domain identifies which of 7 business areas the question belongs to (production, waste, orders, compliance, staff, suppliers, traceability)
check_library checks 20 pre-built queries first. If there is a match, it skips the LLM entirely. Instant, guaranteed-correct SQL.
generate_sql if no match, the LLM generates SQL scoped to only the relevant tables (not all 147)
validate_sql 5-stage safety check: statement type, injection detection, table existence, column existence, row limit enforcement
execute_sql runs the validated query
explain_results LLM explains the numbers in business terms

The key insight: the pre-built query library handles the top 20 questions managers ask every day. The LLM is only the fallback. This means the most common queries are fast and reliable, while novel questions still work.

The SQL validation layer

This is the critical layer. An LLM generating SQL against a production database needs safety gates. The 5-stage validation catches:

Tautologies like WHERE 1=1 (injection attempt)
UNION injection (appending malicious queries)
Comment injection (-- to truncate queries)
Non-existent tables (hallucinated table names)
Missing LIMIT (auto-adds LIMIT 1000 to prevent accidental full table scans)

Only SELECT and WITH (CTEs) are allowed. INSERT, UPDATE, DELETE, DROP are blocked at the validation layer. The database connection uses read-only credentials. Defence in depth.

MCP server architecture

The architecture includes Model Context Protocol servers to decouple the data access layer:

Database server (port 9000) exposes query, table discovery, and domain schema as tools
Document search server (port 9001) exposes RAG search over factory SOPs

This means OpsMind is not the only tool that can use the data. Any MCP-compatible agent can connect to the same servers.

Key takeaways

Pre-built queries beat LLM generation for common questions. A query library handles 80% of real usage with zero latency and zero hallucination risk.
Domain scoping is critical. Exposing all 147 tables to the LLM produces garbage SQL. Scoping to 4-10 relevant tables per domain produces accurate SQL.
Runtime-loaded documentation works. Business rules change. Compliance thresholds change. Loading these from markdown files at runtime keeps domain knowledge current without redeploying.
Local LLMs are sufficient for structured tasks. Gemma 3 12B handles NL-to-SQL and result explanation. No GPT-4 needed. No internet dependency.