Almog Baku - LLM engineering and entrepreneur

README

Hi, I’m Almog - a hands-on AI engineering expert, doing AI for almost a decade - since long before LLMs - with dozens of LLM apps and agents deployed to production.

Author of The LLM Triangle Principle - a framework for building reliable LLM applications in production.
Founder of GenAI Israel - the largest LLM developers community in Israel (8,000+ members).
Serial founder and Kubernetes contributor since 2016 - production-grade distributed-systems scars, now applied to how I build AI.

I hack, build, write, and speak - often bluntly - about what actually breaks when you run AI agents in production, and what it takes to make them reliable.

Areas of Expertise

🧠 AI - LLM apps and agents engineered to survive real users: evals, feedback loops, reliability
🏗️ AI Infrastructure - the plumbing that keeps models fast, cheap, and observable in production
☁️ Cloud-Native - Kubernetes-native systems that scale without drama
📐 Software Architecture at Scale - distributed systems that hold up under real-world load
🔐 Cyber + AI - years in the cyber market: AI that’s secure, and security that’s (artificially) intelligent

Publications

The LLM Triangle Principle: Software Design Principles for Reliable LLM Apps
Software design principles for reliable, high-performing LLM apps - a framework for bridging the gap between a slick demo and production-grade performance.
Building LLM Apps: A Step-By-Step Guide
The end-to-end LLM development process - from first experiment to production.
8 Practical Prompt Engineering Tips for Better LLM Apps
Field-tested prompt engineering tips - no folklore, just what worked.
Effective AI Infrastructure Explained
What modern AI infrastructure actually needs to do for the ML lifecycle.
Talks
From time to time, I give talks on various meetups, podcasts and conferences. You can find some of them on my LinkedIn profile. Make sure to follow me to get updates on upcoming talks.

As Seen On

Want me on your stage or podcast? Reach out via Email.

Recent appearances include:

AI Dev TLV ‘25 - Talk (English) / Feedback Is All You Need - How to Build Agents That Learn on the Job - a talk I gave at AI Dev TLV ‘25 about building AI agents that improve through feedback loops while on the job.
AI Engineer Summit (Online track) - Talk (English) / The LLM Triangle: Engineering Principles for Robust AI Applications - a talk I gave at the AI Engineer Summit about the LLM Triangle principles and how to architect reliable AI apps in a production-grade manner.
AI Dev TLV ‘24 - Talk (Hebrew) / The LLM Triangle: Engineering Principles for Robust AI Applications - a talk I gave at the AI Dev TLV ‘24 conference about the LLM Triangle principles and how to architect reliable AI apps in a production-grade manner.
LangTalk E35 (Hebrew) / LLM Applications Developer Guide - An end to end guide on how to get started and deploy to production your llm app
[*ExplAInable Podcast **](https://www.podbean.com/ew/pb-a2gr4-1984c3c) (Hebrew) / Are Evals a Scam? Feedback is All You Need How to effectively get working AI Agents when you don’t know what you don’t know. *Discussing the importance of feedback loops and importance of looking at real user data to improve your LLM applications.
Making Software (Osim Tochna) E165 (Hebrew) / From a PoC to a product - the hidden challenges of deploying LLM applications
AI In Production Conference - Talk ( English) / How to Build LLM-native Apps with The LLM Triangle Blueprint - a talk I gave at the AI In Production Conference about the LLM Triangle principles and how to architect reliable AI apps in a production-grade manner.
The MLOps Podcast /🫣 Is Data Science a dying job? ( English) - About Kubernetes, Large Language Models (LLMs), how to get them into production, and how data is becoming a more central piece of the ML landscape.
AI Infra Stories (English) - A podcast about AI infrastructure, where I hosted world-class AI infrastructure experts to discuss the latest trends and challenges in AI infrastructure.
And many more… 🚀

Open Source Contributions

I’ve been an active contributor to open source projects for over 15 years, regularly participating in various projects. My contributions range from creating new tools to maintaining major projects, or just sending PRs for bugs 🙃

Notable contributions:

Creator of Raptor.ml: An AI infrastructure project that helps to build and deploy AI to production - the gap between data science and engineering.
Author of openai-streaming: A Python library simplifying interactions with LLM Streaming API, including for tool using purposes.
Kubernetes Maintainer: Active contributor since 2016, focusing on cloud-native big data solutions and Kubernetes Native architectures.
pytest-evals: A pytest plugin for running and analyzing LLM evaluation tests.
LLM Playground: An interface to play/compare different LLM models directly from your browser.
Various Contributions: Ongoing involvement in multiple open source projects, consistently pushing for advancements in technology and knowledge sharing.

Services

I help startups move fast, enterprises scale AI, and investors make smart AI bets. Consultant, exec, or hands-on builder - whichever the problem needs.

Fractional CTO / “CTO-team for hire” (2-3 days/week, hourly rate)
Move fast in the right direction on AI and infrastructure - executive-level leadership, part-time, without the full-time cost.
Workshops
Giving your team the skills and tools that create real value - hands-on workshops on AI Engineering, AI SDLC, and Agents.
AI Transformation
Unlocking the real value of AI agents inside your organization - designing and building the agents, tools, processes, and methodologies your team needs, shaped to fit your company’s existing DNA (not bolted on top of it).
Technical Talks
Practical, battle-tested insights on AI infrastructure and large-scale LLM applications - delivered as conference talks and panels.
Strategic Consulting
A trusted advisor on AI investment and strategy decisions - from high-stakes calls to lighter, ad-hoc guidance, drawing on expertise across engineering, infrastructure, and leadership.

Get in Touch

Connect with me on LinkedIn, GitHub, or via Email, follow me on X for updates, or simply self-service schedule a meeting with me during my Office Hours.

Office Hours

I offer free Office Hours for engineers, founders, and investors - bring a real problem, leave with a real plan.