Inside the Model Factory: How Enterprises Actually Build AI
An API call, sending prompts, receiving responses, and hoping the system behaves well enough to be trusted. This is a common approach taken today where organisations don’t build AI: They consume it. For early experimentation, this works. For real enterprise systems, that is when the wheel falls off.
In our Season 2 opener of What The Tech Podcast (AU), Dave Lemphers, CEO and co-founder of Maincode, explained why this gap exists and what sits underneath organisations that successfully move beyond demos.
That foundation is what he calls a model factory.
What a “model factory” actually is
A model factory isn’t just model training. It’s the entire system required to deliberately design, operate, and evolve AI models.
As Dave put it:
“A model factory is the environment and infrastructure… racks of compute and storage, allocations of GPUs for training, serving models, massive storage for data, and the software layer that makes it all work.”
This includes far more than inference:
Data ingestion and curation
Training and evaluation pipelines
Model versioning and rollback
Deployment and serving layers
Monitoring, benchmarking, and governance
Most organisations never see this layer because SaaS APIs abstract it away.
That abstraction is convenient. It also hides trade-offs.
SaaS-style AI vs a model factory
SaaS-style consumption looks like this:
Send prompt
Get response
Pay per token
Trust the provider’s decisions
Dave summed up the real limitation succinctly:
“They don’t really care about GPUs and hard drives. They care about tokens.”
That’s fine until organisations need:
Determinism
Explainability
Cost predictability
Regulatory clarity
Control over where data moves and who sees it
At that point, abstraction becomes friction.
A model factory flips the relationship:
You choose which model runs which task
You decide how data is used and retained
You benchmark models against your requirements
You optimise cost by routing work intelligently
Dave described how this works in practice:
“Running top-tier models for any possible question is insane… so you categorise the prompt and route it to the most appropriate model.”
This isn’t theoretical. It’s how teams survive at scale.
Prompt routing, distillation, and efficiency
One of the most misunderstood ideas in enterprise AI is that every task needs the best model.
Dave was blunt about this assumption:
“If you have a straightforward question… do you really need a top-tier model to answer that?”
Model factories rely on:
Prompt routing to classify intent and complexity
Smaller models for simple tasks
Specialised models for reasoning-heavy work
They also leverage distillation and quantisation to reduce cost and compute:
“You can teach a smaller model to learn trends from a larger model… and then lower the precision to reduce the computation needed.”
This is how organisations operate within real constraints:
“We don’t have enough compute to serve the entire planet.”
Efficiency isn’t a nice-to-have. It’s survival.
Why benchmarking and ownership matter
Most enterprises rely on public benchmarks to choose models. Dave cautioned against that:
“Create your own benchmarks… questions that are specific to what you actually need.”
A model factory enables this because:
You control the evaluation data
You track improvement over time
You avoid optimising for irrelevant capabilities
This also ties directly to governance and risk:
“If you do not have ownership and transparency over what the AI model is doing… you are running a risk.”
Ownership isn’t ideological. It’s operational.
Why this matters for Australian enterprises
Australia doesn’t have unlimited compute. It doesn’t have hyperscaler-level infrastructure density. And it can’t rely on cross-region ambiguity when compliance matters.
Dave highlighted a hard truth:
“There is not enough compute around… especially here in Australia.”
Model factories allow organisations to:
Work within local constraints
Reduce dependency on opaque platforms
Build capability incrementally
Prioritise high-impact use cases first
This isn’t about rejecting global platforms. It’s about knowing when abstraction helps and when it hurts.
The takeaway
A model factory is not about building the biggest model.
It’s about building:
The right model
For the right task
At the right cost
With the right controls
Or as Dave put it more simply:
“It’s much easier to roll your sleeves up and start building.”
Watch to the full episode
🎧 Australia vs Big Tech is now live on: Spotify, Apple Podcasts, Amazon Music & YouTube
If you’re responsible for AI outcomes; technical, financial, or regulatory; this episode is worth your time.
References
What The Tech Podcast (AU) – Season 2, Episode 1: Australia vs Big Tech
Primary source discussion with Dave Lemphers, CEO & Co-founder of Maincode, covering model factories, enterprise AI trade-offs, compute constraints, and governance.
👉 https://www.whatthetech.com.au/p/australia-vs-big-tech-why-build-aiF5 Ecosystem - What is an AI Factory?
👉https://www.f5.com/company/blog/defining-an-ai-factoryUS Federal Government – Understanding and managing the AI lifecycle
👉 https://coe.gsa.gov/coe/ai-guide-for-government/understanding-managing-ai-lifecycle/





Solid breakdown on the abstraction trap. I worked on a project where we tried building on GPT APIs for regulatory compliance and hit the wall within two months when costs ballooned past budget. The routing point is often overlooked, like you dont need frontier models for every prompt when a tuned small model cuts cost by 80%. Also the benchamrking idea is key because public evals rarley map to real use cases.