The Fine-Tuning War: How AI Giants are Mapping the Value Chain

In the world of Large Language Models (LLMs), we often talk about “Pre-training” as the heavy lifter. It’s where models learn the fundamental semantics: coding rules, grammar, and complex concepts. But as the market matures, the real battleground has shifted to Fine-tuning.

If pre-training builds the muscle, fine-tuning is the specialized training that teaches that muscle how to perform a specific task, whether it’s a coding assistant or a conversational chatbot.

The Evolution of the Value Chain

Looking at the map of the AI landscape, we can see a clear progression from Genesis to Commodity.

Currently, Fine-tuning and RLHF (Reinforcement Learning from Human Feedback) sit in the “Custom” and “Product” phases. While pre-training is becoming more standardized, the way a model is fine-tuned remains the “secret sauce” that determines the final product’s quality.

1. The Anthropic Move: Constitutional Automation

Anthropic has taken a unique path with Constitutional AI. Instead of relying solely on slow, manual human feedback, they’ve specialized their fine-tuning through automation.

In the realm of coding agents, this is a massive advantage. Reinforcement Learning (RL) for code is “easier” to automate because the feedback loop is objective: Does the code compile? Does it pass the unit tests? By using their own tools to code better and faster, Anthropic creates a self-improving loop that could leave the competition struggling to keep up.

2. The Google Move: Fine-Tuning at Global Scale

While Anthropic scales through clever automation, Google scales through sheer Infrastructure. By integrating Gemini into Google Search, they’ve turned their entire user base into a global fine-tuning engine.

This move is only possible because of their hardware advantage. Google’s TPU (Tensor Processing Units) are built specifically for inference at scale, making it significantly cheaper for them to process these billions of interactions compared to competitors relying on general-purpose GPUs.

3. The OpenAI & Copilot Legacy

OpenAI was the first to prove that clever fine-tuning could capitalize on pre-trained knowledge, giving us ChatGPT. GitHub Copilot did the same for code completion. They moved the needle by finding a way to actually use the concepts the models had learned.

However, the “new” players are now perfecting this art, squeezing even more power out of the same pre-trained foundations.


Key Takeaway: Fine-tuning is what unleashes the power of pre-training. You can feed a model more data to teach it new concepts, but without a sophisticated fine-tuning layer, that knowledge remains trapped.


What’s Next? The Return to Data

Software moves in cycles. As fine-tuning techniques eventually move from “Custom” to “Product” and become more accessible, the battle will inevitably shift back to the Pre-training phase and the data that fuels it.

The big question remains: Will GitHub (and by extension, Microsoft/Copilot) be able to capitalize on the massive mountain of “free” data they sit on to dominate the next pre-training cycle?

Functional Domain Programming Cookbook

In this article, I will demonstrate how Typed Functional Programming can be used to effectively represent a domain. We will walk through a step-by-step example where different domains are modeled using types and subtypes, and functions are used to transform one domain into another.

Let’s consider the process of making a cake as described in a recipe. The first step is the list of ingredients.

sealed trait Ingredient
case object Flour extends Ingredient
case object BakingPowder extends Ingredient
case object Water extends Ingredient


This list is defined in the recipe, but we cannot be certain that we have all these ingredients in our kitchen. To express this uncertainty, we create another type:

sealed trait NeededIngredient {
val ingredient: Ingredient
}
case class MissingIngredient(ingredient: Ingredient)
extends NeededIngredient
case class PresentIngredient(ingredient: Ingredient)
extends Ingredient


With these types, we can define the first operation needed to start the cake:

def getIngredients(ingredients: Ingredient*): List[NeededIngredient]


The result is a list of NeededIngredient, which can be either present or missing. While we could have used the standard library’s Option type, it wouldn’t offer the same level of expressivity. In this case, we explicitly want to know which ingredient is missing; a None would have hidden that information.

The next part of the recipe involves mixing those ingredients.

def checkAndMix(ingredients: List[NeededIngredient]):
Either[MissingIngredients, Mixture]


The checkAndMix function results in either a successful mixture or a collection of missing ingredients.

case class Mixture(ingredients: List[Ingredient])
case class MissingIngredients(ingredients: List[Ingredient])


We could have represented this result differently. Instead of Either, we could have created a custom sealed trait:

sealed trait MixtureResult
case class FullMixture(ingredients: List[Ingredient])
extends MixtureResult
case class PartialMixture(present: List[Ingredient],
missing: List[Ingredient])
extends MixtureResult


While this looks elegant, it is actually invalid for our domain: if you don’t have all the ingredients for a cake, you don’t start mixing – that would be a waste!

Either is the correct solution here because it represents a domain disjunction. This means two completely different domains have only one thing in common: they are both valid results of the function, but they cannot exist at the same time.

The final step is baking the mixture.

def bake(mixture: Mixture): Cake


This function only takes a Mixture as input, which is the correct definition from a domain perspective. We want to express that we can only bake after we have successfully mixed the ingredients. The Either type is perfect for this via a map operation:

val neededIngredients = getIngredients(Flour, Water, BakingPowder)
val possibleMixture = checkAndMix(neededIngredients)
val possibleCake = possibleMixture.map(bake)


The function is applied only if the result is a “Right” value (the mixture exists); otherwise, the “Left” value (the missing ingredients) remains unchanged.

However, a different implementation of the bake function might account for potential baking issues:

def bake(mixture: Mixture): Either[BakingIssue, Cake]


In this case, the result introduces another domain disjunction. If we chain these operations, we end up with nested types:

val possibleCake:
Either[MissingIngredients, Either[BakingIssue, Cake]]


Comparing this to our first implementation, the type has become complex and difficult to read. It would be much cleaner to have a type that clearly shows the final result – the cake – alongside any potential problems that occurred during preparation.

val idealCake:
Either[OneOfTwo[MissingIngredients, BakingIssue], Cake]


This ideal solution combines the different negative paths into a single type where only one issue can be present at a time. Here is a simple implementation that converts a nested Either into a “flat” one:

sealed trait OneOfTwo[+A, +B]
case class First[A](a: A) extends OneOfTwo[A, Nothing]
case class Second[B](b: B) extends OneOfTwo[Nothing, B]
// convert
def convert[A, B, C](either: Either[A, Either[B, C]]):
Either[OneOfTwo[A, B], C] = either.fold(
a => Left(First(a)), {
case Left(b) => Left(Second(b))
case Right(c) => Right(c)
}
)


This approach keeps our domain disjunctions organized on the left-hand side, leaving the right-hand side clear for our successful result: the cake.

Functional.Programming

I like to think of Functional Programming as another level of abstraction that pulls code further away from the physical machine, much like how automatic memory management once hid raw memory addresses from the developer. To me, Functional Programming is a guide for building applications without the need for manual memory assignments.

It is all about Immutability

What is the problem with memory assignment, or more specifically, mutable data?

In a world where software runs in parallel, shared mutable states are a massive problem. Countless solutions have been built to address this, but they usually involve synchronization, locks, and waiting – all of which lead to performance and scalability bottlenecks.

Another way to solve the problem is to avoid it entirely by writing software without shared mutable states. This means using exclusively immutable data structures. In this context, a paradigm that thrives on immutability becomes very attractive. Functional Programming fits this role perfectly; it acts as a discipline that abstracts software away from the constant need for memory assignments.

Be Functional

The core principle of Functional Programming is transformation: a function takes an input, applies a logic, and returns an output. This approach has the same expressive power as imperative code but offers tools and patterns that guide developers to work within this different paradigm.

Let’s compare an imperative example of finding the maximum value in a list with a functional version:

Imperative Approach:

List<Integer> list = Arrays.asList(3, 6, 5);
int max = Integer.MIN_VALUE;
for (int value : list) {
if (value > max) max = value;
}


In this example, the imperative code uses a max variable to store and update the result.

Functional Approach:

int calculateMax(List<Integer> list, int max) {
if (list.isEmpty()) return max;
else {
int head = list.get(0);
List<Integer> tail = list.subList(1, list.size() - 1);
int newMax = head > max ? head : max;
return calculateMax(tail, newMax);
}
}
List<Integer> list = Arrays.asList(3, 6, 5);
calculateMax(list, Integer.MIN_VALUE);


While the imperative version relies on re-assignment, the functional version relies on passing the state through parameters. Concepts like pipelines, higher-order functions, and composition simply make this data transformation more efficient and easier to manage.

Be Pure

An entire application can be viewed as a single, massive function. Since a thousand-line function is obviously bad practice, we modularize: functions are composed to create larger transformations. Simply put, composition is using the output of one function as the input for the next.

int f(double input) { ... }
String g(int input) { ... }
String output = g(f(1.2));


Pure functions are functions with no side effects. This means every observable effect is contained within the output. When you compose functions, the output defines the “interface” between them. If a function throws an unhandled exception, it breaks that interface and falls out of the transformation flow.

Be Typed

A typed language helps define a function’s signature. The more precise the type, the safer the composition. The type system acts as the contract between functions, allowing the compiler to handle a significant portion of the software validation for you.

Operating On The Edge Of Failure… of MicroServices

Systems have Always been Complex

The complexity of a system is a correlation between the problem it aims to solve and the technology it is built upon.

Microservices Architectures offer solutions for contexts requiring extreme scalability, high performance, and rapid change. These systems are characterized by strong decoupling and redundancy. While they are excellent at solving complex problems by modularizing them into “sub-problems,” they introduce a new challenge: the macro-system must now handle the cohesion of many small, moving parts.

This introduces types of failures that were simply unknown to standard monolithic applications. It might seem like microservices “caused” new problems, but the reality is that these problems are inherent to complexity. Choosing one architecture over another doesn’t remove problems; it simply changes their nature.

As Richard Cook describes in his talk, a complex system is always operating in a state of partial failure. We should consider these failures as part of the normal context, not as rare exceptions.

The complexity of a system is the correlation of the problem is aiming to solve and the technology is made of.

The edge of failure

A system operates within three distinct boundaries. When a system crosses one of these lines, it either stops working or loses its reason to exist:

edge-of-failure
  • Economic Failure: An application exists only as long as it makes sense from an economic perspective. Management constantly pushes the application away from this boundary by requesting new features or changes to maintain value.
  • Unacceptable Workload: Every application requires a certain amount of effort to run or modify. This workload must remain bearable. Developers are constantly pushing away from this boundary, trying to minimize the effort and “friction” required to keep the system alive.
  • The Accident Boundary: This defines the point at which the application stops working. “Failure” means different things for different applications, and the definition changes over time.

It has been observed that applications tend to run very close to the Accident Boundary, separated only by a thin Error Margin. This margin represents the developers’ confidence, the calculated risk of how close they can operate and make decisions without triggering a catastrophe.

Microservices and “Accepted” Failures

The Microservices paradigm has pulled new types of failures into the “Error Margin”, failures that would be considered catastrophic in other architectures.

In a distributed system, unreliable communication and machine failure are not “bugs”; they are normalcy. Because this is the expected environment, every individual service must be designed to handle these interruptions gracefully.

Reliability by Design

This new class of failures has elevated reliability to a primary role in architectural design. Reliability cannot be an afterthought; it must be built into every component that interacts with unreliable resources or critical systems from day one.

Actor.Messages.As.Methods

One of the best expedient to explain the interactions between Actors is thinking as method calls. It is useful to look at in this way during the design phase because it helps to better structure the code.

A method is identified by the name, the arguments and the return value. The first describes the operation, the second the data necessary to complete it and the last one the result.

The method is also defined in a class or and interface that defines its context.

class Contact {
  public boolean addNumber(String number) {
    ...
  }
}

For example a method without name that takes a string and returns a boolean can be very generic, while the method addNumber is more specific. The same method name can also have different meaning if defined in a different class.

A message in the Actor model is an immutable class that is sent to an actor. The concept is the same of method: the class name is the method name and the properties are the method arguments. The actor is context of the message like the class is for the method.

Method Name      = Message Class Name
Method Arguments = Message Class Properties
Method Class     = Actor

The message response is similar to the return value but contains also information about the origin of the message (operation and actor).

Four lessons

  1. The method name should be meaningful, the message name should be the same.
  2. There are generic methods (toString), there are generic messages (PoisonPill). All the specif messages to an actor should be used only by the actor (into the companion object).
  3. The method should do one thing, the actor should treat a message in the same way. Avoid to use parameters meaning different operations, like boolean switcher. Create instead another message as you would create another method.
  4. The result of a message tells also the origin of the messages. Avoid generic or shared messages, use actor specif result messages.

Actor.Types

The actor in the actor model is defined in Wikipedia as “the universal primitives of concurrent computation”. This definition places actors at the same level of language primitives like if, for, while, etc… in a concurrency context.

Like a primitive, an actor can be used to solve the problems in many way and, like the name suggests, can act a different role that is specific to solve the problem.

Here a not exhaustive list of type of possible types of actor.

Actor as Worker

The actor acts as worker of a specific operation, it has not state and it is specialized to solve the problem.

class Worker extends Actor {
  def receive = {
    case Operation(data) => ...
  }
}

The same operation can be parallelized creating many instances of the same actor with a router (see the documentation).

val router: ActorRef =
  context.actorOf(RoundRobinPool(5).props(Props[Worker]),"router")

The worker runs in a different context from the invoker (the component that has sent the operation) and the other workers. This isolation is used for obtaining parallelism but it also has benefit to separate blocking or failing operations.

Mapping Resources

When a worker (or a groups of workers) maps an external resource, like a connection pool, it creates a protection from the rest of the application. This protection can isolate an external blocking resource if the worker runs in a different thread pool or dispatcher (link to the documentation).

worker-dispatcher {
  type = Dispatcher
  executor = "fork-join-executor"
  throughput = 100
}

In this way only the worker thread is blocked and not the rest of the application.

The dispatcher (thread pool) can be easily associated to a router.

val router: ActorRef =
  context.actorOf(
     RoundRobinPool(5).props(Props[Worker])
     .withDispatcher("my-dispatcher"), "router")

In case of a pool of resources it is possible to match the size of the pool with the number of workers (connection pool of 10 = a router of 10 actors).

Failure of a Worker

The isolation of the worker can be useful to handle failures without the risk of propagation to the rest of the application. The worker can for example implement a retry logic or a circuit breaker or can benefit of the supervisor model, escalating the failure to the parent actor (see the documentation).

Domain Actor

The domain actor represents an instance of a single domain object like for example a person identified by first and second name. The life cycle of the object and the actor is the same, as is the status. The actor messages are the way the object interacts with the rest of the world.

class Person(firstName: String, secondName: String) extends Actor {
  var status = Status()
  def receive = {
    case Create(status) => ...
    case Read() => ...
    case Update(change) => ...
    case Delete() => ...
  }
}

An example of domain actor is the one that maps a single entity, like a row in a database table. Due to the nature of the actor all the operations on the entity are atomic, so this model can be adopted to implement transaction when the storage system doesn’t support it.

The domain actor status is important because it is the entity status and must be protected to not lose it in case of failure. So linking the actor to a storage system is a natural consequence and can be done synchronizing when a change occurs. In this way the actor can crash, can be stopped and restarted without loosing any data.

An actor can persist its status storing all the events that change it. This is know as the event sourcing model and it has been implemented into the Akka Persistence module.

class Person(firstName: String, secondName: String) extends PersistentActor {
  def persistenceId = firstName + "-" + secondName
  val receiveCommand: Receive = {
    case create: Create => persist(create) { event => ... }
    case Read() => ...
    case update: Update => persist(update) { event => ... }
    case delete: Delete => persist(delete) { event => ... }
  }
  val receiveRecover: Receive = {
    case Create(status) => ...
    case Update(change) => ...
    case Delete() => ...
  }
}

The actor persists the event before to update the internal status. This operation is not needed in case of read operations. The persistence actor is identified by the persitenceId that is used as key during the storing and recovery process.

Event Sourcing vs Status Persistence

The vantage of an event sourcing model is the possibility to rebuild the internal status of the actor from the events have contributed to build it. During this process the actor may have changed its behavior that can differ to the initial one. Replacing only the status would not help in this case.

Request Actor

The request actor is an actor created to satisfy a single request. Its status represents the initial request and the progress. A new actor is created for each request and when the request is completed the actor is stopped.

class RequestActor(<request parameters>) extends Actor {
  val requestProgress = ...
  override def preStart() {
    <start interacting with the rest of the system>
  }
  def receive = {
    case MessageFromSystem() => ...
    case LastMessageNeeded() =>
      <send request response back>
      context stop (self)
  }
}

Binding a request to an actor finds vantages in the status. The actor can track the progress, reacts to failures, implements retry logic and can make decisions that influence the result of the request.

Builder Pattern

In Java one of the most powerful ways to create an instance, except for dependency injection, is using the builder pattern.

The builder pattern helps to create immutable object and avoid to use long or many constructors.

The immutability in Java beans can be guaranteed defining all the attributes final, so once set, they remain the same and the status of the object never changes. As consequence, the constructor should set all the attributes and if there are a lot it can cause misleading set. This because Java constructors use only the position to identify the attributes and place the parameters in the wrong order is very easy, especially of the same type.

public class Person {
  private final String name;
  private final String surname;
  private final Integer age;

  public Person(String name, String surname, Integer age) {
    this.name = name;
    this.surname = surname;
    this.age = age;
  }

  public String getName() { return name; }

  public String getSurname() { return surname; }

  public Integer getAge() { return age; }

}

In this example is easy to declare the Person class with name and surname inverted.

To avoid this problem is possible to use a second class, the Builder, where the constructor is replaced with methods with meaningful names.

public class Builder {

  private String name;
  private String surname;
  private Integer age;

  public Builder withName(String name) {
    this.name = name;
    return this;
  }

  public Builder withSurname(String surname) {
    this.surname = surname;
    return this;
  }

  public Builder withAge(Integer age) {
    this.age = age;
    return this;
  }

  public Person build() {
    return new Person(name, surname, age);
  }

}

The Builder class contains the attributes values until the object is built and each parameter is set through a method.

Person person = new Builder()
  .withName("Alessandro")
  .withSurname("Simi")
  .withAge(32)
  .build();

The disadvantage of this solution (although is a good solution) is the Builder in itself, because requires to have another object and it doesn’t guarantee is used instead of the Person constructor.

To improve the solution the Builder can be defined as inner class of the Person table and reduce the visibility of the constructor, so the object can be created only through the Builder.

public class Person {
  ...
  private Person(String name, String surname, Integer age) {
  ...
  public static class Builder {
    ...
  }
}

Still this solution has two drawbacks. First, the object can be built before all the parameters are set. Second, the Builder is not immutable.

Solving the first problem means introducing a chain of Builders where only the leaf contains the build method. This approach can, for example, divides the mandatory attributes from the optional ones and in general can be expressed through a Fluent Interface.

The immutability can be possible if every method return a new instance of the Builder, but can produce a lot of boilerplate code.

public class Person {
  private final String name;
  private final String surname;
  private final Integer age;

  private Person(String name, String surname, Integer age) {
    this.name = name;
    this.surname = surname;
    this.age = age;
  }

  public String getName() { return name; }

  public Integer getAge() { return age; }

  public String getSurname() { return surname; }

  public static AfterName name(String name) {
    return new Person(name, null, null).new AfterName();
  }

  public class AfterName {
    public AfterAge age(Integer age) {
      return new Person(name, null, age).new AfterAge();
    }
  }

  public class AfterAge {
    public Builder surname(String surname) {
      return new Person(name, surname, age).new Builder();
    }
  }

  public class Builder {
    public Person build() {
      return Person.this;
    }
  }
}

This solution defines the Builder as inner class so it can access to the attributes and create a new instance for each method call. It also forces the creation when all the properties are set.