Skip to main content

Profit Optimization Theory

info

For standard ACoS optimization strategies see Intermediate > ACoS Optimization.

ACoS Maximization Problem

  • maximizing ACoS is not the same as maximizing profit

Most resources on the topic of Amazon PPC will encourage you to minimize your ACoS. And for many people that is good advice, as many will struggle to reach their target ACoS at any point. However once you have accomplished that goal, you will want to turn your attention away from minimizing ACoS and focus on maximizing profit.

Let's start by explaining why that is not the same thing.

Here are some metrics for our calculations:

AbbreviationDefinition
SPSales Price (customer price per unit)
UUnits Sold
SSales (SP * U)
CCost (Advertising Cost)
ACoSAdvertising Cost of Sales (C/S)
NSNet Sales (S-C)

Let's look at two advertising campaign scenarios. In both scenarios we have a product priced at $50 (SP = 50)

Scenario 1:

SP = $50     <-- Sales Price
U = 4 <-- Units Sold
S = $200 <-- Sales
C = $40 <-- Cost
ACoS = 20% <-- Advertising Cost of Sales
NS = $160 <-- Net Sales

Scenario 2:

SP = $50     <-- Sales Price
U = 10 <-- Units Sold
S = $500 <-- Sales
C = $150 <-- Cost
ACoS = 30% <-- Advertising Cost of Sales
NS = $350 <-- Net Sales

Scenario 1 has a lower ACoS, but sells fewer units and has a lower net sales.

Scenario 2 has a higher ACoS, but sells more units and has a higher net sales.

If we were to only look at ACoS, we would choose Scenario 1 as the better scenario. But we can see that Scenario 2 has a higher net sales...

We can't tell which scenario is more profitable without knowing the profit margin of the product.

Calculating Profit

Let us introduce some new metrics now:

AbbreviationDefinition
COGSCost of Goods Sold
PMProfit Margin
GPGross Profit
PProfit

We can calculate the profit margin on an individual unit as follows:

PM=SPCOGSSPPM = \frac{SP - COGS}{SP}

So that given a Sales Price (SP) = $50 and a Cost of Goods Sold (COGS) = $25, we would have a profit margin of 50%.

.50=502550.50 = \frac{50 - 25}{50}

Using the profit margin of the product we can calculate the Gross Profit of a campaign as follows:

GP=SPMGP = S * PM

This Gross Profit is our total profit before advertising costs.

*This assumes that the profit margin for all products in the campaign is the same.

Once we know the Gross Profit, to calculate the total Profit on a campaign we can simply subtract the advertising Cost from the Gross Profit:

P=GPCP = GP - C

If we wanted to roll all of that up into one profit formula with primitives, we could do so as follows:

P=(S(SPCOGS)SP)CP = \bigg(S * \frac{(SP - COGS)}{SP}\bigg) - C

So going back to scenario 1 and 2, we can calculate the profit for each scenario:

Scenario 1:

SP = $50      <-- Sales Price
PM = 50% <-- Profit Margin
U = 4 <-- Units Sold
S = $200 <-- Sales
C = $40 <-- Cost
ACoS = 20% <-- Advertising Cost of Sales
NS = $160 <-- Net Sales
GP = $100 <-- Gross Profit
P = $60 <-- Profit

Profit Calculation:

P=(200(5025)50)40P = \bigg(200 * \frac{(50 - 25)}{50}\bigg) - 40

Scenario 2:

SP = $50      <-- Sales Price
PM = 50% <-- Profit Margin
U = 10 <-- Units Sold
S = $500 <-- Sales
C = $150 <-- Cost
ACoS = 30.0% <-- Advertising Cost of Sales
NS = $350 <-- Net Sales
PG = $250 <-- Gross Profit
P = $100 <-- Profit

Profit Calculation:

P=(500(5025)50)150P = \bigg(500 * \frac{(50 - 25)}{50}\bigg) - 150

And the net result is that is that the Scenario 1 profit is $60 and the Scenario 2 profit is $100, even though Scenario 1 has a significantly lower ACoS.

The conclusion here is that in many situations it would actually be more profitable to raise bids and have a higher ACoS, if that results in more sales. Our goal is to maximize profit, not minimize ACoS.

This is not enough information however to accurately optimize our profits, we also have to take into consideration the following:

  • Predicting the relationship between bid and sales
  • Budget constraints
  • Relationship between Budget Nodes
  • Various Profit Margins

Let's explore these considerations.

Building A Prediction Model

  • predicting increase in sales when raising bids

One of the concepts we explored above is that raising bids can result in more sales. While that is generally true we need to consider a couple of factors.

  • Bid/Sales Curve - The relationship between bid and sales is non-linear. As you continue to raise bids, at some point you will have diminishing returns in sales, beyond which raising bids will result in a decrease in profit, and eventually a loss.

    We can expect the bid-sales curve to roughly follow a concave shape, where the curve is steep at the beginning and then flattens out as we raise bids.

Being able to accurately predict this bid/sales concave parabolic curve will be paramount in calculating our expected change in profit while adjusting bids.

Quadratic Regression

If there were no other factors involved we would just be looking at a standard quadratic regression, where we attempt to find the parabolic curve that best fits the data. The result would take the following format:

y=ax2+bx+cy=ax^2+bx+c

Where:

  • y is the sales
  • x is the bid
  • a, b, and c are coefficients estimated from data

To determine these coefficients, you can use least squares regression, which is available in many statistical software packages and programming environments (e.g., R, Python's numpy or scipy libraries, etc.).

However, we have to consider if there are other factors involved that could affect the bid/sales relationship. For example, does the Click-Through-Rate of a target keyword affect the bid/sales curve? It's very likely that it does.

What about recommended bid and conversion rate?

With all of these potential factors, we will need to use a more advanced model.

Accounting for Multiple Variables

We could use a multivariate regression model (multiple linear regression), which looks like this:

Sales=a0+a1×Bid+a2×Clicks+Sales=a0​+a1​×Bid+a2​×Clicks+…

but given the number of factors that we will be taking into account, we will likely be looking at machine learning based models like the following:

Decision Trees - Decision trees split the data into subsets based on rules, which makes them capable of capturing non-linear relationships.

Random Forests - Random forests are an ensemble of decision trees, which makes them capable of capturing non-linear relationships.

Gradient Boosted Trees - (e.g., XGBoost, LightGBM) Boosting algorithms train multiple models sequentially, with each model learning from the mistakes of its predecessors. These algorithms can capture complex relationships and are robust to overfitting with the right hyperparameters.

Machine Learning Model

Sample Data Set

Our objective is to create a machine learning model that can accurately predict the change in sales/cost when adjusting bids, and to make that prediction with a machine learning model you will need a sample data set to train your model on.

There are a couple things you should consider when creating your sample data set:

  • managing statistical significance
  • what data is available to us now while generating the sample data set
  • what data will be available to us when we query the model?
Aggregated/Averaged Metrics

The data that the model is trained on should mirror the data that we will be querying the model with as closely as possible. It would be a problem if we trained the model with daily impressions, because when I run the model on a new target I won't have daily impressions available to me, they haven't happened yet.

To start with we know that we could query the average Conversion Rate, Click-Through-Rate, Average Daily Conversions, Average Daily Sales, Average Daily Bid, Average Daily Cost and Average Daily Units.

In fact let's officially introduce those metrics now:

AbbreviationDefinition
ADSAverage Daily Sales
ADCSTAverage Daily Cost
ADCNVAverage Daily Conversions
ADUAverage Daily Units
ADBAverage Daily Bid
APPAverage Product Price

Where each is the sum of all daily sales/conversions/units divided by the number of days in the time period.

Average Daily Sales=i=1nsin\text{Average Daily Sales} = \frac{\sum_{i=1}^{n} s_i}{n}

As the ultimate goal is to predict profit, and we will have both the sale price and the profit margin/COGS of the product, it's not necessary in fact to predict sales, we could really just try to predict the daily conversions.

And building on that further, each conversion can have more than one unit sold, so we actually want to predict the daily units, which luckily is available to us in reports.

Product Price

That being said, if we are training a machine learning model, the more data we can give it the better. It's very likely that the price of a product (high price point vs low price point) is a factor in the bid/sales relationship.

Knowing that, you could go about adding the price of the product to the model in a few different way. You could either calculate the APP from the historical data using the sales and units sold values. Or you could use the Ad Group Id's from your target reports to fetch the ASINs in those Ad Groups and fetch the price of each product using the Amazon Advertising API. Then merge that data into your dataset.

Each method has some advantages. Calculating the average price from the historical data is easier, it accounts for sales of non-target ASINs, or price fluctuations of the ASIN over time. The major disadvantage is that you won't be able to calculate this metric for any target that doesn't have sales.

One potential way around that would be to query the average product price by Ad Group, and then merge that data into your dataset. That way you would have the average price of the product for each Ad Group, and you could use that as the average price of the product for each target in that ad group.

The key thing that all of these metrics must have in common, is that you can calculate them from the data that is available to you at the time of the query. If you have a datalake of historical target report data, then you can calculate these metrics from that data, both for the training data set and for the data that you will be querying the model with.

There is another piece of data which is likely going to have a strong correlation in our model, and that is the recommended bid. Although the recommended bid for each target daily is not available in reports, we could fetch the current recommended bid for each target using the Advertising API and add it to our sample data set. As there is going to be a wide range of bid recommendations depending on the target, we can use the recommended bid to see whether the current bid is high or low on that range, and use that metric in our model.

Ideally we would have historical data on recommended bids, but we don't, so we will have to settle for the current recommended bid for each target.

We could then add the following features to our model:

  • deviation from recommended bid
  • eviation from recommended bid floor
  • eviation from recommended bid ceiling
Statistical Significance (Bias/Variance Tradeoff)

If you remove low-impression items, you could introduce bias into your model. The model might become too optimistic about typical performance since it never sees examples of poor performance. On the other hand, including them could add noise and increase the variance of your model. Balancing bias and variance is key to building a robust model.

Instead of removing these low-impression data points, you can create features that account for the uncertainty due to low counts. For example, you could add a feature that indicates if the impression count is below a certain threshold, signaling to the model that this data point might be less reliable.

import pandas as pd

# Sample dataframe
data = {
'Impressions': [5, 1500, 25000, 20, 40000],
'Clicks': [1, 100, 2300, 2, 3600]
}
df = pd.DataFrame(data)

# Feature engineering: Indicate if the impression is below a threshold
threshold = 100
df['LowImpression'] = df['Impressions'].apply(lambda x: 1 if x < threshold else 0)

# Another feature: Click-through rate
df['CTR'] = df['Clicks'] / df['Impressions']

print(df)

Impressions  Clicks  LowImpression       CTR
0 5 1 1 0.200000
1 1500 100 0 0.066667
2 25000 2300 0 0.092000
3 20 2 1 0.100000
4 40000 3600 0 0.090000

Using the Model

Now that you have a model which can predict the change in daily units sold and cost when adjusting target bids, you can use that model to optimize your bids for profit. Here are some considerations to take into account while doing so.

Budget Constraints

  • How budget constraints affect profit optimization

We are typically constrained by a budget, which means that we have a limited amount of money to spend on advertising in a given time period. If we raised bids for "optimal" profit, but then run out of budget halfway through the budget period and stop serving ads altogether, we will have lost out on potential profit.

We need to optimize profits over the entire budget period, not just at a single point in time.

Additionally we may need to prioritize certain targets over others. We can now predict which targets in a set are the most profitable relative to the others, in addition to how much cost those targets will incur on a daily basis.

We can use that information to predict if we will run out of budget before the end of the budget period, and if so, which targets we should prioritize.

Relationship Between Budget Nodes

  • Considering shifting budget between campaigns/portfolios.

In Amazon Advertising we can set budgets at either the Campaign or Portfolio level. Let us refer to each item that has a budget as a "budget node". Furthermore let us assume that we have multiple budget nodes that are all operating in parallel.

If we have optimized each budget node individually, we may not have optimized our profit across all budget nodes together.

Say for example that we have two budget nodes, A and B, that have both been optimized for profit individually to the best of our ability.

If node A is making more profit per ad dollar than node B, then it would be more profitable to shift budget from node B to node A.

It would be unwise however to make this the default behaviour, as there could be strategic reasons behind maintaining the budget in node B. Say for example that node B is a new product that we are trying to launch, and we are willing to take a loss on it for a period of time in order to gain market share. Or it could be a research campaign.

Strategic Target Considerations

What should be done with the targets that are deemed to be less profitable by the model?

We are in a situation now where we need to make a decision on two target categories

  1. Targets that are still profitable, but less profitable than the most profitable targets
  2. Targets that have strategically set bids (catch-all, discovery/research)

For the first category, we could simply pause those targets, and shift the budget to the most profitable targets. Or we could give them very low bids like catch-all targets.

For targets whose bids are set strategically, you would want to avoid changing them for optimization purposes. The catch-all bids would get pushed higher, which would defeat the purpose. The research bids would be decimated, which would defeat the purpose of a research campaign.