Understanding Coffee Prices

news
code
analysis
Author
Affiliation
Published

June 21, 2025

Analysis of Coffee Prices with Segmentation and Markov Chains

The question of whether I will continue paying these high prices for coffee was a “nerd itch” I simply had to scratch. Below is a plot showing the coffee prices (in cents per pound) reported at a monthly frequency.

Raw Coffee Prices

As mentioned in the previous post on coffee prices, it’s evident that these prices follow distinct cycles in which prices rise to a peak and then decline. While some peaks are more pronounced than others, focusing on the major ones reveals seven to eight clear changepoints between 1990 and the present. To identify the beginning and end of these cycles, we can use a change point detection algorithm. In time series analysis, the process of dividing a series into segments with similar behavior is known as segmentation.

Note

I have not really explained how I arrive at the number of \(7-8\) changes in this post. This comes from an analysis of the decomposition of the time series into trend and seasonal components and a subsequent review of these components. If you are interested in the details, please check out the next post. An explaination is provided there.

The modeling approach we choose depends largely on how certain we are about the number of segments in the data. In some cases, this number is known with confidence; in others, we start with an estimate that must be refined through modeling. For the coffee price series, we have a fairly narrow estimate—between seven and eight segments. To determine the exact number and placement of these segments, we use a model selection technique. In essence, model selection helps us refine our estimate and identify the segmentation that best fits the data.

To identify segments in the coffee price time series, we use the PELT algorithm (Pruned Exact Linear Time)(Killick, Fearnhead, and and 2012). For a broader overview of change point detection methods, see (Truong, Oudre, and Vayatis 2020). The change point detection problem can be described as follows: we have a sequence of signal values—in this case, monthly coffee prices. Typically, prices from one month to the next are relatively consistent, but occasionally, a significant shift occurs, indicating the start of a new pattern. The goal of change point detection is to identify these moments of transition. Each segment between change points represents a distinct period of similar price behavior, also known as a price regime. When a change point is detected, it marks the beginning of a new regime.

A statistical model is assigned to each segment—for example, a probabilistic model where each segment is assumed to be a Gaussian Process. To prevent overfitting, a regularization or penalty term is included in the model. The parameters, along with the optimal segmentation, are determined by solving an optimization problem. For a comprehensive overview of this approach, see (Truong, Oudre, and Vayatis 2020), and for details on the specific change point detection method used here, refer to (Killick, Fearnhead, and and 2012). The method is briefly summarized in the equation below.

\(\sum_{i=1}^{i=m} \text{segment cost}_i + \beta * \text{penalty}\)

If we have \(m\) change points in the time series, then we have \((m+1)\) segments. Each seqment contributes a model component. This is the segment cost. To prevent overfitting we introduce a penalty, actually, a penalty function, \(f(n)\), where \(n\) is the length of the time series. One suggestion for \(f(n)\) is \(\log(n)\). (Killick, Fearnhead, and and 2012) suggests \(\beta\) to be a small number.

We need to adjust the penalty parameter to be consistent with our data. This is the crux of the model selection problem. The plot below shows the number of breakpoints (and consequently the number of cycles) as a function of the penalty. We need to use the plot below to see what value of penalty is consistent with the number of changes we see in reality. Number of Break Points vs Penalty

Correction

There is an edit to the previous version of the post. In the previous version I had mentioned that a Gaussian kernel is used as model for the segments. This is not a reasonable choice because the points in the series are not IID (identical, independently distributed), a Gaussian Process model is a much more reasonable choice. This model captures the correlation of values between data points. As a result of this change, the results from the modeling are different from what resulted from the Gaussian kernel.

Using a Gaussian Process for each segment and a penalty value of \(55\), we get a segmentation that is shown below.

Change Point Plot

Each segment, or regime, is a sequence of months with similar prices. The summary statistics of the segments is shown below. The size column represents the length of the segment in months.

Regime Summary

A violin plot of the prices during each of the cycles is shown below

Violin Plot of Prices

It is evident that each regime has a unimodal distribution and the Gaussian assumption seems reasonable. Strictly speaking, a Q-Q plot and a goodness of fit test are warranted.

Segmentation gives us time periods where prices can be explained by a particular random component, say a straight line with a particular slope, or, as in this example, samples from a particular Gaussian family. The goal here is to understand behavior and identify characteristics of coffee prices, as apposed to an overt task like forecasting (Can we predict next month’s price accurately?). In other words, this is an analysis exercise as opposed to a strict model development exercise. We need a modeling method that summarize the stochastic characteristics of each regime or segment and in the process reveal insights about how prices behaved in that segment. Markov models can do this for us.

The approach taken here is to model prices as a markov chain. In fact, a simplification is applied. The prices for each regime are first quantized into three discrete levels: Low, Medium and High, by applying a n-tile function to the prices for that regime. As a consequence, the sequence for each regime is a sequence of discretized labels. These are the states of the markov chain for each regime. We can compute a transition matrix corresponding to these states for each regime. The transition matrix is a contingency table between successive states in each regime. Such a table can be represented as a matrix. The \((i,j)^{\text{th}}\) entry of this table represents the count of transitions between predecessor state \(i\) and successor state \(j\). When we normalize the entries of this matrix by the row sum, we have the transition probablities for the regime. This is a stochastic matrix.

Markov chains are widely used to study, understand, and model stochastic processes across a range of fields, including economics, science, and engineering. Their versatility and broad applicability make them a powerful tool for analyzing systems governed by probabilistic behavior. For a comprehensive introduction to Markov chains, see:

  1. The QuantEcon website(QuantEcon 2025)
  2. Matthew Aldridge’s course (Aldridge 2025), “Introduction to Markov Processes”
  3. Jason Bramburger’s course (Bramburger 2025), “Introduction to Mathematical Modeling”

The stochastic matrix for the current period (now) is shown below.

Stochastic Matrix, R8

If the stochastic matrix is irreducible —meaning that it is possible to transition from any state to any other state in a finite number of steps—then the corresponding regime has a stationary distribution. This distribution represents the long-run probabilities of the prices being in each of the three states within that regime. For the current period, the stochastic matrix is indeed irreducible, and its associated stationary distribution is as follows:

Stationary Probablities, R8

Now that the modeling has been described, what do the results suggest. Here is what I was able to pick.

  1. Cycle Lengths Vary Widely: Coffee price cycles can range from as short as 25 months to as long as 70 months.

  2. Rising Volatility Since 2005: The variability in prices within each cycle—measured by the standard deviation—has been increasing since around 2005 (beginning with regime R5). This trend aligns with major global disruptions such as the financial crisis, the COVID-19 pandemic, and the Ukraine conflict. The rise in variance is also clearly visible in the violin plots. Increased volatility implies that businesses could potentially reduce costs by stockpiling coffee when demand is stable, though practical constraints and regulations may limit this strategy.

  3. Coffee Is Currently Expensive: We are presently in a high-price regime—bad news for coffee drinkers.

  4. Stationary Distributions Provide Insight: The stationary distribution of a regime captures the long-term probability of prices falling into different tiers. In the current regime, for example, prices are expected to be in the lowest third of the historical range 47% of the time, in the middle third 31% of the time, and in the highest third 22% of the time.

  5. Large Price Swings Are Rare: Within a regime, large jumps between low and high price levels are uncommon. Transitions typically occur gradually—for example, from low to medium before reaching high. This pattern means sudden, sharp price changes from one day to the next are unlikely.

These are some of the observations I could make. If you can identify others, please let me know along with a justification and rationale for your suggestion.

The code for this post is available. These are grouped as follows:

  1. Download Data
  2. Identify Changepoints
  3. Discretize Prices in Regimes
  4. Markov Analysis

References

Aldridge, Mathew. 2025. Introduction to Markov Processes.” Aldridge, M. (no date) Math2750 introduction to Markov Processes, Matthew Aldridge. Available at: https://mpaldridge.github.io/math2750/ (Accessed: 09 April 2025).
Bramburger, Jason. 2025. “- YouTube — Youtube.com.” https://www.youtube.com/watch?v=TCJdgmquZXU&list=PLXsDp0z6VWFT5ZM86xh8i1AMFYxnrefLk&index=28.
Killick, R., P. Fearnhead, and I. A. Eckley and. 2012. “Optimal Detection of Changepoints with a Linear Computational Cost.” Journal of the American Statistical Association 107 (500): 1590–98. https://doi.org/10.1080/01621459.2012.737745.
QuantEcon. 2025. QuantEcon — Quantecon.org.” https://quantecon.org/.
Truong, Charles, Laurent Oudre, and Nicolas Vayatis. 2020. “Selective Review of Offline Change Point Detection Methods.” Signal Processing 167: 107299. https://doi.org/https://doi.org/10.1016/j.sigpro.2019.107299.

Citation

BibTeX citation:
@online{sambasivan2025,
  author = {Sambasivan, Rajiv},
  title = {Understanding {Coffee} {Prices}},
  date = {2025-06-21},
  url = {https://rajivsam.github.io/r2ds-blog/posts/markov_analysis_coffee_prices/},
  langid = {en}
}
For attribution, please cite this work as:
Sambasivan, Rajiv. 2025. “Understanding Coffee Prices.” June 21, 2025. https://rajivsam.github.io/r2ds-blog/posts/markov_analysis_coffee_prices/.