Simplified historical data / probabilistic inventory control technique

Ronen E · Nov 8, 2025

Hello,

I'd like to raise an entry-level (?) question/topic in operations research. I have no formal training in this field and would appreciate any input from people with relevant expertise/experience. Perhaps this is not the right spot to post it, but given that statistical data and probabilistic modelling are involved, I thought it might be not too far off...

Imagine the following scenario:

I have a store that supplies goods in a specific (technical) field. The store caters for both retail and wholesale. It serves a variety of different customers, with no formal insight about their demand patterns. For all we know, it's completely random and no patterns are known. But a comprehensive database of all transactions (buy, sell, goods in, goods out etc.) exists, going back several years.

Every week, I need to decide what goods to order and in what amounts. I'd like to use the entirety of data available to me (on a rolling basis), to optimise the store performance. I'd like to minimise the probability that any given (specific) customer demand won't be able to be fulfilled from the store's local stock (minimise backorder), whilst keeping inventory costs minimal. Obviously there's a tradeoff, so part of the question is how to optimise... Minimise the product "disappointment probability" x inventory cost?

For discussion simplicity, let's assume I only have one item in my store. If I knew how to optimise its stock level (how many units to order every week), I could theoretically do the same for hundreds of different items.

Let's also assume that all I have is a general purpose spreadsheet software and I don't wish to invest in dedicated software (there is an ERP software in place but let's ignore it for discussion sake).

My database includes comprehensive information about my item:
- Every sale, including date, who the customer was (acc. #), what the unit sell price was, how many were supplied from store stock and how many went on backorder.
- Every purchase (by specific supplier), including date ordered and date goods-in, quantity ordered and unit price.

Let's assume storage space is infinite. Obviously there's overhead, but let's assume it's fixed, stable and low. My main inventory cost is in sunk purchase cost. Since space is unlimited and there are no special storage requirements (like unit volume, weight, hazard / special handling), let's assume the only cost in holding the inventory is interest on the purchase. The item I have is non-perishable and has infinite shelf life (simplifying assumption).

I guess my question is: Where do I start in thinking about this / trying to come up with a method or algorithm to deal with it? Can you recommend a paper, a book, a method?

Thank you,
Ronen.

PS
I researched the topic online a little and looked in Google scholar. I found several references, but then I realised I have no way of telling which ones are "good" / worthwhile investing time in studying. And then I remembered Elsmar!

Tidge · Nov 8, 2025

My initial thoughts:

1) Model the sales data based on "shelf time". Find a functional fit that best describes "how long do products sit on a shelf before being sold?". From this, you can have a CDF that allows an estimation of how long before some fraction of the items will be sold. [hopefully it would be an exponential distribution, but a Weibull/GEV distribution may make more sense]

2) At this point, you can set a date target for when you want to "restock".

3) The date target can be used as variable input for other equations, such as a time-to-restock curve. (if all you cared about was maintaining inventory levels)

The financial considerations can be factored against the "shelf-time" curve... so that you aren't investing too much in the cost-of-goods versus expected income. You ought to be able to find relative maxima/minima by manipulating that one variable, and should something like your cost on inventory change, you can look at varying restock times relative to the expected profit (or vice versa).

Ronen E · Nov 8, 2025

Thank you, but this sounds impractical. There are literally hundreds (if not thousands) of items in the store, and I can't devote more than an hour a day to grooming the data. I'm looking for something that can be mechanised: historical data >>> "Excel" >>> actionable decisions.

I also found papers that discuss distribution free models, so I'm interested in that. It seems more powerful (if it can be pulled off) and less work-intensive.

Another important aspect is that this involves 2 independent stochastic "worlds" that interact via the store. The supply side is by no means deterministic. From my perspective, it presents as a random time for goods-in (and the price is also variable, but this seems a little less "arbitrary").

When I went into this I thought "that wheel must have already been invented". However, after browsing a bit, my impression (by no means necessarily correct) was:
1. Like many other fields, there is a lag factor and people are still using methods from 50 years ago, and
2. There are new approaches around (quite a number of them) but no way of telling what's valid.

Tidge · Nov 9, 2025

Ronen E said:
Thank you, but this sounds impractical. There are literally hundreds (if not thousands) of items in the store, and I can't devote more than an hour a day to grooming the data. I'm looking for something that can be mechanised: historical data >>> "Excel" >>> actionable decisions.

A couple of comments:

Excel can do everything I mentioned. Finding as model for the sales data won't be trivial, but it is straightforward. You'll need a model to be able to do predicitions, how much effort goes into the model is up to you. Personally: I'd pick a handful of items I think are representative, if necessary dividing them into categories (for example: items with a shelf life such as milk would be different than forever items like plastic bags) and look at the sales data to see what model(s) are appropriate. I'm not suggesting that you need to general Anderson-Darling statistics for every item for all time, just that you land on a predictive model that you feel comfortable with.

After you have the model(s), the use the data from "thousands of items" (or "dozens of categories") to generate (if desired) estimators for the chosen model to predict how long a given item/category will sit on the shelf/how long to sell x fraction of a batch... You can do the cross-check of the model at any time for any specific item.

Ronen E said:
Every week, I need to decide what goods to order and in what amounts. I'd like to use the entirety of data available to me (on a rolling basis), to optimise the store performance. I'd like to minimise the probability that any given (specific) customer demand won't be able to be fulfilled from the store's local stock (minimise backorder), whilst keeping inventory costs minimal. Obviously there's a tradeoff, so part of the question is how to optimise... Minimise the product "disappointment probability" x inventory cost?

If the primary concern is "Will I have an item on the shelf?", this doesn't sound much different (to me) than "will this finished good have failed if I go to use it?" (modulo 'replenishment', which ought to be factored out) which is why I naively expect a Weibull/GEV distribution to be the model. The inventory target seems like it should be >= the typical customer order, with whatever confidence you decide. These all sound like classical one-sided distribution concerns. For this exercise I wouldn't sweat things like trying to find unbiased estimators, and critical thinking needs to be applied when trying to fit any given model. "All models are wrong, but some are useful."

Bev D · Nov 9, 2025

Well there is no easy way to actually do this. Several ERP systems are pretty good at it BUT regardless of system (EXCEL, some fake AI thing, etc.) you still have to load the data into the sytem and you have to decide what model to use and how much buffer you need. This really hasn’t changed in the last 50 years. Ask Toyota. Ask the grocery stores…neither use fancy shmancy statistics

Basically you need to understand the ‘sales’ rate and the replenishment time plus a buffer. Your replenishment time needs to be stable and predictable. It’s just simple math. And don’t think about the average sales rate - remember the standard deviation.

And remember any ‘sales’ or other promotions will disrupt the stable system…how will you adjust for that?

ChrisM · Nov 10, 2025

You need to understand your customers better and ask them about their buying needs.

There was a comparatively recent case where a major car company suddenly had a rush in demand for some specific spare parts, from memory I think it was suspension components. Those buying in the spares in for the company's parts distribution network flagged a potential problem - is there a design issue that has suddenly caused a lot of these parts to break on customer cars? Possible recall on the way?

Turned out it was the company's own R&D Department that was buying the parts as they thought that the existing part could be used on a new design, so they'd started buying them up to use in tests and on development/prototype "mules" for a new model

Jean_B · Nov 11, 2025

Also good to know: excel has a data model and powerquery built in these days, and that can help you do more in a more organized branching fashion than pivottables and seperate statistical runs of data can do.

It's not as clicky as PowerBI but plenty good enough if you are the only one utilizing the meaning that comes out of it. Another advantage is that you can 'automate' the cleaning you would do or remove anomalies entries in the input query, and have those repeated every time you update from the source. It does require a 'stable' source in terms of shape and naming of data.

Molding and kneading your data in that way might well even show you Month over Month changes over a period, and highlight ones where stuff is starting to change.

Simplified historical data / probabilistic inventory control technique

Ronen E

Problem Solver

Tidge

Ronen E

Problem Solver

Tidge

Bev D

Heretical Statistician

ChrisM

A little more than Quite Involved in Discussions

Jean_B

Similar threads