Loading [MathJax]/jax/output/SVG/jax.js
Research article

Tutorial on prescriptive analytics for logistics: What to predict and how to predict

  • † The authors contributed equally to this work
  • Received: 04 October 2022 Revised: 17 January 2023 Accepted: 18 January 2023 Published: 24 February 2023
  • The development of the Internet of things (IoT) and online platforms enables companies and governments to collect data from a much broader spatial and temporal area in the logistics industry. The huge amount of data provides new opportunities to handle uncertainty in optimization problems within the logistics system. Accordingly, various prescriptive analytics frameworks have been developed to predict different parts of uncertain optimization problems, including the uncertain parameter, the combined coefficient consisting of the uncertain parameter, the objective function, and the optimal solution. This tutorial serves as the pioneer to introduce existing literature on state-of-the-art prescriptive analytics methods, such as the predict-then-optimize framework, the smart predict-then-optimize framework, the weighted sample average approximation framework, the empirical risk minimization framework, and the kernel optimization framework. Based on these frameworks, this tutorial further proposes possible improvements and practical tips to be considered when we use these methods. We hope that this tutorial will serve as a reference for future prescriptive analytics research on the logistics system in the era of big data.

    Citation: Xuecheng Tian, Ran Yan, Shuaian Wang, Yannick Liu, Lu Zhen. Tutorial on prescriptive analytics for logistics: What to predict and how to predict[J]. Electronic Research Archive, 2023, 31(4): 2265-2285. doi: 10.3934/era.2023116

    Related Papers:

    [1] Shuaian Wang, Xuecheng Tian, Ran Yan, Yannick Liu . A deficiency of prescriptive analytics—No perfect predicted value or predicted distribution exists. Electronic Research Archive, 2022, 30(10): 3586-3594. doi: 10.3934/era.2022183
    [2] Ran Yan, Ying Yang, Yuquan Du . Stochastic optimization model for ship inspection planning under uncertainty in maritime transportation. Electronic Research Archive, 2023, 31(1): 103-122. doi: 10.3934/era.2023006
    [3] Ju Wang, Leifeng Zhang, Sanqiang Yang, Shaoning Lian, Peng Wang, Lei Yu, Zhenyu Yang . Optimized LSTM based on improved whale algorithm for surface subsidence deformation prediction. Electronic Research Archive, 2023, 31(6): 3435-3452. doi: 10.3934/era.2023174
    [4] Pablo Cubillos, Julián López-Gómez, Andrea Tellini . Multiplicity of nodal solutions in classical non-degenerate logistic equations. Electronic Research Archive, 2022, 30(3): 898-928. doi: 10.3934/era.2022047
    [5] Sanqiang Yang, Zhenyu Yang, Leifeng Zhang, Yapeng Guo, Ju Wang, Jingyong Huang . Research on deformation prediction of deep foundation pit excavation based on GWO-ELM model. Electronic Research Archive, 2023, 31(9): 5685-5700. doi: 10.3934/era.2023288
    [6] Miguel Arantes, Wenceslao González-Manteiga, Javier Torres, Alberto Pinto . Striking a balance: navigating the trade-offs between predictive accuracy and interpretability in machine learning models. Electronic Research Archive, 2025, 33(4): 2092-2117. doi: 10.3934/era.2025092
    [7] Jie Zheng, Yijun Li . Machine learning model of tax arrears prediction based on knowledge graph. Electronic Research Archive, 2023, 31(7): 4057-4076. doi: 10.3934/era.2023206
    [8] Ilyоs Abdullaev, Natalia Prodanova, Mohammed Altaf Ahmed, E. Laxmi Lydia, Bhanu Shrestha, Gyanendra Prasad Joshi, Woong Cho . Leveraging metaheuristics with artificial intelligence for customer churn prediction in telecom industries. Electronic Research Archive, 2023, 31(8): 4443-4458. doi: 10.3934/era.2023227
    [9] Shengming Hu, Yongfei Lu, Xuanchi Liu, Cheng Huang, Zhou Wang, Lei Huang, Weihang Zhang, Xiaoyang Li . Stability prediction of circular sliding failure soil slopes based on a genetic algorithm optimization of random forest algorithm. Electronic Research Archive, 2024, 32(11): 6120-6139. doi: 10.3934/era.2024284
    [10] Jiawang Li, Chongren Wang . A deep learning approach of financial distress recognition combining text. Electronic Research Archive, 2023, 31(8): 4683-4707. doi: 10.3934/era.2023240
  • The development of the Internet of things (IoT) and online platforms enables companies and governments to collect data from a much broader spatial and temporal area in the logistics industry. The huge amount of data provides new opportunities to handle uncertainty in optimization problems within the logistics system. Accordingly, various prescriptive analytics frameworks have been developed to predict different parts of uncertain optimization problems, including the uncertain parameter, the combined coefficient consisting of the uncertain parameter, the objective function, and the optimal solution. This tutorial serves as the pioneer to introduce existing literature on state-of-the-art prescriptive analytics methods, such as the predict-then-optimize framework, the smart predict-then-optimize framework, the weighted sample average approximation framework, the empirical risk minimization framework, and the kernel optimization framework. Based on these frameworks, this tutorial further proposes possible improvements and practical tips to be considered when we use these methods. We hope that this tutorial will serve as a reference for future prescriptive analytics research on the logistics system in the era of big data.



    Uncertainty is ubiquitous in logistics, such as uncertain travel time due to unexpected weather and traffic conditions, the fluctuating prices of delivery services due to the varying supply-demand relationship, and the random transportation demand due to the changes in economy and society [1]. Uncertainty is generally perceived as having negative effects on the logistics system, which increases running cost, decreases resource usage, and reduces customer satisfaction [1]. Therefore, an increasing number of logistics studies consider the uncertainty, aiming to mitigate the adverse effects of uncertainty on operations. For uncertain optimization problems in the logistics industry, we notice that the uncertainty can exist in different parts of the optimization problems. Some optimization problems have uncertainty in their objective functions, such as the routing problem (see Example 1 where the travel time in each arc is uncertain), and the energy-cost aware scheduling problem (see Example 2 where the energy price during each time period is uncertain). Other optimization problems have uncertainty in their constraints, such as problems with constraints established to fulfill a given level of service or uncertain demand (see Example 3 where the demand of each booking class is uncertain). Furthermore, it is also possible that uncertainty exists in both objective functions and constraints of optimization problems. To illustrate these observations, we show three examples in the logistics system which have uncertainty in different parts of their optimization problems. The first two examples have uncertainty in their objective functions, and the third example has uncertainty in its constraints.

    Example 1. Routing problem.

    Assume that there is a transport network denoted by G=(N,S), where N is the set of nodes and S is the set of arcs. Each arc sS has an uncertain travel time, denoted by cs, and we define c:=(c1,...,c|S|). The objective is to decide a path on which to drive from origin oN to destination dN with the minimum travel time. Define x:=(x1,...,x|S|) as a binary decision vector, where xs represents the decision variable that takes the value of one if arc s is traversed and zero otherwise. The mathematical model is as follows:

    minxXZrouting(c,x)=minxXsScsxs, (1.1)

    where X is a given set that describes the network constraints.

    Example 2. Energy-cost aware scheduling problem.

    Assume that J is the set of tasks, R is the set of available resources, and T is the set of time periods in equal length. Each task jJ is specified by its duration dj (an integer multiple of a time period), earliest starting time at the beginning of period ej, latest ending time at the beginning of period lj, and power usage pj. Denote ujr as the resource usage of task j for resource r, qr as the available capacity of resource r, vjt as the binary variable that takes the value of 1, only if task j starts at the beginning of time period t and zero otherwise. Furthermore, we require that each task is only scheduled once, and the machine can be scheduled to finish more than one task simultaneously. Assuming that yt is the uncertain energy price during time period t, the objective is to minimize the total energy cost. Thus, we define v as a |J|×|T| matrix with elements vjt,jJ,tT and y:=(y1,...,y|T|). The mathematical model is as follows:

    minvZener(y,v)=minvjJtTvjt(tt<t+djpjyt)  (1.2)

    subject to

    ejtljdjvjt=1 jJ (1.3)
    jJmax{0,tdj}<ttujrvjtqr rR,tT (1.4)
    vjt{0,1} jJ,tT. (1.5)

    Constraints (1.3) ensure that each task is scheduled only once from the earliest starting time to the latest ending time. Constraints (1.4) meet the resource requirement of the machine.

    Although these two examples both have uncertainty in their objective functions, a noticeable difference between the formulation of objective functions in Examples 1 and 2 is that a coefficient pj exists in objective function (1.2), in addition to decision variables and uncertain parameters. We finally show another example with uncertainty in its constraints.

    Example 3. Static network revenue management.

    Denote K as the set of booking classes, gk as the decision variable representing the available capacity that the freight company intends to reserve for bookings of class k over the finite planning horizon, ck as the operating cost of reserving a booking of class k, fk as the revenue of completing a booking of class k, hk as the amount of capacity used by a booking of class k, Q as the amount of available capacity, Dk as the uncertain demand for bookings of class k, γk as the penalty cost if the real demand of class k cannot be met because of the shortage in allocated capacity, and ξk as the recourse variable that represents the shortage amount of capacity for bookings of class k. We define g:=(g1,...,g|K|), D:=(D1,...,D|K|), and ξ:=(ξ1,...,ξ|K|). The objective is to determine the optimal reserved capacities for bookings of different classes to maximize the expected profit, i.e., the difference between expected revenue and the expected penalty cost, over the finite planning horizon. The two-stage mathematical model is as follows:

    [Stage 1]

    maxgZstatic(g,D)=maxg{E[π(g,D)]kKckgk} (1.6)

    subject to

    kKhkgkQ (1.7)
    gk0 kK. (1.8)

    [Stage 2]

    π(g,D)=maxξkK[min(gk,Dk)fkγkξk] (1.9)

    subject to

    gk+ξkDk kK (1.10)
    ξk0 kK. (1.11)

    Constraints (1.7) ensure that the accepted bookings do not exceed the available capacity. Constraints (1.10) ensure that the sum of the capacity allocated to bookings and the unsatisfied demand should be no smaller than the uncertain demand.

    To model and solve optimization problems with uncertainty, different frameworks have been developed. Bertsimas and Koduri [2] divided these frameworks into two main categories, according to whether they take data as a primitive or not. The first category contains relevant literature in stochastic programming [3] and robust programming [4,5] that does not take data as a primitive. These methods generally preset distributions for uncertain parameters without using any real data. However, it is unrealistic for decision-makers to know the ground-truth distributions of uncertain parameters.

    Instead, because we are able to collect and store huge amounts of data, thanks to the development of internet technologies, frameworks in the second category emerge, which contain relevant studies taking data as a primitive to characterize uncertainty. These frameworks can further be classified into two subcategories according to what kind of data they use, including historical data of uncertain parameters themselves and other auxiliary data that can be used to predict uncertain parameters. Frameworks in the first subcategory only use historical data to approximate the scenarios or distributions of uncertain parameters, but do not consider other auxiliary data that might be useful to predict the uncertain parameters, such as the sample average approximation (SAA) framework [6] and the data-driven distributionally robust optimization framework [7,8]. Frameworks in the second subcategory apply various machine learning (ML) techniques to predict uncertain parameters by leveraging, not only their historical data, but also other related auxiliary data. This paper focuses on introducing the state-of-the-art frameworks in the second subcategory.

    The advancement of business analytics techniques in the second subcategory is attributed to the development of the Internet of things (IoT) and online platforms, enabling companies and governments to collect data from a much broader spatial and temporal area [9]. For example, on-demand ride-hailing companies, such as Uber and Lyft, have stored millions of records taken by passengers around the globe since their establishment, which can help them develop smarter dispatch and pricing algorithms to achieve a more cost-effective match between supply and demand in a dynamic environment [9]. More specifically, according to the classification by He et al. [9], data for logistics studies generally comes from the private sector, the public sector, and other sources. Private sector data comes from private transportation and logistics service providers [10,11], social media and map service platforms [12,13], and emerging micromobility service providers [14]. Public sector data mainly comes from government agencies [15] and public transit system operators [16,17]. Other sources include nongovernmental organizations [18], field research [19,20], and third-party platforms. The most common type of data used in logistics studies include origin-destination demand of passengers and customers [21,22,23], retailer sales data across different outlets [24,25], and real-world road network data [26,27,28].

    Figure 1 depicts the workflow of business analytics [9]. The workflow is motivated by the business problem, which consists of the collection, preprocessing, and interpretation of the data, the selection and refinement of predictive analytics methods, and the modeling for decision making in prescriptive analytics. Common business analytics scenarios of applying big data techniques to the logistics industry include, but are not limited to, driving and commuting [22,29,30,31], freight transport [32,33,34,35], last-mile delivery [23,36,37,38], manufacturing [39], and public services (e.g., healthcare service delivery, efficient distribution of food, water, and humanitarian aid, and military industrial logistics) [19,40,41]. In the end, the business analytics is aimed at prescribing sound decisions; that is, we generally focus on the paths called prescriptive analytics. In order to derive decisions from data, there are two different paths for prescriptive analytics, namely indirect path and direct path.

    Figure 1.  A general workflow of business analytics [9].

    The indirect path is to first derive estimations or predictions by predictive analytics, and estimations and predictions are then served as inputs to the downstream decision process. The indirect path involves data, prediction, and decision, and is generally termed predict-then-optimize (PO) framework or estimate-then-optimize framework. Currently, many applications of smart technologies and big data analytics methods have demonstrated potential promise in enhancing the efficiency and effectiveness in various logistics operations and transportation systems [42]. During the estimation and prediction stage, statistical analysis, such as Poisson process [43,44], kernel density models [45,46], and continuous approximation [27,47], is often used to characterize the demand process of the logistics system. We have also witnessed an increased use of econometric and statistical learning tools in logistics studies to explore the relationship between demand and various covariates [17,22]. Furthermore, a wide range of predictive models, ranging from classical statistical methods (e.g., the popular autoregressive integrated moving average (ARIMA) [24]) to novel ML methods (e.g., decision trees, support vector machines, random forests, and neural networks) [15,48,49], have been used in logistics studies. Empirically, Gunasekaran et al. [50] have analyzed how big data and predictive analytics assimilation affects supply chain and organizational performance. Their findings suggested that connectivity and information sharing under the mediation effect of top management commitment are positively related to big data predictive analytics acceptance. At last, during the optimization stage, the predicted values or distributions of the unknown parameters will be plugged into the downstream optimization problems. Corresponding literature has been thoroughly reviewed by Chung [42], Nguyen et al. [51], and Wang et al. [52].

    Although the PO framework is easy to understand and implement, there is always a mismatch between the objectives of the predictive model and the optimization model. Sometimes, a good prediction may not lead to a good decision [2]. As an alternative to indirect path, direct path is a recent trend in prescriptive analytics, which goes directly from data to decision and contains many advanced frameworks, such as the smart predict-then-optimize (SPO) framework [53], the weighted sample average approximation (w-SAA) framework [54,55], the empirical risk minimization (ERM) framework [55,56], and the kernel optimization (KO) framework [2,46,54,55]. These frameworks are rarely reviewed and compared in the existing literature.

    Whichever path these prescriptive analytics frameworks take, their ultimate goal is to prescribe optimal decisions through predicting one of the three parts of the uncertain optimization problem by ML methods, namely, the uncertain parameter, the objective function, or the optimal solution. Yan et al. [57] further proposed that the combined coefficient consisting of uncertain parameters, such as the tt<t+djpjyt part in Example 2, can also be predicted. This kind of prediction is attributed to the structural feature of the optimization problem, and can take any form, such as polynomial or exponential expressions. Therefore, we summarize that there are four parts that can be predicted by prescriptive analytics frameworks, including the uncertain parameter, the combined coefficient, the objective function, and the optimal solution. Accordingly, regarding the three examples shown above, the parts that can be predicted for each example are shown in Table 1.

    Table 1.  The parts that can be predicted for each example.
    Example/Part to be predicted Uncertain parameter Combined coefficient Objective function Optimal solution
    Example 1 Yes No Yes Yes
    Example 2 Yes Yes Yes Yes
    Example 3 Yes No Yes Yes

     | Show Table
    DownLoad: CSV

    Remark 1. For Example 1, there are three parts that can be predicted, including the uncertain parameter c, the objective function Zrouting(c,x), and the optimal solution x=argminxZrouting(c,x). For Example 2, there are four parts that can be predicted, including the uncertain parameter yt, the combined coefficient tt<t+djpjyt, the objective function Zener(y,v), and the optimal solution v=argminvZener(y,v). For Example 3, there are three parts that can be predicted, including the uncertain parameter Dk, the objective function Zstatic(g,D), and the optimal solution g=argmingZstatic(g,D). Therefore, the difference lies in that the coefficient prediction can only be used in uncertain models with structure like Example 2. The common thing is that parameter prediction, objective prediction, and optimizer prediction can all be applied to uncertain problems regardless of where the uncertainty in the optimization model is.

    The huge amount of data acts as catalysts for the development of prescriptive analytics, giving rise to various methods of predicting different parts of uncertain optimization problems. This tutorial makes the following contributions: First, we classify the prediction targets in prescriptive analytics into four categories, including the uncertain parameter, the combined coefficient, the objective function, and the optimal solution. Second, regarding different prediction targets, we review the corresponding state-of-the-art prescriptive analytics frameworks, which are rarely summarised and compared in the existing literature. Third, regarding different prescriptive analytics frameworks, we further propose possible improvements and practical tips to be considered when these frameworks are used in practice. Accordingly, we use the three examples to show how these methods can be used in real applications when and where appropriate.

    If we have access to auxiliary data related to the uncertain parameters in optimization problems, the most common method for solving uncertain problems is to predict uncertain parameters using ML models, which turns uncertain problems into easy-to-solve deterministic problems. If combined coefficients exist in the model, i.e., the polynomial tt<t+djpjyt in Example 2, we can also predict the combined coefficients directly.

    Take Example 1, for instance. Assume that we have collected the travel time on each arc of the past n days, denoted by cis,i{1,...,n},sS, where we define ci:=(ci1,...,ci|S|), as well as the auxiliary feature vector associated with the travel time, including features such as whether it is a working day, rainfall, temperature, and wind, amongst others, denoted by aiARda,i{1,...,n}. Given the new feature vector of today based on weather forecast, denoted by a0, our goal is to find a good path; that is, a path with minimum travel time. If we randomly pick a day, the features (i.e., auxiliary data) and travel times are random, denoted by (˜a,˜c), and their joint distribution is denoted by D. Given the new feature vector ˜a=a0, ˜c is still a random variable, whose distribution is drawn from D, denoted by Da0. Consequently, given the new feature vector a0, we should solve the following model for Example 1:

    minxXE(˜a,˜c)D[Zrouting(˜c,x)|˜a=a0]=minxXE˜cDa0[Zrouting(˜c,x)]. (2.1)

    Because the objective function is linear in the uncertain parameter of Example 1, we can further obtain that

    minxXE˜cDa0[Zrouting(˜c,x)]=minxXZrouting(E˜cDa0[˜c],x). (2.2)

    In order to solve minxXZrouting(E˜cDa0[˜c],x) with the conditional expectation E˜cDa0[˜c], the PO framework is a typical method, which firstly predicts the uncertain parameter ˜c given the new observation a0 by developing an ML model F based on the dataset {(ai,ci)}ni=1, and then plugs the prediction ˆc=F(a0) into the optimization problem to derive decisions. Considering that the cost is a continuous prediction target, we can use mean squared error (MSE) loss to train F, which is expressed as follows:

    LMSE=1nni=1ciF(ai)22. (2.3)

    Assuming that we have infinitely many data, under mild conditions, we can obtain the best estimate

    ˆc=F(a0)=E(˜c|˜a=a0)=E˜cDa0[˜c]. (2.4)

    The conditional expectation E˜cDa0[˜c] is approximated by the estimated ˆc, and the optimization problem of Example 1 is successfully solved by using the conditional mean ˆc.

    A general assumption underlying the PO framework is that the objective function is linear in the uncertain parameter. However, if the objective function (1.1) is not linear in the uncertain parameter, the PO framework is not able to solve the original problem. For example, considering that a student is going to take an exam that starts in 60 minutes, meaning that a route is good only when its overall travel time is less or equal than 60 minutes, the original problem (1.1) should be

    minxXE˜cDa0I(sS˜csxs>60), (2.5)

    where I() is an indicator function which takes the value of one if the condition is true and zero otherwise, and the objective is to minimize the probability that the chosen route is not good given the new observation a0. In this case, the objective function is not linear in the uncertain parameter, so minxXE˜cDa0I(sS˜csxs>60)minxXI(sSE˜cDa0[˜cs]xs>60). To be more specific, assume that there are two paths A and B. Path A's travel time is 60 minutes, and path B's travel time is 59 minutes with 50% chance or 61 minutes with 50% chance. If a student goes to take an exam that starts in one hour, these two paths are very different. If we solve minxXE˜cDa0I(sS˜csxs>60), the optimal solution should select path A only. However, if we solve minxXI(sSE˜cDa0[˜cs]xs>60), the optimal solution would be both path A and path B. Therefore, the PO framework cannot solve the original problem when the original objective function is not linear in the uncertain parameter. In order to remedy this issue, we can take the following methods: the w-SAA method and the quantile-regression based method.

    Instead of predicting certain values of uncertain parameters, Bertsimas and Kallus [54] proposed a w-SAA framework to predict the conditional distribution of the uncertain parameter, given the new observation. Under this framework, take objective function (2.5) as an example, given a new observation a0, the conditional distribution of ˜c is approximated empirically as w(ai,a0),i{1,..,n}, where w(ai,a0) measures the similarity between the historical example ai and the new observation a0, where its format depends on the ML model we use. If we use a k-nearest neighbor (kNN) model, w(ai,a0)=1/k if ai is a kNN of a0 and zero otherwise, and w(ai,a0) can be seen as an approximation of the conditional distribution of ˜c given a=a0, namely, Da0; that is, the approximate distribution of ˜c has n scenarios c1,c2,...,cn with probabilities w(a1,a0),w(a2,a0),...,w(an,a0) (and it is possible that some probabilities w(ai,a0) are 0, meaning that the approximate distribution of ˜c has less than n scenarios). After obtaining the conditional marginal distribution Da0, the approximation of objective function (2.5) is as follows:

    minxXni=1w(ai,a0)I(cix>60). (2.6)

    If we aim to predict the conditional distribution of an uncertain parameter, the w-SAA method is a local ML method, which predicts the conditional distribution by measuring closeness to existing data [2,58]. This method in some sense throws away some data that is not close to the observation, and so it needs a lot of data to work well [2]. As an alternative, Wang and Yan [59] proposed a quantile-regression based global method to take all data into account when estimating a single-dimensional parameter. Because the quantile-regression based global method is stemmed from the traditional regression model, for a particular arc sS in Example 1, if we first assume using the linear regression model Fs(a)=wsa as the predictive model, where ws is a da×1 vector (recall that da is the dimension of the feature vector), we have

    wsargminws1nni=1(wsaicis)2 (2.7)

    and Fs(a)=wsa. Next, given the new observation a0, we can obtain ˆcs=Fs(a0)=wsa0 and ˆc:=(ˆc1,...,ˆc|S|). However, by minimizing the sum of squared errors using the traditional regression model, we are estimating the conditional mean E˜cDa0[˜c] instead of the conditional distribution ˜cDa0, which may not work well when the objective function is not linear in the uncertain parameters. Alternatively, we can introduce a parameter α[0,1] and obtain wαs by solving

    minwαs1nni=1[(1α)max((wαs)aicis,0)+αmax(cis(wαs)ai,0)]. (2.8)

    By minimizing the above weighted sum of over- and under-estimation errors, we are estimating the 100αth percentile of the uncertain parameter. For Example 1, we can estimate the 5th, 15th, ..., 95th percentile of ˜cs. The distribution of ˜cs|a0 is thus approximately Pr(˜cs=(wαs)a0)=110, α=0.05,0.15,...,0.95. Next, there are two cases to consider to solve objective function (2.5) after ˜cs|a0 is obtained under each percentile. First, if we assume that the travel times of different arcs are highly correlated, we should solve

    minxX110α=0.05,0.15,...,0.95[I((a0)wαx)>60], (2.9)

    where wα:=(wα1,...,wα|S|) is a da×|S| matrix. Otherwise, if we assume that the travel times of different arcs follow independent distributions, we need to define wsample:=(wα1,...,wα|S|) as a da×|S| matrix, where α is randomly sampled from {0.05, 0.15, ..., 0.95} for each arc. Considering that there are |S| arcs and each arc has 10 percentile values, there would be 10|S| possible combinations for wsample. Because it is time-consuming to find all possible combinations in a large network, we resample λ times from all combinations and denote each combination as wϵsample. We should then solve

    minxX1λλϵ=1[I((a0)wϵsamplex)>60] (2.10)

    to prescribe final decisions.

    For the frameworks mentioned above, the loss function used to train the ML models generally only focuses on minimizing the prediction error, such as the MSE loss function (2.3), which does not consider the impact of the predictions on the downstream optimization problems, leading to suboptimal solutions. Therefore, a more natural and appropriate method is to plug the optimization problem into the training process of the ML models, which is generally termed SPO framework. The commonly used loss function, designed for measuring decision error, under this framework for parameter prediction of Example 1, namely SPO loss, is expressed as follows:

    LSPO=1nni=1[Zrouting(ci,x(ˆci))Zrouting(ci,x(ci))], (2.11)

    where x(ˆci)=argminxXZrouting(ˆci,x) and x(ci)=argminxXZrouting(ci,x).

    In order to synthesize predictive and prescriptive techniques to create ML systems that learn to make decisions based on empirical data, the resulting composite models often employ constrained optimization as a neural network layer, and are trained in an end-to-end method. Therefore, most SPO-related studies use feed-forward neural networks (NNs) with deep learning architectures composed of a sequence of layers [60]. However, training ML models using SPO loss might be computationally difficult because of the nonconvex and discontinuous characteristics of the SPO loss function for combinatorial optimization problems. This is because the discrete and discontinuous solution space prevents the learning problem from easily differentiating the decision loss over the predicted values. Consequently, it is infeasible to pass back the gradients to inform the predictive model regarding how it should adjust its weights to improve the decision quality of the prescribed solutions [61]. To overcome this problem, Wilder et al. [62] added a quadratic regularization term to the objective function of the relaxed form of the combinatorial problem, but this method can only be applied to combinatorial problems with a totally unimodular matrix. Ferber et al. [61] strengthened this method by employing a cutting-plane solution approach, which tightened the continuous relaxation by adding constraints removing fractional solutions. Furthermore, instead of computing the real decision loss by directly solving the combinatorial problem during the training process, some studies have designed a class of surrogate loss functions based on a sub-gradient, such as Elmachtoub and Grigas [53] and Mandi et al. [63]. For these discussed approaches, a common issue is that these methods all need to repeatedly solve the (possibly relaxed) optimization problem, bringing a huge burden on the computational efficiency. In contrast, Mulamba et al. [64] used a noise contrastive approach by viewing sub-optimal solutions as noise examples and caching them, which replaced optimization calls with a look-up table in the solution cache, so as to improve the training efficiency.

    Furthermore, some studies begin to train decision trees to obtain personalized decision from a finite set of possible options instead of focusing only on the prediction error. Kallus [65] trained trees with a loss function, maximizing the effectiveness of the predictions rather than minimizing the prediction errors. Bertsimas et al. [66] studied a similar treatment recommendation problem, but adopted a weighted loss function to combine prediction and decision error. Elmachtoub et al. [67] considered a more general class of decision-making problems that could involve a large number of decisions represented by a general feasible region. To train decision trees under SPO loss, they proposed a tractable methodology called SPOTs. They claimed that SPOTs could benefit from the interpretability of decision trees, allowing for an interpretable segmentation of a set of contextual features with different optimal solutions to the optimization problem of interest. In a recent study, Kallus and Mao [68] also studied how to fit the node splitting policies in contextual stochastic optimization problems to directly minimize the optimization costs.

    No matter what method we use, i.e., the PO frameworks and the SPO frameworks, or local ML methods and global ML methods, our goal is to predict a perfect value or a perfect distribution of the uncertain parameter to help us prescribe an optimal solution that is near to the full-information perfect solution. Recalling that we can also predict the combined coefficient in the objective function, these frameworks can be further applied to the combined coefficient prediction. The only difference between the prediction of a parameter and the prediction of a combined coefficient is that the output value of the ML model for the combined coefficient prediction should be computed beforehand, according to the structure of the expression. Take Example 2, for instance, where the prediction target is changed from the unit energy price during time period t, yt, to the total energy cost of task j starting from time period t, tt<t+djpjyt. Because there are J tasks, and considering that each task has its own earliest starting time ej, latest ending time lj, and working duration dj, we thus need to train jJ(djejlj) ML models (since we assume for each task and for each feasible starting time period, there is a corresponding predictor) or a multi-output regression model if we are going to predict tt<t+djpjyt. This indicates that the combined coefficient prediction may lead to more computational burdens.

    In summary, parameter and coefficient prediction methods are the most popular in prescriptive analytics. Among different prescriptive analytics frameworks, the predict-then-optimize framework is easy to implement, whereas we can use various ML models to predict the conditional mean of uncertain parameters or coefficients in linear objective functions, or adopt the local w-SAA method or global quantile-regression model to predict the conditional distribution of uncertain parameters or coefficients in non-linear objective functions. However, because those frameworks mentioned above neglect the impact of predictions on downstream decisions, the SPO frameworks are thus proposed, whose tractability and scalability are two major obstacles to be solved for non-convex and discontinuous combinatorial problems.

    As mentioned above, when we predict the uncertain parameter, we can use the PO or SPO frameworks to predict a conditional expectation if the objective function is linear in the uncertain parameter, or we can use the w-SAA framework or the quantile-regression based global method to predict a conditional distribution of the uncertain parameter if the objective function is non-linear in the uncertain parameter. Furthermore, it is worth noting that the w-SAA and the quantile-regression based global method can also be seen as methods to approximate the objective function although they do not take the perspective as predicting the objective [54]. The objective prediction is a recent trend in prescriptive analytics [54], which generally takes local learning methods. However, because local learning methods predict by measuring the closeness to existing data, whereas global learning methods predict by choosing a functional form of the prediction that minimizes some loss functions on existing data, the latter methods perform better with less data, extrapolate better to outliers, and perform better in higher dimensions [2]. Bertsimas and Koduri [2], thus, proposed a global ML method to predict the objective function, which has never been done before in the literature. This section focuses on their method of objective function prediction, and proposes a more general method based on their method.

    Recall that we can use the linear regression to predict the conditional expectation of ˜c, and A is the matrix with rows ai for Example 1. Assume that AA is invertible, the optimal solution of optimization problem (2.7) takes the following form:

    ws=(AA)1Acs, (3.1)

    where cs:=(c1s,...,cns) is the vector of historical target values. Therefore, given the new observation a0, the prediction of cs (note that cs is not an element in cs) for arc sS is

    E[cs|a=a0]wsa0=(cs)A(AA)1a0. (3.2)

    Alternatively, if we aim not to predict cs, but the objective function  Zrouting(c,x)=sScsxs by finding some functions w(x) such that E[Zrouting(c,x)|a]w(x)a, we should compute

    minw(x)1nni=1(Zrouting(ci,x)w(x)ai)2. (3.3)

    The optimal solution of optimization problem (3.3) would be

    w(x)=(AA)1AZrouting(C,x), (3.4)

    and the approximation of E[Zrouting(c,x)|a=a0] would be

    E[Zrouting(c,x)|a=a0]w(x)a0=Zrouting(C,x)A(AA)1a0, (3.5)

    where C is a matrix with rows ci, and Zrouting(C,x) is the vector (Zrouting(c1,x),...,Zrouting(cn,x)). Following the method of using regression to predict the objective function and to generalize the approach to non-linear predictions, Bertsimas and Koduri [2] further used kernel tricks to predict the objective function. For Example 1, they denote the approximate objective function by h(ai,x)H, where H is the Hilbert space defined by a positive-definite kernel function K(ai,aj) (see Definition 1 in Bertsimas and Koduri [2]), and the function (3.3) should be computed as:

    minh(,x)H1nni=1(Zrouting(ci,x)h(ai,x))2+σni=1(h(ai,x))2, (3.6)

    which is computationally tractable, thanks to the representer theorem (see Proposition 1 in Bertsimas and Koduri [2]); here σni=1(h(ai,x))2 denotes the regularization term used to prevent overfitting. According to the representer theorem, the optimal solution of optimization problem (3.6) must take the form

    h(ai,x)=nj=1μj(x)K(aj,ai), (3.7)

    where μj(x)R is a function with respect to the decision vector x, and K(ai,aj) is the kernel function. Plugging (3.7) into (3.6) and following the same procedures as using regression to predict the objective function, and given the new observation a0, the objective function of Example 1 Zrouting(c,x) can be approximated by

    [Zrouting(c,x)|a=a0]h(a0,x)=K(A,a0)(ˆK+αnI)1Zrouting(C,x), (3.8)

    where K(A,a0) is the vector (K(a1,a0),...,K(an,a0)), ˆK is the n×n kernel matrix with components ˆKij=K(ai,aj), and I is the n×n identity matrix. After we obtain the predicted objective function, as shown in (3.8) given a new observation a0, we then minimize it following constraints xX to obtain decisions.

    Because the kernel function K(ai,aj) only considers the auxiliary features a, we propose a more general case, which assumes that the predicted objective function for Example 1, denoted by g(a,x), maintains the original structure, and is as follows:

    g(a,x)=nl=1θl(a)Zrouting(cl,x), (3.9)

    where θl(a)R. Under this prediction, the decision loss function over the solution space should be

    ni=1[g(ai,x)Zrouting(ci,x)]dx, (3.10)

    whose minimization is computationally intractable because we do not know the ground-truth distributions of x. Practically, we only have empirical data points and their optimal solutions are xj=argminxXZrouting(cj,x), j=1,...,n. Then, loss function (3.10) can be empirically approximated as:

    Lobj=ni=1nj=1(g(ai,xj)Zrouting(ci,xj))=ni=1nj=1(nl=1θl(ai)Zrouting(cl,xj)Zrouting(ci,xj)). (3.11)

    Furthermore, we should add a regularization term to prevent overfitting:

    Lobj=ni=1nj=1(nl=1θl(ai)Zrouting(cl,xj)Zrouting(ci,xj))+σni=1nl=1(θl(ai))2. (3.12)

    Then, our goal is to determine θl(a) (l=1,...,n) by solving minθLobj. Though θl(a) can take any form, for simplicity, we assume that θl(a)=θla, where θ is a vector with the same dimension of a. Now, the minimization of loss function (3.12) is a regression problem, where there are n×n records, indexed by {(1,1),...,(i,j),...,(n,n)}. For record (i,j), the target is denoted by Zrouting(ci,xj), and it has n×da features (recall that a has da features), indexed by {(1,1),...,(l,d),...,(n,da)}, where the value of feature (l,d) is denoted by aid×Zrouting(cl,xj). Therefore, we can use Ridge regression, whose regularization term is also in quadratic form, to minimize loss function (3.12).

    In summary, regarding the methods for predicting the objective function, Bertsimas and Koduri [2] proposed the first global method that predicts the objective function in a functional form using kernel tricks. Considering that the kernel method does not maintain the original structure of the objective function, we propose a more general method, which deserves investigation and comparison in future studies.

    For prescriptive analytics frameworks, in addition to predicting the uncertain parameter, the combined coefficient, and the objective function, a more direct way is to predict the optimal solution, as our ultimate goal is to prescribe a solution that is near the perfect solution under the condition that the uncertain information is known. Ban and Rudin [56] proposed two ERM algorithms to predict the optimal solution. For example, the ERM approach to solving Example 1 with auxiliary data is as follows:

    minx()L,{x:AR|S|}ˆR(x(a);(ai,ci)ni=1)=minx()Lx(a)X1nni=1Zrouting(ci,x(ai)), (4.1)

    where the decision now is a function x() that maps the feature space A to reals, ˆR is called the empirical risk of function x() with respect to the dataset {(ai,ci)}ni=1, and we need to specify the function class L and enforce x(a)X to ensure that each training data record meets the network constraints. We note that it is possible that, for a given new observation a0, the prescribed decision may not follow the network constraints, namely x(a0)X; therefore, we can set the prescribed decision as the nearest neighbour of x(a0) in X, where we may need to solve a programming model minxXϵ, where x=x(a0)+ϵ, and ϵR|S| is the decision variable. Consider that we apply linear decision rules to predict the optimal solution of the form

    L={x:AR|S|:x(a)=Xa}, (4.2)

    where X is a |S|×da matrix with rows xs=(xs1,..,xsda), s=1,...,|S|. By using this linear form, the ERM problem (4.1) is as follows:

    minx(a)=XaˆR(x(a);(ai,ci)ni=1)=minXaX1nni=1Zrouting(ci,Xai). (4.3)

    To prevent overfitting, we can add a regularization term to (4.3) as follows:

    minXaX[1nni=1Zrouting(ci,Xai)+σdaj=1sS(xsj)2]. (4.4)

    Therefore, when we use the linear form to estimate the optimal solution, the learning task is to find the best xsj, s=1,...,|S|, j=1,...,da, by solving (4.4). After we obtain X, given a new observation a0, the prescribed decision is thus Xa0.

    Furthermore, following the general formulation of the ERM approach, some studies have used kernel tricks to estimate the function that prescribes the optimal solution. Ban and Rudin [56] proposed an approach to predict the optimal solution by using the kernel optimization method, but it can only be applied to the newsvendor problem. Notz and Pibernik [55] proposed a kernelized ERM approach for the flexible capacity management problem, and proved its performance guarantees. Bertsimas and Koduri [2] proposed a general method to use kernel functions to predict the optimal solution. Taking Example 1, for instance, when using kernel tricks to predict the optimal solution, we restrict each xs(a)x(a),sS, in optimization problem (4.1) to be in a reproducing kernel Hilbert space H, which is associated with a kernel K. Then, the empirical regularized kernelized version of (4.1) is as follows:

    minx1(),...,x|S|()Hx1(a),...,x|S|(a)X[1nni=1Zrouting(ci;x1(ai),...,x|S|(ai))+σsSni=1(xs(ai))2]. (4.5)

    According to the conclusion of Bertsimas and Koduri [2], the optimal solution to the optimization problem (4.5) takes the form

    xs(a)=ni=1μsiK(ai,a),sS, (4.6)

    where μ is the solution to

    minμ1,...,μ|S|RnˆKμ1,...,ˆKμ|S|X[1nni=1Zrouting(ci;(ˆKμ1)i,...,(ˆKμ|S|)i)+σsS(μs)ˆKμs], (4.7)

    where μs=(μs1,...,μsn), and ˆK is the n×n kernel matrix with components ˆKij=K(ai,aj). Now, after specifying kernel functions, the decision variables of (4.7) are all μsis. After we obtain μsi,i{1,...,n},sS, given a new observation a0, the prescribed decision for arc sS is calculated as xs(a0)=ni=1μsiK(ai,a0).

    In summary, ERM models and kernelized methods are all possible approaches to optimal solution prediction, whose performance guarantees have been shown in existing literature, such as Ban and Rudin [55], Bertsimas and Kallus [54], and Bertsimas and Koduri [2]. For all these methods, we need to note that the prescribed solution may violate the problem constraints, so post adjustments of prescribed solutions may be needed.

    This study summarizes existing literature on prescriptive analytics methods in the logistics system. We first point out that four parts in the optimization problems can be predicted, namely, the uncertain parameter, the combined coefficient, the objective function, and the optimal solution. The predictions of the uncertain parameter and the combined coefficient are the most common topics in existing literature on prescriptive analytics in the logistics system, which takes the indirect path from data to decision via prediction. Among these methods for parameter and coefficient prediction, the PO framework is the easiest to implement, whereas the SPO framework focuses more on the decision quality, but may be computationally intractable. It is worth noting that, if the objective function is not linear in the uncertain parameter, the w-SAA method is an alternative to predict the conditional distribution by using local learning methods, and the quantile-regression based method is another alternative that takes global data into account. Furthermore, the prediction of objective function and optimal solution takes the direct path, which goes from data to decision by choosing a functional form of the prediction that minimizes some loss functions on existing data. The methods for predicting the objective function and the optimal solutions are rooted in ERM algorithms, where the most commonly used functional form is the kernel function for its good applicability in handling nonlinearities. Uncertainties are ubiquitous in logistics problems, where these prescriptive analytics frameworks may work well in providing sound decisions. For different uncertain optimization problems, we do not know the best method; instead, this paper discusses a few existing alternatives, and proposes possible improvements, which constitute an arsenal of prescriptive analytics frameworks to be considered when and where appropriate.

    Apart from improvements regarding different prescriptive analytics frameworks, we further propose the following future research directions. First, as stated in this tutorial, integrating learning and optimization in prescriptive analytics for logistics needs tailored learning algorithms that consider the structural characteristics of downstream optimization problems. In order to achieve optimal prescriptive targets, we may need to propose new methodologies and tools for both ML and optimization parts. Second, starting from proposing new methodologies of prescriptive analytics frameworks, we wish to validate their values by applying them to real industrial problems. In this way, when and how prescriptive analytics can improve decision-making can be empirically investigated. At last, the development of prescriptive analytics frameworks can also stimulate the collection of data in the industry. Investigating what kind of data we need, and examining the influence of data quality, can further promote better decision-making.

    The authors thank the three reviewers for their valuable comments.

    The authors declare there is no conflicts of interest.



    [1] W. Wang, Y. Wu, Is uncertainty always bad for the performance of transportation systems, Commun. Transp. Res., 1 (2021), 100021. https://doi.org/10.1016/j.commtr.2021.100021 doi: 10.1016/j.commtr.2021.100021
    [2] D. Bertsimas, N. Koduri, Data-driven optimization: A Reproducing Kernel Hilbert Space approach, Oper. Res., 70 (2021), 454–471. https://doi.org/10.1287/opre.2020.2069 doi: 10.1287/opre.2020.2069
    [3] J. R. Birge, F. Louveaux, Introduction to Stochatic Programming, Springer, New York, 2011. https://doi.org/10.1007/978-1-4614-0237-4
    [4] A. Ben-Tal, L. E. Ghaoui, A. Nemirovski, Robust Programming, Princeton University Press, Princeton, 2009.
    [5] D. Bertsimas, D. B. Brown, C. Caramanis, Theory and applications of robust optimization, SIAM Rev., 53 (2011), 464–501. https://doi.org/10.1137/080734510 doi: 10.1137/080734510
    [6] A. J. Kleywegt, A. Shapiro, T. Homem-de Mello, The sample average approximation for stochastic discrete optimization, SIAM J. Optim., 12 (2002), 479–502. https://doi.org/10.1137/S1052623499363220 doi: 10.1137/S1052623499363220
    [7] D. Bertsimas, V. Gupta, N. Kallus, Data-driven robust optimization, Math. Program., 167 (2018), 235–292. https://doi.org/10.1007/s10107-017-1125-8 doi: 10.1007/s10107-017-1125-8
    [8] E. Delage, Y. Ye, Distributionally robust optimization under moment uncertainty with application to data-driven problems, Oper. Res., 58 (2010), 595–612. https://doi.org/10.1287/opre.1090.0741 doi: 10.1287/opre.1090.0741
    [9] L. He, S. Liu, Z. J. M. Shen, Smart urban transport and logistics: {A} business analytics perspective, Prod. Oper. Manag., 31 (2022), 3771–3787. https://doi.org/10.1111/poms.13775 doi: 10.1111/poms.13775
    [10] L. He, H. Y. Mak, Y. Rong, Z. J. M. Shen, Service region design for urban electric vehicle sharing systems, Manuf. Serv. Oper. Manag., 19 (2017), 309–327. https://doi.org/10.1287/msom.2016.0611 doi: 10.1287/msom.2016.0611
    [11] M. Lu, Z. Chen, S. Shen, Optimizing the profitability and quality of service in carshare systems under demand uncertainty, Manuf. Serv. Oper. Manag., 20 (2018), 162–180. https://doi.org/10.1287/msom.2017.0644 doi: 10.1287/msom.2017.0644
    [12] R. Cui, S. Gallino, A. Moreno, D. J. Zhang, The operational value of social media information, Prod. Oper. Manag., 27 (2018), 1749–1769. https://doi.org/10.1111/poms.12707 doi: 10.1111/poms.12707
    [13] J. Carlsson, S. Song, Coordinated logistics with a truck and a drone, Manag. Sci., 64 (2018), 4052–4069. https://doi.org/10.1287/mnsc.2017.2824 doi: 10.1287/mnsc.2017.2824
    [14] Z. Zou, H. Younes, S. Erdoğan, J. Wu, Exploratory analysis of real-time e-scooter trip data in Washington, DC, Transp. Res. Rec., 2674 (2020), 285–299. https://doi.org/10.1177/0361198120919760 doi: 10.1177/0361198120919760
    [15] C. Glaeser, M. Fisher, X. Su, Optimal retail location: Empirical methodology and application to practice: Finalist–2017 M & SOM practice-based research competition, Manuf. Serv. Oper. Manag., 21 (2019), 86–102. https://doi.org/10.1287/msom.2018.0759 doi: 10.1287/msom.2018.0759
    [16] D. Bertsimas, Y. Sian Ng, J. Yan, Joint frequency-setting and pricing optimization on multimodal transit networks at scale, Transp. Sci., 54 (2020), 839–853. https://doi.org/10.1287/trsc.2019.0959 doi: 10.1287/trsc.2019.0959
    [17] D. Bertsimas, A. Delarue, P. Jaillet, S. Martin, Travel time estimation in the age of big data, Oper. Res., 67 (2019), 498–515. https://doi.org/10.1287/opre.2018.1784 doi: 10.1287/opre.2018.1784
    [18] H. de Vries, J. van de Klundert, A. Wagelmans, The roadside healthcare facility location problem a managerial network design challenge, Prod. Oper. Manag., 29 (2020), 1165–1187. https://doi.org/10.1111/poms.13152 doi: 10.1111/poms.13152
    [19] J. Boutilier, T. Chan, Ambulance emergency response optimization in developing countries, Oper. Res., 68 (2020), 1315–1334. https://doi.org/10.1287/opre.2019.1969 doi: 10.1287/opre.2019.1969
    [20] E. Gralla, J. Goentzel, C. Fine, Problem formulation and solution mechanisms: A behavioral study of humanitarian transportation planning, Prod. Oper. Manag., 25 (2016), 22–35. https://doi.org/10.1111/poms.12496 doi: 10.1111/poms.12496
    [21] Z. Hao, L. He, Z. Hu, J. Jiang, Robust vehicle pre-allocation with uncertain covariates, Prod. Oper. Manag., 29 (2020), 955–972. https://doi.org/10.1111/poms.13143 doi: 10.1111/poms.13143
    [22] A. Kabra, E. Belavina, K. Girotra, Bike-share systems: Accessibility and availability, Manag. Sci., 66 (2020), 3803–3824. https://doi.org/10.1287/mnsc.2019.3407 doi: 10.1287/mnsc.2019.3407
    [23] S. Liu, L. He, Z. J. M. Shen, On-time last-mile delivery: Order assignment with travel-time predictors, Manag. Sci., 67 (2021), 4095–4119. https://doi.org/10.1287/mnsc.2020.3741 doi: 10.1287/mnsc.2020.3741
    [24] S. Steinker, K. Hoberg, U. Thonemann, The value of weather information for e-commerce operations, Prod. Oper. Manag., 26 (2017), 1854–1874. https://doi.org/10.1111/poms.12721 doi: 10.1111/poms.12721
    [25] M. Ang, Y. Lim, M. Sim, Robust storage assignment in unit-load warehouses, Manag. Sci., 58 (2012), 2114–2130. https://doi.org/10.1287/mnsc.1120.1543 doi: 10.1287/mnsc.1120.1543
    [26] M. Lim, H. Mak, Y. Rong, Toward mass adoption of electric vehicles: Impact of the range and resale anxieties, Manuf. Serv. Oper. Manag., 17 (2015), 101–119. https://doi.org/10.1287/msom.2014.0504 doi: 10.1287/msom.2014.0504
    [27] J. Carlsson, M. Behroozi, K. Mihic, Wasserstein distance and the distributionally robust TSP, Oper. Res., 66 (2018), 1603–1624. https://doi.org/10.1287/opre.2018.1746 doi: 10.1287/opre.2018.1746
    [28] G. Baloch, F. Gzara, Strategic network design for parcel delivery with drones under competition, Transp. Sci., 54 (2020), 204–228. https://doi.org/10.1287/trsc.2019.0928 doi: 10.1287/trsc.2019.0928
    [29] J. Shu, M. Chou, Q. Liu, C. Teo, I. Wang, Models for effective deployment and redistribution of bicycles within public bicycle-sharing systems, Oper. Res., 61 (2013), 1346–1359. https://doi.org/10.1287/opre.2013.1215 doi: 10.1287/opre.2013.1215
    [30] G. Cachon, K. Daniels, R. Lobel, The role of surge pricing on a service platform with self-scheduling capacity, Manuf. Serv. Oper. Manag., 19 (2017), 368–384. https://doi.org/10.1287/msom.2017.0618 doi: 10.1287/msom.2017.0618
    [31] S. Datner, T. Raviv, M. Tzur, D. Chemla, Setting inventory levels in a bike sharing network, Transp. Sci., 53 (2019), 62–76. https://doi.org/10.1287/trsc.2017.0790 doi: 10.1287/trsc.2017.0790
    [32] H. Abouee-Mehrizi, O. Berman, S. Sharma, Optimal joint replenishment and transshipment policies in a multi-period inventory system with lost sales, Oper. Res., 63 (2015), 342–350. https://doi.org/10.1287/opre.2015.1358 doi: 10.1287/opre.2015.1358
    [33] R. Yuan, S. Graves, T. Cezik, Velocity-based storage assignment in semi-automated storage systems, Prod. Oper. Manag., 28 (2019), 354–373. https://doi.org/10.1111/poms.12925 doi: 10.1111/poms.12925
    [34] Q. Deng, X. Fang, Y. Lim, Urban consolidation center or peer-to-peer platform? The solution to urban last-mile delivery, Prod. Oper. Manag., 30 (2021), 997–1013. https://doi.org/10.1111/poms.13289 doi: 10.1111/poms.13289
    [35] Z. Wang, J. Sheu, C. Teo, G. Xue, Robot scheduling for mobile-rack warehouses: Human–robot coordinated order picking systems, Prod. Oper. Manag., 31 (2022), 98–116. https://doi.org/10.1111/poms.13406 doi: 10.1111/poms.13406
    [36] W. Qi, L. Li, S. Liu, Z. J. M. Shen, Shared mobility for last-mile delivery: Design, operational prescriptions, and environmental impact, Manuf. Serv. Oper. Manag., 20 (2018), 737–751. https://doi.org/10.1287/msom.2017.0683 doi: 10.1287/msom.2017.0683
    [37] B. Yildiz, M. Savelsbergh, Provably high-quality solutions for the meal delivery routing problem, Transp. Sci., 53 (2019), 1372–1388. https://doi.org/10.1287/trsc.2018.0887 doi: 10.1287/trsc.2018.0887
    [38] M. Ulmer, B. Thomas, A. Campbell, N. Woyak, The restaurant meal delivery problem: Dynamic pickup and delivery with deadlines and random ready times, Transp. Sci., 55 (2021), 75–100. https://doi.org/10.1287/trsc.2020.1000 doi: 10.1287/trsc.2020.1000
    [39] S. Jain, G. Shao, S. J. Shin, Manufacturing data analytics using a virtual factory representation, Int. J. Prod. Res., 55 (2017), 5450–5464. https://doi.org/10.1080/00207543.2017.1321799 doi: 10.1080/00207543.2017.1321799
    [40] A. Nasrollahzadeh, A. Khademi, M. Mayorga, Real-time ambulance dispatching and relocation, Manuf. Serv. Oper. Manag., 20 (2018), 467–480. https://doi.org/10.1287/msom.2017.0649 doi: 10.1287/msom.2017.0649
    [41] X. Li, X. Zhao, W. Pu, P. Chen, F. Liu, Z. He, Optimal decisions for operations management of BDAR: A military industrial logistics data analytics perspective, Comput. Ind. Eng., 137 (2019), 106100. https://doi.org/10.1016/j.cie.2019.106100 doi: 10.1016/j.cie.2019.106100
    [42] S. Chung, Applications of smart technologies in logistics and transport: A review, Transp. Res. Part E Logist. Transp. Rev., 153 (2021), 102455. https://doi.org/10.1016/j.tre.2021.102455 doi: 10.1016/j.tre.2021.102455
    [43] H. Mak, Y. Rong, Z. J. M. Shen, Infrastructure planning for electric vehicles with battery swapping, Manag. Sci., 59 (2013), 1557–1575. https://doi.org/10.1287/mnsc.1120.1672 doi: 10.1287/mnsc.1120.1672
    [44] L. He, G. Ma, W. Qi, X. Wang, Charging an electric vehicle-sharing fleet, Manuf. Serv. Oper. Manag., 23 (2021), 471–487. https://doi.org/10.1287/msom.2019.0851 doi: 10.1287/msom.2019.0851
    [45] T. Chan, D. Demirtas, R. Kwon, Optimizing the deployment of public access defibrillators, Manag. Sci., 62 (2016), 3617–3635. https://doi.org/10.1287/mnsc.2015.2312 doi: 10.1287/mnsc.2015.2312
    [46] T. Chan, Z. J. M. Shen, A. Siddiq, Robust defibrillator deployment under cardiac arrest location uncertainty via row-and-column generation, Oper. Res., 66 (2018), 358–379. https://doi.org/10.1287/opre.2017.1660 doi: 10.1287/opre.2017.1660
    [47] J. Carlsson, M. Behroozi, R. Devulapalli, X. Meng, Household-level economies of scale in transportation, Oper. Res., 64 (2016), 1372–1387. https://doi.org/10.1287/opre.2016.1533 doi: 10.1287/opre.2016.1533
    [48] T. Huang, D. Bergman, R. Gopal, Predictive and prescriptive analytics for location selection of add-on retail products, Prod. Oper. Manag., 28 (2019), 1858–1877. https://doi.org/10.1111/poms.13018 doi: 10.1111/poms.13018
    [49] N. Salari, S. Liu, Z. J. M. Shen, Real-time delivery time forecasting and promising in online retailing: When will your package arrive, Manuf. Serv. Oper. Manag., 24 (2022), 1421–1436. https://doi.org/10.1287/msom.2022.1081 doi: 10.1287/msom.2022.1081
    [50] A. Gunasekaran, T. Papadopoulos, R. Dubey, S. Wamba, S. Childe, B. Hazen, et al., Big data and predictive analytics for supply chain and organizational performance, J. Bus. Res., 70 (2017), 308–317. https://doi.org/10.1016/j.jbusres.2016.08.004 doi: 10.1016/j.jbusres.2016.08.004
    [51] A. Nguyen, L. Zhou, V. Spiegler, P. Ieromonachou, Y. Lin, Big data analytics in supply chain management: A state-of-the-art literature review, Comput. Oper. Res., 98 (2018), 254–264. https://doi.org/10.1016/j.cor.2017.07.004 doi: 10.1016/j.cor.2017.07.004
    [52] G. Wang, A. Gunasekaran, E. Ngai, T. Papadopoulos, Big data analytics in logistics and supply chain management: Certain investigations for research and applications, Int. J. Prod. Res., 176 (2016), 98–110. https://doi.org/10.1016/j.ijpe.2016.03.014 doi: 10.1016/j.ijpe.2016.03.014
    [53] A. Elmachtoub, P. Grigas, Smart "predict, then optimize", Manag. Sci., 68 (2022), 9–26. https://doi.org/10.1287/mnsc.2020.3922
    [54] D. Bertsimas, N. Kallus, From predictive to prescriptive analytics, Manag. Sci., 66 (2020), 1025–1044. https://doi.org/10.1287/mnsc.2018.3253 doi: 10.1287/mnsc.2018.3253
    [55] P. Notz, R. Pibernik, Prescriptive analytics for flexible capacity management, Manag. Sci., 68 (2022), 1756–1775. https://doi.org/10.1287/mnsc.2020.3867 doi: 10.1287/mnsc.2020.3867
    [56] G. Ban, C. Rudin, The big data newsvendor: Practical insights from machine learning, Oper. Res., 67 (2019), 90–108. https://doi.org/10.1287/opre.2018.1757 doi: 10.1287/opre.2018.1757
    [57] Y. Ran, S. Wang, K. Fagerholt, A semi-"smart predict then optimize" (semi-SPO) method for efficient ship inspection, Transp. Res. Part B Methodol., 142 (2020), 100–125. https://doi.org/10.1016/j.trb.2020.09.014 doi: 10.1016/j.trb.2020.09.014
    [58] S. Wang, X. Tian, R. Yan, Y Liu, A deficiency of prescriptive analytics—No perfect predicted value or predicted distribution exists, Electron. Res. Arch., 30 (2022), 3586–3594. https://doi.org/10.3934/era.2022183 doi: 10.3934/era.2022183
    [59] S. Wang, R. Yan, "Predict, then optimize" with quantile regression: A global method from predictive to prescriptive analytics and applications to multimodal transportation, Multimodal Transp., 1 (2022), 100035. http://doi.org/10.1016/j.multra.2022.100035 doi: 10.1016/j.multra.2022.100035
    [60] J. Kotary, F. Fioretto, P. Van Hentenryck, B. Wilder, End-to-end constrained optimization learning: A survey, in Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, (2021), 4475–4482. https://doi.org/10.24963/ijcai.2021/610
    [61] A. Ferber, B. Wilder, B. Dilkina, M. Tambe, MIPaaL: Mixed integer program as a layer, in Proceedings of the AAAI Conference on Artificial Intelligence, (2020), 1504–1511. https://doi.org/10.1609/aaai.v34i02.5509
    [62] B. Wilder, B. Dilkina, M. Tambe, Melding the data-decisions pipeline: Decision-focused learning for combinatorial optimization, in Proceedings of the AAAI Conference on Artificial Intelligence, (2019), 1658–1665. https://doi.org/10.1609/aaai.v33i01.33011658
    [63] J. Mandi, E. Demirovi, P. Stuckey, T. Guns, Smart predict-and-optimize for hard combinatorial optimization problems, in Proceedings of the AAAI Conference on Artificial Intelligence, (2020), 1603–1610. https://doi.org/10.1609/aaai.v34i02.5521
    [64] M. Mulamba, J. Mandi, M. Diligenti, M. Lombardi, V. Bucarey, T. Guns, Contrastive losses and solution caching for predict-and-optimize, in Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, (2021), 2833–2840. https://doi.org/10.24963/ijcai.2021/390
    [65] N. Kallus, Recursive partitioning for personalization using observational data, in Proceedings of the 34th International Conference on Machine Learning, (2017), 1789–1798.
    [66] D. Bertsimas, J. Dunn, N. Mundru, Optimal prescriptive trees, INFORMS J. Optim., 1 (2019), 164–183. https://doi.org/10.1287/ijoo.2018.0005
    [67] A. Elmachtoub, J. Liang, R. Mcnellis, Decision trees for decision-making under the predict-then-optimize framework, in Proceedings of the 37th International Conference on Machine Learning, (2020), 2858–2867.
    [68] N. Kallus, X. Mao, Stochastic optimization forests, Manag. Sci., 2022 (2022). https://doi.org/10.1287/mnsc.2022.4458
  • This article has been cited by:

    1. Xuecheng Tian, Ran Yan, Yannick Liu, Shuaian Wang, A smart predict-then-optimize method for targeted and cost-effective maritime transportation, 2023, 172, 01912615, 32, 10.1016/j.trb.2023.03.009
    2. Shijie Wang, Zhiguo Sun, Dongsheng Wang, Analysis and Verification of Load–Deformation Response for Rocking Self-Centering Bridge Piers, 2023, 15, 2071-1050, 8257, 10.3390/su15108257
    3. Haoqing Wang, Ran Yan, Man Ho Au, Shuaian Wang, Yong Jimmy Jin, Federated learning for green shipping optimization and management, 2023, 56, 14740346, 101994, 10.1016/j.aei.2023.101994
    4. Xuecheng Tian, Ran Yan, Shuaian Wang, Gilbert Laporte, Prescriptive analytics for a maritime routing problem, 2023, 242, 09645691, 106695, 10.1016/j.ocecoaman.2023.106695
    5. Utsav Sadana, Abhilash Chenreddy, Erick Delage, Alexandre Forel, Emma Frejinger, Thibaut Vidal, A survey of contextual optimization methods for decision-making under uncertainty, 2025, 320, 03772217, 271, 10.1016/j.ejor.2024.03.020
    6. Shuaian Wang, Xuecheng Tian, A Deficiency of the Predict-Then-Optimize Framework: Decreased Decision Quality with Increased Data Size, 2023, 11, 2227-7390, 3359, 10.3390/math11153359
    7. Xuecheng Tian, Bo Jiang, King-Wah Pang, Yu Guo, Yong Jin, Shuaian Wang, Solving Contextual Stochastic Optimization Problems through Contextual Distribution Estimation, 2024, 12, 2227-7390, 1612, 10.3390/math12111612
    8. Shuaian Wang, Xuecheng Tian, A Deficiency of the Weighted Sample Average Approximation (wSAA) Framework: Unveiling the Gap between Data-Driven Policies and Oracles, 2023, 13, 2076-3417, 8355, 10.3390/app13148355
    9. Ran Yan, Shuaian Wang, Lu Zhen, Shuo Jiang, Classification and regression in prescriptive analytics: Development of hybrid models and an example of ship inspection by port state control, 2024, 163, 03050548, 106517, 10.1016/j.cor.2023.106517
    10. Xuecheng Tian, Bo Jiang, King-Wah Pang, Yuquan Du, Yong Jin, Shuaian Wang, A Revisit to Sunk Cost Fallacy for Two-Stage Stochastic Binary Decision Making, 2024, 12, 2227-7390, 1557, 10.3390/math12101557
    11. Zhong Chu, Ran Yan, Shuaian Wang, Evaluation and prediction of punctuality of vessel arrival at port: a case study of Hong Kong, 2024, 51, 0308-8839, 1096, 10.1080/03088839.2023.2217168
    12. Chenliang Zhang, Zhongyi Jin, Kam K.H. Ng, Tie-Qiao Tang, Fangni Zhang, Wei Liu, Predictive and prescriptive analytics for robust airport gate assignment planning in airside operations under uncertainty, 2025, 195, 13665545, 103963, 10.1016/j.tre.2025.103963
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(2328) PDF downloads(189) Cited by(12)

Figures and Tables

Figures(1)  /  Tables(1)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog