Optimal replacement in a system of n-machines with random horizon ROCIO

This paper considers a system consisting of independently operating n-machines, which follows a deterioration processes with an associated cost function. It is assumed that the system is observed at discrete time and the objective function is the total expected cost. Also, it is considered that the horizon of the problem is random. For this problem, a replacement optimal policy that minimize the operation cost of the system is provided. Besides, a numerical example through a program in Matlab is presented.


Introduction
In industrial processes is common the deterioration of electronic components or machines, then it is important to provide replacement strategies for the optimization of these systems.The problem of optimal replacement is modeled in different ways.For example, in [9] the problem is studied considering a single machine which follows a deterioration stochastic process with various quality levels in a continuous time.Also, the optimal replacement has been implemented based on future technological advances, in this case, a non-stationary process is considered and the optimal decision is characterized using a forecast horizon approach (see [6]).On the other hand, in [2] the problem is studied with n-machines considering two heuristics rules of replacement that make possible the search of optimal policies.These rules are as follow: the first suggests that a machine is replaced only if all older machines are replaced and the second one indicates that in any stage all machines of the same age are either kept or replaced.
In this work the model proposed in [1] has been considered, which involves a single machine that follows a Markov process of deterioration with D possible levels, where D is a positive integer, associated with operation costs of the machine at each level.The machine is observed in discrete time and, depending on the deterioration level, the following situations are possible: 1. Leaving operate for a additional period of time.
2. Replaced it with a cost R > 0.
Fixing a finite horizon of operation for the system is determined an optimal replacement policy, such that the expected total cost incurred is minimal.Now, in this paper a system consisting of n-machines with independent deterioration processes is studied, assuming that the system is operating in a random horizon.This novel consideration in the model is because it is possible that external factors obligate to conclude the process before expected, for example, bankruptcy of the firm in an economic model (see [8], p. 125).The approach to analyze this problem is through Markov decision processes.
This paper is organized as follows.In Section 2 the basic theory of Markov decision processes is presented.In Section 3, a general study on Markov decision processes with random horizon is provided, which allows to solve the problem that is proposed.Afterwards, the problem of optimal replacement in a system with n-machines and random horizon is described in Section 4.Then, in the following section, the Markov control model and the dynamic programming equations are structured.For solving numerical cases a program in Matlab is elaborated.Finally, in Section 6 some numerical results are illustrated.

Basic theory of Markov decision processes
The theory presented in this section can be consulted in [3].
Consider the following non-homogeneous Markov decision or control model: where a) X is a Borel space, called the state space; b) A is a Borel space, called control or action set; c) {A(x) | x ∈ X} is a nonempty measurable subsets A(x) of A, where A(x) denotes the set of feasible controls or actions when the system is in state x ∈ X, and with the property that the set e) c t : K → R is a measurable function called the cost-per-stage or one-stage cost function.
Remark 2.1.The Markov control model that is considered in this paper is a non-homogeneous system in the cost.A general nonhomogeneous Markov decision model is , this kind of model can be consulted in [8].
Let (X, A, {A(x) | x ∈ X}, Q, c t , t ∈ {0, 1, 2, . ..}) be a Markov control model and for each t = 0, 1, . . .define the space H t of admissible histories up to time t as An arbitrary element h t of H t , which is called an admissible t -history or simply t-history, is a vector of the form h t = (x 0 , a 0 , . . ., x t−1 , a t−1 , x t ), with (x i , a i ) ∈ K for i = 0, 1, . . ., t − 1 and x t ∈ X.
Then, a control policy is a sequence π = {π t , t = 0, 1, . ..} of stochastic kernels π t on the control set A given H t such that π Let Φ be the set of all stochastic kernels ϕ such that ϕ (A(x)| x) = 1 for all x ∈ X, and let F be the set of all measurable functions f : X → A satisfying that f (x) ∈ A(x) for all x ∈ X.The functions in F are called selectors of the multifunction x 7 → A(x), x ∈ X.
The set of all policies is denoted by Π.In this work, deterministic Markov policies are characterized.A policy is called deterministic Markov policy, if there is a sequence Let (Ω, F ) be a measurable space consisting of the canonical sample space Ω := H ∞ and F is the corresponding product σ-algebra.The elements of Ω are sequences of the form ω = (x 0 , a 0 , x 1 , a 1 , . ..), with x t ∈ X and a t ∈ A, t = 0, 1, 2, . ... The projections x t and a t of Ω to the sets X and A are called state and action variables, respectively.Let π = {π t } be an arbitrary control policy and µ an arbitrary probability measure on X, referred to as the initial distribution.Then, by the theorem of C. Ionescu-Tulcea (see [3]), there exists a unique probability measure The expectation operator with respect to P π µ is denoted by E π µ .If µ is concentrated at the initial state x ∈ X, then we write P π µ and E π µ as P π x and E π x , respectively.Let π ∈ Π and x ∈ X.The expected total cost with finite horizon N is defined by where c N is a measurable function called the terminal cost function.Consider the Markov control model (X, A, {A(x) | x ∈ X}, Q, c t , t ∈ {0, 1, 2, . ..}) and suppose that wish to minimize J(π, x).Define the optimal value function as x ∈ X.The optimal control problem is to determine a policy π * ∈ Π such that J(π * , x) = J(x), x ∈ X.
An approach to analyze the optimal control problem is Dynamic Programming (DP).In the following section, a theorem to guarantee the existence of optimal policies for the optimal control problem is presented.
Remark 3.1.E is the expected value with respect to the joint distribution of the process {x t , a t } and τ .
Define the optimal value function as By Assumption 3.2, it is follows that π ∈ Π and x ∈ X, where Thus, the optimal control problem with a random horizon τ is equivalent to an optimal control problem with a finite horizon T + 1, a cost per stage P t c t and a null terminal cost (see [5] and [4]).
The following result is motivated by Theorem 3.2.1 in [3], in our case the proof to let a random horizon was adapted.Theorem 3.3.Let J 0 , J 1 , ..., J T +1 be the functions on X defined by J T +1 (x) := 0 and for t = T, T − 1, ..., 0, Suppose that J t is a measurable function for each t = 0, 1, ..., T .Then, there is a selector f t ∈ F such that f t (x) ∈ A(x) attains the minimum in (3.1) for all x ∈ X and t = 0, 1, ..., T ; i.e.
Proof.Let π ∈ Π be an arbitrary policy and define for t = 0, 1, ..., T , and C T +1 (π, x) := 0. C t (π, x) is called the cost from time t onwards when using the policy π and x t = x.

Description of the problem
Consider a system consisting of n machines, each with an independent stochastic process of deterioration, whose possible levels of deterioration are denoted by 1, 2, 3, . . ., D, where D is a positive integer.Level one denotes that the machine is in perfect condition.Suppose that deterioration level is increasing, i.e., that a machine operating at level i is better than i + 1, i = 1, 2, 3, . . ., D − 1.
Let P = (p i,j ) D×D be the matrix of transition probabilities for going from level i to level j (identical for the n-machines).Since a machine can not move to better level of deterioration, p i,j = 0 if j < i.Let g : {1, 2, 3, . . ., D} → R be a known function, which will measure the cost of operation of a machine.Suppose that g is nondecreasing, i.e. g(1) ≤ g(2) ≤ . . .≤ g(D), and in the beginning of each period of time can be taken the following options.Also consider that the system can operate for τ time periods, where τ is a random variable independent of the process followed by the system with probability distribution P (τ = t) = ρ t , t = 0, 1, 2, . . ., T , where T is a positive integer.
The problem consists on determining optimal replacement policies that minimize the expected total cost of operation of the system.

Modeling the problem
The problem is solved through the theory of Markov decision processes.This requires building the corresponding Markov control model.At the beginning of an arbitrary time period, the state of the system can be registered as (d 1 , d 2 , . . ., d n ) where d k , k = 1, 2, . . ., n, is the level of deterioration in which the machine is operating, therefore the state space is defined by: where card(X) = D n states.A replacement action can be represented by (a 1 , a 2 , . . ., a n ) with a k = 0 or a k = 1, where a k = 0 means let that the machine k operate on the level d k and a k = 1 means replace it.At this way where its cardinality is 2 n actions.For an arbitrary machine k, let P 0 = (p 0 i,j ) DXD = P be the transition matrix of the process of deterioration when the machine is not replaced.Let P 1 = (p 1 i,j ) D×D be the transition matrix when the machine is replaced, where p 1 i,j = 1, if j = 1 and p 1 i,j = 0 in otherwise (safely when machine k was replaced, the machine goes to level one).Let Q a = (q a i,j ) D n ×D n be the transition matrix of the state i at state j of the system, when the action a ∈ A is taken, i, j ∈ X.For the independence of the deterioration processes of the machines, it is obtained that q (a 1 ,a 2 ,...,an) where γ(x k,t , a k,t ) = g(x k,t ), if a k,t = 0 and γ(x k,t , a k,t ) = g(1) + R, if a k,t = 1.In this case x k,t and a k,t represent the state and the action at time t of the machine k, respectively.Therefore the performance criterion for this problem is π ∈ Π and x ∈ X, where τ is the planning horizon of the problem.
Remark 5.1.The assumption of measurability and existence of selectors in Theorem 3.3 holds, due to X and A are finites (see [8], p. 90).
The amount of numerical calculations involving the equation (5.4) depends on the value of n, D and T .For this reason a program in Matlab that carry out numerical calculations was implemented.
Below, the algorithm followed for the elaboration of the program is presented.
1. Read of the following data: • n, the number of machines.
• D, the number of deterioration levels.
• P , the transition matrix of deterioration process.
• R, the cost per unit replaced.
• T , the support of the horizon.
4. Calculate the transition matrices Q a for each a ∈ A, through (5.3).
5. Obtain the optimal policy and optimal value function, as follow 5.1.do t = T + 1 and J t (x) = 0 for each x ∈ X.

Example
Consider the optimal replacement problem with a random horizon and In the table 6.1, the optimal expected total cost is presented for each initial state.
In the table 6.2, the optimal action depending on the stage and state of the system is reported.
For a given number of machines n, consider the state of perfect conditions of the system as the initial state, i.e. x 0 = (1, 1, . . ., 1) n×1 .Let J n τ (x 0 ) be the optimal expected value.In table 6.3, the optimal value is illustrated for different values of n, whose graph is shown in figure 6.1.
The numerical calculations show that there is a linear relation between the number of machines and the optimal expected value, with initial state x 0 .Moreover, it can be seen that in this particular case, it is possible to write the following relation: J n τ (x 0 ) = nJ 1 τ (x 0 ).In a general case, with initial state, , which would divide the problem with n machines in n problems with a single machine, this would give many computational benefits.initial state optimal value initial state optimal value x J τ (x) x J τ (x)

Conclusion
In this paper a problem of optimal replacement in a system of n-machines considering random horizon has been analyzed.The problem is modeled as a Markov decision process, using this approach it is possible to characterize the optimal replacement policy through of dynamic programming equation.Moreover, a program in Matlab that helps to solve numerical cases was developed.Results obtained for special cases suggest that we can divide the problem with n-machines in n problems with a single machine, which reduce the course of dimensionality (see [7]).
a) Operate the machine k, k = 1, 2, . . ., n in a level of deterioration for this time period, or b) replace by a new one with a fixed cost R > 0.
5.3.replace t by t − 1 and calculate J t (x) for x ∈ X by means of the equation J t (x) = min a∈A(x)

Table 6 .
1: Optimal value for each initial state