Recent Advances in Information TechnologyView this Special Issue
Research Article | Open Access
Multiobjective Resource-Constrained Project Scheduling with a Time-Varying Number of Tasks
In resource-constrained project scheduling (RCPS) problems, ongoing tasks are restricted to utilizing a fixed number of resources. This paper investigates a dynamic version of the RCPS problem where the number of tasks varies in time. Our previous work investigated a technique called mapping of task IDs for centroid-based approach with random immigrants (McBAR) that was used to solve the dynamic problem. However, the solution-searching ability of McBAR was investigated over only a few instances of the dynamic problem. As a consequence, only a small number of characteristics of McBAR, under the dynamics of the RCPS problem, were found. Further, only a few techniques were compared to McBAR with respect to its solution-searching ability for solving the dynamic problem. In this paper, (a) the significance of the subalgorithms of McBAR is investigated by comparing McBAR to several other techniques; and (b) the scope of investigation in the previous work is extended. In particular, McBAR is compared to a technique called, Estimation Distribution Algorithm (EDA). As with McBAR, EDA is applied to solve the dynamic problem, an application that is unique in the literature.
In executing an air traffic schedule, bad weather or emergencies might occur whereby to-be-executed activities in this schedule are no longer feasible. Many real-world problems are set in this type of dynamic scenario where their objectives, constraints, or even dimensions may change in time [1–3]. This is particularly true for some resource-constrained project scheduling (RCPS) problems, a class of problems that have ongoing tasks restricted to utilizing a limited number of resources . Some RCPS problems are NP-hard which are often approached using modern heuristic methods, in particular, Evolutionary Algorithms (EA) .
Many researches [6–9] have investigated scheduling problems that have fixed dimensions. However, there are important scheduling problems and each of them has varying dimensions. For example, suppose that in implementing a resource-constrained schedule for an edifice construction, a large buried object is found. As a consequence, a new task to remove this object must be performed before other tasks in the schedule can continue or commence. If each variable in the problem of determining a schedule corresponds to a task, then the revision of the schedule to accommodate a new task entails a new problem that may have a greater number of variables (dimensions) than the original problem.
Many scheduling problems are multiobjective [10–13]. For example, in creating the edifice construction schedule above, one objective is to reduce the construction duration which could be accomplished by employing large number of laborers. However, due to overhead expenses (e.g., hazard fee) per laborer, the construction cost for many laborers could be higher than that for a few laborers for the same total man-hours. Now, if another objective is to reduce the construction cost which could be accomplished by hiring fewer laborers, the construction duration could increase. Thus, in this example, the two conflicting objectives (minimization of construction cost and duration) cannot possibly be achieved simultaneously. A problem that requires simultaneous optimization of conflicting objectives is referred to as multiobjective optimization (MOO) problem .
Let denote the biobjective dynamic RCPS problem where minimization of the schedule cost and duration are the conflicting objectives and where the number of tasks varies in time, a variation that brings about a change in the dimension of this problem. Our previous work  investigated the performance (solution-searching ability) of the memory and EA-based technique called mapping of task IDs for centroid-based adaptation with random immigrants (McBAR) in solving the problem. However, this investigation was performed on few instances of . Consequently, only a few characteristics of McBAR were found under the dynamics of this problem. Further, only a few techniques were being compared to McBAR with respect to its performance in solving the problem.
The major goals of this paper are to(1)legitimize some subalgorithms constituting McBAR. Legitimization of a subalgorithm of a technique is to manifest the decline in the effectiveness of this technique in solving a problem when the subalgorithm is replaced;(2)extend the investigations in our previous work . In particular, add the techniques being compared to McBAR with the technique that utilizes Estimation Distribution Algorithm (EDA) . As with McBAR, this additional technique is applied to solve the problem. This application is unique in the literature.
This paper is organized as follows. Section 2 explores the knowledge useful to understand the problems and methods presented in this paper. Section 3 provides information on the problems and the methods used to solve these problems. The results obtained for applying these methods are described and investigated in Section 4, together with the demonstration of the way in which the goals above are accomplished. Section 5 presents the conclusion of this work.
2. Background Knowledge
The following five subsections contain general and specific background knowledge helpful for understanding the techniques and problems discussed in this paper. Section 2.1 investigates some RCPS problems, each with a time-varying number of tasks, and gives special emphasis to approaches applied to solve these problems. Section 2.2 presents the memory-based approach, upon which McBAR is founded. Section 2.3 provides information on some multiobjective optimization problems which are related to that of . EDA is presented in Section 2.5 including its application to some RCPS problems.
2.1. Resource-Constrained Project Scheduling (RCPS)
Scheduling as a solution to any RCPS problem is composed of tasks obeying some precedence relationship, such as that exemplified in Figure 1. In this figure, boxed numbers are IDs of tasks of an RCPS; directed links signify precedence relationships of these tasks; and labels “S” and “E” correspond to starting and ending tasks, respectively. Note that any task precedence network that will be mentioned from here onwards is of the form just described. RCPS tasks are characterized by several attributes such as, duration, starting time, and required resources, for example, personnel, materials, and fuel. In RCPS, the number of resources of the same type utilized by all ongoing tasks are constrained not to exceed a predefined limit.
A schedule implemented in a dynamic environment can turn into an infeasible or a low quality one, for example, one involving a high cost of implementation. Thus, there could be a need to revise this schedule. However, the following rule must be taken into account in the revision. Schedule revision must judiciously be made to avoid the high cost of altering its in-use components which thus may be preserved .
Given an RCPS problem, let the total number of tasks be defined as the sum of the number of finished, ongoing, and to-be-executed tasks at the moment of change in the dynamic environment that sets this problem. Note that task cancellation is not considered in this paper. In our previous work [14, 17] as well as in this paper, we consider three entities that could change in the environment: resource availability, task duration, and total number of tasks. The change in the total number of tasks is given the major emphasis.
2.1.1. Time-Varying Task Number
There had been several reactive scheduling approaches to revise schedules to cope with the effects of the time-varying number of tasks. For example, in the job-shop scheduling problem in , during the EA process of evolving a population of schedules, genes which correspond to new/old jobs that arrived/finished are inserted/removed to/from genotypes which correspond to the schedules. The resulting population is then evolved further to search for new high-quality schedules. It is worth noting that despite the genotype alteration, significant improvement on the EA’s search convergence and solutions/schedules quality was found [16, 19–21]. In , tasks in processors of a distributed computing system arrive randomly and are put on queue. These tasks are then processed by batch that has varying number of tasks. This process follows a schedule created through Genetic Algorithm to obtain minimal combined execution time of tasks in every batch. In , genes that correspond to new tasks are also inserted into genotypes which are then processed by EA to obtain a new high-quality schedule for multiresource scheduling with cumulative constraints. In this EA process, lateness of the new schedule is minimized and at the same time, important properties of the original schedule are preserved. EDA was applied to solve a non-RCPS problem whose dimension changes in time [24, 25]. However, the objective functions utilized in this application were simple. More techniques in revising schedules, to cope with the variation of the number of tasks in the environments that set these schedules, can be found in .
2.1.2. Schedule Generation
We now discuss a process of generating schedules referred to as serial schedule generation scheme (SSGS). SSGS is also a popular method to form initial populations used to solve RCPS problems through EA . Before proceeding, some terms will be defined. Let a genotype be viewed as an ordered set of slots to which IDs, of tasks that comprise a schedule, will be filled in consecutively (e.g., from leftmost slot to the right). Consider a genotype devoid of IDs. Once an ID is filled in, its corresponding task will be called a scheduled task. A root task is defined as the task from which all other tasks succeed, for instance, the task labeled “S” in Figure 1. The root task is considered as a scheduled task even though its ID is not filled in to the genotype. Eligible tasks have IDs not yet filled in to the genotype and are the immediate successors of scheduled tasks and/or the root task. For example, given the task precedence in Figure 1, if scheduled tasks have IDs 1, 2 and “S,” the set of eligible tasks have IDs in set whose subsets (indicated by ) contain ID/s of task/s. These tasks are the immediate successors to their corresponding (based on the figure) scheduled and root tasks.
A version of the SSGS algorithm to create a resource-constrained schedule is depicted in Algorithm 1. It starts by determining the set of IDs of eligible tasks not yet with any scheduled task, except the root task. This implies that, at the start, is composed of IDs of tasks immediately succeeding the root task. In the last example, the set of eligible tasks at this stage is . Let be the number of tasks to be scheduled. A loop is then executed times to fill all slots of the genotype utilized in this algorithm. In each cycle, indexed of this loop a task ID is selected randomly from and placed to the genotype at slot indexed . Then set the starting time of the task that corresponds to this ID to be the earliest time later than the maximum end time—start time plus duration —of all its predecessors whose IDs are in the set . Further, the starting time is such that there are enough available resources, for task with ID , to utilize over the entire execution period of this task. Note that some resources an RCPS environment could already been allocated to some scheduled tasks and hence could be unavailable. Task with ID is now considered scheduled. And the last step in the loop is to update . After cycles, the genotype is completely filled and the starting times of tasks that correspond to IDs in the genotype are determined. Thus, a schedule is formed.
2.2. Memory-Based Approaches
EA techniques which utilize memory record relevant information that corresponds to previous problems they had solved or to past evolutionary generations they had gone through. This information is retrieved to assist in solving a current problem or a problem at a current evolutionary generation . Suppose that a previous planning problem and a current problem are close, based on a certain measure. It could then be expected that the fitness landscapes that correspond to these problems are also close, by a different measure. Thus, the solutions to these problems could also be close, based on another measure. By this proximity, few EA cycles could be required to evolve an initial population, containing solutions or representative of solutions to , to become solutions to [6, 28]. This expectation underlies several memory-based approaches in EA [6, 28–30].
Performances (solution-searching abilities) of memory-based EA approaches are highly dependent on the diversity of population which they create and then evolve . They can be undertaken either in explicit or implicit styles.
2.2.1. Explicit Memory-Based Approach
Explicit memory-based Approach defines how information produced by EA in solving problems and/or information about environments that set these problems is stored and retrieved . The combination of competitive learning  and explicit memory-based EA approach was applied in [29, 32] to solve some problems in an artificial dynamic environment. An explicit memory-based EA approach in  is based from the human immune system. For each previous form of a dynamic knapsack problem that has analogues of antigens (molecules foreign to the immune system), a representative of the analogues of the pathogens in the system is kept in a memory and is used for solving a current form of the problem. The basic approach in  was to search for and then store to memory solutions suitable for many considered environmental states. Based on some mathematical measures, one of the solutions measured as being the most suitable was retrieved from the memory and then implemented on a current environmental state. However, if none of these solutions was measured as being suitable, a new optimal solution was searched. In , a prioritized list of categories of scheduling task properties was created and stored in a memory. Tasks to be executed had their properties matched to the categories to produce a prioritized list of tasks from which an initial population was formed and then evolved to determine the schedule of the tasks. The system in  continuously learns of changes in an environment to dynamically update its knowledge base of pairs of environmental properties and solutions to problem set in this environment. A pair that contained environmental properties that match, based on a mathematical measure, those of the current environmental state was retrieved from the knowledge base. The solutions contained in the retrieved pairs were then utilized to form an initial population which was evolved through EA to search for optimal solutions to a problem set in the current state of the environment. A system in , closely related to that in , was applied to some dynamic resource allocation problem in a command and control environment. This allocation problem considered risk and cost in a project implementation, factors which were lumped into one objective function.
2.2.2. Implicit Memory-Based Approach
Implicit memory-based approach defines the way in which representatives of information produced by EA for solving problems are stored and retrieved . As in explicit style, this information may or may not be added to the information on the environment that sets the problems. In the scheduling domain, implicit style has advantage over explicit style because, in a dynamic scenario, the schedule produced by explicit style will swiftly become irrelevant due to variation in the priority and precedence order of schedule components .
Multistranded chromosome (polyploidy) was utilized in [36–38] to store representatives of information that correspond to past environmental changes. This utilization was shown to be useful in searching solutions, through Genetic Algorithm, to some problems in a dynamic environment. Chromosomes utilized in  have a multilevel genetic structure which serve as long-term memory and facilitate quick adaptation of a function optimizer to environmental changes.
2.3. Multiobjective Optimization (MOO)
Let us now explore some important features of the MOO problem. Let be referred to as decision vector (e.g., genotype) of decision variables (e.g., ID of task in a schedule) and as the objective function which relates a decision vector to the objective value (e.g., cost to implement an entire schedule). The objective vector for a -objective problem is denoted by
The concept of dominance of a solution is relevant for comparing the quality of this solution to another solution of the same MOO problem. To explain this concept, consider two sets of indices, and where , , and . Let the indices be those of the objective functions contained in the expression of in (1). A solution dominates another , denoted by , if a nonempty set where and , that is, if there is one or more of the objective functions each yielding objective value at less than that at and if the rest of the objective functions each yielding objective value at equaled that at . This definition is applicable when objectives are to be minimized. Otherwise, the inequality sign will be reversed and “greater than” will be used instead of “lesser than" in the definition of dominance.
For example, suppose that there are two conflicting objectives to minimize, schedule duration and cost, which correspond to objective functions and , respectively, where is a schedule. If the cost in implementing schedule is lesser than that of schedule , that is, and the duration in accomplishing schedule is shorter than that of schedule , that is, , then ; that is, schedule is of better quality than . In this example, the sets of indices and in the definition of dominance above are equal to and , respectively.
If does not dominate and also does not dominate , they are referred to as nondominated. A set of nondominated solutions is called nondominated set (NDS). There could be many solutions to a MOO problem. At least one of them may constitute an NDS . In practice, guided by his/her experience and intuition, a decision maker may choose one solution from an NDS to implement in his/her field of interest. From here onwards, let the chosen solution be referred to as chosen schedule if it is in the context of scheduling.
One way to compare the quality of the set of solutions, obtained by the method , to another set of solutions, obtained by another method , is through the set coverage. Let be the set containing elements of which are dominated by an element in , The set coverage is defined as where is the set cardinality. This definition can also be applied to monoobjective optimization.
Based on (3), the set coverage has a range of . It will be convenient for later discussions to have a set coverage-related quantity that has a range symmetric around zero. For this purpose, we define where is referred to as differential set coverage which is antisymmetric on its arguments. Assuming that and based on (4), has a range of values, We then take the following. Suppose that and are sets of solutions computed, respectively, by techniques and under similar conditions. Technique performs better than technique under these conditions if .
2.4. Multiobjective Evolutionary Algorithm (MOEA)
As was pointed out in the Introduction, some RCPS problems which are NP-hard are often solved using EA. Further, recall also that is an RCPS problem compounded by multiobjectivity. A class of EA-based methods suitable to solve multiobjective NP-hard problems is referred to as Multiobjective Evolutionary Algorithm (MOEA). A popular member of MOEA is the Nondominated Sorting Genetic Algorithm-II (NSGA-II) . Being a type of EA, NSGA-II has an evolutionary process which starts on an initial population. Further, given an evolutionary stage, the selection process in NSGA-II gives preference to individuals (solutions) far from crowded individuals in the search space associated with the problem being solved through NSGA-II. This preference helps diversify offspring existing into the next evolutionary cycle .
Selection. The selection mechanism of NSGA-II is described as follows: at each evolutionary cycle of NSGA-II, fast nondominated sorting is applied to population , and as a consequence, each individual in is endowed with integral Pareto rank and crowding distance . Let be a set of individuals each having a Pareto rank of . Further, let be the population of selected individuals. Starting from , each is included in ascending rank to which starts from empty. This inclusion will stop before including which will result in , the number of individuals to be selected. If at the moment of stopping , then individuals in will be sorted in descending order of their crowding distances, thereby becomes an ordered set . Then , the set of first elements of , will be included in . Thereby, have exactly elements. As exemplified in Figure 2, if is included in , , such that only its subset is included to . Now, if at the moment of stopping, has number of elements already, then there is no need to include any element of .
2.5. Estimation Distribution Algorithm (EDA)
Estimation Distribution Algorithm (EDA) is an evolutionary heuristic which, instead of using EA operators, makes use of sampling and estimation of probability density functions (PDFs) to create its next-generation population . It is a heuristic that has the ability to detect and preserve good quality building blocks of chromosomes , an ability that could be important in some applications . It has not been applied to solve the problem. Its application to solve these problems is unique in the literature.
Before discussing a particular algorithm of EDA, let us describe Figure 3. Suppose that genes in genotypes that correspond to individuals in a given population are task IDs. Each color block in this figure signifies the density of the genotypes that have task ID equal to the task ID in the vertical axis at gene index in the horizontal axis. Let the matrix of density depicted in the figure be referred to as the probability matrix.
Algorithm 2 presents a particular EDA algorithm. Beginning at its first generation , EDA creates a probability matrix with equal entries. This probability matrix is then sampled to form a set of genotypes , sized . Genotypes in are then selected to form another set of genotypes, sized , where is a round-up operator and is a constant. Their selection is based on the fitness of their corresponding phenotypes. A new probability matrix is then estimated from . The cycle of estimation, sampling, and selection is repeated until a stopping condition is met.
Based on elementary statistics, the probability of a genotype in , for some , to be in the next generation () after sampling is derivable from . If a prospective genotype of high quality (e.g., fitness value) has less probability of persisting to the next generation, it is less likely to be in the next generation. This EDA drawback was remedied in . Another drawback of EDA is that after several generations, diversity of genotypes in , for large , will be lost; a drawback is remedied in .
In Figure 3, there is consideration of the visually detected high (of blue hue) probability block of genes indexed 15 to 17 with task IDs 5, 15, and 17, respectively. During the sampling of the probability matrix depicted in the figure, task IDs that correspond to these high probability blocks are more likely to appear at gene indices of the offspring genotypes that correspond to the blocks, thereby preserving the blocks.
2.5.1. EDA in Scheduling
In [9, 45], EDA was applied to solve an RCPS problem in a static environment with tasks on various execution modes. The authors utilized a probability matrix where is the task index, is the gene index of genotype used in EDA, and is the EDA generation/cycle count resembling that in Algorithm 2. Before the start of the first generation () in the applied EDA, all entries of were set to where is the number of tasks in the environment and is the genotype length. This implies that all of the tasks have equal probability of being placed into any gene location in any genotype formed during sampling of at the first generation.
A copy of was used to generate a genotype as follows: the column of , that is, is sampled to obtain a task that can be assigned to the gene indexed of the genotype. However, it could happen that not all tasks with nonzero corresponding probability in are eligible for the assignment due to the task precedence constraint in the RCPS problem. To remedy this, let be a set of indices of eligible tasks with nonzero corresponding probability in . Corresponding probabilities of these eligible tasks are provisionally revised to not for updating entries of , where . The provisional probability matrix is then sampled to obtain a task that will be assigned to gene indexed . After this assignment, all entries of row of are set to zero to avoid reassigning the obtained task thereafter. Further, each column of is renormalized. The steps described in this paragraph enable the assignment of a task to one gene only. They are repeated for each gene, from first () gene to last (), consecutively, thereby creating one genotype in the next generation genotype set (step 2 of Algorithm 2).
To produce other genotypes in , the steps described in the last paragraph are repeated but starting with a fresh copy of . The remaining steps in Algorithm 2 are executed to complete one cycle.
2.5.2. Other EDA Applications
EDA was applied in  to improve the performance of a technique, based on reinforcement learning (RL) , for solving a multiobjective problem. In this work, a probability matrix was revised every time the technique’s RL system dynamically interacted with an environment. EDA was applied to classify tissues at molecular level in . The selection of genotypes (an implementation of step 3 in Algorithm 2) in this work is based on the Pareto ranks and crowding distances  of their corresponding phenotypes.
EDA was applied in the optimization of a simple function with time-varying dimension in [24, 25]. In this application, the utilized probability matrix was a mixed Gaussian model whose number of clusters was determined through the Bayesian information condition. EDA was applied to other dynamic optimization problems with fixed dimensions [44, 49, 50]. In , some parameters of the current state of a dynamic environment were used to retrieve, from a memory, parameters of probability matrix utilized for solving a problem set in a previous state of the environment. These retrieved parameters were utilized to form the probability matrix of an EDA process that solved the problem set in the current state of the environment. Correction to population diversity loss in the EDA process was applied.
3. Materials and Methods
This section presents a test environment that sets some instances of the problem solved by each technique in . Further, it explores these techniques. Section 3.1 provides information on the intuitive descriptions of the problem. The formal definition of this problem is in Appendix A. Section 3.2 describes the various instances of the problem from which the dynamics of McBAR’s performance for solving these instances are demonstrated. Section 3.3 presents the technique referred to as centroid-based adaptation with random immigrant (CBAR) which is the precursor of McBAR. Section 3.4 provides information on McBAR. Section 3.5 elaborates our approach in applying EDA to the problem. Further, it describes other techniques used to achieve goal 1 in the Introduction.
3.1. Intuitive Description of the Problem
The selection of as the test problem in this paper is based on the popularity of RCPS in the study of adaptation in dynamic environments . The problem is set in a military operation environment such that it is referred to as the military mission scheduling (MMS) problem. MMS is a process by which the goals of a commanding officer will reach fruition. For an extensive review on MMS, one may refer to the work of . Intuitive descriptions of are given in this section with examples. In this section, the military operation environment that sets problem is simply referred to as the environment or the dynamic environment (to stress its variability).
Consider the dynamic environment in which a snapshot is taken of its original state and at every moment it changes state. For example, a snapshot is taken of the environment that involves a valley occupied by rebels that pose a certain scheduling problem. Then, the next snapshot is taken when the environment changes to a state that, in addition to its last state, has a road-blocking landslide which hampers transport of logistics from a depot to the battlefield in the environment. This next snapshot poses a different scheduling problem. Note that a snapshot of the dynamic environment is a static environment.
Let be a biobjective RCPS problem where schedule cost and duration are the conflicting objectives to minimize. Further, let this problem be set in a snapshot of a dynamic environment changing state for the time. Note that since a snapshot is a static environment, then is a static problem. Let the integral-valued index in denote the sequential order of state alteration (SOSA) of the dynamic environment. For instance, third SOSA denotes the third moment the environment is changing state. Let zeroth SOSA correspond to the original state of the environment. Considering that the snapshot of the dynamic environment is taken immediately after this environment changes state, then the sequential order of taking the snapshot is similar to SOSA.
The dynamic RCPS problem is viewed as a sequence of static problems ; that is, where denotes ordered set; subscripts denote SOSA; and denotes the number of moments at which the environment changes state. From here onwards, will be referred to as subproblem of . In addition, the index in has the range , assumed from here onwards. Intuitive descriptions of the subproblem are given first before its formal definitions, found in Appendix A. These intuitive descriptions are as follows.
Tasks in the environment are characterized by the following.(1)task ID,(2)type of performed activity,(3)duration,(4)status,(5)number of items of a specified type of resource utilized,(6)precedence relationship to other tasks,(7)starting time. The status of each of the tasks can either be ongoing, finished, or not yet executed. Canceled unexecuted tasks are not considered in this paper. The input information to the subproblem is items 1 to 6 of each of the tasks in the environment. Some features of the tasks in the environment are found in Table 1. The first and second columns in this table are task IDs and durations, respectively.
The precedence relationships of tasks can be expressed as a network of nodes, which represent tasks, and arcs, which represent precedence relationships. An example network is illustrated in Figure 1. Now, suppose that, in Table 1, task 2 is to transport weapons and ammunition, task 16 is to interrogate prisoners of war (POWs), and task 17 is to clear a bombed area. Thus, the network in Figure 1 expresses the concept that the bombing of enemies must be done after the transportation of weapons and ammunitions and before clearing the bombed area and the interrogation of POWs.
Resources such as soldiers, fuel, and weapons are assets that can be utilized for a system to function. In RCPS, each type of resource has a limited number of items, a number that is input information to the subproblem. The third to the last columns of in Table 1, respectively, are the resources to utilized by the tasks with IDs in the first columns. For instance, task 12 is to bomb enemies for 16 time units and to utilize at most four light mortar batteries ().
After one task is finished, resource items it had utilized are transferred to another task that will utilize all or some of the items or are returned to a depot called central base. Another input to the subproblem is the cost of moving each resource item from one task location to the next. Task activity location is identified by the ID of the task. For instance, location 7 in a battlefield is where the task with ID 7, for example, to care for refugees by one infantry company, is executed. Central base has a location label of zero.
The transfer status of resource item can either be moved or unmoved. And, the availability-to-task status of resource item can either be available, broken/dead, or occupied (meaning utilized by a task). Additional inputs to the subproblem are items’ transfer and the availability status of resource items and the execution locations of tasks.
3.1.3. Objective Functions
Minimization of schedule cost and duration are the two objectives in . They are mathematically defined in Appendix A. As pointed out in Introduction, these objectives are conflicting.
One constraint in the subproblem is implied by the precedence network of tasks in the environment; namely, the starting time of any task in this network must be later than or equal to the longest end time of its predecessors. For example, based on Table 1, task 3 has duration of 16 time units. Based on Figure 1, if task 3 starts at five time units then, to satisfy the constraint, task 8 must start on or after 21 time units.
As mentioned above, is an RCPS type of subproblem. Thus, in , the number of items of any type of resource utilized by all ongoing tasks in the environment is constrained. For example, there must be at most five C130 airplanes ( resource type) that can be used simultaneously at any instant during the military operation in the environment. This constraint is denoted by the column heading of Table 1.
Some computational products of solving the subproblem are genotypes, each expressed as an ordered set where s are the ID of tasks in the environment, is the gene index, and is the genotype length. Note that any genotype mentioned from here onwards has a form similar to that in (9). The ordering of IDs in the genotype is based on a given task precedence network. The ordering rule is the following: any ID in the genotype must correspond to a task whose successors do not have IDs where . For example, the section of a genotype satisfies the ordering rule based on Figure 4. Any genotype which satisfies the ordering rule is characterized as task-precedence feasible. Genotypes are generated either through SSGS (explained in Section 2.1.2) or by some other means, such as described in Section 3.3.
Some more computational products of solving are the starting times of tasks in the environment. These starting times are determined either through SSGS or by some other means, such as described in Section 3.3.4. Any solution to the subproblem is an ordered set of the tasks with determined starting times and is referred to as a schedule, where task has ID in (9). Based on the above-mentioned ordering rule of IDs in any genotype, every task in (10) must not have successor where .
Sample schedules are depicted in Figures 5(a) and 5(b) where the horizontal axis is time; filled rectangles represent tasks whose task IDs are the labels of the rectangles; and the lengths of the rectangles denote the duration of the tasks that correspond to them. All rectangles inside the region labeled correspond to tasks that utilize resources of similar label . All tasks in the schedules obey the task precedence relationship enshrined in Figure 4. For instance, in Figure 5(a), task 2 ends at 13 time units and task 7 starts at that similar time. Hence, task 7 succeeds task 2, thereby maintaining the task precedence relationship. More details of Figures 5(a) and 5(b) will be given in the next section.
3.1.6. Dynamic Factors of
Sections 3.1.2 to 3.1.5 described the static subproblem of . Let us now describe the features of the dynamic . In the above-mentioned military mission environment that sets , it could happen that the estimated completion duration of a military task will not be obeyed due to difficulties in terminating enemies. Thus, the task completion duration could vary from one snapshot of the environment to the next. The dynamic duration of any task with ID in the environment is modeled as where is the normal distribution with the mean as the predefined task duration (e.g., those listed in Table 1) and as the standard deviation.
It could also happen that the availability status of resources in the environment will change due to such factors as the fatigue of soldiers, the breakdown of equipment, and the delay in logistics. Further, it could also transpire that enemies retreat to recuperate and then attack at some later time. Consequently, new tasks, not accounted for in an original plan (a plan prior to any change in the environment), could be elicited. Thus, the total number of different tasks in the environment can change. The total number of tasks can only increase or be constant. Further, the number of finished tasks does not reduce the total number of tasks and there are no canceled unexecuted tasks.
To recapitulate, the availability status of resources used, duration, and total number of tasks are the dynamic factors in . To exemplify certain dynamics of , consider Figures 5(a) and 5(b) again. The schedule depicted in Figure 5(a) is one of the solutions to the subproblem set in the snapshot taken at the fifth SOSA of the environment and at 13 time units. The schedule depicted in Figure 5(b) is one of the solutions to the subproblem set in the snapshot taken at the sixth SOSA of the environment and at 16 time units. Each task in the schedule depicted in Figure 5(b) has a duration that is an alteration, following (11), of the duration of its pair task of similar IDs in the schedule depicted in Figure 5(a). The starting times of the pair of tasks are identical but their durations may not be so due to the dynamics of the environment. This is true for all pairs of tasks from the figures with similar IDs. The four tasks, with IDs 31 to 34, had already been added to the 30 tasks, with IDs 1 to 30, in the original plan.
In Figures 5(a) and 5(b), the darkest-background (red-background) rectangles correspond to finished task, for example, task 2; lightest-background (yellow) to ongoing tasks, for example, tasks 1 and 7; and the other colored rectangles to tasks yet to be executed. Based on Table 1 and on the task precedence in Figure 4, at 13 time units (which corresponds to Figure 5(a)), tasks 1, 3, and 7 are still ongoing, task 2 is finished, and the rest are yet to be executed. Further, at 16 time units (which corresponds to Figure 5(b)), tasks 1 and 7 are still ongoing, tasks 2 and 3 are finished, and the rest are yet to be executed.
3.2. Subproblem Instances
Section 3.1 provides an intuitive description of the general features of the problem, while Appendix A presents specific details of these features. This section provides information on the parametric values and categorical types of the features. Let problem with concrete features be referred to as a problem instance. Different problem instances of could be set in different environments, for example, scheduling problems for sea and land battles being expectedly different.
Section 3.2.1 provides information on problem instances of , information such as the types of resources utilized by the tasks and the number of new tasks at every SOSA of an environment. Section 3.2.2 defines the functions used to obtain the concrete features of a problem instance given some parameters. Section 3.2.3 gives details of the several computer simulations of the instances of . Section 3.2.4 defines averages useful for describing the dynamic performances and for solving the problem instances, of techniques in (defined in Section 3.5.2).
3.2.1. Instances of Problem
Problem instance of is expressed as where is an instance of which, based on (8), is a subproblem of set in the snapshot of an environment that has number of moments of changes. Note that there could be several simultaneous changes in the environment at a single moment of changes. As a sample notation of subproblem instance, denotes a subproblem of the instance, labeled 12, of problem set in the seventh snapshot of the environment. Based on the ordered set in (12), also denotes the seventh subproblem of instance labeled 12 of problem . Any environment mentioned in this section is one that sets the problem instance and will be referred to as the environment.
Each task in the environment utilizes only one of the following four types of resources:(i)light mortar batteries () ;(ii)infantry companies () ;(iii)C130s () ;(iv)Apache helicopters () .The constraint on the number of items of any listed resources that can be utilized by all ongoing tasks in the environment is indicated beside the resource label . Note that the availability status (defined in Section 3.1.2) of any of the resources can change from available to occupied (e.g., mortars firing) or vice versa (e.g., mortars returned from combat to depot) in the course of executing a schedule, that is, without the effects of the environmental dynamics. The only effect of the environment on the availability status, being proposed in this paper, is to change this status to broken (e.g., killed infantry soldier) from being available or occupied.
As mentioned in Section 3.1.2, properties of the tasks in the environment are found in Table 1. The environment has 30 original tasks (existing in the environment before it is changed) that have precedence relationship as depicted in Figure 1. A total of ten new tasks are added to the original tasks during the entire dynamics of the environment. New tasks, that occur in a particular SOSA of the environment, have IDs , , where is the maximum task ID prior to this SOSA. For example, if, at the last SOSA, the maximum task ID is 30 and there are four new tasks added at the current SOSA, then these new added tasks will have IDs 31 to 34.
The considered types of changes in the environment are listed in Table 2. The first column contains labels of types of changes, and the second contains the parameters that change simultaneously. For example, type 6 of the changes is a type of change involving changes in the environment where task duration, total number of tasks, and resource availability change simultaneously.
The types of changes are chained to form a sequence of changes. Each type of sequence of changes (TSC) is presented as a column, labeled , in Table 3. The first and last column of this table contain, respectively, the SOSAs and times at which specific type of changes occur. For example, sequence begins and followed by changes in task duration only (type 0 in Table 2). The next type of change in this sequence is simultaneous changes in task duration, total number of tasks, and resource availability (type 6 in Table 2) that occur at the third SOSA of the environment. The number of moments at which the environment changes state is .
Some components of each TSC are types of changes that involve an increase in the total number of tasks. For example, has types of changes 6, 5, 2, and 3 that occur, respectively, at third, eighth, tenth, and SOSAs of the environment and all involve an increase in the total number of tasks, based on Table 2. Task number increase sequence (TNIS), labeled as in Table 4, is a sequence of the numbers of new tasks that appear at some SOSAs of the environment. The sequence order of these numbers is the order of appearance of the batches of new tasks. Using the example above, if has TNIS of , there will be five, three, one, and one new tasks that appear at the third, eighth, tenth, and SOSAs of the environment, respectively, based on Table 4. The first column of Table 4 is the order at which the batch of new tasks appear. Note that even with similar TSC but with different TNIS the number of new tasks, in a given SOSA of the environment, could be different.
Task precedence networks that correspond to TNIS of , , , and are illustrated in Figures 6(a) to 6(d), respectively, while that of is in Figure 4. They are formed by placing the new tasks that appear in the environment to the original task precedence network, illustrated in Figure 1, and differing in their forms by the locations in which the new tasks are placed.
Some components of the sequences of changes in Table 3 involve change in task duration. The amount of change in task duration is modeled by (11) whose has to be specified. To recapitulate, a instance is defined by a particular sequence (e.g., ) of changes in the environment, subsequence (e.g., ) of increases in the total number of tasks in the sequence, and the value of in (11).
Table 5 lists instances of labeled from 1 to 30. With the column of types of TNIS as the reference column, the instances with labels at the left and right correspond to and , respectively; instances with labels under have TSC of ; and instances with labels at the same row as have TNIS of . For example, instance 25 has TSC of , TNIS of , and task duration changes modeled by (11) with of 3.0.
3.2.2. Tables as Functions
Let us define some functions that yield values in Tables 2 and 4 and which are useful in the succeeding sections. Given a problem instance labeled , a TSC can be obtained using Table 5. Given an SOSA of an environment that sets , a change type label can be determined using in Table 3. Thus, Tables 5 and 3 serve as a function , Using in Table 2, the environmental attributes that change simultaneously can be determined. For example, instance 3 has TSC of , based on Table 5. Based on the column of Table 3, the change type at the eighth SOSA of the environment is 5; that is, . Based on Table 2, type 5 of changes signifies simultaneous changes in resource availability and the total number of tasks in the environment.
Table 5 serves as a function to map an instance label to type of task-precedence network, Note that, based on the column of Table 3, the total number of tasks increases for the 3rd occasion at the eighth SOSA of the environment. Further, ; that is, instance 3 has TNIS of . The intersection of the third row, which is the third occasion order, and column in Table 4 is two, the number of new tasks appearing at the eighth SOSA of the environment. Thus, the set of Tables 2, 4, and 5 serve as a function to determine the number of new tasks, This function is applicable only when yields a type of changes that is 2, 3, 5, or 6 which, based on Table 2, all involve an increase in the total number of tasks. Otherwise, there is no new task at a given SOSA of the environment.
For convenience in the succeeding discussions, let the functions , , and be distributive to vector elements; that is, if , For example, for an ordered set of instance labels , .
The environment is computer simulated whereby, to reflect a real-life scenario, its resources may be moved from one location of its task to another (e.g., tanks are moved from base camp to a valley); the status of its tasks and resources (enumerated in Section 3.1.2) may be altered (e.g., soldiers killed); and its type of dynamics (e.g., ) is implemented. The environment is simulated for several times where, in each simulation,(1)the original (before any change in the environment) durations of unfinished tasks in the environment are added with random samples of the model in (11). Consequently, an unfinished task in the snapshot taken at the SOSA of one environment simulation could differ from the duration of the task with same ID in the snapshot taken at the same SOSA of other environment simulations. For example, if task 43, in the fifth snapshot of the second environment simulation, has a duration of 16 time units, its duration can be 25 time units in the fifth snapshot of the ninth environment simulation. The difference could be true for all unfinished tasks in all snapshots taken from the first to the SOSA of all environment simulations. All other features (e.g., number of in-use resources) of the environment are identical across simulations;(2)subproblem instances are sequentially solved (e.g., from to ) independently by every technique in . Further, the environment is simulated, at its SOSA, before solving the subproblem instance ( could be any given instance label) set in this environment. Note that despite differences between elements of a set of the subproblem instance simulations (due to the addition of different random values to duration of tasks in different simulations), each technique in solves this same set. This approach will reduce the number of variables to consider in comparing the techniques;(3)The random seed used, by the evolutionary processes of techniques in , to solve subproblem, , set in a simulation of the environment, could differ from that of the other simulations of the environment.
Let us discuss the types of averages utilized to analyze the performances of techniques in . The average differential set coverage of technique over technique will be used to compare the performances of these techniques in solving subproblem instances of , at different values of . The parameter is found in (11) which models the change in duration of tasks in environments that set the instances. The average is defined as where is the differential set coverage (defined in (4)) of technique over technique determined for subproblem instance (in instance with label in Table 5) and at the simulation of this instance. Further, is the number of simulations and is the number of subproblems in , including . Furthermore, is a set of labels of instances in Table 5 that utilize a given value of ; for example, based on Table 5, .
The average differential set coverage of technique over technique will be used to compare the performances of these techniques at different types of changes listed in Table 2. It is defined as where is the number of all instances in Table 5; is the set of SOSAs of the environment that sets instance labeled at which type of changes occurs. To exemplify, consider instance labeled 1 which, based on Table 5, has TSC . In this TSC, SOSAs with type of changes (task duration change only, based on Table 2) are in the set , based on Table 3. Note that zeroth SOSA is excluded in the definition of the averages.
The average differential set coverage of technique over technique will be used to compare the performances of these techniques at different simulations of subproblem instances of . It is defined as where all indices are already defined in this subsection.
Note that all of the averages are derived from differential set coverage which, as mentioned in Section 2.3, is a measure of the performance of one technique over another. We then take the following: if any of the averages is greater than zero, then technique performs better than technique , in determining solutions to instances of listed in Table 5, over the domain at which this average is taken.
3.3. Centroid-Based Adaptation with Random Immigrants (CBAR)
McBAR is an extension of the memory-based EA technique referred to as centroid-based adaptation with random immigrant (CBAR). CBAR was applied in  to solve a class of biobjective dynamic RCPS problems with a fixed total number of tasks. is viewed as a sequence of the static RCPS subproblems and each has schedule cost and duration as objectives to minimize where , as in Section 3.1.1, is the index of a snapshot taken at the SOSA of an environment that sets . From this point to Section 3.3.6, the environment that sets is simply referred to as the environment or the dynamic environment.
Being an implicit memory-based technique (explained in Section 2.2), CBAR utilizes representatives of sets of solutions to past subproblems to compute the solutions to the current subproblem , where . Each representative is the centroid of genotypes that correspond to nondominated solutions to the problem. A centroid is a genotype whose gene is  where ; is the total number of tasks in the snapshot taken at the SOSA of the environment; is a genotype that corresponds to a schedule as a solution to the problem; is the gene/ID in the genotype (defined in (9)); is a population of genotypes that have one-to-one correspondence to all solutions in the nondominated set of solutions to the subproblem; and is the operator to round its real-value argument to an integer. Note in (23) that the equally-weighted average (mean) of IDs forms centroid genes.
3.3.1. Centroid Repair
Centroid formed through (23) may not necessarily be task-precedence-feasible, as defined in Section 3.1.5. If this is so, it will be repaired. Before explaining the repair process, a definition will be presented. Given a genotype , its complementary genotype is less the elements of . For example, let the genotype and the centroid where numbers to the right of 3 are fixed but not shown for brevity. Thus, . The repair of centroid is undertaken by successively appending (to be described below) IDs to a genotype which starts from empty. The ID from will be appended to if it satisfies the following appending rule: the appending of ID should result in a new whose corresponding complementary genotype has IDs of tasks that are not predecessors of the task with ID and should not be found in the former . However, if does not satisfy the appending rule, a different ID is randomly picked, which satisfies the appending rule from and then appended to the former . Note that the complementary genotype used in checking the satisfiability of to the appending rule is different from where is picked.
At the start of the repair of centroid , the first element of is attempted to be appended to an empty genotype following the process of appending described previously. Note that based on the explanation above, the element or a different element may be appended to . Next, the second element of is attempted to be appended to (which has one element at this stage). This repair process is continued until the last element of is attempted to be appended to . After this stage, is the repaired version of . The following three paragraphs will provide examples of this repair process.
Consider the centroid in (24) whose repair will be based on the task-precedence network in Figure 4. The above-mentioned repair process starts by attempting to append the first gene/ID 14 of to an empty genotype . So let the new whose complementary genotype . However, ID 1 in is the ID of the task that is the predecessor (based in Figure 4) of task with ID 14. Thus, the above-mentioned appending rule is violated. A random pick of ID, say 1, from is then undertaken. If 1 is appended to the former (which is empty), the new becomes whose complementary genotype . Now, no task with ID in the new is a predecessor (based in Figure 4) to the task with ID 1. Thus, the appending of 1 to is allowed, thereby obtaining . Notice that, based on Section 3.1.5, the obtained is task-precedence feasible.
The next step to the repair process is to attempt to append the second gene/ID 1 of the centroid to . Note that 1 is already present in whose complementary genotype . Thus, based on the above appending rule, a randomly chosen ID, say 14, is randomly picked from . It is then appended to to obtain whose complementary genotype . Now, no task with ID in the last is a predecessor to the task with ID 14. Thus, the appending of ID 14 is allowed, thereby obtaining . Again, notice that this obtained is task-precedence feasible.
Consider now the third gene/ID 2 of . When it is appended to yields whose complementary genotype is . Now, no task with ID in is a predecessor to the task with ID 2. Thus, the appending of 2 is permitted. Again, notice that the obtained is task-precedence feasible. The repair process is continued until the last gene of is attempted to be appended to . Based on this sample repair process, the resulting at each appending cycle is task-precedence feasible. It is straightforward to prove that after is completely appended, it is task-precedence feasible.
Consider the expressions of the completely appended and the centroid , where is defined in connection with (23) and is a SOSA of the environment. The element of could be viewed as the result of mapping the element of . Let this mapping be denoted by and be referred to as random repairer. Formally stating the repair, where is defined in (23).
3.3.2. Initial Population
One subalgorithm of CBAR is the creation of an initial population which CBAR evolves to obtain solutions to the , , subproblem (defined in (22)) set in the snapshot of the environment. For , this initial population is a set of SSGS-generated genotypes. For , it is expressed as where (1) is a set of centroids, (2) is the repaired centroid of the population of genotypes that corresponded to all the nondominated solutions to the subproblem;(3)The starting index of population is (4) is a given maximum number of centroids in any initial population;(5) is a set of SSGS-generated genotypes, ;(6) is a fixed size of the initial population;(7) is the chosen genotype in .Note that the definition of in (28) restricts the maximum number of centroids in to . Recall from Section 2.1.2 that SSGS randomly selects IDs of eligible tasks, in a given RCPS environment, to form genotypes. Hence, these genotypes have a stochastic facet. Thus, the SSGS-generated set in (26) constitutes the random/diversifying component of the initial population .
3.3.3. Genetic Operators
Being an EA-based technique, CBAR has an evolutionary process. The crossover and mutation operators in this process are designed to suit our system. The designed crossover operator has the following algorithm. Consider two parent genotypes and for crossover from whom two offspring genotypes and are generated. Each of the parent genotypes is broken into three parts at two randomly selected crossing points labeled and . The crossing point is between gene locations and and the crossing point is between gene locations and , where and is the length of the parent and offspring genotypes. These crossing points are similar for both parents. First, offspring inherits the first part of parent . Second, it inherits the genes of parent and consecutively places these genes at its gene locations to . Suppose that the gene location of is to be filled with a gene, where . The genes of parent , located from one to , are consecutively searched for a gene different to all genes in located before location . Once found, the search is stopped and the different gene is placed at location of . Third, the second step is repeated with genes of , located from to , inheriting from where the search process is applied to all genes of parent . The three parts of are inherited by consecutively using parents , , and in the first to the third steps, respectively. The presented inheriting process has similarities to the PMX crossover . The mutation of a genotype swaps two of its consecutive genes, at a randomly selected gene location with predefined probability, provided that the resulting genotype is task-precedence feasible, as defined in Section 3.1.5.
3.3.4. Schedule Formation
Let us discuss a method, referred to as Schedule Formation, for determining a schedule from a genotype. Given the genotype of length as described in Section 3.1.5, its corresponding schedule is formed through the following scheme. Each gene/ID in this genotype is used consecutively, from first to last, to determine the starting time of the task with this ID. At the stage of using the gene/ID , the starting time of task with this ID is set to the earliest time later than or equal to where all terms in this equation are defined in Section 2.1.2, except . After the consecutive usage of genes, the starting times of all tasks with IDs in the genotype are all determined. Based on the definition of schedule in Section 3.1.5, the schedule that corresponds to the genotype is obtained. The cost to implement this schedule is determined through and its makespan is the end time of its last task to finish.
3.3.5. The Algorithm of CBAR
Static subproblems in the dynamic problem (expressed in (22)) are solved by CBAR sequentially, from to , where is the number of SOSAs of the environment. To determine solutions to the subproblem (set in the original state of the environment), CBAR simply executes SSGS (explained in Section 2.1.2) to generate an initial population of genotypes. CBAR then evolves to obtain the population of evolved genotypes. Then by using the genotypes in for the Schedule Formation method described in Section 3.3.4, the population of baseline schedules is then determined. These schedules are solutions to the subproblem.
The evolutionary process of CBAR is performed through NSGA-II. In each cycle of this process, NSGA-II creates an offspring genotype population from a parent genotype population. We let NSGA-II use the revised genetic operators described in Section 3.3.3 to create the offspring. The offspring genotypes are feed to the Schedule Formation method to obtain the cost and duration of the schedules that correspond to these genotypes. The Pareto rank and crowding distance for each of these schedules are determined based on its cost and duration . Following the explanation in Section 2.4, NSGA-II uses the Pareto ranks and crowding distances of the schedules in its selection process to obtain its next generation genotype and schedule populations.
To determine solutions to the subproblem , , CBAR starts by determining the centroid of genotypes that correspond to the nondominated solutions to the subproblem (set in the last snapshot before the snapshot of the environment) through (23) then repairs this centroid using (explained in Section 3.3.1), followed by forming an initial population through (26), and then evolves this initial population using NSGA-II as described above. The schedules that correspond to the evolved genotypes are the solutions to the subproblem and form a population .
From here onwards, CBAR is designated as the one which generates the initial population and the SSGS-generated genotype population in (26), at any , by having the SSGS algorithm as the generator of these genotypes and as the one which evolves the initial populations described in this section by having NSGA-II as its evolutionary engine.
3.3.6. Chosen Schedule
After CBAR has computed the population of solutions to the subproblem, a schedule is chosen from . This chosen schedule is utilized in the simulation of the environment. When, in this simulation, the environment changes state for the SOSA, CBAR evolves an initial population to obtain the solutions to the subproblem set in this state of the environment. The initial population contains, based on (26), the chosen genotype that corresponds to the chosen schedule .
Consider the tasks, of a schedule that corresponds to a genotype in different to , whose IDs are similar to either on-going or finished tasks of the chosen schedule (corresponds to ) at the instant of the SOSA of the environment. The starting time of each of these tasks is set equal to its counterpart (task of same ID) in the chosen schedule. This process is performed on all schedules that correspond to genotypes in different to and also on offspring genotypes in every cycle of the evolutionary process of NSGA-II. At the end of the evolutionary process of NSGA-II, the evolved schedules have copies of the ongoing or finished tasks of the chosen schedule. Note that these evolved schedules, as solutions to the subproblem (set in the current state of the environment), are revisions of the chosen schedule which is one of the solutions to the subproblem (set in the last state of the environment). Thus, the preservation (copying) of the ongoing and finished tasks on the revised schedules demonstrates CBAR to abide by the schedule revision rule described in Section 2.1.
3.3.7. Inapplicability of CBAR
Let us now discuss how CBAR becomes inapplicable for solving the dynamic problem which involves change in total number of tasks. Suppose that a static subproblem in is set in an environment snapshot that has total number of tasks. For this subproblem, a sample task precedence network is depicted in Figure 4 where original tasks (the only tasks found in the original state of the environment) and additional tasks are represented by numbered rectangles and circles, respectively.
Suppose that the total number of tasks is increased for the first time to at the SOSA of an environment that sets the subproblem; . When or more tasks are finished at the SOSA of the environment, IDs of new tasks can be placed instead of the IDs of finished tasks in each genotype in the initial population used to solve the subproblem through CBAR. Thereby, the need to change genotype length may be avoided. However, in the case where there is no task finished at the SOSA of the environment, the genotype must have a sufficient number of genes to accommodate IDs of original and new tasks. Thus, this condition requires the genotype to be implemented with number of genes. Therefore, the nondominated set of solutions (schedules) to the subproblem must correspond to genotypes—forming the set —each of length .
Before continuing our discussion, let us take the notation, to denote the combined genotype populations that correspond to the nondominated sets of solutions to static problems to , where is defined in (28). Based on the definition of and on (30), the number of genotype populations combined to form is limited at most to , the maximum number of centroids used in (28).
Suppose that there is no finished task prior to the SOSA of an environment that sets the dynamic problem, where . Thus, each genotype in the combined populations has number of genes. Based on (26), repaired centroids and chosen genotype in the initial population are derived from . Thus, they each have a length of . Considering that CBAR does not increase the length of each of the genotypes as it evolves them, then it cannot evolve (whose genotypes have the length ) to produce whose genotypes each must have the length .
3.4. Mapping of Task ID for Centroid-Based Adaptation with Random Immigrants (McBAR)
Let us now consider how CBAR is revised to overcome its unsuitability to solve the dynamic problem (defined in (8)). Let any environment mentioned in this section be the one that sets the problem and be referred to as the environment or the dynamic environment. Now, suppose that new tasks (not found in the original state of the environment) appear for the first time at the SOSA of the environment. Following , the partial remedy for the unsuitability is to insert new genes, that correspond to the new tasks, to each genotype in the population which correspond to the nondominated set of solutions to the static subproblem where with as defined in (28). The insertions of the new genes to genotypes in populations to are depicted in Figure 7 as downward-pointing arrows. In this figure, the horizontal line is the SOSA of the environment. Further, the SOSA at which arrow point corresponds to subproblem whose nondominated solutions have corresponding genotypes that are being inserted with the new genes. Based on the range of , subproblems encountered in the gene-insertion process are to . These subproblems are set in the snapshots of the environment taken, respectively, from until before the first increase in the total number of tasks.
After the insertion of new genes into each genotype of , , the initial population in (26) is formed and then evolved by CBAR. The evolved genotypes are inputted to the Schedule Formation scheme in Section 3.3.4 to obtain the solutions to subproblem . The technique referred to as gene-inserting CBAR (GIBAR) is CBAR that includes executing the gene-insertion process every time new tasks appear in the environment. This nomenclature will be useful in Section 3.5.2.
Now, our previous work  showed the performance (solution-searching ability) of CBAR to be degraded by the gene insertion. The adopted resolution of this side effect is to map task IDs in each of the gene-inserted genotypes in to reduce the discontinuity of IDs along each genotype, where is the population of all gene-inserted genotypes of . The mapping process is applied for each , . To facilitate ease in succeeding discussions, let be the set of all ID-mapped gene-inserted genotypes of . Note that is derived from which is a set of genotypes that correspond to the nondominated solutions of the subproblem. Further, let the combined population be
In addition to the gene-insertion and mapping operation, the centroid repair in CBAR is revised to further increase the performance of CBAR. The resulting CBAR evolves (as explained in Section 3.3.5) an initial population derived from the ID-mapped genotypes in .
Let be the computing system used to solve the subproblem, , where is a given value. One attribute of any resource in the environment is the ID of task that utilizes it. Thus, in system , IDs are part of the information about (a) resources, (b) genotypes used in the evolutionary process of CBAR to solve the subproblem, (c) and task-precedence network of tasks in the environment. The insertion of new genes into each genotype in mentioned above is accompanied by the insertion of new nodes, with IDs of the new tasks, into the precedence network of tasks. The mapping of IDs in each gene-inserted genotype in mentioned above is accompanied by the mapping of IDs in the node-inserted precedence network of tasks and in the resources utilized by these tasks. When all IDs in system are transformed by the mapping, system is referred to as being in a mapped mode.
Under system in mapped mode, in each cycle of the evolutionary process of CBAR, the ID-mapped genotypes in the -derived initial population are inputted to the Schedule Formation method. This method then uses the mapped tasks IDs in resources and the precedence network of tasks in the snapshot of the environment, to obtain schedules with mapped IDs. At the end of the evolutionary process, evolved genotypes have corresponding schedules, with mapped IDs, as solutions to . Let be the set of evolved genotypes which correspond to the nondominated solutions (schedules) to .
Now, copies of the ID-mapped genotypes in and the node-inserted ID-mapped task precedence network are made. Then IDs in each of the copies are unmapped. The ID-unmapped genotypes in are processed in the Schedule Formation method that utilizes the ID-unmapped node-inserted task precedence network. Thereby, solutions to subproblem are produced. This production is referred to as solution production.
The copying and the subsequent unmapping of the ID-mapped genotypes are only performed when required, for example, for a decision maker, to view the solutions (schedules) to subproblem . However, the evolved ID-mapped genotypes in are utilized for finding solutions to problems set in the snapshots taken after the SOSA of the environment. The gene-insertion, mapping/unmapping, initial population of gene-inserted genotypes, and the revision of the centroid repair are fully described in Sections 3.4.1 to 3.4.4 below, respectively.
Suppose that no new tasks appear from the to the SOSA of the environment, where and that the second batch of new tasks occurs at the SOSA of the environment. Recall from the above discussion that gene insertion is performed only due to the occurrence of new tasks. Thus, there is no need to insert genes to each genotype in the populations to which correspond to the nondominated sets of solutions to the to , respectively. The non-insertion of genes is depicted in Figure 7 as no arrows being pointed towards the SOSAs. Now, although the population of genotypes that correspond to the nondominated solutions to is evolved from the initial population derived from the gene-inserted genotype populations in (31), genotypes in had not undergone gene-insertion process. Thus, there is no downward-pointing arrow at SOSA in the figure.
To solve the subproblem , first, an initial population is formed using the populations in where is defined in (28); could be a population of evolved ID-mapped gene-inserted genotypes—if —or a population of evolved ID-mapped uninserted genotypes—if . Note that the insertion or noninsertion is depicted in Figure 7. Further, for , is the population of genotypes that correspond to nondominated solutions (schedules) to the subproblem. Furthermore, is a mix of populations of inserted and uninserted genotypes. Next, the initial population derived from is evolved by CBAR to obtain a population of evolved genotypes. Let be a set of evolved genotypes that if fed to the solution production yield all the nondominated solutions to the subproblem . Note that system is at mapped mode during this evolutionary process.
Let us now consider the SOSA of the environment where the second batch of new tasks occurs. Let be a set of populations , where and each could contain either ID-mapped gene-inserted genotypes or ID-mapped gene uninserted genotypes, a composition similar to that of (32). To solve subproblem , first, the ID-mapped genotypes in are unmapped to obtain a collection of populations , , of ID-unmapped genotypes. ID unmapping is also applied to task IDs in resources and in the precedence network of tasks in the snapshot of the environment. Thus, system is restored to unmapped mode. Second, new genes that correspond to the second batch of new tasks are inserted into the unmapped genotypes in to obtain a collection of populations , of ID-unmapped gene-inserted genotypes. This second batch of gene insertion is depicted in Figure 7 as upward-pointing arrows. New nodes, which correspond to the new tasks, are placed into the ID-unmapped task-precedence network. Third, a different mapping function is applied to IDs in each genotype in to obtain which is a collection of populations , , of ID-unmapped gene-inserted then ID-remapped genotypes. This mapping is also applied to the task IDs in resources and in the precedence network of tasks in the snapshot of the environment. Thus, system is restored to mapped mode once again. Fourth, the collection of population is used to form an initial population which is then evolved by CBAR to obtain the population of evolved genotypes which has mapped IDs. Note again that system is in mapped mode during the evolution of the initial population. Fifth, when required, the evolved genotypes in are used in the solution production to obtain the non-dominated solutions (schedules) to the subproblem. The successive processes of (a) restoring system to unmapped mode; (b) insertion of new genes that correspond to new tasks; and (c) reverting to mapped mode, are repeated every time a batch of new tasks appear in the environment after the first batch.
Based on the discussions until this point, system is always in mapped mode when CBAR evolves initial populations to determine solutions to subproblems set at or after the snapshot of the environment where the first batch of new tasks are found. In addition, the formation of the initial population uses the genotype populations that correspond to sets of nondominated solutions.
CBAR that undergoes all of the above-mentioned innovations is labeled as McBAR. In summary, McBAR is comprised of the following subalgorithms used to solve subproblem set in the SOSA of the environment:(1)gene insertion,(2)ID mapping operation ,(3)minimal repair of centroid,(4)initial population defined in (34) below,(5)NSGA-II (explained in Section 3.3.5) used to evolve the initial population,(6)maintenance of system at mapped mode,(7)preservation of ongoing and finished tasks in a chosen schedule (described in Section 3.3.6),(8)SSGS to form (0) and in (26). Items (1) to (4) are fully discussed in Sections 3.4.1 to 3.4.3, respectively. The algorithm of McBAR is similar to that of CBAR (described in Section 3.3.5), except that items (1) to (4) are applied to form an initial population and that system is maintained at mapped mode, that is, item (6). Based on the definition of performance in Section 2.3, McBAR was demonstrated in  to perform better than CBAR when the environment undergoes an increase in total number of tasks.
3.4.1. Gene Insertion
Suppose that new tasks appeared at the SOSA of the environment and consider also a genotype in a collection (e.g., defined in (30) to (33)) of genotype populations. The new gene that corresponds to a new task is inserted to the right of and as near as possible to the gene, currently in , that corresponds to the immediate predecessor of the new task. This gene insertion abides by the gene ordering rule (mentioned in Section 3.1.5) of genes in any genotype. It is performed for each of the new tasks. The complete insertion process of the new genes is applied to every genotype in . Sample gene insertion is illustrated in Figure 8 where genes are represented by boxed numbers. Genes/IDs to correspond to the new tasks and the rest of IDs to those of original tasks (the only tasks found in the original state of the environment). The gene arrangement in this figure abides by the gene ordering rule that uses the task-precedence network in Figure 4 with circles that represent tasks 34 to 40 replaced with arcs.
Empirical investigations in  showed gene insertion to degrade the performance (as defined in Section 2.3) of McBAR. In the research of [16, 19–21], gene insertion was shown to be beneficial for searching for high quality solutions to some problems. However, there is no centroid (as defined in (23)) utilized in their strategies, such that the benefit does not necessarily apply to the performance of McBAR.
Let us describe a possible cause of the degradation in the performance of McBAR. It is possible that new tasks in the environment cannot be anticipated; this could happen in practice. It is then sensible to label original tasks in the environment with IDs from one to , the number of original tasks. In this set-up, new genes that correspond to the new tasks must have task IDs (e.g., 31 to 33 in Figure 8) greater than . Now, as exemplified in Figure 8, the insertion of the new genes brings about a large discontinuity of task ID values along a gene-inserted genotype. As demonstrated in , this large discontinuity is a factor in the degradation of the performance of McBAR; that is, the large ID discontinuity is a gene insertion side effect.
3.4.2. Resolution of Effects
Before proceeding to discuss the resolution of the gene insertion side effect, let us define the precedence order of a task. This order is the maximum number of directed links that connect the task to the start of a given task precedence network following the reversed direction of the links. For instance, task 16 in Figure 4 is of seventh precedence order.
The insertion side effect was resolved in  by mapping IDs of tasks, that belong to a similar precedence order, to unique values that are as close to each other as possible. Let the function represent the ID-mapping operation, where is the ID to be mapped. For example, second precedence ordered tasks in Figure 4 such as 4, 6, 7, 8, 9, 10, 12, 14, and 38 are mapped to themselves, except task 38 which is mapped to 15. As empirically demonstrated in , the mapping reduces, with high likelihood, the abrupt change of IDs along a gene-inserted genotype.
Note that although a task ID may be transformed, the task that it represents remains the same. For example, the starting time of and the type of resource utilized by the task are not affected by , except the task ID in the utilized resource. Further, aside from mapping task IDs in genotypes and resources, node IDs in an associated task precedence network are also mapped by .
3.4.3. Initial Population for a Current Static Subproblem
The subproblem set in the original state of the environment is solved by McBAR by evolving an initial population composed of SSGS-generated genotypes. The initial population used to determine the solutions to subproblem , where , is formed as follows. Suppose that new tasks first appear at the SOSA of the environment. Let be a collection of genotype populations , , where and is defined in (28). could be a population of gene-inserted ID-mapped genotypes (e.g., s in (31) or s in (33)) or a population of uninserted ID-mapped genotypes (e.g., s in (32)). As mentioned above, genotypes in correspond to the nondominated solutions to the subproblem. The collection is used to form an initial population, where (1) is a set of centroids, (2) is the -repaired (described below) centroid of ;(3) is the set of genotypes generated through SSGS which utilizes the ID-mapped and could be node-inserted precedence network of tasks in the snapshot of the environment. . Being produced through SSGS (which endows stochasticity to its produced genotypes), the elements of are referred to as random immigrants. As noted in Section 2.2, memory-based EA approaches need to diversify the solutions at some stages of their evolutionary cycles. In McBAR, this diversification is implemented through the random immigrants, thereby justifying the inclusion of in (34);(4) is a fixed size of the initial population;(5) is the ID-mapped chosen genotype in which corresponds to the chosen schedule from the population of nondominated solutions to the subproblem.Based on the enumerated definitions above, all components of the initial population have genotypes with mapped IDs. This initial population is then evolved (as explained in Section 3.3.5) by McBAR to determine an ID-mapped evolved population. When necessary, this evolved population is fed to the solution production method to obtain a set of schedules as solutions to the problem.
3.4.4. Centroid Repair
The centroid components of (35) could be task-precedence infeasible based on the ID-mapped node-inserted precedence network of tasks in the snapshot of the environment. McBAR repairs task-precedence infeasible centroids using a function different to of CBAR. The centroid repair function perturbs the centroids as little as possible such that, intuitively, their being centroids will diminish least.
Recall from Section 3.3.1 that the ID of a centroid is allowed to be appended to a genotype if it satisfies the appending rule. Otherwise, the function of CBAR randomly picks a different ID , that satisfies the appending rule, from and then appends this to , where is the complementary genotype of . Now, the function of McBAR differs from in the following. Instead of randomly picking an element of , the element of that is nearest in value to , and satisfies the appending rule, is the one appended to . If two IDs from are equidistant to and satisfy the appending rule, one of these is randomly picked for appending to . The appending rule uses the node-inserted ID-mapped precedence network of tasks in the snapshot of the environment if in (34) is greater than or equal to which is the SOSA of the environment at which new tasks appear for the first time. If , the appending rule uses the precedence network of original tasks.
As an example, consider the centroid ; its first gene/ID 14; the current ; and . As mentioned in Section 3.3.1, ID 14 does not satisfy the appending rule. However, the ID 3 from satisfies the appending rule and is nearest to 14. Consequently, and becomes . The ID 1 of is considered for appending to next. It satisfies the appending rule such that . The ID 2 of is considered next and also satisfies the appending rule, such that .
Let the first three genes of form the vector and the last be treated as vector. The last , produced through the function, has a distance of 11 to , while , produced (as mentioned in Section 3.3.1) through the function, has a distance of 18.38 to . This result supports the lesser perturbation effected by than by on repairing centroids.
3.4.5. Using the Median
The centroid expressed in (23) is computed using the mean which is one type of tendency in statistics . It is of interest to determine the performance of McBAR when the centroid is computed using a median, in which case the centroid is renamed as medoid and McBAR as MedianBAR. The gene of a medoid is where is an element of the ordered set ; ; ; sort is a sorting operation; is the task ID in the gene of the genotype in a population of size ; and .
3.5. Other Techniques
This section provides information on techniques, other than CBAR, McBAR, and MedianBAR, that are utilized to legitimize some components of McBAR, that is, to achieve goal 1 in the Introduction.
3.5.1. EDA on Problem
The EDA algorithm described in Section 2.5 is innovated, in order to solve problem , and consequently labeled as . Recall from Section 3.1.1 that subproblem is set in the original state of a dynamic environment. With , determines a set of solutions to through the following steps, starting with EDA cycle .(1)The probability matrix in (6) is relabeled as to indicate the SOSA of the environment that sets the problem which it is being used to solve. It is set to have equal entries of ; that is, , where is the total number of tasks at the snapshot of the environment. This step is the equivalent of to step 1 in Algorithm 2.(2)The probability matrix is then sampled, following the sampling scheme in Section 2.5, to form a population of genotypes. This step is related to step 2 in Algorithm 2.(3)From , a set of schedules is formed through the schedule formation scheme described in Section 3.3.4.(4)Costs and makespans of schedules in are utilized in NSGA-II’s fast nondominated sorting to obtain each schedule’s Pareto rank and crowding distance .(5) genotypes are selected from , where is a round-up operator; ; and is a predefined constant referred to as percent of population. This selection is performed following the scheme described in Section 2.4 and is based on the Pareto rank and crowding distance of schedules in . This step is related to step 3 in Algorithm 2.(6)These selected genotypes are included to SSGS-generated genotypes to form a new population of size . Note that, as explained in Section 3.3.2, SSGS-generated genotypes add diversity to the population in which they are included. The inclusion of the SSGS-generated genotypes is intended to remedy the loss of diversity of the population produced by EDA as the evolutionary cycles of EDA progress .(7)A new probability matrix is estimated from the new population . Following , the probability matrix to be used in the next step is where is a chosen value and is referred to as a learning rate. This equation allows the information, embedded in , from the last cycle to be carried over (learned) to the next EDA cycle. This step is related to step 5 in Algorithm 2.(8)Steps 2 to 7 are repeated for a fixed number of generations, except at the last generation where only steps 2 and 3 are executed successively. The generation index is incremented at every end of the generations. At the last generation, where , the evolved population is utilized in the schedule formation scheme in Section 3.3.4 to form a set of schedules as solutions to the subproblem . Let and be relabeled as and , respectively.
Steps 6 and 1 are revised to determine solutions to subproblem with . Step 6 is replaced as follows. The selected genotypes are included with SSGS-generated genotypes and with a chosen genotype to complete a population of size . The chosen genotype corresponds to the chosen schedule randomly picked from the set of nondominated solutions in . The finished and ongoing tasks in the chosen schedule are preserved (as explained in Section 3.3.6) in the evolved schedules as solutions to the subproblem .
When there is no increase in the total number of tasks at the SOSA of the environment, , step 1 is replaced as follows. Let where is defined in (28) and is a set of genotypes that correspond to all nondominated solutions in the set of solutions to subproblem . From , the probability matrix is estimated. In this way, solutions (e.g., ) to subproblems set in past snapshots are utilized to search for solutions to the current () snapshot of the environment.
If there is an increase in the total number of tasks, say at the SOSA of the environment, , step 1 is replaced as follows. New genes that correspond to new tasks which appear at the SOSA are inserted into all genotypes in of (38) following the scheme in Section 3.4.1. From the gene-inserted , the probability matrix is estimated, where is the total number of tasks at the SOSA.
The algorithm of above implies that any genotype it processed has a fixed length during its evolutionary process. Thus, following the discussion in Section 3.3.7, the gene insertion is a legitimate step.
As mentioned in Introduction, goal 1 in this paper is to legitimize some subalgorithms of McBAR. These subalgorithms are items 2 to 5 of those enumerated in Section 3.4. The legitimization is based on the idea that if technique differs from technique in regard to some components, and if, in general, performs better than in solving a given set of problems, then the components of distinct from are legitimate for to solve this set of problems. Consider a sample case of the legitimization. Both McBAR and (described in Section 3.5.1) individually solve some instances of problem . Note that McBAR uses EA operators while uses distribution sampling to produce offspring. If in solving most of these instances, McBAR performs better than , then the EA operators utilized in McBAR are legitimate components of McBAR to solve these instances. To accomplish the legitimization goal, a set of techniques is formed comprised of McBAR and other techniques utilized to legitimize some subalgorithms of McBAR. These other techniques are as follows.
(1) Centroid-based adaptation with minimal repair (CBAM) differs from GIBAR (described in Section 3.4) only in using instead of to repair centroids. Based on the descriptions in Sections 3.3 and 3.4 of the subalgorithms of CBAR and McBAR, respectively, GIBAR differs from McBAR in using , in the mapping of task IDs in genotypes, and in the maintenance of system at mapped mode. Considering that CBAM differs from GIBAR by using which is also used by McBAR, then it differs from McBAR in the mapping and in the maintenance of system at mapped mode. Implied in Section 3.4, the application of the mapping causes the maintenance of the system at mapped mode. Considering this catalysis, the fundamental difference between CBAM and McBAR is only in regard to . This difference will be used as the basis of legitimizing the in McBAR.
(2) Mapping of task IDs for centroid-based adaptation with stochastic repair (McBAS) differs from McBAR only in using random centroid repair instead of the minimal centroid repair . This difference will be used as the basis of legitimizing the in McBAR.
(3) Mapping of task IDs for centroid-based adaptation (McBA) differs from McBAR in randomly selecting genotypes from the set of genotypes that correspond to the nondominated solutions to the last subproblem , instead of generating genotypes through SSGS, to form the component of the initial population in (34), where corresponds to the current subproblem being solved. This difference will be used as the basis of legitimizing the random immigrant component (SSGS-generated genotypes) in McBAR.
(4) Mapping of task IDs for centroid-based adaptation with medoids (MedianBAR) differs from McBAR only in using the median (defined in (36)) instead of the mean to compute for the centroid in (34). This difference will be used as the basis of legitimizing the process of taking the mean in McBAR.
(5) is defined in Section 3.5.1 and differs from McBAR in its use of sampling and the estimation of a probability matrix (described in Section 2.5), instead of using mutation and crossover operators, to form the next generation offspring in its evolutionary process. Note that, based on Section 3.5.1, does not apply the mapping function, maintain system in mapped mode, and repair the centroid using . Based on Section 3.4, these subalgorithms of McBAR are devised due to the use of centroids as memory in McBAR. On the other hand, uses the probability matrix as memory, based on the revised step 1 in Section 3.5.1. Thus, McBAR fundamentally differs from in the use of EA operators and the centroid. These differences will be used as the basis of legitimizing the use of the EA operators and the centroid in McBAR.
(6) NDS of last population (NDLPOP) differs from GIBAR in randomly selecting genotypes, from the above-mentioned set of genotypes , to form the component of the initial population in (26). Note that the resulting is no longer a set of centroids but rather of genotypes of . Thus, NDLPOP differs from McBAR in being an explicit memory-based approach. This difference will be used as the basis of legitimizing the implicit memory-based approach of McBAR.
It is also of interest to compare the performance of McBAR and the previously enumerated techniques to those of the following techniques.
(7) Random immigrants (RI) creates an initial population for NSGA-II to evolve through SSGS. The length of each genotype in this initial population is equal to the total number of tasks right after the environmental change of state. This technique differs from other techniques in in applying no rule in creating its initial population, except the rules followed by SSGS.
(8) Gene-inserting CBAR (GIBAR) is described in Section 3.4.
Techniques that apply the mapping function , such as McBAS, McBA, McBAR, and MedianBAR, are classified as variants (of McBAR) and the rest of techniques in as nonvariants.
The parametric values of and other techniques in are listed in Tables 6(b) and 6(a), respectively. These values enable each of the techniques to yield high quality solutions to a significant problem instance of and are determined through an approach, described in our other work [53, 54]. In our preliminary investigations, the performances of the techniques in